EXT8035 : Big Data
EXT8035 : Big Data
- Offered for Year: 2026/27
- Module Leader(s):
- Owning School: Engineering
- Teaching Location: Newcastle City Campus
Semesters
Your programme is made up of credits, the total differs on programme to programme.
| Semester 2 Credit Value: | 10 |
| ECTS Credits: | 5.0 |
| European Credit Transfer System | |
Pre-requisite
Modules you must have done previously to study this module
Pre Requisite Comment
N/A
Co-Requisite
Modules you need to take at the same time
Co Requisite Comment
N/A
Aims
The aim of this module is to provide an overview of the big data problem and present the main principles and technologies behind distributed/parallel systems with data intensive applications.
Outline Of Syllabus
“Big Data” involves data whose volume, diversity and complexity requires new technologies, algorithms and analyses to extract valuable knowledge, which go beyond the normal processing capabilities of a single computer. The field of Big Data has many different faces such as databases, security and privacy, visualisation, computational infrastructure or data analytics/mining. This module will provide the following concepts:
1. Introduction to Big data: introducing the main principles behind distributed/parallel systems with data intensive applications, identifying key challenges: capture, store, search, analyse and visualise the data.
2. SQL Databases vs. NoSQL Databases: understand the growing amounts of data; the relational database management systems (RDBMS); overview of Structured Query Languages (e.g. SQL); introduction to NoSQL databases; understanding the difference between a relational DBMS and a NoSQL database; Identifying the need to employ a NoSQL DB.
3. Big Data frameworks and how to deal with big data: this includes the MapReduce programming model, as well as an overview of recent technologies (Hadoop ecosystem, and Apache Spark). Then, you will learn how to interact with the latest APIs of Apache Spark (RDDs, DataFrames and Datasets) to create distributed programs capable of dealing with big datasets (using Python and/or Scala)
4. Finally, we will dive into the data mining and machine learning part of the course, including data preprocessing approaches (to obtain quality data), distributed machine learning algorithms and data stream algorithms. To do so, you will use the Machine learning library of Apache Spark (MLlib) to understand how some machine learning algorithms (e.g. Decision Trees, Random Forests, k-means) can be deployed at a scale.
Learning Outcomes
Intended Knowledge Outcomes
Knowledge and Understanding: Understand the importance of the data. The principles that allow the processing of big data sets. Understand the working and features of existing machine learning algorithms capable of handling big data. Learn to use the main tools of the big data ecosystem. The current limitations of big data technologies to allow distributed machine learning.
Intended Skill Outcomes
Intellectual Skills: Understand complex ideas and relate them to specific problems or questions in the area of parallel computation. Be able to identify distributed solutions/approaches to handle big datasets with existing technologies.
Professional/Practical Skills: Hands-on experience with state-of-the-art technologies to handle big data.
Transferable/Key Skills: Experience in problem solving. Experience in working in groups. Retrieve information from appropriate sources (e.g. Spark API)
Teaching Methods
Teaching Activities
| Category | Activity | Number | Length | Student Hours | Comment |
|---|---|---|---|---|---|
| Scheduled Learning And Teaching Activities | Lecture | 1 | 100:00 | 100:00 | Delivered at Nottingham University as part of the Power Electronics CDT |
| Total | 100:00 |
Teaching Rationale And Relationship
Teaching methods are set by Nottingham University
Reading Lists
Assessment Methods
The format of resits will be determined by the Board of Examiners
Exams
| Description | Length | Semester | When Set | Percentage | Comment |
|---|---|---|---|---|---|
| Written Examination | 120 | 2 | A | 50 | Assessment set by Nottingham University |
Other Assessment
| Description | Semester | When Set | Percentage | Comment |
|---|---|---|---|---|
| Portfolio | 2 | M | 50 | Assessment Set by Nottingham University |
Assessment Rationale And Relationship
Assessment is set by Nottingham University
Timetable
- Timetable Website: www.ncl.ac.uk/timetable/
- EXT8035's Timetable
Past Exam Papers
- Exam Papers Online : www.ncl.ac.uk/exam.papers/
- EXT8035's past Exam Papers
General Notes
N/A
Welcome to Newcastle University Module Catalogue
This is where you will be able to find all key information about modules on your programme of study. It will help you make an informed decision on the options available to you within your programme.
You may have some queries about the modules available to you. Your school office will be able to signpost you to someone who will support you with any queries.
Disclaimer
The information contained within the Module Catalogue relates to the 2026 academic year.
In accordance with University Terms and Conditions, the University makes all reasonable efforts to deliver the modules as described.
Modules may be amended on an annual basis to take account of changing staff expertise, developments in the discipline, the requirements of external bodies and partners, staffing changes, and student feedback. Module information for the 2027/28 entry will be published here in early-April 2027. Queries about information in the Module Catalogue should in the first instance be addressed to your School Office.