Module Catalogue 2019/20

CSC8101 : Big Data Analytics

  • Offered for Year: 2019/20
  • Module Leader(s): Dr Paolo Missier
  • Owning School: Computing
  • Teaching Location: Newcastle City Campus
Semesters
Semester 2 Credit Value: 10
ECTS Credits: 5.0
Pre Requisites
Pre Requisite Comment

None

Co Requisites
Co Requisite Comment

None

Aims

The aim of Big Data Analytics is to analyse large amounts of data in order to extract useful information. Examples include analysing the world wide web to power web search engines, optimising the design of e-commerce sites by analysing user activity, and processing “open linked data” released globally both by governments in order to improve public services, as well as by research organizations in order to improve data sharing. Whilst data analysis has been an important topic for many decades, three developments have led to a surge of interest in new algorithms and methods. Firstly, there has been an explosion in the quantity and variety of data generated by organisations, programs and sensors: the web is one example of this. This has placed the processing of this data beyond existing approaches. Secondly, cloud computing has provided a new type of dynamically scalable platform on which to parallelise data analysis. Thirdly, there is enormous potential for insight and action deriving from the real-time analysis of data – such as from sensors, social media and e-commerce.
This module focusses on the algorithms, technologies and architectures required to analyse “big data”, with a particular focus on cloud-based solutions.

Outline Of Syllabus

- Scalable data management architectures
- Overview of data-parallel problems in e-science
- Patterns and technology for exploiting cloud infrastructure on data-parallel problems
- Graph databases and their application to social media analysis
- Scalable real-time data processing

Learning Outcomes

Intended Knowledge Outcomes

You will learn to competently discuss the merits of various technologies for Big Data Processing, with respect to specific Data Analytics challenges. Specifically:
- You will learn about the scale of big data and its progression in time, in diverse data domains, including areas of science
- You will learn how to configure and best exploit scalable data architectures in view of their application to challenges in multiple application domains
- You will learn how to combine big data analytics techniques with machine learning approaches to predictive analytics, from the perspective of both Data Science and Data Engineering

Intended Skill Outcomes

- The ability to design, implement and evaluate big data analysis systems
- The ability to apply big data analysis to specific problems in support of science and social science.

Graduate Skills Framework

Graduate Skills Framework Applicable: Yes
  • Cognitive/Intellectual Skills
    • Critical Thinking : Assessed
    • Data Synthesis : Present
    • Active Learning : Assessed
    • Numeracy : Present
    • Literacy : Present
    • Information Literacy
      • Source Materials : Present
      • Synthesise And Present Materials : Present
      • Use Of Computer Applications : Assessed
  • Self Management
    • Self Awareness And Reflection : Present
    • Planning and Organisation
      • Goal Setting And Action Planning : Present
    • Personal Enterprise
      • Innovation And Creativity : Present
      • Independence : Present
      • Problem Solving : Assessed
  • Interaction
    • Communication
      • Written Other : Assessed
  • Application
    • Occupational Awareness : Present

Teaching Methods

Teaching Activities
Category Activity Number Length Student Hours Comment
Guided Independent StudyAssessment preparation and completion201:0020:00Lecture follow up
Scheduled Learning And Teaching ActivitiesLecture201:0020:00Lectures
Scheduled Learning And Teaching ActivitiesPractical181:0018:00Practicals
Guided Independent StudyProject work241:0024:00Coursework / Lab reports
Guided Independent StudyIndependent study181:0018:00Background reading
Total100:00
Teaching Rationale And Relationship

Lectures will be used to introduce the learning material and for demonstrating the key concepts by example. Students are expected to follow-up lectures within a few days by re-reading and annotating lecture notes to aid deep learning.

This is a very practical subject, and it is important that the learning materials are supported by hands-on opportunities provided by practical classes. Students are expected to spend time on coursework outside timetabled practical classes.

Students aiming for 1st class marks are expected to widen their knowledge beyond the content of lecture notes through background reading.

Reading Lists

Assessment Methods

The format of resits will be determined by the Board of Examiners

Other Assessment
Description Semester When Set Percentage Comment
Practical/lab report2M10in-lab sign-off assignment: Neo4J queries. (8 hours)
Practical/lab report2M90Spark programming. 24 hours. (Includes demo/discussion of work.)
Assessment Rationale And Relationship

The assessment structure is designed to maximize engagement of the students with an area of technology that is evolving very rapidly. This is achieved in two ways.
- in-lab assignment, signed-off at the end of a 2 hours lab. This covers the one of the key topics in the module (querying graph databases) from a hands-on, practical programming perspective.
- coursework assignment (programming) with free lab time as well as assisted practical hours. The aim it to offer students a rich, hands-on experience using the dominant technology for Big Data Analytics (Spark), on a state-of-the-art industry-grade platform.
As part of the coursework assessment, a short demonstration session is conducted with each student individually to discuss their solution as well as to test students’ knowledge of other topics covered in the module.

Timetable

Past Exam Papers

General Notes

N/A

Disclaimer: The information contained within the Module Catalogue relates to the 2019/20 academic year. In accordance with University Terms and Conditions, the University makes all reasonable efforts to deliver the modules as described. Modules may be amended on an annual basis to take account of changing staff expertise, developments in the discipline, the requirements of external bodies and partners, and student feedback. Module information for the 2020/21 entry will be published here in early-April 2019. Queries about information in the Module Catalogue should in the first instance be addressed to your School Office.