Skip to main content


CSC8631 : Data Management and Exploratory Data Analysis

  • Offered for Year: 2021/22
  • Module Leader(s): Dr Matthew Forshaw
  • Lecturer: Dr Joe Matthews
  • Owning School: Computing
  • Teaching Location: Newcastle City Campus
Semester 1 Credit Value: 10
ECTS Credits: 5.0


This module explores the principles of data management and exploratory data analysis. Furthermore, we introduce the underlying technologies and computational tools to support automation and reproducibility in data analysis.

Specifically, the module aims to equip the students with the following knowledge and skills:
•       To understanding of the principles of the scientific method and how it is applied in computational analyses
•       To understand methods of data characterisation and data processing
•       To understand the principles of knowledge representation and constructing data models
•       To understand the technologies that support analysis pipelines
•       To understand end-to-end system design for Data Science

Opportunities for PiP sessions can be available in practical sessions

Outline Of Syllabus

1.       Scientific method in computational analyses
2.       The software lifecycle
3.       The data lifecycle
4.       Variable characterisation and experimental design
5.       Exploratory data analysis
6.       Open Science and Reproducibility
7.       Data Architectures
8.       System design, microservices and workflows
9.       Developing data products

Teaching Methods

Teaching Activities
Category Activity Number Length Student Hours Comment
Guided Independent StudyAssessment preparation and completion101:0010:00Lecture follow-up
Guided Independent StudyAssessment preparation and completion201:0020:00Background reading & participation in discussions
Guided Independent StudyAssessment preparation and completion31:003:00Preparation for oral presentation
Guided Independent StudyAssessment preparation and completion271:0027:00Coursework
Guided Independent StudyAssessment preparation and completion12:002:00Non-synchronous online presentations session with participation in non-synchronous peer Q&A
Guided Independent StudyAssessment preparation and completion50:302:30Preparation for oral examination
Guided Independent StudyAssessment preparation and completion10:300:30Oral Examination
Scheduled Learning And Teaching ActivitiesLecture101:0010:00Lectures
Scheduled Learning And Teaching ActivitiesPractical251:0025:00Practical sessions PIP
Teaching Rationale And Relationship

Lectures explain the underpinning principles for the module and technologies that support data management and exploratory data analysis. Lectures are complemented by supervised practical sessions to guide the application of these principles using suitable computational tools. The practical work builds up experience working with a computational toolset that is used to complete a substantive project working with data from a real-world context.

Assessment Methods

The format of resits will be determined by the Board of Examiners

Other Assessment
Description Semester When Set Percentage Comment
Report1M80Extended technical project Word count: Up to 2,000 words
Oral Examination1M20Oral presentation- presentation of the methods and results from the coursework - length 15 minutes
Zero Weighted Pass/Fail Assessments
Description When Set Comment
Oral ExaminationMA structured interview/ discussion including a software demonstration &reflection on the key learning objectives of coursework.
Assessment Rationale And Relationship

The report tests the students’ ability to apply data management techniques in a reproducible manner, using effective tools and methods to solve a real-world challenge. The presentation assesses the students’ ability to communicate their approach and findings. The semi-structured interview facilitates a reflective discussion about how individual students have met the learning objectives of the module and how the principles of data management and exploratory data analysis are embedded in the functionality of their software artefact.

Reading Lists