Module Catalogue 2022/23

CME8124 : Big Data Analytics in the Process Industries

  • Offered for Year: 2022/23
  • Module Leader(s): Dr Chris O'Malley
  • Lecturer: Dr Jie Zhang
  • Owning School: Engineering
  • Teaching Location: Newcastle City Campus
Semesters
Semester 1 Credit Value: 10
ECTS Credits: 5.0
Pre Requisites
Pre Requisite Comment

Basic knowledge of statistics from A-level mathematics or equivalent

Co Requisites
Co Requisite Comment

N/A

Aims

To introduce students to variety of data analysis techniques that can be used for modelling and analysis of large datasets, aka “big data”, typically encountered in the process industries.

There are many cases in the process industries where it is not possible to undertake experimental design and utilise the resulting data to enhance process understanding. Quite often the only data available is that collected directly from the process via the routine monitoring and control of process variables on plant. This data is often not of the correct format for subsequent modelling and often contains outliers, missing values and mistakes due to things like transcription errors or badly calibrated instruments.

This module aims to introduce students to tools and techniques for working with this type of data and how to extract meaningful relationships from the plant data that can subsequently be used to enhance process understanding and to develop data driven models for process monitoring and prediction. In recent years this has been a hot topic in the likes of the Bioprocess sector and forms a key part of the concept of Quality by Design (QbD).

Outline Of Syllabus

Multivariate Data Analysis: Introduction: What problems can be addressed using these techniques; Preliminary Data Analysis – Handling of Inhomogeneous Data (Missing Data; Outliers; Noisy Data; Time Alignment); Graphical Procedures. Dimensionality Reduction (Principal Component Analysis); Modelling techniques: Multiple linear regression, Principal component regression; Projection to Latent Structures. Multivariate Statistical Performance Monitoring – Continuous and Batch Processes. Model simplification. Analysis of Variance. Confidence Intervals. Non-linear modeling techniques. Machine Learning techniques.

Learning Outcomes

Intended Knowledge Outcomes

To develop the knowledge of the students, through their exposure to a raft of methodologies (data pre-screening, feature extraction and process modelling) that are applicable both in the laboratory and the production plant, thereby enabling them to help in the delivery of enhanced process performance, process understanding and process optimisation.

To develop an awareness of the advantages and disadvantages of the different methodologies (data pre-screening, feature extraction and process modelling) presented for the analysis of industrial data.

To develop the critical ability of the students enabling them to identify the most appropriate methodologies for the problem to be addressed (data pre-screening, feature extraction and process modelling).

Intended Skill Outcomes

The ability to understand the fundamental statistical techniques that form the basis of multivariate methods and how they relate to the analysis methods.

The ability to interrogate the results from the execution of a multivariate data analysis in the context of the problem being addressed, e.g. to realise an enhanced understanding of process operation, and to determine their validity and applicability for solving the problem.

Teaching Methods

Teaching Activities
Category Activity Number Length Student Hours Comment
Scheduled Learning And Teaching ActivitiesLecture161:0016:00Present in Person
Guided Independent StudyAssessment preparation and completion110:0010:00Problem Solving Exercise, formative assessment on pre-treatment of data
Guided Independent StudyAssessment preparation and completion130:0030:00Problem Solving Exercise 2 and subsequent writing up in report format -summative assessment
Scheduled Learning And Teaching ActivitiesSmall group teaching62:0012:00Numerical practice sessions - Computing Labs
Guided Independent StudyIndependent study132:0032:00Review lecture material and prepare for small group teaching
Total100:00
Teaching Rationale And Relationship

Lectures convey the statistical concepts and theory and their application in process engineering. Numerical practice sessions support the learning introduced in lectures through the students having the opportunity to apply the concepts to a number of problems varying in terms of complexity. The numerical practice sessions allow the completion some of the assignment work.

Reading Lists

Assessment Methods

The format of resits will be determined by the Board of Examiners

Other Assessment
Description Semester When Set Percentage Comment
Computer assessment1M100Assessed report - Process Data Modelling (set Week 6) -2000 words
Zero Weighted Pass/Fail Assessments
Description When Set Comment
Computer assessmentMPass/Fail formative report on pre-screening of data
Assessment Rationale And Relationship

Assignments allow engineering problems to be set and solved using computer software. They also provide the opportunity for the key skills listed above to be assessed and implemented. The Formative assessment will run as a lead-in to the summative assessment and will be used to assess the students comprehension of the techniques discussed in the lectures whilst preparing the data for subsequent analysis.

Timetable

Past Exam Papers

General Notes

N/A

Disclaimer: The information contained within the Module Catalogue relates to the 2022/23 academic year. In accordance with University Terms and Conditions, the University makes all reasonable efforts to deliver the modules as described. Modules may be amended on an annual basis to take account of changing staff expertise, developments in the discipline, the requirements of external bodies and partners, and student feedback. Module information for the 2023/24 entry will be published here in early-April 2023. Queries about information in the Module Catalogue should in the first instance be addressed to your School Office.