DSC8014 : Hypothesis-Led Data Analysis and Modelling for Industrial Biosciences - Global Opportunities

Offered for Year: 2025/26

Available to incoming Study Abroad and Exchange students

Module Leader(s): Dr Matt Bawn

Owning School: Natural and Environmental Sciences
Teaching Location: Newcastle City Campus

Semesters

Your programme is made up of credits, the total differs on programme to programme.

Semester 2 Credit Value:	20
ECTS Credits:	10.0
European Credit Transfer System

Pre-requisite

Modules you must have done previously to study this module

Pre Requisite Comment

N/A

Co-Requisite

Modules you need to take at the same time

Co Requisite Comment

N/A

Aims

This module aims to train students in the design, implementation, and interpretation of hypothesis-led data analysis and modelling approaches relevant to industrial biosciences. Students will develop the ability to frame biologically and commercially relevant research questions, select suitable modelling strategies, and apply them to complex datasets using appropriate tools and techniques.

Emphasis is placed on developing a solid understanding of both statistical and machine learning approaches, including regression, classification, clustering, dimensionality reduction, and time series forecasting. The module also introduces omics-specific modelling methods (e.g., DESeq2 for transcriptomics) and key model evaluation techniques such as cross-validation, ROC analysis, and feature importance, ensuring students can assess both performance and biological relevance.

Students will work with real and simulated datasets from a range of bioscience applications—such as bioprocess monitoring, antimicrobial resistance, and microbial ecology—and learn to interpret modelling results in biological terms, including gene function, phenotype, and yield. Reproducibility and clear communication of analytical workflows are a core focus, with students using R or Python in environments such as R Markdown or Jupyter Notebooks.

Throughout the module, students will engage in hands-on computational workshops supported by asynchronous industrial talks, offering insights into how data modelling supports decision-making and innovation in the bioscience sector. Through both individual and group-based assessments, students will develop the confidence to apply data-driven approaches to complex biological questions and effectively communicate their findings to scientific and industry audiences.

Outline Of Syllabus

• Framing testable hypotheses from bioscience problems in industrial contexts (e.g., synthetic biology, agritech, pharma)
• Experimental design, variable control, and exploratory data analysis (EDA)
• Statistical and machine learning models for regression, classification, clustering, and dimensionality reduction
• Time series analysis and forecasting for bioprocess optimisation
• Omics-specific modelling approaches (e.g., DESeq2 for transcriptomics)
• Model evaluation techniques (e.g., cross-validation, ROC/AUC, feature importance)
• Interpretation, visualisation, and communication of results for scientific and industry audiences

Teaching Methods

Teaching Activities

Category	Activity	Number	Length	Student Hours	Comment
Guided Independent Study	Assessment preparation and completion	10	1:00	10:00	Weekly analysis (10 x 1 hour) tasks peer review
Guided Independent Study	Assessment preparation and completion	10	1:00	10:00	Weekly analysis tasks (10 x 1 hour). Students can make available for peer-peer assessment
Guided Independent Study	Assessment preparation and completion	5	2:00	10:00	Mini group project presentation on guided hypothesis testing on a biological dataset
Guided Independent Study	Assessment preparation and completion	1	0:20	0:20	Group recorded presentation
Guided Independent Study	Assessment preparation and completion	1	30:00	30:00	Max 6-page report on full modelling pipeline with biological interpretation
Guided Independent Study	Directed research and reading	1	99:40	99:40	Writing up notes, reading on topics of interest
Scheduled Learning And Teaching Activities	Practical	9	2:00	18:00	Computer Workshop
Scheduled Learning And Teaching Activities	Drop-in/surgery	10	1:00	10:00	Online drop in
Scheduled Learning And Teaching Activities	Module talk	1	2:00	2:00	Module introduction - Synchronous In-person
Scheduled Learning And Teaching Activities	Module talk	10	1:00	10:00	Asynchronous online module talk
Total	200:00

Teaching Rationale And Relationship

Throughout the module, students will engage in hands-on computational workshops supported by asynchronous industrial talks, offering insights into how data modelling supports decision-making and innovation in the bioscience sector.

This module delivers content through integrated computational workshops, where short lecture segments are interwoven with hands-on modelling exercises using R and Python. This approach ensures students immediately apply statistical and machine learning concepts to real datasets, directly supporting both knowledge and skills outcomes.
Workshops are structured around authentic bioscience scenarios, with a focus on hypothesis framing, model evaluation, and biological interpretation. Weekly practical tasks and group activities enable students to build confidence and fluency with computational tools such (e.g. DESeq2, scikit-learn, and Prophet).
To enhance industry relevance, asynchronous industrial talks are included, offering insight into how data modelling is used in fields such as fermentation optimisation, bioreactor monitoring, and antimicrobial resistance prediction. These talks help contextualise technical skills within real-world decision-making and innovation.
The combination of applied instruction, industry engagement, and collaborative activities ensures that students develop both the conceptual understanding and technical competencies required for data-led roles in industrial biosciences.

Assessment Methods

The format of resits will be determined by the Board of Examiners

Other Assessment

Description	Semester	When Set	Percentage	Comment
Report	2	M	60	Max. 6 A4 pages. Full modelling pipeline with biological interpretation (e.g., forecasting CHO bioreactor yield, predicting resistance phenotypes)
Oral Presentation	2	M	40	Guided hypothesis testing on a biological dataset (e.g., transcriptomics, metagenomics, AMR) Presentation (20 mins) - mini-group project allows students to collaboratively test a biological hypothesis using real or simulated industrial data.

Formative Assessments

Formative Assessment is an assessment which develops your skills in being assessed, allows for you to receive feedback, and prepares you for being assessed. However, it does not count to your final mark.

Description	Semester	When Set	Comment
Written exercise	2	M	Weekly research projects. 10 weekly analysis tasks. Students can make available for peer-peer assessment.

Assessment Rationale And Relationship

Through both individual and group-based assessments, students will develop the confidence to apply data-driven approaches to complex biological questions and effectively communicate their findings to scientific and industry audiences
The assessment strategy supports the module’s emphasis on hypothesis-led analysis and real-world application. It consists of a mini-group project (40%) and an individual research report (60%).
The mini-group project allows students to collaboratively test a biological hypothesis using real or simulated industrial data. Throughout the module, students submit key components of their work for peer-to-peer review, receiving formative feedback on hypothesis framing, data exploration, modelling choices, and interpretation. This encourages reflective learning and supports iterative improvement. For summative submission, each group finalises and submits their project, which is assessed using a combination of peer and academic marking to ensure both rigour and collaborative accountability.
The individual research report is a capstone assignment where students independently design and implement a full modelling pipeline. This includes data preparation, model selection, performance evaluation, and biological interpretation, culminating in a report that reflects industry-standard analytical practice. It provides students with the opportunity to demonstrate mastery of both the conceptual and practical aspects of the module.
Weekly portfolio tasks reinforce learning by providing structured, incremental challenges that build technical proficiency and promote critical thinking.
Together, these assessments promote critical thinking, technical fluency, and effective scientific communication, ensuring students are prepared for data-driven research and decision-making in industrial biosciences.

Reading Lists

DSC8014's Reading List

Timetable

Timetable Website: www.ncl.ac.uk/timetable/
DSC8014's Timetable