DSC8014 : Hypothesis-Led Data Analysis and Modelling for Industrial Biosciences
- Offered for Year: 2025/26
- Available to incoming Study Abroad and Exchange students
- Module Leader(s): Dr Matt Bawn
- Owning School: Natural and Environmental Sciences
- Teaching Location: Newcastle City Campus
Semesters
Your programme is made up of credits, the total differs on programme to programme.
| Semester 2 Credit Value: | 20 |
| ECTS Credits: | 10.0 |
| European Credit Transfer System | |
Pre-requisite
Modules you must have done previously to study this module
Pre Requisite Comment
N/A
Co-Requisite
Modules you need to take at the same time
Co Requisite Comment
N/A
Aims
This module aims to train students in the design, implementation, and interpretation of hypothesis-led data analysis and modelling approaches relevant to industrial biosciences. Students will develop the ability to frame biologically and commercially relevant research questions, select suitable modelling strategies, and apply them to complex datasets using appropriate tools and techniques.
Emphasis is placed on developing a solid understanding of both statistical and machine learning approaches, including regression, classification, clustering, dimensionality reduction, and time series forecasting. The module also introduces omics-specific modelling methods (e.g., DESeq2 for transcriptomics) and key model evaluation techniques such as cross-validation, ROC analysis, and feature importance, ensuring students can assess both performance and biological relevance.
Students will work with real and simulated datasets from a range of bioscience applications—such as bioprocess monitoring, antimicrobial resistance, and microbial ecology—and learn to interpret modelling results in biological terms, including gene function, phenotype, and yield. Reproducibility and clear communication of analytical workflows are a core focus, with students using R or Python in environments such as R Markdown or Jupyter Notebooks.
Throughout the module, students will engage in hands-on computational workshops supported by asynchronous industrial talks, offering insights into how data modelling supports decision-making and innovation in the bioscience sector. Through both individual and group-based assessments, students will develop the confidence to apply data-driven approaches to complex biological questions and effectively communicate their findings to scientific and industry audiences.
Outline Of Syllabus
• Framing testable hypotheses from bioscience problems in industrial contexts (e.g., synthetic biology, agritech, pharma)
• Experimental design, variable control, and exploratory data analysis (EDA)
• Statistical and machine learning models for regression, classification, clustering, and dimensionality reduction
• Time series analysis and forecasting for bioprocess optimisation
• Omics-specific modelling approaches (e.g., DESeq2 for transcriptomics)
• Model evaluation techniques (e.g., cross-validation, ROC/AUC, feature importance)
• Interpretation, visualisation, and communication of results for scientific and industry audiences
Teaching Methods
Teaching Activities
| Category | Activity | Number | Length | Student Hours | Comment |
|---|---|---|---|---|---|
| Guided Independent Study | Assessment preparation and completion | 10 | 1:00 | 10:00 | Weekly analysis (10 x 1 hour) tasks peer review |
| Guided Independent Study | Assessment preparation and completion | 10 | 1:00 | 10:00 | Weekly analysis tasks (10 x 1 hour). Students can make available for peer-peer assessment |
| Guided Independent Study | Assessment preparation and completion | 5 | 2:00 | 10:00 | Mini group project presentation on guided hypothesis testing on a biological dataset |
| Guided Independent Study | Assessment preparation and completion | 1 | 0:20 | 0:20 | Group recorded presentation |
| Guided Independent Study | Assessment preparation and completion | 1 | 30:00 | 30:00 | Max 6-page report on full modelling pipeline with biological interpretation |
| Guided Independent Study | Directed research and reading | 1 | 99:40 | 99:40 | Writing up notes, reading on topics of interest |
| Scheduled Learning And Teaching Activities | Practical | 9 | 2:00 | 18:00 | Computer Workshop |
| Scheduled Learning And Teaching Activities | Drop-in/surgery | 10 | 1:00 | 10:00 | Online drop in |
| Scheduled Learning And Teaching Activities | Module talk | 1 | 2:00 | 2:00 | Module introduction - Synchronous In-person |
| Scheduled Learning And Teaching Activities | Module talk | 10 | 1:00 | 10:00 | Asynchronous online module talk |
| Total | 200:00 |
Teaching Rationale And Relationship
Throughout the module, students will engage in hands-on computational workshops supported by asynchronous industrial talks, offering insights into how data modelling supports decision-making and innovation in the bioscience sector.
This module delivers content through integrated computational workshops, where short lecture segments are interwoven with hands-on modelling exercises using R and Python. This approach ensures students immediately apply statistical and machine learning concepts to real datasets, directly supporting both knowledge and skills outcomes.
Workshops are structured around authentic bioscience scenarios, with a focus on hypothesis framing, model evaluation, and biological interpretation. Weekly practical tasks and group activities enable students to build confidence and fluency with computational tools such (e.g. DESeq2, scikit-learn, and Prophet).
To enhance industry relevance, asynchronous industrial talks are included, offering insight into how data modelling is used in fields such as fermentation optimisation, bioreactor monitoring, and antimicrobial resistance prediction. These talks help contextualise technical skills within real-world decision-making and innovation.
The combination of applied instruction, industry engagement, and collaborative activities ensures that students develop both the conceptual understanding and technical competencies required for data-led roles in industrial biosciences.
Assessment Methods
The format of resits will be determined by the Board of Examiners
Other Assessment
| Description | Semester | When Set | Percentage | Comment |
|---|---|---|---|---|
| Report | 2 | M | 60 | Max. 6 A4 pages. Full modelling pipeline with biological interpretation (e.g., forecasting CHO bioreactor yield, predicting resistance phenotypes) |
| Oral Presentation | 2 | M | 40 | Guided hypothesis testing on a biological dataset (e.g., transcriptomics, metagenomics, AMR) Presentation (20 mins) - mini-group project allows students to collaboratively test a biological hypothesis using real or simulated industrial data. |
Formative Assessments
Formative Assessment is an assessment which develops your skills in being assessed, allows for you to receive feedback, and prepares you for being assessed. However, it does not count to your final mark.
| Description | Semester | When Set | Comment |
|---|---|---|---|
| Written exercise | 2 | M | Weekly research projects. 10 weekly analysis tasks. Students can make available for peer-peer assessment. |
Assessment Rationale And Relationship
Through both individual and group-based assessments, students will develop the confidence to apply data-driven approaches to complex biological questions and effectively communicate their findings to scientific and industry audiences
The assessment strategy supports the module’s emphasis on hypothesis-led analysis and real-world application. It consists of a mini-group project (40%) and an individual research report (60%).
The mini-group project allows students to collaboratively test a biological hypothesis using real or simulated industrial data. Throughout the module, students submit key components of their work for peer-to-peer review, receiving formative feedback on hypothesis framing, data exploration, modelling choices, and interpretation. This encourages reflective learning and supports iterative improvement. For summative submission, each group finalises and submits their project, which is assessed using a combination of peer and academic marking to ensure both rigour and collaborative accountability.
The individual research report is a capstone assignment where students independently design and implement a full modelling pipeline. This includes data preparation, model selection, performance evaluation, and biological interpretation, culminating in a report that reflects industry-standard analytical practice. It provides students with the opportunity to demonstrate mastery of both the conceptual and practical aspects of the module.
Weekly portfolio tasks reinforce learning by providing structured, incremental challenges that build technical proficiency and promote critical thinking.
Together, these assessments promote critical thinking, technical fluency, and effective scientific communication, ensuring students are prepared for data-driven research and decision-making in industrial biosciences.
Reading Lists
Timetable
- Timetable Website: www.ncl.ac.uk/timetable/
- DSC8014's Timetable