Skip to main content


GEO8026 : Data analysis for Geoscience

  • Offered for Year: 2022/23
  • Module Leader(s): Dr Matthew Perks
  • Lecturer: Professor Steve Juggins
  • Owning School: Geography, Politics & Sociology
  • Teaching Location: Newcastle City Campus
Semester 1 Credit Value: 20
ECTS Credits: 10.0


This module will provide students with working knowledge of software widely-used for numerical
analysis in the Geosciences and within a range of industries (e.g. data science, engineering). By the
end of the module, students will be equipped with knowledge and skills to be able to organise, query,
analyse and display environmental datasets. This skillset will be developed through the completion of
practical exercises using research datasets acquired from across the sub-disciplines of the
geosciences. Following acquisition of the core skills, students will apply their knowledge to solve realworld problems.

This module is split into two parts:

Part 1
Students will be provided with skills and experience of working with the MATLAB programming
language for data import, manipulation, processing, and analysis using a range of environmental
datasets. The focus of this section is the development of code capable of processing datasets that
are large, geo-spatial and/or time series.

Part 2
This section teaches key statistical methods and graphical techniques for the analysis of geoscience
data using the R language and statistical computing environment. The first lecture and practical will
provide an introduction (or refresher) to basic statistical methods and R/RStudio software.
Subsequent sections will explore advanced regression methods and techniques for the analysis and
display of time series and multivariate data.

Outline Of Syllabus

Part 1:
• Fundamentals of MATLAB: Key commands, variable types, structure.
• Matrix and vector operations.
• Data handling: Data import, quality control.
• Time series analysis: Managing time, interpolation, periodicity, filtering and smoothing, automation, data export.
• Statistics: Hypothesis testing, uni-variate and bi-variate tests, quantifying uncertainty.
• Image analysis: Image pre-processing, enhancement, registration, movement detection.
• Visualisation: Production of publication quality plots from 2D and 3D datasets.

Part 2:
• Introduction to R and RStudio: using tidyverse packages for data import, cleaning, restructuring and
• Visualising earth science data: traditional and modern graphics, publication quality plots and
interactive web graphics.
• Refresher on basic statistical methods using R (correlation, least squares regression, t-test,
analysis of variance)
• Advanced regression: multiple regression and model building, generalised linear models (GMLs),
generalised additive models (GAMs), classification and regression trees (CARTs).
• Analysis of multivariate data: ordination and cluster analysis; analysis of time-series data.

Teaching Methods

Teaching Activities
Category Activity Number Length Student Hours Comment
Scheduled Learning And Teaching ActivitiesLecture101:0010:00PiP or synchronous online depending on coronavirus infection rates
Scheduled Learning And Teaching ActivitiesPractical102:0020:00A - follow lecture in same room - PiP or synchronous online depending on coronavirus infection rates
Scheduled Learning And Teaching ActivitiesPractical103:0030:00B - morning on day after lecture + A - PiP or synchr online depending on coronavirus infection rates
Guided Independent StudyProject work265:00130:00N/A
Scheduled Learning And Teaching ActivitiesDrop-in/surgery101:0010:00Follow B on same day - PiP or synchronous online depending on coronavirus infection rates
Teaching Rationale And Relationship

Part One: The module will be taught over a two-week period. This will take the form of five, three-hour blocks of teaching comprising of a lecture to provide background and underpinning theory, followed by a two hour practical session where students will apply the techniques to a single practical example. Following this, students will be provided access to additional resources where they will be expected to apply their knowledge and skills to an extended problem. The outputs generated during these sessions will form part of the requirements of the portfolio. Students will be allocated time to work on this problem independently the following day (Practical B) and a drop-in surgery will be available in the afternoon to troubleshoot any problems and offer further

Part Two: Staff teaching will take place each morning (9am-1pm) in the form of short (30 minute) lectures followed by a 30 minute “taster” practical where the students can apply methods learned in lectures to real-world data. Students will then work independently during the afternoon on more substantial practical exercises. There will be a one-hour surgery session towards the end of the afternoon practical to troubleshoot problems and provide feedback on student work.

Assessment Methods

The format of resits will be determined by the Board of Examiners

Other Assessment
Description Semester When Set Percentage Comment
Portfolio1M501500 word equivalent practical portfolio report
Portfolio1M501500 word equivalent practical portfolio report based on statistical analysis of bespoke datasets
Assessment Rationale And Relationship

The practical portfolio will evaluate students’ ability to analyse a series of bespoke geoscience datasets using a range of statistical techniques. The portfolio will test their knowledge in terms of selecting an appropriate statistical method, assessing the assumptions of the method, interpreting the output, and relating results to the wider environmental context of the problem. The portfolio will also test students’ programming, graphical and other practical skills to perform the analysis and produce tabular and/or graphical summaries of results. A practical report is essential given the practical nature of the module.

Reading Lists