Skip to main content

Module

GEO8026 : Data analysis for Geoscience

  • Offered for Year: 2024/25
  • Module Leader(s): Dr Matthew Perks
  • Co-Module Leader: Prof. Steve Juggins
  • Owning School: Geography, Politics & Sociology
  • Teaching Location: Newcastle City Campus
Semesters

Your programme is made up of credits, the total differs on programme to programme.

Semester 1 Credit Value: 20
ECTS Credits: 10.0
European Credit Transfer System

Aims

This module will provide students with working knowledge of software widely-used for numerical analysis in the Geosciences and within a range of industries (e.g. data science, environmental consultancy/engineering). By the end of the module, students will be equipped with knowledge and skills to be able to organise, query, analyse and display environmental datasets. This skillset will be developed through the completion of practical exercises using research datasets acquired from across the sub-disciplines of the Geosciences. Following acquisition of the core skills, students will apply their knowledge to solve real-world problems.

This module is split into two parts:

Part 1
Students will be provided with skills and experience of working with the MATLAB programming language for data import, manipulation, processing, and analysis using a range of environmental datasets. The focus of this section is the development of code capable of processing datasets that are large, geo-spatial and/or time series.

Part 2
This section teaches key statistical methods and graphical techniques for the analysis of geoscience data using the R language and statistical computing environment. The first lecture and practical will provide an introduction (or refresher) to basic statistical methods and R/RStudio software. Subsequent sections will explore advanced regression methods and techniques for the analysis and display of time series and multivariate data.

Outline Of Syllabus

Part 1:
• Fundamentals of MATLAB: Key commands, variable types, structure.
• Matrix and vector operations.
• Data handling: Data import, quality control.
• Time series analysis: Managing time, interpolation, periodicity, filtering and smoothing, automation, data export.
• Statistics: Hypothesis testing, uni-variate and bi-variate tests, quantifying uncertainty.
• Image analysis: Image pre-processing, enhancement, registration, movement detection.
• Visualisation: Production of publication quality plots from 2D and 3D datasets.

Part 2:
• Introduction to R and RStudio: using tidyverse packages for data import, cleaning, restructuring and
summary.
• Visualising earth science data: traditional and modern graphics, publication quality plots and
interactive web graphics.
• Refresher on basic statistical methods using R (correlation, least squares regression, t-test,
analysis of variance)
• Advanced regression: multiple regression and model building, generalised linear models (GMLs),
generalised additive models (GAMs), classification and regression trees (CARTs).
• Analysis of multivariate data: ordination and cluster analysis; analysis of time-series data.

Teaching Methods

Teaching Activities
Category Activity Number Length Student Hours Comment
Scheduled Learning And Teaching ActivitiesLecture101:0010:00PiP
Scheduled Learning And Teaching ActivitiesPractical102:0020:00A - follow lecture in same room - PiP
Scheduled Learning And Teaching ActivitiesPractical103:0030:00B - morning on day after lecture + A - PiP
Guided Independent StudyProject work265:00130:00N/A
Scheduled Learning And Teaching ActivitiesDrop-in/surgery101:0010:00Follow B on same day
Total200:00
Teaching Rationale And Relationship

Part One: The module will be taught over a two-week period. This will take the form of five, three-hour blocks of teaching comprising of a lecture to provide background and underpinning theory, followed by a two hour practical session where students will apply the techniques to a single practical example. Following this, students will be provided access to additional resources where they will be expected to apply their knowledge and skills to an extended problem. Students will be allocated time to work on this problem independently the following day (Practical B) and a drop-in surgery will be available in the afternoon to troubleshoot any problems and offer further guidance.

Part Two: Staff teaching will take place in the form of short (30 minute) lectures followed by a 30 minute “taster” practical where the students can apply methods learned in lectures to real-world data. Students will then work independently during the afternoon on more substantial practical exercises. There will be a one-hour surgery session to troubleshoot problems and provide feedback on student work.

Assessment Methods

The format of resits will be determined by the Board of Examiners

Other Assessment
Description Semester When Set Percentage Comment
Portfolio1M501500 word equivalent practical portfolio report
Portfolio1M501500 word equivalent practical portfolio report based on statistical analysis of bespoke datasets
Assessment Rationale And Relationship

The practical portfolio will evaluate students’ ability to analyse a series of bespoke geoscience datasets using a range of statistical techniques. The portfolio will test their knowledge in terms of selecting an appropriate statistical method, assessing the assumptions of the method, interpreting the output, and relating results to the wider environmental context of the problem. The portfolio will also test students’ programming, graphical and other practical skills to perform the analysis and produce tabular and/or graphical summaries of results. A practical report is essential given the practical nature of the module.

Reading Lists

Timetable