Skip to main content


CSC1033 : Information Storage and Retrieval

  • Offered for Year: 2019/20
  • Module Leader(s): Dr Marie Devlin
  • Lecturer: Dr John Colquhoun, Dr Dan Nesbitt
  • Owning School: Computing
  • Teaching Location: Newcastle City Campus
Semester 1 Credit Value: 10
Semester 2 Credit Value: 10
ECTS Credits: 10.0


This module will provide students with an understanding of information storage and retrieval. This relates to all forms of data, including text and multimedia (image, video and audio) stored on and consumed from the web, amongst other sources. The module covers fundamental techniques and strategies of information storage and retrieval used in a variety of online applications such as web-search engines, document matching systems and business storage and analytics.

Outline Of Syllabus

Database design
•       Retrieval, browsing, user information needs, and other core concerns.
•       Notions of structured, unstructured and semi-structured data
•       Relational databases, SQL
•       Exemplar applications including business data collections and website design.

Data representation
•       Data models and query languages
•       Metadata and semantics, faceted classifications, and other “linked data” issues
•       Image and video features and classifications that enable access to other media types
•       Data standards and governance
•       Query expansion and its relationship with the Semantic Web
•       Spiders/crawlers, stopwords and keywords, indexing and stemming
•       Exemplar applications including publishing multimedia data archives, web-based search engines

Data Integration
•       Information models, databases and data normalization for transactional systems (OLTP)
•       Data de-normalization, data marts / data warehouses, star and snowflake schemas, and cubes as support for analytical systems (OLAP)
•       Exemplar applications, including organisation-wide analytics, e-commerce

Distributed databases
•       The challenges presented by “Big Data”
•       NoSQL and Cloud Computing for distributed and scalable treatment of “Big Data”.
•       Overview of IoT and edge computing
•       Exemplar applications such as real-time data processing, smart video stream surveillance, urban observatory data publishing.

Teaching Methods

Teaching Activities
Category Activity Number Length Student Hours Comment
Guided Independent StudyAssessment preparation and completion241:0024:00Skills Practice: Practical follow up and technique practice /tool use
Guided Independent StudyAssessment preparation and completion440:3022:00Revision for exam
Guided Independent StudyAssessment preparation and completion12:002:00examination
Guided Independent StudyAssessment preparation and completion241:0024:00Lecture follow-up
Scheduled Learning And Teaching ActivitiesLecture242:0048:00Lectures
Scheduled Learning And Teaching ActivitiesPractical241:0024:00Computer classroom
Guided Independent StudyIndependent study561:0056:00Background reading
Teaching Rationale And Relationship

Techniques and theory are presented in lectures. Classroom-based practical sessions provide experience of designing and building database applications.

This is a very practical subject, and it is important that the learning materials are supported by hands-on opportunities provided by practical classes, and on the related Programming Portfolio modules.

Assessment Methods

The format of resits will be determined by the Board of Examiners

Description Length Semester When Set Percentage Comment
Written Examination1202A100N/A
Assessment Rationale And Relationship

In the written examination the questions will assess fundamental knowledge and understanding of theory and
application of database design and usage.

Practical exercises are set during lab practical classes.

Reading Lists