Skip to main content

Streaming Data Modelling Research

Extracting value from the data firehose.

Data scientists are becoming increasingly interested in modelling and analysing live data streams. Often, we want to analyse in (near) real time.

Our work in this theme of Newcastle Data considers:

  • streaming data engineering
  • online algorithms for the analysis of streaming data
  • impactful applications of streaming data modelling research

From lakes to streams

Traditional data science relies on the analysis of complete data sets, at rest in data lakes. But as we instrument and measure more, this model breaks down. No data set is ever complete, and so attention focuses more on making sense of data as it flows and accumulates.

The shift towards managing streamed data requires fundamental changes to the approach to data engineering and statistical modelling.

Recent years have seen significant developments in both hardware and software architecture tailored to streaming data modelling research. There are also different software libraries that purport to simplify deployment in production. Similarly, various online statistical modelling and machine learning approaches are being developed. They can process and make inferences from streaming data in (near) real time.

Streaming data