School of Computing

Staff Profiles

Professor Paolo Missier

Professor of Big Data Analytics


I am a Professor in Scalable Data Analytics with the School of Computing at Newcastle University and a Fellow (2018-2021) of the Alan Turing Institute, UK's National Institute for Data Science and Artificial Intelligence [read more]

I joined academia in 2011, after a prior career as a Research Scientist at Bell Communications Research, USA (1994-2001), and as a Research Fellow at the University of Manchester, School of Computer Science (2004-2011) where I got my PhD in 2008. My thesis on Data and Information Quality in e-science was awarded the ICIQ (International Conference on Information Quality) 2009 Ballou & Pazer DQ/IQ Best Dissertation Award, Best Dissertation Award from the School of Computer Science, University of Manchester, and came second for the 2008 British Computer Society Distinguished Dissertation Award.

I have also been involved in the specification of the W3C PROV data model for provenance (2011-2013) where I contributed to the main recommendation documents [12,14], which follows the Open Provenance Model [15].

- Google Scholar.
- At bibbase
- ResearchGate

I lead our School’s post-graduate academic teaching on Big Data Analytics (see teaching)
I am Sr. Associate Editor for the ACM Journal on Data and Information Quality (JDIQ)


Note: References in this text are linked through and also available as a list from

Current research focus: 

Data Science and Engineering for Health.  [16,17]
- Methods for predicting and preventing age-related diseases through Machine Learning;
- Personalised disease trajectories
- Discovery of digital biomarkers from self-monitoring devices (wearables).

Other research interests, current and past:
  • Provenance of data and processes. [1,2,3,4,5,6]. I have also been involved in the specification of the W3C PROV data model for provenance (2011-2013).
  • Optimisation of algorithmic fairness [7]
  • ReComp: preserving value from large-scale data analytics over time through selective re-computation [invited talk]. [8,9,3,5]
  • Social media analytics (Twitter) to help health authorities combat Zika and Dengue epidemics [10]
  • Enabling trust-less and fair marketplaces for data streams using blockchain technology [11,12]
  • Implementing efficient and cost-effective genomics data processing pipelines using workflow technology on the Cloud (funded project:  Cloud-eGenome:, 2013-2015, PI, MRC/ NIHR) [13]
  • Data and Information QualityDuring my PhD I proposed the notion of Quality Views, a semantics-based method for semi-automatically adding data quality control to scientific workflows [14,15]


Post-graduate (MSc) teaching: Big Data Analytics (CSC8101)