EPSRC Centre for Doctoral Training Cloud Computing for Big Data


Professor Paul Watson

Director of the Digital Institute, Professor of Computer Science


Paul Watson FREng FBCS CEng PhD is Director of the UK's National Innovation Centre for Data, Professor of Computer Science at Newcastle University, and a Fellow of the Alan Turing Institute. He also directs the EPSRC Centre for Doctoral Training in Cloud Computing for Big Data. After a BSc and PhD at Manchester University, he began his career there as a lecturer before moving to industry (ICL) to design parallel database servers. In 1995 he joined Newcastle University where his research and teaching has focussed on scalable data engineering. From 2009-15 he directed the UKRI Social Inclusion for the Digital Economy Hub that tackled social exclusion by designing digital technologies to help people and communities, including older people and those with disabilities. Professor Watson is a Fellow of the Royal Academy of Engineering, a Fellow of the British Computer Society, and a Chartered Engineer. He received the 2014 Microsoft Jim Gray eScience Award.


Paul Watson's research is in scalable information management. This includes work on:

- Cloud Computing. Methods for building scalable cloud-based applications are explored in a range of projects. Since 2007, this has been focused on e-Science Central , a portable cloud platform for storing, analysing and sharing data. e-Science Central is used to support a wide range of applications and users through the Digital Institute. It is also our main research vehicle for cloud research in areas including provenance, scalability, formal methods and federated clouds. Our work on using federated clouds to meet the security requirements of applications won the Best Paper award at IEEE CloudCom 2011. Our work on clouds builds on what we learnt in the Carmen project which designed and built a virtual laboratory to support neuroscientists.

- Data-intensive e-science. Interest in e-science is resulting in vast amounts of data being published. We worked on how to publish data through services (e.g. see the Databases and the Grid paper, and Databases in Grid Applications: Distribution and Locality) so that it can be exploited in distributed applications. From this, and earlier work in parallel query processing, we collaborated with Norman Paton's group at Manchester to explore how to integrate data held in distributed database servers, exploiting grid computing to dynamically acquire computational resources as they are needed, e.g. to speed-up queries through parallel joins. This led to the release of the widely adopted OGSA-DQP. The legacy of this work continues in many of the e-Research applications we support on e-Science Central.

- Streaming Data Analytics. Sensors are now generating vast quantities of data - extracting value from them requires new tools and techniques that combine statistics and computing to find and act on important patterns in the data. We work on systems that start with a declarative description of the functional and non-functional requirements, and work out how to map the computation across distributed infrastructure, including healthcare monitors, field gateways and clouds. Here's a paper on some of our work - it shows how this approach can dramatically extend the battery life of wearables.

- Exploring how advanced technologies can improve the lives of those from vulnerable groups, including older people, disabled people, and marginalised youth. Paul directed the £12M Social Inclusion through the Digital Economy" (SiDE) project which aims to  realise the potential of digital technologies to transform the lives of those who are excluded. This drove much of our work on cloud computing as SiDE made heavy use of sensor-based systems which generate large amounts of data that must be analysed in order to understand human behaviour, often in real-time. Work started in SiDE on data analytics and sensors is now been driven forward through the National Innovation Centre for Ageing