Dependability

The Dependability group is a world leader in dependability research with a long and successful record of developing and applying fundamental ideas in the area (including highly influential work on recovery blocks, conversations, coordinated atomic actions and defining the dependability taxonomy).

The group investigates fundamental concepts, development techniques, models, architectures and mechanisms that directly contribute to creating modern information systems, networks and infrastructures that are dependable and secure in all aspects. Such systems need to be resilient to malicious attacks and accidental faults to deliver services that can be justifiably trusted by their stakeholders.

Group research combines a rigorous dependability-explicit approach to requirements and design with a coherent framework for system operation so that fault and attack tolerance is available at the component, system, systems of systems, network and socio-environment levels (the interdisciplinary dimension was to the forefront of our leadership of the DIRC Interdisciplinary Research Collaboration and is continuing in our TrAmS-2 Platform grant on Trustworthy Ambient Systems: Resource-constrained Ambience).

The group targets challenges to the dependability and security of next generation information systems and critical infrastructures. It is involved in substantial flagship research projects sponsored by EU and EPSRC/UK (DEPLOY, TrAmS-2, DESTECS) and has a strong record of collaboration with major UK and European companies.

The group works closely with the Centre for Software Reliability (CSR), an established University Research Centre. CSR also delivers a programme of research and technology transfer, primarily (but not exclusively) focussed on industry. A major component of this activity is the national UK Safety-Critical Systems Club, which is managed by CSR.

The group focuses on the following areas:

  • Formal methods for dependability
  • Design and evaluation of secure systems
  • Resilience
  • Trustworthy ambient systems
  • Dependable architectures.

The group members have extensive experience in developing and applying industrial-strength modelling tools, including RODIN (Event-B), Atelier B (B), Overture (VDM) and FDR (CSP), for specification and verification of complex computer-based systems.

Formal methods for dependability

The fact that today, computers are part of nearly all complex systems means that the software in those computers must be designed to tolerate failures in the physical (and human) components of the overall system. These overall systems are of such complexity that formal reasoning is essential to show that the system will behave in a predictable way even when, say a connection is lost or a driver crosses a red signal. We by no means claim that formalism solves the problems of dependability: one needs suitable architectures, careful analysis of potential sources of component failure. But we see formal reasoning as essential if the software is to help rather than reduce overall system dependability.

Our group has significant strength in formal methods and is exploring new avenues in formal reasoning tools. We led the EU RODIN STREP project that developed tools for the Event-B method; these were so successful that we are now leading a major follow-on EU DEPLOY Integrated Project that is deploying the Rodin tools in major European companies. This interaction with industrial users is, as expected, throwing up new research challenges that are pursued within and beyond the project.

The EU DESTECS project has just started and will extend the investigation to interface with tools that handle control systems. The Overture project is developing open-source tools for the VDM method. We have just been awarded an EPSRC project (joint with Edinburgh) to decrease the load on engineers needing to prove theorems by deploying AI techniques (AI4FM).

A further area, being explored in cooperation with members of the Concurrent Asynchronous Systems Group, is that of structured occurrence nets, a new formalism for recording information about the activity of an intended or actual complex evolving system, and for analyzing complex fault-error-failure chains in such systems.

Design and evaluation of secure systems

Computer security is a major concern in the digital economy due to the prevalence of online systems. Our research is centred on the vitally important and intellectually challenging task of designing robust system security solutions that will prove effective in practical use (in particular by being truly usable as well as technically sound). This mainly involves three lines of work:

  1. the design of various tools,
  2. the investigation of methodologies and
  3. the invention of novel solutions.

Our work covers a wide spectrum in security, from theory to practice. We contribute to the fundamentals of computer security, by designing new cryptographic systems. The well-known Pret a Voter voting system was originated from the DIRC project that was led by our group. We also designed the J-PAKE protocol, which is a fundamental building block for many secure communication systems and has been widely deployed in practice. Our work on breaking CAPTCHAs has also had a high impact in security industry.

While our recent work is more oriented to systems security, our research also covers applied cryptography and cybercrime. We are heavily involved in the work of the Centre for Cybercrime and Computer Security, which carries out research and provides education to make the Internet safer for families, businesses and organisations.

Our recent research activities in this area are mainly funded by Microsoft Research and the Royal Academy of Engineering.

Resilience

Even well-designed computer systems can be at the mercy of malicious attacks and accidental faults in their (networked) environment, and yet businesses and communities expect systems to be resilient, ie to return to a predetermined level of service after an attack or fault. We work on methods and tools needed to design systems in such a way that they can be relied on to be resilient. These include techniques for modelling faults and failures in networked systems and for analysing their consequences, such as their effects on information flow, as well as designing policies to manage recovery.

Modern networked systems can, in principle, respond to threats dynamically. Our research on formal modelling of virtual organisations (work with Dstl and industry) has shown how formal modelling can help in analysing policies that should govern such organisations.

We have initiated a series of workshops on formal aspects of virtual organisations and are working with other institutes to establish the ARCoNet, a national research network on agile resilient communities.

We have been substantially contributing to the International ERCIM Working Group on Software Engineering for Resilient Systems (SERENE).

Together with our UK and EU collaborators, we are working on the run-time use of metadata to govern reconfiguration in response to threats, aiming to use elements of self-adaptive and autonomic computing in order to achieve resilience.

Trustworthy ambient systems

Because they cannot be designed as a coherent whole, ambient systems pose huge dependability challenges. Their mobility means that they are open to new forms of malicious interference as well as accidental failure modes that are difficult to predict at design time. Their decentralised character and separate ownership of components result in a lack of control over evolution and upgrades, making recovery a challenge.

The group is investigating the use of formal features for defining, modelling and verifying fault tolerance properties which are required of ambient systems before, during and after the evolution/upgrade stages. Our TrAmS-2 Platform grant is directly active in this arena.

We are working on ensuring predictability of fault tolerance during the evolution of an ambient system. In our recent work on the CAMA (Context-Aware Mobile Agents) framework, we have developed abstractions for programming trustworthy ambient systems, providing a formal analytic capability.

Our research on multi-agent systems focuses on developing patterns that would support rigorous development of large-scale open fault-tolerant applications in which components are anonymous, mobile and asynchronous. The semantics of reconfiguration policies are investigated to support run-time reconfiguration ensuring ambient system trustworthiness.

Dependable architectures

The group strives to develop architectural modelling and reasoning technologies that would allow designers to introduce dependability mechanisms, such as fault-tolerance, and evaluate their effectiveness earlier in the development cycle. Our recent focus has been on investigating advanced, architectural solutions for ensuring the dependability of service-oriented systems, web services, system biology and cloud-based applications.

A number of fault tolerance architectures have been developed for specific domains (such as e-science, system biology and space sectors) to support explicit reasoning about the resources required for dealing with faults and errors.

Recent research commissioned by the UK government and industry has been exploring a formal description of interface contracts at the boundaries of components in large-scale systems of systems.

The group conducts substantial experimental work on assessing dependability of service and cloud-based architectures run over the Internet.

Developing blueprints and patterns for systematic and re-usable integration of fault tolerance at the architectural level is a prominent area of research in our group.

In 2002 we initiated a successful series of Workshops on Architecting Dependable Systems at the major dependability and software engineering conferences, which resulted in six LNCS volumes, collecting the best work in the field.

Further information: