Disaster Recovery in Telecommunications Systems

Following a successful collaboration with Hewlett Packard Laboratories Bristol on the application of fail-silent processes, this second external research project (ERP) is concentrating on mechanisms for providing disaster recovery in telecommunications systems. Fault-tolerant systems are usually constructed from a collection of fault-tolerant units. A single unit may be comprised of a number of replicas and can mask internal failures. Such an internal failure is not deemed a disaster since its presence is never felt outside of the unit. Our current work concentrates upon system disasters. We define a disaster as being a situation where a unit fails and the recovery information required by it is not available.