Bayesian inference for stochastic kinetic genetic regulatory networks

BBSRC/EPSRC Bioinformatics initiative
Grant #: BIO14454 , March 2001-March 2004

Investigators

Outline

Biology: The project aims to gain an improved understanding of stochastic kinetic genetic regulatory transcription control mechanisms in eukaryotic cells, by developing software and algorithms to aid investigation. There is strong evidence for intrinsic stochastic variations in gene expression that can have a variety of effects on cell fate and function. Stochastic models will be developed for eukaryotic gene regulation which capture the key feedback mechanisms for expression.

Statistics: The project will be concerned with the development of computer-intensive algorithms for Bayesian inference for the parameters and structure of the highly complex continuous-time Markov processes that are used to model the bio-chemical networks which regulate protein synthesis in eukaryotic cells.

Objectives

Stochastic models will be developed for eukaryotic gene regulation which capture the key feedback mechanisms for expression. Computer-intensive statistical methods will be developed so that inferences can be made from experimental data for both model parameters and model structure. Freely available software will be produced, both for the modelling and simulation of eukaryotic genetic regulatory circuits and for inferring parameters of such networks based on real-time imaging data.

Publications

Several papers have been written resulting from the research carried out as part of this project. These are listed here, and links to PDF versions will be maintained until they appear in print.

The following work was not part of the BBSRC-funded project, but the result of a "spin-off" project sponsored by an EPSRC studentship, which is examining, inter alia , the use of diffusion approximations for inferential purposes.

Models

This project was more concerned with proving that inference in stochastic kinetic models from (discrete) time course data is possible, rather than with any particular real regulatory networks. Here we provide links to some very simple example models which can be used in conjunction with the simulation and inference software we provide below. The model format used is a subset of SBML Level 1. It is assumed that all rate laws are stochastic and that all units are self-consistent. The units , compartments and rules sections of the SBML document are not processed. ie. there is only one compartment and rules are not supported. Look at the example models for further details. All of the models and software are free in the sense of the GNU General Public License.

  • models.tgz - models in a gzipped tar file. Can be unpacked on a Linux system with a command like tar xvfz models.tgz . See the enclosed README.txt for further details.
Software

Again, the software for inference that we provide is intended more as proof-of-concept than as a practical application for applied bioinformatics researchers. All of the models and software are free in the sense of the GNU General Public License. Presently there are two packages that we provide.

  • gillespie is a simple SBML Level 1 simulator designed to simulate exact realisations from models defined in the above manner. This is useful for simulating test data for the inference software. This too is packaged as a gzipped tar file. See the enclosed README.txt for information on compilation and execution.
  • stochInf is the inference program, which accepts models, as described above, together with data (in the format produced by gillespie ), in order to carry out parameter inference. The parameters in the SBML model file are used only as starting values for the MCMC scheme and are not used otherwise.
Links