Centre for Health and Bioinformatics


Dr Ian Wilson



Research Interests

Development of Statistical Methods for Population Genomic Data

My main research focuses on building models for statistical inference of the processes underlying genetic variation in large datasets of closely linked genetic markers. Publicly accessible datasets are now available that give detailed pictures of haplotype diversity from a sample of human populations. Genomic data of this form is likely to be increasingly important studying the genetic of disease: successful use of this data requires new statistical techniques. My main interest is in how we use known human evolutionary history to inform genetic studies of common human disease – such as type 1 and type II diabetes, hypertension and coronary artery disease.

I am developing models that describe genomic data and can be used to estimate genetic parameters from subdivided populations, and, more importantly, can sample from the conditional distribution of an unseen variant, conditional on genomic variation. A large number of population genomic problems can be described under this prediction-with-subdivision framework and problems of immediate interest can be made to conform such as fine-scale mapping for case-control studies of genetic variation, and the search for loci that have undergone different selective regimes in sampled subpopulations.

Recent work has focussed on methods for inferring human evolutionary history from worldwide DNA samples. Projects have concentrated on the X and Y chromosomes and mitochondrial DNA. I am the author of BATWING, which has been used extensively in inferring human history using the Y chromosome.

The questions that reflect my research interests are:

  • How can we make best use of large databases of genetic variation?

How do we use known human history to
How can we best make use of complete genetic information?
What approximations can be made so that calculations are feasible for thousands of SNPs at the same time?
What evidence do we have for the genetic basis of difference between populations?
What proportion of selected loci could one hope to find?
How widespread are functional genetic variants between populations?