Home
Models
Trees
Genomics
PANDIT
Jobs

Simon Whelan [Publications | Biography]

Lecturer
Faculty of Life Sciences,
The University of Manchester
Oxford Road, Manchester, M13 9PT, UK

Tel: +44-(0)161-3068901
e-mail: simon.whelan@manchester.ac.uk

Research interests

The ultimate goal of my research is to learn how the selective and mutational pressures acting on molecular sequences relate to their function, and how the linear information held in the hereditary material translates to the complex organismal phenotypes observed in the natural world.
It is difficult to understand the selective and mutational pressures acting on a sequence when observing it in only a single species. Studying the patterns of change between multiple species can be highly informative and much of my research consists of developing new statistical methods for making these comparisons.

Statistical models of evolution [Details]
The rates that different characters (e.g. DNA or amino acids) replace each other in biological sequences are described using Markov models. Whilst models are not expected to explain the full complexities of sequence evolution, making them more reflective of reality will improve tree estimates and provide considerable insights into how molecules evolve. My research has improved models describing protein evolution, and demonstrated that large-scale mutations affecting multiple nucleotides are more frequent than previously thought.

Phylogenetic tree reconstruction [Details]
Using patterns of change to infer the evolutionary relationships between sequences is a common task in comparative sequence analysis. The maximum likelihood of a tree is indicative of how well it describes the evolution of the sequences. I am currently developing new algorithms that are highly effective for finding trees with high likelihood. Results suggest that when implemented in the program Leaphy these algorithms perform favourably relative to other existing programs.

Comparative genomics [Details]
The aim of comparative genomics is to infer biologically interesting information, usually in the form of functional annotation, from homologous genomic regions. The application of statistical models to this problem has been particularly successful, and they are now frequently used to identify conserved regions of genomes. My research focuses on developing new models that may lead to the identification of new elements and/or the grouping of existing elements by their evolutionary properties.

PANDIT [link]
An objective comparison of methodology in molecular evolution requires its assessment over large amounts of data. PANDIT is a database of homologous sequence alignments accompanied by estimates of their corresponding phylogenetic trees. Currently in version 17.0, it comprises 7738 families of homologous protein domains; for each family, DNA and corresponding amino acid sequence multiple alignments are available together with high quality phylogenetic tree estimates.
 

Last modified: 22 May, 2007