|
|
Lecturer Faculty of Life Sciences, The University
of Manchester Oxford Road, Manchester, M13 9PT, UK
Tel: +44-(0)161-3068901 e-mail: simon.whelan@manchester.ac.uk |
 |
Research interests
The ultimate goal of my research is to learn how the selective and
mutational pressures acting on molecular sequences relate to their
function, and how the linear information held in the hereditary material
translates to the complex organismal phenotypes observed in the natural
world. It is difficult to understand the selective and mutational
pressures acting on a sequence when observing it in only a single species.
Studying the patterns of change between multiple species can be highly
informative and much of my research consists of developing new
statistical methods for making these comparisons.
Statistical models of evolution
[Details]
The rates that different characters (e.g. DNA or amino acids) replace each
other in biological sequences are described using Markov models. Whilst
models are not expected to explain the full complexities of sequence
evolution, making them more reflective of reality will improve tree
estimates and provide considerable insights into how molecules evolve. My
research has improved models describing protein evolution, and
demonstrated that large-scale mutations affecting multiple nucleotides are
more frequent than previously thought.
Phylogenetic tree reconstruction [Details]
Using
patterns of change to infer the evolutionary relationships between
sequences is a
common task in comparative sequence analysis. The maximum likelihood of a
tree is indicative of how well it describes the evolution of the
sequences. I am currently developing new algorithms that are highly
effective for finding trees with high likelihood. Results suggest
that when implemented in the program Leaphy these algorithms perform favourably relative to other
existing programs.
Comparative genomics
[Details]
The aim of comparative genomics is
to infer biologically interesting information, usually in the form of
functional annotation, from homologous genomic regions. The application of
statistical models to this problem has been particularly successful, and
they are now frequently used to identify conserved regions of genomes. My
research focuses on developing new models that may lead to the
identification of new elements and/or the grouping of existing elements by
their evolutionary properties.
PANDIT
[link]
An objective comparison of
methodology in molecular evolution requires its assessment over large
amounts of data. PANDIT is a database of homologous sequence alignments
accompanied by estimates of their corresponding phylogenetic trees.
Currently in version 17.0, it comprises 7738 families of homologous
protein domains; for each family, DNA and corresponding amino acid
sequence multiple alignments are available together with high quality
phylogenetic tree estimates.
|