next up previous contents
Next: Acknowledgements Up: Introduction Previous: Introduction   Contents

Why is PHASE different from other phylogenetic programs?

This package is designed specifically for use with RNA sequences that have a conserved secondary structure, e.g., rRNA and tRNA. It is well known that compensatory substitutions occur in the paired regions of RNA secondary structures; this means that substitutions occurring on one side of a pair are correlated with substitutions on the other side. Most phylogenetic programs assume that each site in a molecule evolves independently of the others but this assumption is not valid for RNA genes.

Substitution models of sequence evolution that consider pairs of sites rather than single sites are implemented in this package along with standard nucleotides substitution models used nowadays. When a RNA molecule with a secondary structure is used in conjunction with a RNA substitution model, PHASE requires a structure-based alignment of the sequences with the consensus secondary structure indicated in bracket and dot notation at the top of the alignment. We assume that you can provide this structure.

It is now commonplace to perform combined analyses of heterogeneous sequence data when nucleotides with diffent patterns of evolution are sequenced for a set of studied species. It is possible to use several substitution models simultaneously with PHASE (for paired and/or unpaired sites) when analysing protein coding genes or when stems and loops of RNA genes are used.

PHASE provides a Markov Chain Monte Carlo sampler to generate large numbers of possible phylogenetic trees with probability proportional to their likelihood. This is a Bayesian statistical method that allows posterior probabilities to be generated for alternative trees and alternative clades. These posterior probabilities provide a sound statistical measure of support of alternative phylogenetic hypotheses, and they remove the need for bootstrapping. Where many alternative arrangements of a given set of species exist, it is possible to calculate posterior probabilities for all the alternative arrangements of these species in a convenient way.

Standard Maximum Likelihood techniques for inferring the optimal tree with any of the DNA or RNA evolution models are also implemented.

The program's features include:

* Bayesian estimation of phylogenies and substitution model parameters
* standard ML search algorithms for inferring the optimal tree with optional topology constraints
* 6, 7 and 16 state RNA models
* standard 4 state DNA models
* invariant and discrete gamma model for substitution rate heterogeneity between sites
* mixing of molecular data types in a single analysis


Journal publications :

* C. Hudelot, V. Gowri-Shankar, H. Jow, M. Rattray and P. Higgs. ``RNA-based Phylogenetic Methods: Application to Mammalian Mitochondrial RNA Sequences''. Molecular Phylogenetics and Evolution (in press, 2003).
* H. Jow, C. Hudelot, M. Rattray and P. Higgs. ``Bayesian phylogenetics using an RNA substitution model applied to early mammalian evolution''. Molecular Biology and Evolution, 19(9):1591-1601 (2002).

next up previous contents
Next: Acknowledgements Up: Introduction Previous: Introduction   Contents
Gowri-Shankar Vivek 2003-04-24