![]() ![]() The answer to the first question is “usually.” It is possible that, if Q 1 and Q 2 are very different, then Q ^ may not be stochastic, that is, Q ^ may have some negative or even complex off-diagonal entries. If Q 1 and Q 2 are in a particular model class and (1) is true, will Q ^ necessarily belong to the same class? Is there a single rate matrix Q ^ that can be used to describe the process from t = 0 to t = t 1 + t 2 as a homogeneous Markov process with e Q ^ ( t 1 + t 2 ) = M ^? 2. So far, so good, but then two questions naturally arise: 1. If Q 1 and Q 2 belong to a particular model class, Q − is the rate matrix in the same model class subject to the condition that e Q − ( t 1 + t 2 ) be as close to e Q 1 t 1 e Q 2 t 2 as possible.Īs a consequence of the Markov assumption for the overall process, the substitution matrix that describes the probability of substitutions between time t = 0 and t = t 1 + t 2 can be expressed as the matrix product M ^ = M 1 M 2. This scenario is illustrated in Figure 1.Ĭan an heterogeneous process with a single disruption be represented as a homogeneous process? The rate matrix Q ^ satisfies e Q ^ ( t 1 + t 2 ) = e Q 1 t 1 e Q 2 t 2. Then a disruption occurs, and over time t 2, the sequence again evolves under a time-homogeneous Markov process but with a different rate matrix Q 2 governing the substitution rates, so that the corresponding substitution matrix for this time period is M 2 = e Q 2 t 2. For time t = 0 to t 1, the sequence evolves under a time-homogeneous Markov process with substitutions governed by a rate matrix Q 1 whose ijth entry is the rate at which state i changes to state j, so that the corresponding probability substitution matrix is given by M 1 = e Q 1 t 1. Consider a single molecular sequence where each site evolves under the same, albeit heterogeneous, Markov process (we assume that sites evolve independently and under identical conditions). To exhibit that GTR could be causing model misspecification, we proceed as follows. Kolaczkowski and Thornton (2004) looked at the effect on phylogenetic accuracy when different partitions of the data have different branch lengths (heterotachy). ![]() (2010) showed the potential for phylogenetic error in these scenarios. (2000) found instances of covarion evolution where the sites that are free to vary are different in distinct lineages, and Grievink et al. (2004) have considered the effect of ignoring changing base composition across the tree. For example, Galtier (2004) considers the effect of nonindependence of sites, whereas Galtier and Gouy (1995) and Jermiin et al. The phylogenetics literature is rich with other examples of model misspecification, each with the potential to cause problems for inference. It is the purpose of this article to give examples that demonstrate why this lack of closure may pose a problem for phylogenetic analysis and thus add GTR to the growing list of factors that are known to cause model misspecification in phylogenetics. In mathematical terms, the problem is simple: matrix multiplication of two GTR substitution matrices does not return another GTR matrix. 2012) shows that GTR, along with several other commonly used models, has an undesirable mathematical property that may be a cause of concern for the thoughtful phylogeneticist. However, a recent publication ( Sumner et al.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |