Q and P Matrices

Evolutionary analyses of sequences are conducted on a wide variety of time scales.

Thus, it is convenient to express these models in terms of the instantaneous rates of change between different states. This representation of the model is typically called the model's Q Matrix.

SubstitutionModels.Q — Function

Generate a Q matrix for a NucleicAcidSubstitutionModel, of the form:

\[Q = \begin{bmatrix} Q_{A, A} & Q_{A, C} & Q_{A, G} & Q_{A, T} \\ Q_{C, A} & Q_{C, C} & Q_{C, G} & Q_{C, T} \\ Q_{G, A} & Q_{G, C} & Q_{G, G} & Q_{G, T} \\ Q_{T, A} & Q_{T, C} & Q_{T, G} & Q_{T, T} \end{bmatrix}\]

Call as either

Q(model), or
Q(model, bool)

Form (2) scales the matrix so that $_π(model) ⋅ -diag(Q(model)) = 1$ when bool=true. Q(model, false) is equivalent to Q(model).

source

If we are given a starting state at one position in a DNA sequence, the model's Q matrix and a branch length expressing the expected number of changes to have occurred since the ancestor, then we can derive the probability of the descendant sequence having each of the four states.

This transformation from the instantaneous rate matrix (Q Matrix), to a probability matrix for a given time period (P Matrix), is described here.

SubstitutionModels.P — Function

Generate a P matrix for a NucleicAcidSubstitutionModel, of the form:

\[P = \begin{bmatrix} P_{A, A} & P_{A, C} & P_{A, G} & P_{A, T} \\ P_{C, A} & P_{C, C} & P_{C, G} & P_{C, T} \\ P_{G, A} & P_{G, C} & P_{G, G} & P_{G, T} \\ P_{T, A} & P_{T, C} & P_{T, G} & P_{T, T} \end{bmatrix}\]

for specified time

Call as either

P(model, t), or
P(model, t, bool)

Form (2) obtains its probabilities from the scaled Q matrix if bool=true. Branch lengths estimated from a scaled P matrix are in units of expected number of substitutions per site. P(model, t, false) is equivalent to P(model, t).

source