Identifying and counting sequence sites
GeneticVariation.jl extends the site-counting methods in BioSequences.jl, using the same fast bit-parallel techniques to rapidly compute the numbers of different types of mutations between two large biological sequences. Such computation is required for many population genetic analyses of variation, such as computation of evolutionary distances.
Types of site added
GeneticVariation.Conserved
— TypeA Conserved
site describes a site where two aligned nucleotides are definately conserved. By definately conserved this means that the symbols of the site are non-ambiguity symbols, and they are the same symbol.
GeneticVariation.Mutated
— TypeA Mutated
site describes a site where two aligned nucleotides are definately mutated. By definately mutated this means that the symbols of the site are non-ambiguity symbols, and they are not the same symbol.
GeneticVariation.Segregating
— TypeSegregating
sites are positions which show differences (polymorphisms) between related genes in a sequence alignment (are not conserved). Segregating sites include conservative, semi-conservative and non-conservative mutations.
See the site-counting section of the BioSequences.jl documentation to see how to use the count
and count_pairwise
methods to count different types of site.