Random genration of Kmers
Kmers.jl supports efficient random generation of kmers through a package extension, when the Random
stdlib is loaded.
Sampling a symbol from a kmer
julia> m = mer"PLQWHY"a;
julia> rand(m)
AA_L
julia> rand(m)
AA_H
Sampling a random kmer type
When sampling a random kmer from a type, kmers are sampled from the following distributions:
- For
T <: AAKmer
, the kmer symbols is uniformly sampled from the 20 standard proteogenic amino acidsACDEFGHIKLMNPQRSTVWY
. This excludes amino acids such as selenocysteine, and pseudo-aas such asAA_Term
andAA_Gap
- For
T <: DNA{<:NucleicAcidAlphabet{4}}
, the kmer symbols is uniformly sampled from the 4 unambiguous nucleotides. E.g. forT <: Kmer{DNAAlphabet{4}}
, the only symbols areA
,C
,G
andT
. - For all other kmer types, the symbols are sampled uniformly from all symbols in its alphabet.
You can sample from either a concrete type, or a type with the trailing N
type parameter elided:
julia> rand(DNAKmer{9})
DNA 9-mer:
TGGGAGCCA
julia> rand(AAKmer{4, 1})
AminoAcid 4-mer:
RRIP