Generating random sequences
Long sequences
You can generate random long sequences using the randdna function and the Sampler's implemented in BioSequences:
BioSequences.randseq — Functionrandseq([rng::AbstractRNG], A::Alphabet, len::Integer)Generate a LongSequence{A} of length len from the specified alphabet, drawn from the default distribution. User-defined alphabets should implement this method to implement random LongSequence generation.
For RNA and DNA alphabets, the default distribution is uniform across A, C, G, and T/U. For AminoAcidAlphabet, it is uniform across the 20 standard amino acids. For a user-defined alphabet A, default is uniform across all elements of symbols(A).
Example:
julia> seq = randseq(AminoAcidAlphabet(), 50)
50aa Amino Acid Sequence:
VFMHSIRMIRLMVHRSWKMHSARHVNFIRCQDKKWKSADGIYTDICKYSMrandseq([rng::AbstractRNG], A::Alphabet, sp::Sampler, len::Integer)Generate a LongSequence{A} of length len with elements drawn from the given sampler.
Example:
# Generate 1000-length RNA with 4% chance of N, 24% for A, C, G, or U
julia> sp = SamplerWeighted(rna"ACGUN", fill(0.24, 4))
julia> seq = randseq(RNAAlphabet{4}(), sp, 50)
50nt RNA Sequence:
CUNGGGCCCGGGNAAACGUGGUACACCCUGUUAAUAUCAACNNGCGCUNUBioSequences.randdnaseq — Functionranddnaseq([rng::AbstractRNG], len::Integer)Generate a random LongSequence{DNAAlphabet{4}} sequence of length len, with bases sampled uniformly from [A, C, G, T]
BioSequences.randrnaseq — Functionrandrnaseq([rng::AbstractRNG], len::Integer)Generate a random LongSequence{RNAAlphabet{4}} sequence of length len, with bases sampled uniformly from [A, C, G, U]
BioSequences.randaaseq — Functionrandaaseq([rng::AbstractRNG], len::Integer)Generate a random LongSequence{AminoAcidAlphabet} sequence of length len, with amino acids sampled uniformly from the 20 standard amino acids.
BioSequences.SamplerUniform — TypeSamplerUniform{T}Uniform sampler of type T. Instantiate with a collection of eltype T containing the elements to sample.
Examples
julia> sp = SamplerUniform(rna"ACGU");BioSequences.SamplerWeighted — TypeSamplerWeighted{T}Weighted sampler of type T. Instantiate with a collection of eltype T containing the elements to sample, and an orderen collection of probabilities to sample each element except the last. The last probability is the remaining probability up to 1.
Examples
julia> sp = SamplerWeighted(rna"ACGUN", fill(0.2475, 4));