Working with haplotypes
Calling variants
The first step in working with sequence variation is to identify (call) variations between two sequences. SequenceVariation can directly call variants using the Haplotype(::PairwiseAlignment)
constructor of the Haplotype
type.
julia> using SequenceVariation, BioAlignments, BioSequences
julia> bovine = dna"GACCGGCTGCATTCGAGGCTGCCAGCAAGCAG";
julia> ovine = dna"GACCGGCTGCATTCGAGGCTGTCAGCAAACAG";
julia> human = dna"GACAGGCTGCATCAGAAGAGGCCATCAAGCAG";
julia> bos_ovis_alignment = PairwiseAlignment(AlignedSequence(ovine, Alignment("32M", 1, 1)), bovine);
julia> bos_human_alignment = PairwiseAlignment(AlignedSequence(human, Alignment("32M", 1, 1)), bovine);
julia> bos_ovis_haplotype = Haplotype(bos_ovis_alignment)
Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 2 edits: C22T G29A
julia> bos_human_haplotype = Haplotype(bos_human_alignment)
Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 7 edits: C4A T13C C14A G17A C19A T20G G25T
Sequence reconstruction
If the alternate sequence of a haplotype is no longer available (as is often the case when calling variants from alignment files), then the sequence can be retrieved using the reconstruct
function.
julia> human2 = reconstruct(bos_human_haplotype)
32nt DNA Sequence: GACAGGCTGCATCAGAAGAGGCCATCAAGCAG
julia> human2 == bovine
false
julia> human2 == human
true
Reference switching
All variations within a haplotype can be mapped to a new reference sequence given an alignment between the new and old references using the translate
function. This could be useful if variants were called against a reference sequence for the entire species, but need to be analyzed as variants of a subtype later.
julia> ovis_human_alignment = PairwiseAlignment(AlignedSequence(human, Alignment("32M", 1, 1)), ovine)
BioAlignments.PairwiseAlignment{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}}: seq: 1 GACAGGCTGCATCAGAAGAGGCCATCAAGCAG 32 ||| |||||||| || | | || ||| ||| ref: 1 GACCGGCTGCATTCGAGGCTGTCAGCAAACAG 32
julia> SequenceVariation.translate(bos_ovis_haplotype, ovis_human_alignment)
Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 2 edits: C22T G29A