Working with haplotypes
Calling variants
The first step in working with sequence variation is to identify (call) variations between two sequences. SequenceVariation can directly call variants using the Haplotype(::PairwiseAlignment) constructor of the Haplotype type.
julia> using SequenceVariation, BioAlignments, BioSequencesjulia> bovine = dna"GACCGGCTGCATTCGAGGCTGCCAGCAAGCAG";julia> ovine = dna"GACCGGCTGCATTCGAGGCTGTCAGCAAACAG";julia> human = dna"GACAGGCTGCATCAGAAGAGGCCATCAAGCAG";julia> bos_ovis_alignment = PairwiseAlignment(AlignedSequence(ovine, Alignment("32M", 1, 1)), bovine);julia> bos_human_alignment = PairwiseAlignment(AlignedSequence(human, Alignment("32M", 1, 1)), bovine);julia> bos_ovis_haplotype = Haplotype(bos_ovis_alignment)Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 2 edits: C22T G29Ajulia> bos_human_haplotype = Haplotype(bos_human_alignment)Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 7 edits: C4A T13C C14A G17A C19A T20G G25T
Sequence reconstruction
If the alternate sequence of a haplotype is no longer available (as is often the case when calling variants from alignment files), then the sequence can be retrieved using the reconstruct function.
julia> human2 = reconstruct(bos_human_haplotype)32nt DNA Sequence: GACAGGCTGCATCAGAAGAGGCCATCAAGCAGjulia> human2 == bovinefalsejulia> human2 == humantrue
Reference switching
All variations within a haplotype can be mapped to a new reference sequence given an alignment between the new and old references using the translate function. This could be useful if variants were called against a reference sequence for the entire species, but need to be analyzed as variants of a subtype later.
julia> ovis_human_alignment = PairwiseAlignment(AlignedSequence(human, Alignment("32M", 1, 1)), ovine)BioAlignments.PairwiseAlignment{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}}: seq: 1 GACAGGCTGCATCAGAAGAGGCCATCAAGCAG 32 ||| |||||||| || | | || ||| ||| ref: 1 GACCGGCTGCATTCGAGGCTGTCAGCAAACAG 32julia> SequenceVariation.translate(bos_ovis_haplotype, ovis_human_alignment)Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 2 edits: C22T G29A