edit

Submodule: APE

The APE submodule provides compatibility with the R package called APE.

APE is an R package for phylogenetic and evolutionary analyses.

Currently compatibility with the following R classes is provided:

DNAbin

APE provides a DNAbin class which represents DNA sequences using a byte per nucleotide, with a specific binary encoding. We currently provide support for the matrix form of DNAbin.

The BioBridgeR.APE provides a bitstype of DNAbin which inherits from the abstract type NucleicAcid from BioSymbols.jl. BioBridgeR.APE.DNAbin variables then can be created and used to represent nucleic acid data, just as you can with the DNA and RNA defined in BioSymbols.jl. They work with the methods defined in BioSymbols.jl:

julia> using BioSequences, RCall, BioBridgeR.APE
WARNING: Method definition ==(Base.Nullable{S}, Base.Nullable{T}) in module Base at nullable.jl:238 overwritten in module NullableArrays at /home/travis/.julia/v0.6/NullableArrays/src/operators.jl:128.
WARNING: Method definition promote_rule(Type{IntervalSets.ClosedInterval{T}}, Type{IntervalSets.ClosedInterval{S}}) in module IntervalSets at /home/travis/.julia/v0.6/IntervalSets/src/closed.jl:86 overwritten in module AxisArrays at /home/travis/.julia/v0.6/AxisArrays/src/intervals.jl:23.

julia> ispurine(DNAbin_A)
true

julia> complement(DNAbin_G)
DNAbin_C

DNAbin symbols can also be created from text and converted to text just as DNA and RNA symbols can be.

julia> DNAbin('t')
DNAbin_T

julia> Char(DNAbin_Gap)
'-': ASCII/Unicode U+002d (category Pd: Punctuation, dash)

DNAbin symbols can also be created from text and converted to text just as DNA and RNA symbols can be.

julia> DNAbin(DNA_N)
DNAbin_N

julia> DNA(DNA_R)
DNA_R

DNA sequences in R, represented as a DNAbin object can be transferred into a julia as an Array{DNAbin,2}, or (by default) as a BioSequence{DNAAlphabet{4}} type as defined in BioSequences.jl:

julia> R"""
       library(ape)
       data(woodmouse)
       """
RCall.RObject{RCall.StrSxp}
[1] "woodmouse"

julia> @rget woodmouse::Array{DNAbin,2}
15×965 Array{BioBridgeR.APE.DNAbin,2}:
 DNAbin_N  DNAbin_T  DNAbin_T  DNAbin_C  …  DNAbin_A  DNAbin_T  DNAbin_A
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_G  DNAbin_T  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_A  DNAbin_T  DNAbin_A
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C  …  DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C  …  DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_N  DNAbin_N  DNAbin_N  DNAbin_N     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_A  DNAbin_T  DNAbin_T  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N
 DNAbin_N  DNAbin_N  DNAbin_N  DNAbin_C     DNAbin_N  DNAbin_N  DNAbin_N

julia> @rget woodmouse
15-element Array{BioSequences.BioSequence{BioSequences.DNAAlphabet{4}},1}:
 NTTCGAAAAACACACCCACTACTAAAANTTATCAGTCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAGACCCTATA
 ATTCGAAAAACACACCCACTACTAAAAATTATCAACCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCTGTN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCTATA
 ATTCGAAAAACACACCCACTACTAAAAATCATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAATACNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAACCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATTAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 ATTCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN
 NNNCGAAAAACACACCCACTACTAAAAATTATCAATCAC…ACGCAGCCTAATATTCCGCCCAATTACTCAAACCCNNNN

Conversely, Array{DNAbin, 2} and BioSequence{DNAAlphabet{4}} variables can be transferred over from the julia session to the R session as a DNAbin object. Note that currently you have to explicitly specify that it must be converted to a RawSxp (the structure R uses to store binary data as DNAbin does, indeed DNAbin variables are arrays of bytes with specific behaviour).

julia> sequences = DNASequence[dna"AAAAA", dna"TTTTT", dna"CCCCC", dna"GGGGG"]
4-element Array{BioSequences.BioSequence{BioSequences.DNAAlphabet{4}},1}:
 AAAAA
 TTTTT
 CCCCC
 GGGGG

julia> @rput sequences::RawSxp
Ptr{RCall.RawSxp} @0x00000000063564e8

julia> R"sequences"
RCall.RObject{RCall.RawSxp}
4 DNA sequences in binary format stored in a matrix.

All sequences of same length: 5

Labels:

Base composition:
   a    c    g    t
0.25 0.25 0.25 0.25