Accessing and modifying annotations
Feature
Features (genes) can be added using addgene!. A feature must have a feature name and a locus (position), and can have any number of additional qualifiers associated with it (see next section).
GenomicAnnotations.addgene! — Functionaddgene!(chr::Record, feature, locus; kw...)Add gene to chr. locus can be a Locus, a UnitRange, or a StepRange (for decreasing ranges, which will be annotated on the complementary strand).
Example
addgene!(chr, "CDS", 1:756;
locus_tag = "gene0001",
product = "Chromosomal replication initiator protein dnaA")After adding a new feature, sort! can be used to make sure that the annotations are stored (and printed) in the order in which they occur on the chromosome:
sort!(chr)Existing features can be removed using delete!:
Base.delete! — Methoddelete!{T}(h::MutableBinaryHeap{T}, i::Int)Deletes the element with handle i from heap h .
delete!(collection, key)Delete the mapping for the given key in a collection, and return the collection.
Examples
julia> d = RobinDict("a"=>1, "b"=>2)
RobinDict{String,Int64} with 2 entries:
"b" => 2
"a" => 1
julia> delete!(d, "b")
RobinDict{String,Int64} with 1 entry:
"a" => 1delete!(tree::RBTree, key)Deletes key from tree, if present, else returns the unmodified tree.
delete!(gene::AbstractGene)Delete gene from parent(gene). Warning: does not work when broadcasted! Use delete!(::AbstractVector{Gene}) instead.
Base.delete! — Methoddelete!(genes::AbstractArray{Gene, 1})Delete all genes in genes from parent(genes[1]).
Example
delete!(@genes(chr, length(gene) <= 60))Qualifiers
Features can have multiple qualifiers, which can be modified using Julia's property syntax:
# Remove newspace from gene product descriptions
for gene in @genes(chr, CDS)
replace!(gene.product, '\n' => ' ')
endProperties also work on views of genes, typically generated using @genes:
interestinggenes = readlines("/path/to/list/of/interesting/genes.txt")
@genes(chr, CDS, :locus_tag in interestinggenes).interesting .= trueSometimes features have multiple instances of the same qualifier, such genes having several EC-numbers. Assigning qualifiers with property syntax overwrites any data that was previously stored for that feature, and trying to assign a vector of values to a qualifier that is currently storing scalars will result in an error, so to safely assign qualifiers that might have more instances one can use pushproperty!:
GenomicAnnotations.pushproperty! — Functionpushproperty!(gene::AbstractGene, qualifier::Symbol, value::T)Add a property to gene, similarly to Base.setproperty!(::gene), but if the property is not missing in gene, it will be transformed to store a vector instead of overwriting existing data.
julia> eltype(chr.genedata[!, :EC_number])
Union{Missing,String}
julia> chr.genes[1].EC_number = "EC:1.2.3.4"
"EC:1.2.3.4"
julia> pushproperty!(chr.genes[1], :EC_number, "EC:4.3.2.1"); chr.genes[1].EC_number
2-element Array{String,1}:
"EC:1.2.3.4"
"EC:4.3.2.1"
julia> eltype(chr.genedata[!, :EC_number])
Union{Missing, Array{String,1}}Accessing properties that haven't been stored will return missing. For this reason, it often makes more sense to use get() than to access the property directly.
# chr.genes[2].pseudo returns missing, so this will throw an error
if chr.genes[2].pseudo
println("Gene 2 is a pseudogene")
end
# ... but this works:
if get(chr.genes[2], :pseudo, false)
println("Gene 2 is a pseudogene")
endSequences
The sequence of a Chromosome chr is stored in chr.sequence. Sequences of individual features can be read with sequence:
GenomicAnnotations.sequence — Methodsequence(gene::AbstractGene; translate = false)Return genomic sequence for gene. If translate is true, the sequence will be translated to a LongAA, excluding the stop, otherwise it will be returned as a LongDNA{4} (including the stop codon). ```