Accessing and modifying annotations
Feature
Features (genes) can be added using addgene!
. A feature must have a feature name and a locus (position), and can have any number of additional qualifiers associated with it (see next section).
GenomicAnnotations.addgene!
— Functionaddgene!(chr::Record, feature, locus; kw...)
Add gene to chr
. locus
can be a Locus
, a UnitRange, or a StepRange (for decreasing ranges, which will be annotated on the complementary strand).
Example
addgene!(chr, "CDS", 1:756;
locus_tag = "gene0001",
product = "Chromosomal replication initiator protein dnaA")
After adding a new feature, sort!
can be used to make sure that the annotations are stored (and printed) in the order in which they occur on the chromosome:
sort!(chr)
Existing features can be removed using delete!
:
Base.delete!
— Methoddelete!{T}(h::MutableBinaryHeap{T}, i::Int)
Deletes the element with handle i
from heap h
.
delete!(collection, key)
Delete the mapping for the given key in a collection, and return the collection.
Examples
julia> d = RobinDict("a"=>1, "b"=>2)
RobinDict{String,Int64} with 2 entries:
"b" => 2
"a" => 1
julia> delete!(d, "b")
RobinDict{String,Int64} with 1 entry:
"a" => 1
delete!(tree::RBTree, key)
Deletes key
from tree
, if present, else returns the unmodified tree.
delete!(gene::AbstractGene)
Delete gene
from parent(gene)
. Warning: does not work when broadcasted! Use delete!(::AbstractVector{Gene}) instead.
Base.delete!
— Methoddelete!(genes::AbstractArray{Gene, 1})
Delete all genes in genes
from parent(genes[1])
.
Example
delete!(@genes(chr, length(gene) <= 60))
Qualifiers
Features can have multiple qualifiers, which can be modified using Julia's property syntax:
# Remove newspace from gene product descriptions
for gene in @genes(chr, CDS)
replace!(gene.product, '\n' => ' ')
end
Properties also work on views of genes, typically generated using @genes
:
interestinggenes = readlines("/path/to/list/of/interesting/genes.txt")
@genes(chr, CDS, :locus_tag in interestinggenes).interesting .= true
Sometimes features have multiple instances of the same qualifier, such genes having several EC-numbers. Assigning qualifiers with property syntax overwrites any data that was previously stored for that feature, and trying to assign a vector of values to a qualifier that is currently storing scalars will result in an error, so to safely assign qualifiers that might have more instances one can use pushproperty!
:
GenomicAnnotations.pushproperty!
— Functionpushproperty!(gene::AbstractGene, qualifier::Symbol, value::T)
Add a property to gene
, similarly to Base.setproperty!(::gene)
, but if the property is not missing in gene
, it will be transformed to store a vector instead of overwriting existing data.
julia> eltype(chr.genedata[!, :EC_number])
Union{Missing,String}
julia> chr.genes[1].EC_number = "EC:1.2.3.4"
"EC:1.2.3.4"
julia> pushproperty!(chr.genes[1], :EC_number, "EC:4.3.2.1"); chr.genes[1].EC_number
2-element Array{String,1}:
"EC:1.2.3.4"
"EC:4.3.2.1"
julia> eltype(chr.genedata[!, :EC_number])
Union{Missing, Array{String,1}}
Accessing properties that haven't been stored will return missing. For this reason, it often makes more sense to use get()
than to access the property directly.
# chr.genes[2].pseudo returns missing, so this will throw an error
if chr.genes[2].pseudo
println("Gene 2 is a pseudogene")
end
# ... but this works:
if get(chr.genes[2], :pseudo, false)
println("Gene 2 is a pseudogene")
end
Sequences
The sequence of a Chromosome
chr
is stored in chr.sequence
. Sequences of individual features can be read with sequence
:
GenomicAnnotations.sequence
— Methodsequence(gene::AbstractGene; translate = false)
Return genomic sequence for gene
. If translate
is true
, the sequence will be translated to a LongAA
, excluding the stop, otherwise it will be returned as a LongDNA{4}
(including the stop codon). ```