Accessing and modifying annotations

Feature

Features (genes) can be added using addgene!. A feature must have a feature name and a locus (position), and can have any number of additional qualifiers associated with it (see next section).

GenomicAnnotations.addgene!Function
addgene!(chr::Record, feature, locus; kw...)

Add gene to chr. locus can be a Locus, a UnitRange, or a StepRange (for decreasing ranges, which will be annotated on the complementary strand).

Example

addgene!(chr, "CDS", 1:756;
    locus_tag = "gene0001",
    product = "Chromosomal replication initiator protein dnaA")
source

After adding a new feature, sort! can be used to make sure that the annotations are stored (and printed) in the order in which they occur on the chromosome:

sort!(chr)

Existing features can be removed using delete!:

Base.delete!Method
delete!{T}(h::MutableBinaryHeap{T}, i::Int)

Deletes the element with handle i from heap h .

delete!(collection, key)

Delete the mapping for the given key in a collection, and return the collection.

Examples

julia> d = RobinDict("a"=>1, "b"=>2)
RobinDict{String,Int64} with 2 entries:
  "b" => 2
  "a" => 1

julia> delete!(d, "b")
RobinDict{String,Int64} with 1 entry:
  "a" => 1
delete!(tree::RBTree, key)

Deletes key from tree, if present, else returns the unmodified tree.

delete!(gene::AbstractGene)

Delete gene from parent(gene). Warning: does not work when broadcasted! Use delete!(::AbstractVector{Gene}) instead.

source
Base.delete!Method
delete!(genes::AbstractArray{Gene, 1})

Delete all genes in genes from parent(genes[1]).

Example

delete!(@genes(chr, length(gene) <= 60))
source

Qualifiers

Features can have multiple qualifiers, which can be modified using Julia's property syntax:

# Remove newspace from gene product descriptions
for gene in @genes(chr, CDS)
    replace!(gene.product, '\n' => ' ')
end

Properties also work on views of genes, typically generated using @genes:

interestinggenes = readlines("/path/to/list/of/interesting/genes.txt")
@genes(chr, CDS, :locus_tag in interestinggenes).interesting .= true

Sometimes features have multiple instances of the same qualifier, such genes having several EC-numbers. Assigning qualifiers with property syntax overwrites any data that was previously stored for that feature, and trying to assign a vector of values to a qualifier that is currently storing scalars will result in an error, so to safely assign qualifiers that might have more instances one can use pushproperty!:

GenomicAnnotations.pushproperty!Function
pushproperty!(gene::AbstractGene, qualifier::Symbol, value::T)

Add a property to gene, similarly to Base.setproperty!(::gene), but if the property is not missing in gene, it will be transformed to store a vector instead of overwriting existing data.

julia> eltype(chr.genedata[!, :EC_number])
Union{Missing,String}

julia> chr.genes[1].EC_number = "EC:1.2.3.4"
"EC:1.2.3.4"

julia> pushproperty!(chr.genes[1], :EC_number, "EC:4.3.2.1"); chr.genes[1].EC_number
2-element Array{String,1}:
 "EC:1.2.3.4"
 "EC:4.3.2.1"

julia> eltype(chr.genedata[!, :EC_number])
Union{Missing, Array{String,1}}
source

Accessing properties that haven't been stored will return missing. For this reason, it often makes more sense to use get() than to access the property directly.

# chr.genes[2].pseudo returns missing, so this will throw an error
if chr.genes[2].pseudo
    println("Gene 2 is a pseudogene")
end

# ... but this works:
if get(chr.genes[2], :pseudo, false)
    println("Gene 2 is a pseudogene")
end

Sequences

The sequence of a Chromosome chr is stored in chr.sequence. Sequences of individual features can be read with sequence:

GenomicAnnotations.sequenceMethod
sequence(gene::AbstractGene; translate = false)

Return genomic sequence for gene. If translate is true, the sequence will be translated to a LongAA, excluding the stop, otherwise it will be returned as a LongDNA{4} (including the stop codon). ```

source