BED
Description
BED is a text-based file format for representing genomic annotations like genes, transcripts, and so on. A BED file has tab-delimited and variable-length fields; the first three fields denoting a genomic interval are mandatory.
This is an example of RNA transcripts:
chr9 68331023 68424451 NM_015110 0 +
chr9 68456943 68486659 NM_001206 0 -I/O tools for BED are provided from the GenomicFeatures.BED module, which exports following three types:
- Reader type:
BED.Reader - Writer type:
BED.Writer - Element type:
BED.Record
Examples
Here is a common workflow to iterate over all records in a BED file:
# Import the BED module.
using GenomicFeatures
# Open a BED file.
reader = open(BED.Reader, "data.bed")
# Iterate over records.
for record in reader
# Do something on record (see Accessors section).
chrom = BED.chrom(record)
# ...
end
# Finally, close the reader.
close(reader)If you repeatedly access records within specific ranges, it would be more efficient to construct an IntervalCollection object from a BED reader:
# Create an interval collection in memory.
icol = open(BED.Reader, "data.bed") do reader
IntervalCollection(reader)
end
# Query overlapping records.
for interval in eachoverlap(icol, Interval("chrX", 40001, 51500))
# A record is stored in the metadata field of an interval.
record = metadata(interval)
# ...
endAPI
GenomicFeatures.BED.Reader — TypeBED.Reader(input::IO; index=nothing)
BED.Reader(input::AbstractString; index=:auto)Create a data reader of the BED file format.
The first argument specifies the data source. When it is a filepath that ends with .bgz, it is considered to be block compression file format (BGZF) and the function will try to find a tabix index file (<filename>.tbi) and read it if any. See http://www.htslib.org/doc/tabix.html for bgzip and tabix tools.
Arguments
input: data sourceindex: path to a tabix file
GenomicFeatures.BED.Writer — TypeBED.Writer(output::IO)Create a data writer of the BED file format.
Arguments:
output: data sink
GenomicFeatures.BED.Record — TypeBED.Record()Create an unfilled BED record.
BED.Record(data::Vector{UInt8})Create a BED record object from data.
This function verifies and indexes fields for accessors. Note that the ownership of data is transferred to a new record object.
BED.Record(str::AbstractString)Create a BED record object from str.
This function verifies and indexes fields for accessors.
GenomicFeatures.BED.chrom — Functionchrom(record::Record)::StringGet the chromosome name of record.
GenomicFeatures.BED.chromstart — Functionchromstart(record::Record)::IntGet the starting position of record.
Note that the first base is numbered 1.
GenomicFeatures.BED.chromend — Functionchromend(record::Record)::IntGet the end position of record.
GenomicFeatures.BED.name — Functionname(record::Record)::StringGet the name of record.
GenomicFeatures.BED.score — Functionscore(record::Record)::IntGet the score between 0 and 1000.
GenomicFeatures.BED.strand — Functionstrand(record::Record)::GenomicFeatures.StrandGet the strand of record.
GenomicFeatures.BED.thickstart — Functionthickstart(record::Record)::IntGet the starting position at which record is drawn thickly.
Note that the first base is numbered 1.
GenomicFeatures.BED.thickend — Functionthickend(record::Record)::IntGet the end position at which record is drawn thickly.
GenomicFeatures.BED.itemrgb — Functionitemrgb(record::Record)::ColorTypes.RGBGet the RGB value of record.
The return type is defined in ColorTypes.jl.
GenomicFeatures.BED.blockcount — Functionblockcount(record::Record)::IntGet the number of blocks (exons) in record.
GenomicFeatures.BED.blocksizes — Functionblocksizes(record::Record)::Vector{Int}Get the block (exon) sizes of record.
GenomicFeatures.BED.blockstarts — Functionblockstarts(record::Record)::Vector{Int}Get the block (exon) starts of record.
Note that the first base is numbered 1.