BED
Description
BED is a text-based file format for representing genomic annotations like genes, transcripts, and so on. A BED file has tab-delimited and variable-length fields; the first three fields denoting a genomic interval are mandatory.
This is an example of RNA transcripts:
chr9 68331023 68424451 NM_015110 0 + chr9 68456943 68486659 NM_001206 0 -
I/O tools for BED are provided from the GenomicFeatures.BED module, which exports following three types:
- Reader type:
BED.Reader - Writer type:
BED.Writer - Element type:
BED.Record
Examples
Here is a common workflow to iterate over all records in a BED file:
# Import the BED module. using GenomicFeatures # Open a BED file. reader = open(BED.Reader, "data.bed") # Iterate over records. for record in reader # Do something on record (see Accessors section). chrom = BED.chrom(record) # ... end # Finally, close the reader. close(reader)
If you repeatedly access records within specific ranges, it would be more efficient to construct an IntervalCollection object from a BED reader:
# Create an interval collection in memory. icol = open(BED.Reader, "data.bed") do reader IntervalCollection(reader) end # Query overlapping records. for interval in eachoverlap(icol, Interval("chrX", 40001, 51500)) # A record is stored in the metadata field of an interval. record = metadata(interval) # ... end
API
#
GenomicFeatures.BED.Reader — Type.
BED.Reader(input::IO; index=nothing) BED.Reader(input::AbstractString; index=:auto)
Create a data reader of the BED file format.
The first argument specifies the data source. When it is a filepath that ends with .bgz, it is considered to be block compression file format (BGZF) and the function will try to find a tabix index file (
Arguments
input: data sourceindex: path to a tabix file
#
GenomicFeatures.BED.Writer — Type.
BED.Writer(output::IO)
Create a data writer of the BED file format.
Arguments:
output: data sink
#
GenomicFeatures.BED.Record — Type.
BED.Record()
Create an unfilled BED record.
BED.Record(data::Vector{UInt8})
Create a BED record object from data.
This function verifies and indexes fields for accessors. Note that the ownership of data is transferred to a new record object.
BED.Record(str::AbstractString)
Create a BED record object from str.
This function verifies and indexes fields for accessors.
#
GenomicFeatures.BED.chrom — Function.
chrom(record::Record)::String
Get the chromosome name of record.
#
GenomicFeatures.BED.chromstart — Function.
chromstart(record::Record)::Int
Get the starting position of record.
Note that the first base is numbered 1.
#
GenomicFeatures.BED.chromend — Function.
chromend(record::Record)::Int
Get the end position of record.
#
GenomicFeatures.BED.name — Function.
name(record::Record)::String
Get the name of record.
#
GenomicFeatures.BED.score — Function.
score(record::Record)::Int
Get the score between 0 and 1000.
#
GenomicFeatures.BED.strand — Function.
strand(record::Record)::GenomicFeatures.Strand
Get the strand of record.
#
GenomicFeatures.BED.thickstart — Function.
thickstart(record::Record)::Int
Get the starting position at which record is drawn thickly.
Note that the first base is numbered 1.
#
GenomicFeatures.BED.thickend — Function.
thickend(record::Record)::Int
Get the end position at which record is drawn thickly.
#
GenomicFeatures.BED.itemrgb — Function.
itemrgb(record::Record)::ColorTypes.RGB
Get the RGB value of record.
The return type is defined in ColorTypes.jl.
#
GenomicFeatures.BED.blockcount — Function.
blockcount(record::Record)::Int
Get the number of blocks (exons) in record.
#
GenomicFeatures.BED.blocksizes — Function.
blocksizes(record::Record)::Vector{Int}
Get the block (exon) sizes of record.
#
GenomicFeatures.BED.blockstarts — Function.
blockstarts(record::Record)::Vector{Int}
Get the block (exon) starts of record.
Note that the first base is numbered 1.