bigBed
Description
bigBed is a binary file format for representing genomic annotations and often created from BED files. bigBed files are indexed to quickly fetch specific regions.
I/O tools for bigBed are provided from the GenomicFeatures.BigBed
module, which exports following three types:
- Reader type:
BigBed.Reader
- Writre type:
BigBed.Writer
- Element type:
BigBed.Record
Examples
A common workflow is to open a file, iterate over records, and close the file:
# Import the BigBed module. using GenomicFeatures # Open a bigBed file. reader = open(BigBed.Reader, "data.bb") # Iterate over records overlapping with a query interval. for record in eachoverlap(reader, Interval("Chr2", 5001, 6000)) # Extract the start position, end position and value of the record, startpos = BigBed.chromstart(record) endpos = BigBed.chromend(record) value = BigBed.value(record) # and do something... end # Finally, close the reader. close(reader)
Iterating over all records is also supported:
reader = open(BigBed.Reader, "data.bb") for record in reader # ... end close(reader)
Creating a bigBed file can be done as follows. The write
call takes a tuple of 3-12 elements (i.e. chromosome name, start position, end position, name, score, strand, thickstart, thickend, RGB color, blockcount, blocksizes and blockstarts). The first three are mandatory fields but others are optional.
# Import RGB type. using ColorTypes file = open("data.bb", "w") writer = BigBed.Writer(file, [("chr1", 1000)]) write(writer, ("chr1", 1, 100, "some name", 100, '+', 10, 90, RGB(0.5, 0.1, 0.2), 2, [4, 10], [10, 20])) close(writer)
Accessors
#
GenomicFeatures.BigBed.Reader
— Type.
BigBed.Reader(input::IO)
Create a reader for bigBed file format.
Note that input
must be seekable.
#
GenomicFeatures.BigBed.chromlist
— Function.
chromlist(reader::BigBed.Reader)::Vector{Tuple{String,Int}}
Get the (name, length)
pairs of chromosomes/contigs.
#
GenomicFeatures.BigBed.Writer
— Type.
BigBed.Writer(output::IO, chromlist; binsize=64)
Create a data writer of the bigBed file format.
Arguments
output
: data sinkchromlist
: chromosome list with lengthbinsize=64
: size of a zoom with the highest resolution
Examples
output = open("data.bb", "w") writer = BigBed.Writer(output, [("chr1", 12345), ("chr2", 9100)]) write(writer, ("chr1", 101, 150, "gene 1")) write(writer, ("chr2", 211, 250, "gene 2")) close(writer)
#
GenomicFeatures.BigBed.Record
— Type.
BigBed.Record()
Create an unfilled bigBed record.
#
GenomicFeatures.BigBed.chrom
— Function.
chrom(record::Record)::String
Get the chromosome name of record
.
#
GenomicFeatures.BigBed.chromid
— Function.
chromid(record::Record)::UInt32
Get the chromosome ID of record
.
#
GenomicFeatures.BigBed.chromstart
— Function.
chromstart(record::Record)::Int
Get the start position of record
.
#
GenomicFeatures.BigBed.chromend
— Function.
chromend(record::Record)::Int
Get the end position of record
.
#
GenomicFeatures.BigBed.name
— Function.
name(record::Record)::String
Get the name of record
.
#
GenomicFeatures.BigBed.score
— Function.
score(record::Record)::Int
Get the score between 0 and 1000.
#
GenomicFeatures.BigBed.strand
— Function.
strand(record::Record)::GenomicFeatures.Strand
Get the strand of record
.
#
GenomicFeatures.BigBed.thickstart
— Function.
thickstart(record::Record)::Int
Get the starting position at which record
is drawn thickly.
Note that the first base is numbered 1.
#
GenomicFeatures.BigBed.thickend
— Function.
thickend(record::Record)::Int
Get the end position at which record
is drawn thickly.
#
GenomicFeatures.BigBed.itemrgb
— Function.
itemrgb(record::Record)::ColorTypes.RGB
Get the RGB value of record
.
The return type is defined in ColorTypes.jl.
#
GenomicFeatures.BigBed.blockcount
— Function.
blockcount(record::Record)::Int
Get the number of blocks (exons) in record
.
#
GenomicFeatures.BigBed.blocksizes
— Function.
blocksizes(record::Record)::Vector{Int}
Get the block (exon) sizes of record
.
#
GenomicFeatures.BigBed.blockstarts
— Function.
blockstarts(record::Record)::Vector{Int}
Get the block (exon) starts of record
.
Note that the first base is numbered 1.
#
GenomicFeatures.BigBed.optionals
— Function.
optionals(record::Record)::Vector{String}
Get optional fields as strings.