bigBed
Description –––––-
bigBed is a binary file format for representing genomic annotations and often created from BED files. bigBed files are indexed to quickly fetch specific regions.
I/O tools for bigBed are provided from the GenomicFeatures.BigBed module, which exports following three types:
- Reader type:
BigBed.Reader - Writre type:
BigBed.Writer - Element type:
BigBed.Record
Examples
A common workflow is to open a file, iterate over records, and close the file:
# Import the BigBed module. using GenomicFeatures # Open a bigBed file. reader = open(BigBed.Reader, "data.bb") # Iterate over records overlapping with a query interval. for record in eachoverlap(reader, Interval("Chr2", 5001, 6000)) # Extract the start position, end position and value of the record, startpos = BigBed.chromstart(record) endpos = BigBed.chromend(record) value = BigBed.value(record) # and do something... end # Finally, close the reader. close(reader)
Iterating over all records is also supported:
reader = open(BigBed.Reader, "data.bb") for record in reader # ... end close(reader)
Creating a bigBed file can be done as follows. The write call takes a tuple of 3-12 elements (i.e. chromosome name, start position, end position, name, score, strand, thickstart, thickend, RGB color, blockcount, blocksizes and blockstarts). The first three are mandatory fields but others are optional.
# Import RGB type. using ColorTypes file = open("data.bb", "w") writer = BigBed.Writer(file, [("chr1", 1000)]) write(writer, ("chr1", 1, 100, "some name", 100, '+', 10, 90, RGB(0.5, 0.1, 0.2), 2, [4, 10], [10, 20])) close(writer)
API
#
GenomicFeatures.BigBed.Reader — Type.
BigBed.Reader(input::IO)
Create a reader for bigBed file format.
Note that input must be seekable.
#
GenomicFeatures.BigBed.chromlist — Function.
chromlist(reader::BigBed.Reader)::Vector{Tuple{String,Int}}
Get the (name, length) pairs of chromosomes/contigs.
#
GenomicFeatures.BigBed.Writer — Type.
BigBed.Writer(output::IO, chromlist; binsize=64)
Create a data writer of the bigBed file format.
Arguments
output: data sinkchromlist: chromosome list with lengthbinsize=64: size of a zoom with the highest resolution
Examples
output = open("data.bb", "w") writer = BigBed.Writer(output, [("chr1", 12345), ("chr2", 9100)]) write(writer, ("chr1", 101, 150, "gene 1")) write(writer, ("chr2", 211, 250, "gene 2")) close(writer)
#
GenomicFeatures.BigBed.Record — Type.
BigBed.Record()
Create an unfilled bigBed record.
#
GenomicFeatures.BigBed.chrom — Function.
chrom(record::Record)::String
Get the chromosome name of record.
#
GenomicFeatures.BigBed.chromid — Function.
chromid(record::Record)::UInt32
Get the chromosome ID of record.
#
GenomicFeatures.BigBed.chromstart — Function.
chromstart(record::Record)::Int
Get the start position of record.
#
GenomicFeatures.BigBed.chromend — Function.
chromend(record::Record)::Int
Get the end position of record.
#
GenomicFeatures.BigBed.name — Function.
name(record::Record)::String
Get the name of record.
#
GenomicFeatures.BigBed.score — Function.
score(record::Record)::Int
Get the score between 0 and 1000.
#
GenomicFeatures.BigBed.strand — Function.
strand(record::Record)::GenomicFeatures.Strand
Get the strand of record.
#
GenomicFeatures.BigBed.thickstart — Function.
thickstart(record::Record)::Int
Get the starting position at which record is drawn thickly.
Note that the first base is numbered 1.
#
GenomicFeatures.BigBed.thickend — Function.
thickend(record::Record)::Int
Get the end position at which record is drawn thickly.
#
GenomicFeatures.BigBed.itemrgb — Function.
itemrgb(record::Record)::ColorTypes.RGB
Get the RGB value of record.
The return type is defined in ColorTypes.jl.
#
GenomicFeatures.BigBed.blockcount — Function.
blockcount(record::Record)::Int
Get the number of blocks (exons) in record.
#
GenomicFeatures.BigBed.blocksizes — Function.
blocksizes(record::Record)::Vector{Int}
Get the block (exon) sizes of record.
#
GenomicFeatures.BigBed.blockstarts — Function.
blockstarts(record::Record)::Vector{Int}
Get the block (exon) starts of record.
Note that the first base is numbered 1.
#
GenomicFeatures.BigBed.optionals — Function.
optionals(record::Record)::Vector{String}
Get optional fields as strings.