FASTQ formatted files
FASTQ is a text-based file format for representing DNA sequences along with qualities for each base. A FASTQ file stores a list of sequence records in the following format:
@{name} {description}?
{sequence}
+
{qualities}
Here is an example of one record from a FASTQ file:
@FSRRS4401BE7HA
tcagTTAAGATGGGAT
+
###EEEEEEEEE##E#
Readers and Writers
The reader and writer for FASTQ formatted files, are found within the FASTQ module of FASTX.
They can be created with IOStreams:
using FASTX
r = FASTQ.Reader(open("my-reads.fastq", "r"))
w = FASTQ.Writer(open("my-output.fastq", "w"))
Alternatively, Base.open
is overloaded with a method for conveinience:
r = open(FASTQ.Reader, "my-reads.fastq")
w = open(FASTQ.Writer, "my-out.fastq")
Note that FASTQ.Reader
does not support line-wraps within sequence and quality. Usually sequence records will be read sequentially from a file by iteration.
reader = open(FASTQ.Reader, "my-reads.fastq")
for record in reader
## Do something
end
close(reader)
You can also overwrite records in a while loop to avoid excessive memory allocation.
reader = open(FASTQ.Reader, "my-reads.fastq")
record = FASTQ.Record()
while !eof(reader)
read!(reader, record)
## Do something.
end
Reading in a record from a FASTQ formatted file will give you a variable of type FASTQ.Record
.
Various getters and setters are available for FASTQ.Record
s:
FASTQ.hasidentifier
FASTQ.identifier
FASTQ.hasdescription
FASTQ.description
FASTQ.hassequence
FASTQ.sequence
FASTQ.hasquality
FASTQ.quality
To write a BioSequence
to FASTQ file, you first have to create a FASTQ.Record
:
As always with julia IO types, remember to close your file readers and writer after you are finished.
Using open
with a do-block can help ensure you close a stream after you are finished.
open(FASTQ.Reader, "my-reads.fastq") do reader
for record in reader
## Do something
end
end
Quality encodings
FASTQ records have a quality string which have platform dependent encodings. The FASTQ submodule has encoding and decoding support for the following quality encodings. These can be used with a FASTQ.quality
method, to ensure the correct quality score values are extracted from your FASTQ quality strings.