FASTQ formatted files
FASTQ is a text-based file format for representing DNA sequences along with qualities for each base. A FASTQ file stores a list of sequence records in the following format:
@{name} {description}?
{sequence}
+
{qualities}Here is an example of one record from a FASTQ file:
@FSRRS4401BE7HA
tcagTTAAGATGGGAT
+
###EEEEEEEEE##E#Readers and Writers
The reader and writer for FASTQ formatted files, are found within the FASTQ module of FASTX.
They can be created with IOStreams:
using FASTX
r = FASTQ.Reader(open("my-reads.fastq", "r"))
w = FASTQ.Writer(open("my-output.fastq", "w"))Alternatively, Base.open is overloaded with a method for conveinience:
r = open(FASTQ.Reader, "my-reads.fastq")
w = open(FASTQ.Writer, "my-out.fastq")Note that FASTQ.Reader does not support line-wraps within sequence and quality. Usually sequence records will be read sequentially from a file by iteration.
reader = open(FASTQ.Reader, "my-reads.fastq")
for record in reader
## Do something
end
close(reader)You can also overwrite records in a while loop to avoid excessive memory allocation.
reader = open(FASTQ.Reader, "my-reads.fastq")
record = FASTQ.Record()
while !eof(reader)
read!(reader, record)
## Do something.
endReading in a record from a FASTQ formatted file will give you a variable of type FASTQ.Record.
Various getters and setters are available for FASTQ.Records:
FASTQ.hasidentifierFASTQ.identifierFASTQ.hasdescriptionFASTQ.descriptionFASTQ.hassequenceFASTQ.sequenceFASTQ.hasqualityFASTQ.quality
To write a BioSequence to FASTQ file, you first have to create a FASTQ.Record:
As always with julia IO types, remember to close your file readers and writer after you are finished.
Using open with a do-block can help ensure you close a stream after you are finished.
open(FASTQ.Reader, "my-reads.fastq") do reader
for record in reader
## Do something
end
endQuality encodings
FASTQ records have a quality string which have platform dependent encodings. The FASTQ submodule has encoding and decoding support for the following quality encodings. These can be used with a FASTQ.quality method, to ensure the correct quality score values are extracted from your FASTQ quality strings.