FASTQ (.fastq) File Format

Description

 • FASTQ is a plaintext format for storing biological sequences and associated quality scores. It is similar to the FASTA format but in addition to sequence data it includes a quality score for each sequence element.
 • The commands Import and Export support this format.
 • Import produces an n×3 Matrix. Each row corresponds with a sequence element in the file:
 – the first column contains a sequence descriptor
 – the second column contains the sequence itself
 – the third column contains quality information

Details on the FASTQ format

 • The FASTQ format employs the following standard IUB/IUPAC conventions for encoding protein or nucleic acid sequences as alphabetic characters. For details on this encoding, consult the FASTA format.
 • The quality string assigns a quality score to each element of the sequence. Each score is encoded as a single ASCII character with a mapping defined by the FASTQ specification.

Notes

 • Content-Type: chemical/seq-na-fastq

Examples

Import a DNA sequence from a FASTQ file.

 > $\mathrm{Import}\left("example/sample.fastq",\mathrm{base}=\mathrm{datadir}\right)$
 $\left[\begin{array}{ccc}{"Sequence1"}& {"AACAGGGTTTGTTAAGATGGCAGAG"}& {";;7;<=>=<;;;;:979:;;;;988"}\\ {"Sequence2"}& {"CAATACACTGAAAATGTCGATGGAT"}& {";;;;;;<=>?@A?><;;;:-;3;83"}\\ {"Sequence3"}& {"CATACACAAACGCCTGAGCCTAGCA"}& {"9393393;7;;;;;:::;;;:;;;9"}\end{array}\right]$ (1)

References

 Cock PJA, Fields CJ, Goto N, Heuer M, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research (2010) 38(6): 1767-1771, doi:10.1093/nar/gkp1137.