Quality Control: FastQC

This page takes an estimated 1 hour to complete.

Modern high throughput sequencers can generate tens of millions of sequences in a single run. Before analyzing this sequence to draw biological conclusions you should always perform some simple quality control checks to ensure that the raw data looks good and there are no problems or biases in your data that may affect how you can usefully use it.

Theory

Most sequencers will generate a QC report as part of their analysis pipeline, but this is usually only focused on identifying problems that were generated by the sequencer itself. FastQC aims to provide a QC report which can spot problems that originate either in the sequencer or in the starting library material.

We are going to use FastQC in the scope of this guide, so make sure to thoroughly review the following content.

Materials

You can use the following files, which are similar to the examples presented in the video.

The fastq files are stored in the following directory on the server:

/home/shared/ngs_data/fastq

FastQC Documentation (.pptx below)

Using FastQC to Check the Quality of Throughput Sequence

Congratulations!

If you made it here, then congratulations! You have successfully completed this section. Move to the next portion of the guide with the arrow buttons below.

Last updated