Alignment and Sequence Variant Detection

This page takes an estimated 35 minutes to complete.

Theory

Genomic variations are differences in the DNA sequence between individuals within a species. There are several different types of genomic variations, including single nucleotide polymorphisms (SNPs), insertions and deletions, copy number variations (CNVs), and structural variations.

SNPs are the most common type of genomic variation, and they occur when a single nucleotide (A, T, C, or G) in the DNA sequence is altered. These variations can have a range of effects, from having no impact on an organism's characteristics to causing significant changes.

Insertions and deletions are changes in the DNA sequence that involve the addition or removal of nucleotides. These variations can be small, involving just a few nucleotides, or large, involving thousands of nucleotides.

Copy number variations (CNVs) are changes in the number of copies of a particular section of DNA. These variations can have a range of effects, from having no impact on an organism's characteristics to causing significant changes.

Structural variations are changes in the structure of the DNA molecule itself. These variations can involve the rearrangement of large sections of DNA, and they can have a range of effects, from having no impact on an organism's characteristics to causing significant changes.

Overall, genomic variations are an important source of diversity within a species, and they can have a range of effects on an organism's characteristics and behavior.

Read the following chapter to gain a general understanding of the use of variant identification and analysis in bioinformatics and genomics.

Practice

The bioinformatics workflow for handling and for detecting sequence variations starts with the alignment of reads to the genome, followed by what we call variant calling. To learn more about the pipeline, watch the following videos from the Coursera course, and return here afterward:

Materials

You can use the following files, which are similar to the examples presented in the video.

The files are stored in the following directory on the server:

VCF files

/home/shared/ngs_data/vcf

For the alignment practice you can use the following fastq file and as a reference the fasta file

#Example fastq raw reads

/home/shared/ngs_data/fastq/example.fastq

#Example reference fasta files

/home/shared/ngs_data/fasta/Sars_cov_2.fa

Congratulations!

If you made it here, then congratulations! You have successfully completed this section. Move to the next portion of the guide with the arrow buttons below.

Last updated