ABI Bioinformatics Guide 2024
  • INTRODUCTION
    • How to use the guide
  • MOLECULAR BIOLOGY
    • The Cell
      • Cells and Their Organelles
      • Cell Specialisation
      • Quiz 1
    • Biological Molecules
      • Carbohydrates
      • Lipids
      • Nucleic Acids (DNA and RNA)
      • Quiz 2
      • Proteins
      • Catalysis of Biological Reactions
      • Quiz 3
    • Information Flow in the Cell
      • DNA Replication
      • Gene Expression: Transcription
      • Gene Expression: RNA Processing
      • Quiz 4
      • Chromatin and Chromosomes
      • Regulation of Gene Expression
      • Quiz 5
      • The Genetic Code
      • Gene Expression: Translation
    • Cell Cycle and Cell Division
      • Quiz 6
    • Mutations and Variations
      • Point mutations
      • Genotype-Phenotype Interactions
      • Quiz 7
  • PROGRAMMING
    • Python for Genomics
    • R programming (optional)
  • STATISTICS: THEORY
    • Introduction to Probability
      • Conditional Probability
      • Independent Events
    • Random Variables
      • Independent, Dependent and Controlled Variables
    • Data distribution PMF, PDF, CDF
    • Mean, Variance of a Random Variable
    • Some Common Distributions
    • Exploratory Statistics: Mean, Median, Quantiles, Variance/SD
    • Data Visualization
    • Confidence Intervals
    • Comparison tests, p-value, z-score
    • Multiple test correction: Bonferroni, FDR
    • Regression & Correlation
    • Dimentionality Reduction
      • PCA (Principal Component Analysis)
      • t-SNE (t-Distributed Stochastic Neighbor Embedding)
      • UMAP (Uniform Manifold Approximation and Projection)
    • QUIZ
  • STATISTICS & PROGRAMMING
  • BIOINFORMATICS ALGORITHMS
    • Introduction
    • DNA strings and sequencing file formats
    • Read alignment: exact matching
    • Indexing before alignment
    • Read alignment: approximate matching
    • Global and local alignment
  • NGS DATA ANALYSIS & FUNCTIONAL GENOMICS
    • Experimental Techniques
      • Polymerase Chain Reaction
      • Sanger (first generation) Sequencing Technologies
      • Next (second) Generation Sequencing technologies
      • The third generation of sequencing technologies
    • The Linux Command-line
      • Connecting to the Server
      • The Linux Command-Line For Beginners
      • The Bash Terminal
    • File formats, alignment, and genomic features
      • FASTA & FASTQ file formats
      • Basic Unix Commands for Genomics
      • Sequences and Genomic Features Part 1
      • Sequences and Genomic Features Part 2: SAMtools
      • Sequences and Genomic Features Part 3: BEDtools
    • Genetic variations & variant calling
      • Genomic Variations
      • Alignment and variant detection: Practical
      • Integrative Genomics Viewer
      • Variant Calling with GATK
    • RNA Sequencing & Gene expression
      • Gene expression and how we measure it
      • Gene expression quantification and normalization
      • Explorative analysis of gene expression
      • Differential expression analysis with DESeq2
      • Functional enrichment analysis
    • Single-cell Sequencing and Data Analysis
      • scRNA-seq Data Analysis Workflow
      • scRNA-seq Data Visualization Methods
  • FINAL REMARKS
Powered by GitBook
On this page
  • DNA strings and sequencing
  • Why study DNA sequencing and computational genomics?
  • DNA: the molecule, the string
  • String definitions and Python syntax
  • Practical: String basics
  • Practical: Manipulating DNA strings
  • Practical: Downloading and reading a genome
  • How DNA is copied
  • Sequencing by Synthesis
  • Base calling and sequencing errors
  • File formats to store DNA reads
  • Reads in FASTQ format
  • Practical: Working with sequencing reads
  • Practical: Analyzing reads by position
  • Sequencers give pieces to genomic puzzles
  • Problems to solve
  • Congratulations!

Was this helpful?

  1. BIOINFORMATICS ALGORITHMS

DNA strings and sequencing file formats

PreviousIntroductionNextRead alignment: exact matching

Last updated 10 months ago

Was this helpful?

In this section, you will follow a few videos to revise your knowledge about the nature of DNA strings, the basics of working with genomes and reads, and the different file formats. After you're done with the videos, you will find a few problems to solve and check your knowledge and programming skills. The practical videos are explained in Python, but feel free to use any language of your choice to reproduce the same algorithm.

DNA strings and sequencing

Why study DNA sequencing and computational genomics?

DNA: the molecule, the string

String definitions and Python syntax

Practical: String basics

Practical: Manipulating DNA strings

Practical: Downloading and reading a genome

How DNA is copied

Sequencing by Synthesis

Base calling and sequencing errors

File formats to store DNA reads

Reads in FASTQ format

Practical: Working with sequencing reads

Practical: Analyzing reads by position

Sequencers give pieces to genomic puzzles

Problems to solve

Now, try to solve these problems after completing all the videos above. Some of them may be challenging :)

Congratulations!

If you made it here, then congratulations! You have successfully completed this section. Move to the next portion of the guide with the arrow buttons below.

To solve the following problems you need to and . On this site, you will find instructions on how to solve the problems and submit your answers. Please note that you need to solve the problems in the specified order. The problems will open up for you to solve only after you have unlocked the previous problems.

create an account on the Rosalind website
log in
Counting DNA Nucleotides
Transcribing DNA into RNA
Complementing a Strand of DNA
Computing GC Content
Translating RNA into Protein