ABI Bioinformatics Guide 2024
  • INTRODUCTION
    • How to use the guide
  • MOLECULAR BIOLOGY
    • The Cell
      • Cells and Their Organelles
      • Cell Specialisation
      • Quiz 1
    • Biological Molecules
      • Carbohydrates
      • Lipids
      • Nucleic Acids (DNA and RNA)
      • Quiz 2
      • Proteins
      • Catalysis of Biological Reactions
      • Quiz 3
    • Information Flow in the Cell
      • DNA Replication
      • Gene Expression: Transcription
      • Gene Expression: RNA Processing
      • Quiz 4
      • Chromatin and Chromosomes
      • Regulation of Gene Expression
      • Quiz 5
      • The Genetic Code
      • Gene Expression: Translation
    • Cell Cycle and Cell Division
      • Quiz 6
    • Mutations and Variations
      • Point mutations
      • Genotype-Phenotype Interactions
      • Quiz 7
  • PROGRAMMING
    • Python for Genomics
    • R programming (optional)
  • STATISTICS: THEORY
    • Introduction to Probability
      • Conditional Probability
      • Independent Events
    • Random Variables
      • Independent, Dependent and Controlled Variables
    • Data distribution PMF, PDF, CDF
    • Mean, Variance of a Random Variable
    • Some Common Distributions
    • Exploratory Statistics: Mean, Median, Quantiles, Variance/SD
    • Data Visualization
    • Confidence Intervals
    • Comparison tests, p-value, z-score
    • Multiple test correction: Bonferroni, FDR
    • Regression & Correlation
    • Dimentionality Reduction
      • PCA (Principal Component Analysis)
      • t-SNE (t-Distributed Stochastic Neighbor Embedding)
      • UMAP (Uniform Manifold Approximation and Projection)
    • QUIZ
  • STATISTICS & PROGRAMMING
  • BIOINFORMATICS ALGORITHMS
    • Introduction
    • DNA strings and sequencing file formats
    • Read alignment: exact matching
    • Indexing before alignment
    • Read alignment: approximate matching
    • Global and local alignment
  • NGS DATA ANALYSIS & FUNCTIONAL GENOMICS
    • Experimental Techniques
      • Polymerase Chain Reaction
      • Sanger (first generation) Sequencing Technologies
      • Next (second) Generation Sequencing technologies
      • The third generation of sequencing technologies
    • The Linux Command-line
      • Connecting to the Server
      • The Linux Command-Line For Beginners
      • The Bash Terminal
    • File formats, alignment, and genomic features
      • FASTA & FASTQ file formats
      • Basic Unix Commands for Genomics
      • Sequences and Genomic Features Part 1
      • Sequences and Genomic Features Part 2: SAMtools
      • Sequences and Genomic Features Part 3: BEDtools
    • Genetic variations & variant calling
      • Genomic Variations
      • Alignment and variant detection: Practical
      • Integrative Genomics Viewer
      • Variant Calling with GATK
    • RNA Sequencing & Gene expression
      • Gene expression and how we measure it
      • Gene expression quantification and normalization
      • Explorative analysis of gene expression
      • Differential expression analysis with DESeq2
      • Functional enrichment analysis
    • Single-cell Sequencing and Data Analysis
      • scRNA-seq Data Analysis Workflow
      • scRNA-seq Data Visualization Methods
  • FINAL REMARKS
Powered by GitBook
On this page
  • Introduction
  • Question on the topic
  • RNA polymerase
  • Initiation of Transcription
  • Elongation
  • Transcription Termination
  • Summary

Was this helpful?

  1. MOLECULAR BIOLOGY
  2. Information Flow in the Cell

Gene Expression: Transcription

Introduction

DNA sequences known as genes encode proteins or functional RNAs. However, even protein-coding genes do not directly drive protein synthesis but act through the process of gene expression. Gene expression includes two main processes:

  1. Transcription: RNA synthesis occurs on the DNA template.

  2. Translation: protein synthesis takes place on the RNA template.

In eukaryotic cells, RNA molecules undergo several processing steps before translation can occur.

Question on the topic

The nucleotide sequence of a DNA strand serving as a template for transcription is given below. 5'-AACCTGACAGA-3'

What is the nucleotide sequence of the RNA produced on this template?

Note that nucleic acid sequences should be written in the 5'-3' direction.

Answer

UCUGUCAGGUU

RNA polymerase

The main enzyme involved in transcription is RNA polymerase which catalyses the synthesis of RNA using the DNA template. Similar to DNA polymerase, RNA polymerase elongates RNA in the 5'-to-3' direction. However, unlike DNA polymerase, RNA polymerase does not require a primer to initiate polynucleotide synthesis.

In bacteria, a single RNA polymerase enzyme is responsible for transcription. Eukaryotes typically possess three RNA polymerases, each responsible for transcribing different types of genes. However, all eukaryotic protein-coding genes are transcribed by RNA polymerase II.

The RNA produced on a protein-coding gene is referred to as messenger RNA (mRNA). Transcription of other genes results in non-coding RNAs, which include ribosomal RNAs (rRNAs), transfer RNAs (tRNAs), small nuclear RNAs (snRNAs), microRNAs (miRNAs) and several other RNA types.

Initiation of Transcription

To start transcription, RNA polymerase must bind to a specific DNA sequence known as a promoter, which marks the start site for RNA synthesis. Most, though not all, promoters are located upstream from (before) the transcription start site.

Transcription initiation in bacteria usually requires just the RNA polymerase and one additional protein called σ factor. Promoters, though variable, typically contain similar sequences located at positions -10 and -35 upstream from the transcription start site. The initiation of transcription requires the unwinding of a short region of DNA which is stabilised by σ factor. This open DNA region is termed a transcription bubble. Once the transcription bubble is formed, RNA polymerase initiates RNA synthesis. Just one of the DNA strands called the template strand is used for transcription.

In eukaryotes, the binding of the promoter by RNA polymerase requires the assistance of multiple additional proteins known as transcription factors. A typical promoter recognised by RNA polymerase II contains a conserved nucleotide sequence rich in adenines and thymines.

Elongation

The RNA polymerase moves downstream from the transcription start site adding nucleotides to the growing RNA molecule. The elongated RNA grows in the 3' direction.

Multiple RNA polymerase enzymes can transcribe the same gene at the same time to increase RNA production.

Transcription Termination

Termination in bacteria occurs when RNA polymerase encounters a termination signal. In most cases, termination involves the formation of a "hairpin" structure which results in polymerase detaching from the transcript.

In eukaryotes, the process of transcription termination is more sophisticated and is not yet fully understood. The enzyme RNA polymerase encounters a specific sequence called polyadenylation signal which marks the end of the transcript and recruits an additional enzyme that cleaves the newly synthesised RNA. Despite the cleavage of the RNA, RNA polymerase may continue transcription until it is eventually pushed away from the DNA by enzymes responsible for degrading the non-functional RNA.

Summary

The video below summarises the transcription and provides some additional details:

PreviousDNA ReplicationNextGene Expression: RNA Processing

Last updated 2 months ago

Was this helpful?

Eukaryotic RNA polymerase II Image source: Litvinanna - Own work using https://www.rcsb.org/structure/1WCM, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=77523301
Image source: Connie Rye, Robert Wise, Vladimir Jurukovski, Jean DeSaix, Jung Choi, Yael Avissar - Biology, Oct 21, 2016, https://openstax.org/books/biology/pages/15-2-prokaryotic-transcription
RNA polymerase II and transcription factors required to bind the promoter Image source: ByCNX OpenStax - http://cnx.org/contents/GFy_h8cu@10.53:rZudN6XP@2/Introduction, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=49929297
RNA elongation Image source: Rosti1985 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=89613003
Multiple RNA polymerase enzymes transcribe three closely located genes. "Begin" indicates the 5' end of the coding strand of DNA, where new RNA synthesis begins; "end" indicates the 3' end, where the transcripts are almost complete. RNA polymerase molecules are seen on DNA as dots. Image source: CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1830046
Transcription termination in bacteria Image source: Untitled image. "Stages of transcription". Accessed February 29, 2024. https://www.khanacademy.org/science/biology/gene-expression-central-dogma/transcription-of-dna-into-rna/a/stages-of-transcription