ABI Bioinformatics Guide 2024
  • INTRODUCTION
    • How to use the guide
  • MOLECULAR BIOLOGY
    • The Cell
      • Cells and Their Organelles
      • Cell Specialisation
      • Quiz 1
    • Biological Molecules
      • Carbohydrates
      • Lipids
      • Nucleic Acids (DNA and RNA)
      • Quiz 2
      • Proteins
      • Catalysis of Biological Reactions
      • Quiz 3
    • Information Flow in the Cell
      • DNA Replication
      • Gene Expression: Transcription
      • Gene Expression: RNA Processing
      • Quiz 4
      • Chromatin and Chromosomes
      • Regulation of Gene Expression
      • Quiz 5
      • The Genetic Code
      • Gene Expression: Translation
    • Cell Cycle and Cell Division
      • Quiz 6
    • Mutations and Variations
      • Point mutations
      • Genotype-Phenotype Interactions
      • Quiz 7
  • PROGRAMMING
    • Python for Genomics
    • R programming (optional)
  • STATISTICS: THEORY
    • Introduction to Probability
      • Conditional Probability
      • Independent Events
    • Random Variables
      • Independent, Dependent and Controlled Variables
    • Data distribution PMF, PDF, CDF
    • Mean, Variance of a Random Variable
    • Some Common Distributions
    • Exploratory Statistics: Mean, Median, Quantiles, Variance/SD
    • Data Visualization
    • Confidence Intervals
    • Comparison tests, p-value, z-score
    • Multiple test correction: Bonferroni, FDR
    • Regression & Correlation
    • Dimentionality Reduction
      • PCA (Principal Component Analysis)
      • t-SNE (t-Distributed Stochastic Neighbor Embedding)
      • UMAP (Uniform Manifold Approximation and Projection)
    • QUIZ
  • STATISTICS & PROGRAMMING
  • BIOINFORMATICS ALGORITHMS
    • Introduction
    • DNA strings and sequencing file formats
    • Read alignment: exact matching
    • Indexing before alignment
    • Read alignment: approximate matching
    • Global and local alignment
  • NGS DATA ANALYSIS & FUNCTIONAL GENOMICS
    • Experimental Techniques
      • Polymerase Chain Reaction
      • Sanger (first generation) Sequencing Technologies
      • Next (second) Generation Sequencing technologies
      • The third generation of sequencing technologies
    • The Linux Command-line
      • Connecting to the Server
      • The Linux Command-Line For Beginners
      • The Bash Terminal
    • File formats, alignment, and genomic features
      • FASTA & FASTQ file formats
      • Basic Unix Commands for Genomics
      • Sequences and Genomic Features Part 1
      • Sequences and Genomic Features Part 2: SAMtools
      • Sequences and Genomic Features Part 3: BEDtools
    • Genetic variations & variant calling
      • Genomic Variations
      • Alignment and variant detection: Practical
      • Integrative Genomics Viewer
      • Variant Calling with GATK
    • RNA Sequencing & Gene expression
      • Gene expression and how we measure it
      • Gene expression quantification and normalization
      • Explorative analysis of gene expression
      • Differential expression analysis with DESeq2
      • Functional enrichment analysis
    • Single-cell Sequencing and Data Analysis
      • scRNA-seq Data Analysis Workflow
      • scRNA-seq Data Visualization Methods
  • FINAL REMARKS
Powered by GitBook
On this page
  • RNA Capping
  • RNA Splicing
  • Polyadenylation
  • Summary

Was this helpful?

  1. MOLECULAR BIOLOGY
  2. Information Flow in the Cell

Gene Expression: RNA Processing

PreviousGene Expression: TranscriptionNextQuiz 4

Last updated 1 year ago

Was this helpful?

In eukaryotes, the transcription of a gene by RNA polymerase II produces pre-mRNA which has to undergo several processing steps to become a fully functional mRNA. These steps include modifications of the transcript's 5' and 3' ends and the excision of some internal parts of the RNA molecule.

RNA Capping

Soon after the start of transcription by RNA polymerase, a modified guanine nucleotide is added to the 5' end of the growing RNA molecule.

This modification, known as a 5' cap, serves several functions:

  1. Protection from degradation: the 5' cap protects the pre-mRNA from degradation by nucleases.

  2. Nuclear export signal: the presence of the 5' cap serves as one of the molecular signals indicating that the RNA is mature and ready to be exported from the nucleus to the cytoplasm. This ensures that only fully processed mRNA molecules are transported to the site of protein synthesis.

  3. Initiation of translation: in eukaryotes, the 5' cap is essential for the ribosome to bind to the mRNA and initiate protein synthesis.

RNA Splicing

Most eukaryotic protein-coding genes contain two major types of segments: coding segments called exons and non-coding sequences called introns. During transcription by RNA polymerase II, both exons and introns are included in the pre-mRNA transcript.

However, prior to translation, introns must be removed from the pre-mRNA and exons must be joined together in a process called RNA splicing. Splicing is catalysed by a large molecular complex known as the spliceosome, which consists of several proteins and small nuclear RNAs (snRNAs). The spliceosome recognises specific sequences at the boundaries between exons and introns, called splice sites. Splicing results in the production of mature mRNA molecules containing only the sequences necessary for protein synthesis.

While the precise role of introns and splicing in eukaryotic organisms is still not fully understood, there are several potential advantages associated with splicing:

  1. Regulatory elements in introns: some introns contain regulatory elements that can modulate gene expression.

  2. Exon-intron structure and protein evolution: the exon-intron structure of genes may facilitate the faster evolution of new proteins. It has been observed that many protein domains, which are functional units of proteins capable of folding into stable tertiary structures, are encoded by single exons. Additionally, proteins often consist of similar sets of domains. The alternation of exons with introns allows for exon shuffling through recombination events. This process can generate new combinations of exons, giving rise to novel genes that encode functionally active proteins, thus contributing to protein evolution and diversification.

  3. Alternative splicing for protein diversity: RNA splicing enables the generation of diverse protein isoforms from a single gene through a mechanism known as alternative splicing. Different exons can be included or excluded from mature mRNA transcripts, leading to the production of multiple protein isoforms with distinct functions.

Polyadenylation

As RNA polymerase II progresses along the gene during transcription, it eventually encounters a specific sequence known as the polyadenylation signal, which is transcribed into the growing pre-mRNA. The canonical polyadenylation signal sequence is AAUAAA.

The polyadenylation signal is recognised by cleavage proteins that cut the transcript a short distance downstream from the signal. This cleavage event results in the release of the pre-mRNA.

An enzyme called poly-A polymerase (PAP) adds a string of adenine nucleotides, known as the poly-A tail, to the 3' end of the pre-mRNA. Typically, about 200 adenine nucleotides are added to create the poly-A tail.

Similar to the 5' cap, the poly-A tail protects mRNA from degradation and is required for mRNA export from the nucleus.

Summary

The figure below summarises RNA processing in eukaryotic cells.

More details on RNA splicing can be found in the following videos.

The structure of the 5' cap Image source: Zephyris - English Wikipedia, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1379696
RNA splicing: introns are excised and exons are fused Image source: Genomics Education Programme - Splice donor sites, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=50543001
Alternative splicing results in different protein isoforms Image source: National Human Genome Research Institute - http://www.genome.gov/Images/EdKit/bio2j_large.gif, Public Domain, https://commons.wikimedia.org/w/index.php?curid=2132737
Polyadenylation of RNA Image source: Zephyris (en Wikipedia user) - English Wikipedia, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1379635
RNA processing in eukaryotic cells Image source: Biorender.com