ABI Bioinformatics Guide 2024
  • INTRODUCTION
    • How to use the guide
  • MOLECULAR BIOLOGY
    • The Cell
      • Cells and Their Organelles
      • Cell Specialisation
      • Quiz 1
    • Biological Molecules
      • Carbohydrates
      • Lipids
      • Nucleic Acids (DNA and RNA)
      • Quiz 2
      • Proteins
      • Catalysis of Biological Reactions
      • Quiz 3
    • Information Flow in the Cell
      • DNA Replication
      • Gene Expression: Transcription
      • Gene Expression: RNA Processing
      • Quiz 4
      • Chromatin and Chromosomes
      • Regulation of Gene Expression
      • Quiz 5
      • The Genetic Code
      • Gene Expression: Translation
    • Cell Cycle and Cell Division
      • Quiz 6
    • Mutations and Variations
      • Point mutations
      • Genotype-Phenotype Interactions
      • Quiz 7
  • PROGRAMMING
    • Python for Genomics
    • R programming (optional)
  • STATISTICS: THEORY
    • Introduction to Probability
      • Conditional Probability
      • Independent Events
    • Random Variables
      • Independent, Dependent and Controlled Variables
    • Data distribution PMF, PDF, CDF
    • Mean, Variance of a Random Variable
    • Some Common Distributions
    • Exploratory Statistics: Mean, Median, Quantiles, Variance/SD
    • Data Visualization
    • Confidence Intervals
    • Comparison tests, p-value, z-score
    • Multiple test correction: Bonferroni, FDR
    • Regression & Correlation
    • Dimentionality Reduction
      • PCA (Principal Component Analysis)
      • t-SNE (t-Distributed Stochastic Neighbor Embedding)
      • UMAP (Uniform Manifold Approximation and Projection)
    • QUIZ
  • STATISTICS & PROGRAMMING
  • BIOINFORMATICS ALGORITHMS
    • Introduction
    • DNA strings and sequencing file formats
    • Read alignment: exact matching
    • Indexing before alignment
    • Read alignment: approximate matching
    • Global and local alignment
  • NGS DATA ANALYSIS & FUNCTIONAL GENOMICS
    • Experimental Techniques
      • Polymerase Chain Reaction
      • Sanger (first generation) Sequencing Technologies
      • Next (second) Generation Sequencing technologies
      • The third generation of sequencing technologies
    • The Linux Command-line
      • Connecting to the Server
      • The Linux Command-Line For Beginners
      • The Bash Terminal
    • File formats, alignment, and genomic features
      • FASTA & FASTQ file formats
      • Basic Unix Commands for Genomics
      • Sequences and Genomic Features Part 1
      • Sequences and Genomic Features Part 2: SAMtools
      • Sequences and Genomic Features Part 3: BEDtools
    • Genetic variations & variant calling
      • Genomic Variations
      • Alignment and variant detection: Practical
      • Integrative Genomics Viewer
      • Variant Calling with GATK
    • RNA Sequencing & Gene expression
      • Gene expression and how we measure it
      • Gene expression quantification and normalization
      • Explorative analysis of gene expression
      • Differential expression analysis with DESeq2
      • Functional enrichment analysis
    • Single-cell Sequencing and Data Analysis
      • scRNA-seq Data Analysis Workflow
      • scRNA-seq Data Visualization Methods
  • FINAL REMARKS
Powered by GitBook
On this page
  • Introduction
  • Proteins are made of amino acids
  • Levels of protein structure

Was this helpful?

  1. MOLECULAR BIOLOGY
  2. Biological Molecules

Proteins

PreviousQuiz 2NextCatalysis of Biological Reactions

Last updated 12 months ago

Was this helpful?

Introduction

Proteins play a pivotal role in virtually every cellular function. They are responsible for molecular transport, provide structural support, and participate in defence mechanisms. Proteins, known as enzymes, act as catalysts, speeding up biochemical reactions. Other proteins are responsible for intercellular communications.

Proteins are made of amino acids

A polymer formed by the linkage of amino acids is termed a polypeptide and is called a protein when it serves a biological function. Any compound containing both an amino group (-NH2) and a carboxylic group (-COOH) can be broadly termed an amino acid. Yet, it is specifically the α-amino acids, where the amino and carboxylic groups are attached to the same carbon atom, that are the building blocks of proteins. The side chain, which can as well be called the R group, is variable among amino acids.

A total of 22 amino acids can be incorporated into polypeptides by ribosomes. Notably, among these, 20 amino acids are considered "canonical" and are universally present in all organisms. The distinctive properties of a particular amino acid are defined by its side chain, which can be categorised as negatively charged (acidic), positively charged (basic), nonpolar (hydrophobic), or polar (hydrophilic).

A linkage between a carboxyl group of one amino acid and an amino group of another amino acid is called a peptide bond.

Levels of protein structure

Following synthesis by a ribosome, a polypeptide undergoes a folding process, dictated by the sequence of amino acids. The resulting complex shape of the protein and the physical properties of its surface govern its capability to interact with other molecules and determine its function. The structure of proteins can be classified into four levels: primary, secondary, tertiary, and, in some cases, quaternary.

The primary structure is the sequence of amino acids in a peptide. The primary structure of a protein is defined by the genetic information encoded in the DNA.

The secondary structure is a regular local conformation of the polypeptide backbone stabilised by hydrogen bonds between carboxyl oxygen atoms and hydrogen atoms attached to nitrogen atoms. Two common types of secondary structures include the spiral-shaped alpha helix and the beta-pleated sheet, the latter formed by two or several parallel polypeptide segments.

The tertiary structure refers to the overall three-dimensional shape of a polypeptide and is stabilised by interactions between amino acids' R groups. These interactions include hydrogen bonds, ionic bonds, disulfide bridges formed between cysteine residues, van der Waals forces, and hydrophobic interactions. The latter entails the grouping of hydrophobic amino acid residues at the core of a protein globule to minimise contact with water, while hydrophilic amino acid residues predominantly occupy the surface, exposed to the aqueous environment.

The quaternary structure represents the overall structure of a protein, formed by the arrangement of two or several polypeptides referred to as subunits. A classic example is haemoglobin consisting of two α globin and two β globin subunits, each carrying a non-protein haem group.

Overall, the diversity of functions performed by proteins is reflected by the diversity of their shapes.

Functions of proteins Image source: Biorender.com
Amino acid structure Image source: OpenStax College - Anatomy & Physiology, Connexions Web site. http://cnx.org/content/col11496/1.6/, Jun 19, 2013., CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=30131160
Proteinogenic amino acids Image source: CNX OpenStax - http://cnx.org/contents/GFy_h8cu@10.53:rZudN6XP@2/Introduction, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=49923700
Formation of a peptide bond by a dehydration reaction
Levels of protein structure Image source: CNX OpenStax - https://cnx.org/contents/5CvTdmJL@4.4, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=53712842
Secondary structures: alpha helix and beta-pleated sheet Image source: CNX OpenStax - https://cnx.org/contents/5CvTdmJL@4.4, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=53712849
Interactions stabilising the tertiary structure Image source: CNX OpenStax - https://cnx.org/contents/5CvTdmJL@4.4, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=53712850
Haemoglobin structure. α globin and β globin subunits are shown in red and blue respectively. Haem groups are shown in green. Image source: Zephyris at the English-language Wikipedia, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=2300973
Examples of proteins' 3D shapes Image source: David Goodsell, https://book.bionumbers.org/how-big-is-the-average-protein/, with changes