ABI Bioinformatics Guide 2024
  • INTRODUCTION
    • How to use the guide
  • MOLECULAR BIOLOGY
    • The Cell
      • Cells and Their Organelles
      • Cell Specialisation
      • Quiz 1
    • Biological Molecules
      • Carbohydrates
      • Lipids
      • Nucleic Acids (DNA and RNA)
      • Quiz 2
      • Proteins
      • Catalysis of Biological Reactions
      • Quiz 3
    • Information Flow in the Cell
      • DNA Replication
      • Gene Expression: Transcription
      • Gene Expression: RNA Processing
      • Quiz 4
      • Chromatin and Chromosomes
      • Regulation of Gene Expression
      • Quiz 5
      • The Genetic Code
      • Gene Expression: Translation
    • Cell Cycle and Cell Division
      • Quiz 6
    • Mutations and Variations
      • Point mutations
      • Genotype-Phenotype Interactions
      • Quiz 7
  • PROGRAMMING
    • Python for Genomics
    • R programming (optional)
  • STATISTICS: THEORY
    • Introduction to Probability
      • Conditional Probability
      • Independent Events
    • Random Variables
      • Independent, Dependent and Controlled Variables
    • Data distribution PMF, PDF, CDF
    • Mean, Variance of a Random Variable
    • Some Common Distributions
    • Exploratory Statistics: Mean, Median, Quantiles, Variance/SD
    • Data Visualization
    • Confidence Intervals
    • Comparison tests, p-value, z-score
    • Multiple test correction: Bonferroni, FDR
    • Regression & Correlation
    • Dimentionality Reduction
      • PCA (Principal Component Analysis)
      • t-SNE (t-Distributed Stochastic Neighbor Embedding)
      • UMAP (Uniform Manifold Approximation and Projection)
    • QUIZ
  • STATISTICS & PROGRAMMING
  • BIOINFORMATICS ALGORITHMS
    • Introduction
    • DNA strings and sequencing file formats
    • Read alignment: exact matching
    • Indexing before alignment
    • Read alignment: approximate matching
    • Global and local alignment
  • NGS DATA ANALYSIS & FUNCTIONAL GENOMICS
    • Experimental Techniques
      • Polymerase Chain Reaction
      • Sanger (first generation) Sequencing Technologies
      • Next (second) Generation Sequencing technologies
      • The third generation of sequencing technologies
    • The Linux Command-line
      • Connecting to the Server
      • The Linux Command-Line For Beginners
      • The Bash Terminal
    • File formats, alignment, and genomic features
      • FASTA & FASTQ file formats
      • Basic Unix Commands for Genomics
      • Sequences and Genomic Features Part 1
      • Sequences and Genomic Features Part 2: SAMtools
      • Sequences and Genomic Features Part 3: BEDtools
    • Genetic variations & variant calling
      • Genomic Variations
      • Alignment and variant detection: Practical
      • Integrative Genomics Viewer
      • Variant Calling with GATK
    • RNA Sequencing & Gene expression
      • Gene expression and how we measure it
      • Gene expression quantification and normalization
      • Explorative analysis of gene expression
      • Differential expression analysis with DESeq2
      • Functional enrichment analysis
    • Single-cell Sequencing and Data Analysis
      • scRNA-seq Data Analysis Workflow
      • scRNA-seq Data Visualization Methods
  • FINAL REMARKS
Powered by GitBook
On this page

Was this helpful?

  1. STATISTICS: THEORY

Confidence Intervals

Confidence intervals are a statistical concept used to estimate the range of values within which we expect a population parameter, such as the mean or proportion, to lie. They provide a measure of uncertainty around an estimated statistic based on sample data.

  1. Purpose:

    • Estimate Precision: Confidence intervals help quantify the uncertainty in our estimates of population parameters derived from sample data.

    • Inferential Tool: They provide a range of plausible values for the parameter, allowing us to make inferences about the population.

  2. Construction:

    • Sample Data: Start with a sample from the population and compute a sample statistic (e.g., mean, proportion).

    • Distribution Assumptions: Underlying assumptions about the population distribution (e.g., normality for means, binomial for proportions) guide the calculation.

    • Formula: Typically constructed as Estimate±Margin of ErrorEstimate±Margin \ of \ ErrorEstimate±Margin of Error, where the margin of error accounts for variability and is based on the standard error of the statistic.

  3. Interpretation:

    • Confidence Level: Often expressed as a percentage (e.g., 95%, 99%). It represents the probability that the confidence interval includes the true population parameter if the sampling and estimation process were repeated many times.

    • Example: A 95% confidence interval suggests that if we were to take 100 different samples and compute confidence intervals for each, approximately 95 of those intervals would contain the true population parameter.

  4. Factors Influencing Width:

    • Sample Size: Larger samples generally result in narrower confidence intervals because they provide more precise estimates of the population parameter.

    • Variability: Higher variability in the data results in wider intervals, as it increases the uncertainty in estimating the parameter.

Practical Use:

  • Decision Making: Confidence intervals aid in making informed decisions by providing a range of plausible values for a population parameter.

  • Comparisons: They allow comparisons between groups or over time, assessing whether differences are statistically significant.

PreviousData VisualizationNextComparison tests, p-value, z-score

Last updated 10 months ago

Was this helpful?