scRNA-seq Data Analysis Workflow
Last updated
Last updated
Analyzing single-cell sequencing data involves several key steps:
Preprocessing: This includes quality control, normalization, and filtering to ensure high-quality data.
Dimensionality Reduction: Techniques like PCA, t-SNE, and UMAP are used to reduce the complexity of the data and visualize the relationships between cells.
Clustering: Cells are grouped based on similar gene expression patterns, allowing for the identification of distinct cell types or states.
Differential Expression Analysis: Identifying genes that are differentially expressed between cell clusters or conditions to understand the underlying biology.
One of the most used packages for single-cell data analysis is Seurat, which is a comprehensive R toolkit that facilitates the entire analysis workflow, from data preprocessing to advanced visualization and downstream analysis. A guided tutorial of a basic single-cell analysis worklow for a 10X Genomics dataset of Peripheral Blood Mononuclear Cells (PBMC) in Seurat can be found here:
A Python alternative to Seurat is scanpy, a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. You can find a scanpy tutorial on the same PBMC dataset can be found here: