# UMAP (Uniform Manifold Approximation and Projection)

**Overview:**

* **UMAP** is another non-linear dimensionality reduction technique that focuses on preserving both the local and global structure of the data.
* It constructs a high-dimensional graph and then optimizes a low-dimensional graph to be as similar as possible to the high-dimensional one.

**Key Characteristics:**

* Generally faster and more scalable than t-SNE, making it suitable for larger datasets.
* Often produces more meaningful global structure in the low-dimensional representation.
* Less sensitive to hyperparameters compared to t-SNE, with only a few parameters to tune (n\_neighbors and min\_dist).

**Applications:**

* Similar to t-SNE, UMAP is used for visualizing high-dimensional data in fields like genomics, image analysis, and natural language processing.
* It is also used as a preprocessing step for clustering and classification algorithms.

{% embed url="<https://www.youtube.com/watch?v=eN0wFzBA4Sc>" %}

{% embed url="<https://www.youtube.com/watch?v=jth4kEvJ3P8>" %}

### Comparison

| Feature            | PCA                                      | t-SNE                                          | UMAP                                           |
| ------------------ | ---------------------------------------- | ---------------------------------------------- | ---------------------------------------------- |
| **Algorithm Type** | Linear                                   | Non-linear                                     | Non-linear                                     |
| **Parameters**     | None                                     | Perplexity, learning rate                      | n\_neighbors, min\_dist                        |
| **Scalability**    | Efficient, handles large datasets        | Computationally intensive                      | Fast, scalable                                 |
| **Output**         | Linear combinations of original features | 2D or 3D embedding for visualization           | 2D or 3D embedding for visualization           |
| **Strengths**      | Simple, fast, captures variance          | Reveals clusters, good for visualization       | Fast, captures both local and global structure |
| **Weaknesses**     | Only captures linear relationships       | Computationally expensive, parameter-sensitive | Slightly complex, needs parameter tuning       |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://edu.abi.am/statistics-theory/dimentionality-reduction/umap-uniform-manifold-approximation-and-projection.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
