Skip to content

Commit 8231e14

Browse files
committed
added content for linear algebra for ML
1 parent 8981b1b commit 8231e14

File tree

12 files changed

+1081
-27
lines changed

12 files changed

+1081
-27
lines changed

docs/machine-learning/introduction/ml-engineer-vs-ai-engineer.mdx

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,4 +67,12 @@ The AIE role is often used interchangeably with MLE, but when distinct, it typic
6767

6868
:::important
6969
**CodeHarborHub's Focus:** This tutorial is geared towards the **Machine Learning Engineer** skillset. We will give you the *modeling foundation* of a Data Scientist and the *engineering discipline* of a Software Engineer, emphasizing the MLOps skills needed for real-world production.
70-
:::
70+
:::
71+
72+
<LiteYouTubeEmbed
73+
id="Ff8HHBITvfs"
74+
params="autoplay=1&autohide=1&showinfo=0&rel=0"
75+
title="AI VS ML Engineer What Do They Do?"
76+
poster="maxresdefault"
77+
webp
78+
/>

docs/machine-learning/introduction/what-is-ml.mdx

Lines changed: 45 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -24,32 +24,30 @@ A widely accepted, formal definition of Machine Learning was provided by compute
2424
2525
Let's break down this concept with a simple example: **Spam Filtering**.
2626

27-
| Component | Description | Spam Filtering Example |
28-
| :--- | :--- | :--- |
29-
| **Task ($T$)** | The problem the ML system is trying to solve. | Classifying an email as "Spam" or "Not Spam (Ham)". |
30-
| **Experience ($E$)** | The data the ML system uses to train itself. | A large dataset of historical emails labeled as either spam or ham. |
31-
| **Performance ($P$)** | A metric used to evaluate the system's success. | **Accuracy:** The percentage of emails correctly classified. |
27+
| Component | Description | Spam Filtering Example |
28+
| :-------------------- | :---------------------------------------------- | :------------------------------------------------------------------ |
29+
| **Task ($T$)** | The problem the ML system is trying to solve. | Classifying an email as "Spam" or "Not Spam (Ham)". |
30+
| **Experience ($E$)** | The data the ML system uses to train itself. | A large dataset of historical emails labeled as either spam or ham. |
31+
| **Performance ($P$)** | A metric used to evaluate the system's success. | **Accuracy:** The percentage of emails correctly classified. |
3232

3333
:::tip
34-
The core idea is that the program's ability to classify new, unseen emails gets better the more labeled examples it processes. The program *learns* the rules itself.
34+
The core idea is that the program's ability to classify new, unseen emails gets better the more labeled examples it processes. The program _learns_ the rules itself.
3535
:::
3636

3737
## ML vs. Traditional Programming
3838

3939
This is the most crucial concept when starting out. Machine Learning fundamentally shifts the paradigm of software development.
4040

41-
42-
4341
<Tabs>
4442
<TabItem value="traditional" label="Traditional Programming" default>
4543

46-
In traditional programming, you (the programmer) write explicit **Rules** (algorithms, logic, conditions) that process **Data** to produce an **Answer**.
44+
In traditional programming, you (the programmer) write explicit **Rules** (algorithms, logic, conditions) that process **Data** to produce an **Answer**.
4745

48-
```mermaid
49-
graph LR
50-
A[Data] --> B(Rules/Program);
51-
B --> C[Answer];
52-
```
46+
```mermaid
47+
graph LR
48+
A[Data] --> B(Rules/Program);
49+
B --> C[Answer];
50+
```
5351

5452
**Example (Temperature Conversion):**
5553
You explicitly write the formula: `Fahrenheit = (Celsius * 9/5) + 32`. The computer executes this static rule.
@@ -67,26 +65,47 @@ graph LR
6765
```
6866

6967
**Example (Predicting House Price):**
70-
You feed it past house data (size, location) and the final sale price. The ML algorithm creates a complex mathematical model (the "Rule") that predicts the price of a *new* house based on its features.
68+
You feed it past house data (size, location) and the final sale price. The ML algorithm creates a complex mathematical model (the "Rule") that predicts the price of a _new_ house based on its features.
7169

7270
</TabItem>
7371
</Tabs>
7472

7573
## Key Characteristics of Machine Learning
7674

77-
* **Data-Driven:** ML models require vast amounts of high-quality data to learn effectively.
78-
* **Automatic Pattern Discovery:** The system discovers hidden patterns, correlations, and rules in the data without human intervention.
79-
* **Generalization:** A good ML model can accurately predict or classify data it has never seen before (its performance improves with experience $E$).
80-
* **Iterative Process:** Developing an ML model is a cyclical process of data collection, training, evaluation, and refinement.
75+
- **Data-Driven:** ML models require vast amounts of high-quality data to learn effectively.
76+
- **Automatic Pattern Discovery:** The system discovers hidden patterns, correlations, and rules in the data without human intervention.
77+
- **Generalization:** A good ML model can accurately predict or classify data it has never seen before (its performance improves with experience $E$).
78+
- **Iterative Process:** Developing an ML model is a cyclical process of data collection, training, evaluation, and refinement.
8179

8280
## Where is ML Used?
8381

8482
Machine Learning is the engine behind many everyday technologies:
8583

86-
| Domain | Application | ML Task |
87-
| :--- | :--- | :--- |
88-
| **E-commerce** | Recommendation Systems (e.g., "People who bought X also bought Y") | Classification / Ranking |
89-
| **Healthcare** | Tumor detection in X-rays or MRIs | Image Segmentation / Classification |
90-
| **Finance** | Fraud detection in credit card transactions | Anomaly Detection / Classification |
91-
| **Speech** | Voice assistants (Siri, Alexa) | Natural Language Processing (NLP) |
92-
| **Transportation**| Self-driving cars | Computer Vision / Reinforcement Learning |
84+
| Domain | Application | ML Task |
85+
| :----------------- | :----------------------------------------------------------------- | :--------------------------------------- |
86+
| **E-commerce** | Recommendation Systems (e.g., "People who bought X also bought Y") | Classification / Ranking |
87+
| **Healthcare** | Tumor detection in X-rays or MRIs | Image Segmentation / Classification |
88+
| **Finance** | Fraud detection in credit card transactions | Anomaly Detection / Classification |
89+
| **Speech** | Voice assistants (Siri, Alexa) | Natural Language Processing (NLP) |
90+
| **Transportation** | Self-driving cars | Computer Vision / Reinforcement Learning |
91+
92+
<Tabs>
93+
<TabItem value="en" label="In English" default>
94+
<LiteYouTubeEmbed
95+
id="MAKw4DrYMWA"
96+
params="autoplay=1&autohide=1&showinfo=0&rel=0"
97+
title="What is Machine Learning (ML)?"
98+
poster="maxresdefault"
99+
webp
100+
/>
101+
</TabItem>
102+
<TabItem value="hi" label="In Hindi">
103+
<LiteYouTubeEmbed
104+
id="cE-Ej1ycXtk"
105+
params="autoplay=1&autohide=1&showinfo=0&rel=0"
106+
title="What is Machine Learning With Full Information?"
107+
poster="maxresdefault"
108+
webp
109+
/>
110+
</TabItem>
111+
</Tabs>
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
---
2+
title: Determinants
3+
sidebar_label: Determinants
4+
description: "Understanding the determinant of a matrix, its geometric meaning (scaling factor), and its crucial role in checking for matrix invertibility in ML."
5+
tags:
6+
[
7+
determinants,
8+
linear-algebra,
9+
mathematics-for-ml,
10+
invertibility,
11+
singular-matrix,
12+
geometric-meaning,
13+
]
14+
---
15+
16+
The **Determinant** is a special scalar value associated with every **square matrix** ($\mathbf{A}$). It provides crucial information about the matrix, particularly whether it can be inverted and its effect on geometric transformations.
17+
18+
## 1. Geometric Meaning
19+
20+
Conceptually, a matrix represents a **linear transformation** in space. The determinant of that matrix measures the **scaling factor** of the area (in 2D), volume (in 3D), or hyper-volume (in higher dimensions) caused by that transformation.
21+
22+
* If the determinant is $2$, the transformation doubles the area/volume.
23+
* If the determinant is $0.5$, the transformation halves the area/volume.
24+
* If the determinant is negative, the orientation of the space is flipped (like mirroring).
25+
26+
## 2. Calculation of the Determinant
27+
28+
The determinant of a matrix $\mathbf{A}$ is denoted as $\det(\mathbf{A})$ or $|\mathbf{A}|$.
29+
30+
### A. $2 \times 2$ Matrix
31+
32+
For a $2 \times 2$ matrix, the determinant is calculated simply:
33+
34+
$$
35+
\mathbf{A} = \begin{bmatrix} a & b \\ c & d \end{bmatrix}
36+
$$
37+
38+
$$
39+
\det(\mathbf{A}) = |\mathbf{A}| = ad - bc
40+
$$
41+
42+
**Example: 2x2**
43+
44+
Let $\mathbf{A} = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$.
45+
46+
$$
47+
\det(\mathbf{A}) = (4)(3) - (1)(2) = 12 - 2 = 10
48+
$$
49+
50+
### B. $3 \times 3$ Matrix (Method of Cofactors)
51+
52+
For larger matrices, the calculation is more complex and typically involves the **cofactor expansion** method. This process recursively breaks down the determinant into determinants of smaller sub-matrices.
53+
54+
For a $3 \times 3$ matrix $\mathbf{A}$:
55+
56+
$$
57+
\mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}
58+
$$
59+
60+
Expanding along the first row:
61+
62+
$$
63+
\det(\mathbf{A}) = a_{11} \begin{vmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{vmatrix} - a_{12} \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} + a_{13} \begin{vmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{vmatrix}
64+
$$
65+
66+
## 3. Determinants in Machine Learning: Invertibility
67+
68+
The most crucial role of the determinant in ML, particularly in classical linear models, is determining if a matrix is **invertible**.
69+
70+
### A. Non-Singular (Invertible) Matrix
71+
72+
If $\det(\mathbf{A}) \ne 0$:
73+
* The matrix $\mathbf{A}$ has an **inverse** ($\mathbf{A}^{-1}$).
74+
* The system of linear equations represented by $\mathbf{A}\mathbf{x} = \mathbf{b}$ has a unique solution.
75+
76+
### B. Singular (Non-Invertible) Matrix
77+
78+
If $\det(\mathbf{A}) = 0$:
79+
* The matrix $\mathbf{A}$ is **singular** (or degenerate) and **does not have an inverse**.
80+
* The transformation compresses the space into a lower dimension (e.g., a 3D volume collapses onto a 2D plane, hence the volume scaling factor is zero).
81+
* The columns (or rows) of $\mathbf{A}$ are **linearly dependent** (redundant).
82+
83+
:::caution ML Application: Normal Equation
84+
In Linear Regression, the closed-form solution (Normal Equation) is given by:
85+
86+
$$
87+
\mathbf{w} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}
88+
$$
89+
90+
This equation requires the matrix $(\mathbf{X}^T\mathbf{X})$ to be invertible. If $\det(\mathbf{X}^T\mathbf{X}) = 0$, the inverse does not exist, and the Normal Equation cannot be solved directly. This often happens if features are perfectly correlated (multicollinearity). **Regularization techniques (like Ridge Regression) are used precisely to make this matrix invertible by ensuring $\det(\mathbf{X}^T\mathbf{X}) \ne 0$.**
91+
:::
92+
93+
## 4. Properties of the Determinant
94+
95+
1. **Identity Matrix:** $\det(\mathbf{I}) = 1$.
96+
2. **Transpose:** The determinant of a matrix is equal to the determinant of its transpose: $\det(\mathbf{A}) = \det(\mathbf{A}^T)$.
97+
3. **Product:** The determinant of a product of matrices is the product of their determinants: $\det(\mathbf{A}\mathbf{B}) = \det(\mathbf{A})\det(\mathbf{B})$.
98+
99+
---
100+
101+
The concept of invertibility, driven by the determinant, leads directly to the next crucial topic: finding the matrix inverse itself.
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
title: Diagonalization
3+
sidebar_label: Diagonalization
4+
description: "Understanding matrix diagonalization, its geometric meaning as a change of basis, and how it simplifies matrix computations, especially in complex systems and Markov chains."
5+
tags:
6+
[
7+
diagonalization,
8+
linear-algebra,
9+
mathematics-for-ml,
10+
change-of-basis,
11+
markov-chains,
12+
eigen-decomposition,
13+
]
14+
---
15+
16+
**Diagonalization** is the process of transforming a square matrix $\mathbf{A}$ into an equivalent diagonal matrix $\mathbf{D}$ by using its eigenvectors. This process is fundamentally a **change of basis** that simplifies many complex matrix operations, particularly when dealing with repetitive transformations.
17+
18+
## 1. The Diagonalization Formula
19+
20+
A square matrix $\mathbf{A}$ is diagonalizable if and only if it has a full set of linearly independent eigenvectors. If it is diagonalizable, it can be written as:
21+
22+
$$
23+
\mathbf{A} = \mathbf{P} \mathbf{D} \mathbf{P}^{-1}
24+
$$
25+
26+
Let's break down the components:
27+
28+
| Component | Role | Description |
29+
| :--- | :--- | :--- |
30+
| $\mathbf{A}$ | Original Matrix | The linear transformation we want to analyze. |
31+
| $\mathbf{P}$ | Eigenvector Matrix | Columns are the linearly independent eigenvectors of $\mathbf{A}$. |
32+
| $\mathbf{D}$ | Diagonal Matrix | A diagonal matrix whose diagonal entries are the corresponding eigenvalues of $\mathbf{A}$. |
33+
| $\mathbf{P}^{-1}$ | Inverse Matrix | The inverse of the eigenvector matrix. |
34+
35+
:::tip Connection to Eigen-Decomposition
36+
The diagonalization formula is simply a rearrangement of the Eigen-Decomposition formula we saw earlier: $\mathbf{A} = \mathbf{V} \mathbf{\Lambda} \mathbf{V}^{-1}$. Here, $\mathbf{P}$ is the matrix of eigenvectors ($\mathbf{V}$), and $\mathbf{D}$ is the diagonal matrix of eigenvalues ($\mathbf{\Lambda}$).
37+
:::
38+
39+
## 2. The Geometric Meaning: Change of Basis
40+
41+
The true power of diagonalization lies in its geometric interpretation: **it describes the transformation $\mathbf{A}$ from a simpler perspective.**
42+
43+
* **Step 1: $\mathbf{P}^{-1}$ (Changing the Basis):** This transforms the coordinate system from the standard basis (x, y axes) into the **eigenbasis** (the axes defined by the eigenvectors).
44+
* **Step 2: $\mathbf{D}$ (The Simple Transformation):** In this new eigenbasis, the complex transformation $\mathbf{A}$ simply becomes a scaling operation $\mathbf{D}$. Diagonal matrices only scale vectors along the axes—the easiest transformation possible!
45+
* **Step 3: $\mathbf{P}$ (Changing Back):** This transforms the result back from the eigenbasis into the standard coordinate system.
46+
47+
The complex transformation $\mathbf{A}$ can therefore be understood as: **Change to Eigenbasis $\rightarrow$ Scale $\rightarrow$ Change Back**.
48+
49+
## 3. Application: Simplifying Powers of a Matrix
50+
51+
Calculating high powers of a matrix, such as $\mathbf{A}^{100}$, is computationally intensive and tedious. Diagonalization makes this trivial.
52+
53+
If $\mathbf{A} = \mathbf{P} \mathbf{D} \mathbf{P}^{-1}$, then:
54+
55+
$$
56+
\mathbf{A}^2 = (\mathbf{P} \mathbf{D} \mathbf{P}^{-1})(\mathbf{P} \mathbf{D} \mathbf{P}^{-1})
57+
$$
58+
59+
Since $\mathbf{P}^{-1}\mathbf{P} = \mathbf{I}$ (the Identity Matrix):
60+
61+
$$
62+
\mathbf{A}^2 = \mathbf{P} \mathbf{D} (\mathbf{P}^{-1}\mathbf{P}) \mathbf{D} \mathbf{P}^{-1} = \mathbf{P} \mathbf{D} \mathbf{I} \mathbf{D} \mathbf{P}^{-1} = \mathbf{P} \mathbf{D}^2 \mathbf{P}^{-1}
63+
$$
64+
65+
For any power $k$:
66+
67+
$$
68+
\mathbf{A}^k = \mathbf{P} \mathbf{D}^k \mathbf{P}^{-1}
69+
$$
70+
71+
### Why this is simple:
72+
The power of a diagonal matrix $\mathbf{D}^k$ is found simply by raising each diagonal element to the power $k$.
73+
74+
If $\mathbf{D} = \begin{bmatrix} 2 & 0 \\ 0 & 3 \end{bmatrix}$, then $\mathbf{D}^3 = \begin{bmatrix} 2^3 & 0 \\ 0 & 3^3 \end{bmatrix} = \begin{bmatrix} 8 & 0 \\ 0 & 27 \end{bmatrix}$.
75+
76+
## 4. Application in ML: Markov Chains
77+
78+
Diagonalization is critical for analyzing **Markov Chains**, which model systems (like user behavior, or language transitions) that change state over time.
79+
80+
* The system's transition probabilities are captured in a matrix $\mathbf{A}$.
81+
* The state of the system after many time steps ($k \to \infty$) is given by $\mathbf{A}^k$.
82+
* By diagonalizing $\mathbf{A}$, we can easily compute $\mathbf{A}^k$ to find the **long-term steady state** (equilibrium) of the system, which is crucial for modeling language, search engine rankings (PageRank), and customer journey analysis.
83+
84+
## Conclusion of Linear Algebra
85+
86+
You have successfully completed the foundational concepts of Linear Algebra! You now understand the basic data structures (scalars, vectors, matrices, tensors) and the core operations (multiplication, transpose, inverse) and decompositions (Eigen-Decomposition, SVD) that underpin all modern Machine Learning algorithms.
87+
88+
---
89+
90+
Your next module will delve into Calculus, the mathematics of change, which is the engine that drives the learning process in ML models.

0 commit comments

Comments
 (0)