Linear Algebra for Machine Learning

Linear algebra is the workhorse of machine learning. Nearly every machine learning algorithm — from linear regression to deep neural networks — relies on vectors, matrices, and the operations between them.

This notebook covers the essentials, supplementing lecture 3 of the :

Topic	What you'll learn
Vectors	Creation, addition, subtraction, scalar multiplication
Matrices	Arithmetic, transpose, multiplication (three views)
Linear maps	How matrices transform space, Gram matrices
Interactive explorer	Drag a 2x2 matrix and watch geometry change

All computations use PyTorch tensors, the same objects you'll use to build neural networks.

Vectors

A vector is an ordered list of numbers. In machine learning we usually think of vectors as column vectors:

\mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} \in \mathbb{R}^n

Two views:

Algebraic: a vector is a tuple of coordinates.
Geometric: a vector is an arrow from the origin to a point in space.

In PyTorch, vectors are 1-D tensors.

["text/plain:tensor([2., 1.])", "text/plain:tensor([1., 3.])"]

Vector addition & subtraction

\mathbf{v} + \mathbf{w} = \begin{bmatrix} v_1 + w_1 \\ v_2 + w_2 \end{bmatrix}, \qquad \mathbf{v} - \mathbf{w} = \begin{bmatrix} v_1 - w_1 \\ v_2 - w_2 \end{bmatrix}

Geometrically, $\mathbf{v} + \mathbf{w}$ is the diagonal of the parallelogram formed by $\mathbf{v}$ and $\mathbf{w}$ .

Scalar multiplication

c \cdot \mathbf{v} = \begin{bmatrix} c \, v_1 \\ c \, v_2 \end{bmatrix}

$|c| > 1$ : stretches the vector
$|c| < 1$ : shrinks the vector
$c < 0$ : reverses direction

Matrices

A matrix is a rectangular array of numbers — a 2-D tensor in PyTorch:

A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \in \mathbb{R}^{m \times n}

["text/plain:tensor([[1., 2.],\n        [3., 4.]])", "text/plain:tensor([[5., 6.],\n        [7., 8.]])"]

Matrix arithmetic

Operation	Formula	PyTorch
Addition	$A + B$	`A + B`
Scalar multiplication	$cA$	`c * A`
Transpose	$A^\top$	`A.T`

["text/plain:tensor([[ 6.,  8.],\n        [10., 12.]])", "text/plain:tensor([[ 3.,  6.],\n        [ 9., 12.]])", "text/plain:tensor([[1., 3.],\n        [2., 4.]])"]

Dot product & outer product

The dot product (inner product) of two vectors $\mathbf{v}, \mathbf{w} \in \mathbb{R}^n$ :

\mathbf{v}^\top \mathbf{w} = \sum_{i=1}^{n} v_i \, w_i \in \mathbb{R}

The outer product produces a matrix:

\mathbf{v} \, \mathbf{w}^\top = \begin{bmatrix} v_1 w_1 & v_1 w_2 \\ v_2 w_1 & v_2 w_2 \end{bmatrix} \in \mathbb{R}^{m \times n}

["text/plain:5.0", "text/plain:tensor([[2., 6.],\n        [1., 3.]])"]

Matrix multiplication

For $A \in \mathbb{R}^{m \times n}$ and $B \in \mathbb{R}^{n \times p}$ , the product $C = AB \in \mathbb{R}^{m \times p}$ .

Three equivalent views:

Entry-wise: $\;C_{ij} = \sum_{k=1}^{n} A_{ik} B_{kj}$ — each entry is a dot product of a row of $A$ with a column of $B$ .
Column view: Each column of $C$ is a linear combination of columns of $A$ , with coefficients from the corresponding column of $B$ .
Row view: Each row of $C$ is a linear combination of rows of $B$ , with coefficients from the corresponding row of $A$ .

In PyTorch, the @ operator is used for matrix multiplication:

tensor([[19., 22.], [43., 50.]])

Matrix as a linear map

Multiplying a matrix by a vector applies a linear transformation:

\mathbf{y} = A\mathbf{x} = x_1 \begin{bmatrix} a_{11} \\ a_{21} \end{bmatrix} + x_2 \begin{bmatrix} a_{12} \\ a_{22} \end{bmatrix}

The output is a linear combination of the columns of $A$ , weighted by the entries of $\mathbf{x}$ . This means:

The columns of $A$ are where the standard basis vectors $\mathbf{e}_1, \mathbf{e}_2$ get mapped.
Lines through the origin stay lines (linear maps preserve linearity).

Gram matrix

The Gram matrix $G = A^\top A$ captures the inner products between columns of $A$ . It is always symmetric ( $G = G^\top$ ) and positive semi-definite.

The Gram matrix appears in PCA, kernel methods, and the normal equations for least squares.

["text/plain:tensor([[10., 14.],\n        [14., 20.]])", "text/plain:tensor([[True, True],\n        [True, True]])"]

Interactive: explore 2D transformations

Drag the matrix entries below to see how different matrices transform the plane. Try these classic transformations:

Transformation	Matrix
Identity	$\begin{bmatrix}1&0\\0&1\end{bmatrix}$
Scaling	$\begin{bmatrix}s_x&0\\0&s_y\end{bmatrix}$
Rotation by $\theta$	$\begin{bmatrix}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{bmatrix}$
Reflection (y-axis)	$\begin{bmatrix}-1&0\\0&1\end{bmatrix}$
Shear	$\begin{bmatrix}1&k\\0&1\end{bmatrix}$

Geometric formulas

Rotation by angle $\theta$ (counter-clockwise):

R(\theta) = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}

Reflection across the $x$ -axis: $\begin{bmatrix}1&0\\0&-1\end{bmatrix}$ , across the $y$ -axis: $\begin{bmatrix}-1&0\\0&1\end{bmatrix}$ .

Dilation (uniform scaling): $\begin{bmatrix}c&0\\0&c\end{bmatrix}$ stretches ( $c>1$ ) or shrinks ( $0<c<1$ ) all directions equally.

Summary

Concept	Math	PyTorch
Vector addition	$\mathbf{v} + \mathbf{w}$	`v + w`
Scalar multiplication	$c\mathbf{v}$	`c * v`
Dot product	$\mathbf{v}^\top\mathbf{w}$	`torch.dot(v, w)`
Outer product	$\mathbf{v}\mathbf{w}^\top$	`torch.outer(v, w)`
Matrix multiply	$AB$	`A @ B`
Transpose	$A^\top$	`A.T`
Gram matrix	$A^\top A$	`A.T @ A`

Key takeaway: Every matrix encodes a linear map. Understanding how matrices act on vectors — stretching, rotating, reflecting — gives you geometric intuition for the transformations at the heart of machine learning.

03_linear_algebra.py