Linear algebra is the workhorse of machine learning. Nearly every machine learning algorithm — from linear regression to deep neural networks — relies on vectors, matrices, and the operations between them.
This notebook covers the essentials, supplementing lecture 3 of the :
| Topic | What you'll learn |
|---|---|
| Vectors | Creation, addition, subtraction, scalar multiplication |
| Matrices | Arithmetic, transpose, multiplication (three views) |
| Linear maps | How matrices transform space, Gram matrices |
| Interactive explorer | Drag a 2x2 matrix and watch geometry change |
All computations use PyTorch tensors, the same objects you'll use to build neural networks.
A vector is an ordered list of numbers. In machine learning we usually think of vectors as column vectors:
Two views:
In PyTorch, vectors are 1-D tensors.
["text/plain:tensor([2., 1.])", "text/plain:tensor([1., 3.])"]
Geometrically, is the diagonal of the parallelogram formed by and .
A matrix is a rectangular array of numbers — a 2-D tensor in PyTorch:
["text/plain:tensor([[1., 2.],\n [3., 4.]])", "text/plain:tensor([[5., 6.],\n [7., 8.]])"]
| Operation | Formula | PyTorch |
|---|---|---|
| Addition | A + B | |
| Scalar multiplication | c * A | |
| Transpose | A.T |
["text/plain:tensor([[ 6., 8.],\n [10., 12.]])", "text/plain:tensor([[ 3., 6.],\n [ 9., 12.]])", "text/plain:tensor([[1., 3.],\n [2., 4.]])"]
The dot product (inner product) of two vectors :
The outer product produces a matrix:
["text/plain:5.0", "text/plain:tensor([[2., 6.],\n [1., 3.]])"]
For and , the product .
Three equivalent views:
Entry-wise: — each entry is a dot product of a row of with a column of .
Column view: Each column of is a linear combination of columns of , with coefficients from the corresponding column of .
Row view: Each row of is a linear combination of rows of , with coefficients from the corresponding row of .
In PyTorch, the @ operator is used for matrix multiplication:
Multiplying a matrix by a vector applies a linear transformation:
The output is a linear combination of the columns of , weighted by the entries of . This means:
The Gram matrix captures the inner products between columns of . It is always symmetric () and positive semi-definite.
The Gram matrix appears in PCA, kernel methods, and the normal equations for least squares.
["text/plain:tensor([[10., 14.],\n [14., 20.]])", "text/plain:tensor([[True, True],\n [True, True]])"]
Drag the matrix entries below to see how different matrices transform the plane. Try these classic transformations:
| Transformation | Matrix |
|---|---|
| Identity | |
| Scaling | |
| Rotation by | |
| Reflection (y-axis) | |
| Shear |
Rotation by angle (counter-clockwise):
Reflection across the -axis: , across the -axis: .
Dilation (uniform scaling): stretches () or shrinks () all directions equally.
| Concept | Math | PyTorch |
|---|---|---|
| Vector addition | v + w | |
| Scalar multiplication | c * v | |
| Dot product | torch.dot(v, w) | |
| Outer product | torch.outer(v, w) | |
| Matrix multiply | A @ B | |
| Transpose | A.T | |
| Gram matrix | A.T @ A |
Key takeaway: Every matrix encodes a linear map. Understanding how matrices act on vectors — stretching, rotating, reflecting — gives you geometric intuition for the transformations at the heart of machine learning.