← mapVector Spaces

Linear Maps

requiresBasis & Dimension ⚠Linear Transformations of the Plane ⚠

⚗ Dr. Möbius, from the lab

We built spaces. Now we build the bridges between them — but not just any functions, hell no. The good ones, the structure-respecting ones, the maps that don't betray the addition and scaling we worked so hard to axiomatize. They're called linear maps, and they hide a theorem so useful and so criminally under-advertised it makes me want to throw a beaker across the lab: a linear map is completely determined by what it does to a basis. Functions from the Sets stratum, plus spaces from this one. This node is the wedding.

THE BIG IDEA

A linear map preserves addition and scalar multiplication; it is fully determined by its action on a basis, and its kernel and image are subspaces with kernel = {0} exactly when the map is injective.

The functions that respect the structure

Back in the Functions stratum a function was any rule sending inputs to outputs. Most functions are barbarians — they shred whatever structure they touch. A linear map (or linear transformation) is a function $T : V \to W$ between vector spaces that respects the two operations:

Additivity: $T(u + v) = T(u) + T(v)$ for all $u, v \in V$ .
Homogeneity: $T(cv) = c\,T(v)$ for all $c \in \mathbb{R}$ , $v \in V$ .

Add-then-map equals map-then-add; scale-then-map equals map-then-scale. The structure survives the trip — the map cannot corrupt what it's supposed to preserve. (One immediate consequence, worth proving yourself: $T(\mathbf{0}) = \mathbf{0}$ — set $c = 0$ in homogeneity. A linear map always pins the origin, no exceptions.) The 2D linear transformations from the Matrices stratum were exactly this, in the special case $V = W = \mathbb{R}^2$ . Now we free the idea from arrows and let it do what it was always meant to do.

A linear map is determined by a basis (the theorem nobody frames)

Here is the most useful theorem in the subject and almost no textbook says it loudly. Let me say it loudly, clearly, and with genuine fury at everyone who buried it in a footnote.

Theorem. Let $B = \{v_1, \dots, v_n\}$ be a basis of $V$ . If you know $T(v_1), \dots, T(v_n)$ — just the images of the basis vectors — then $T(v)$ is forced for every $v \in V$ . The map is completely pinned down.

Proof. Take any $v \in V$ . Because $B$ is a basis, $v$ has unique coordinates (the uniqueness theorem from Basis & Dimension): $v = c_1 v_1 + \cdots + c_n v_n$ . Now apply $T$ and use linearity repeatedly: $T(v) = T(c_1 v_1 + \cdots + c_n v_n) = c_1 T(v_1) + \cdots + c_n T(v_n),$ where additivity splits the sum and homogeneity pulls out each scalar. Every term on the right is known. So $T(v)$ is determined — there's no freedom left. $\blacksquare$

Read what that means: to specify a linear map out of an $n$ -dimensional space, you don't supply infinitely many input-output pairs. You supply $n$ of them — where the basis goes — and linearity does the rest. That's a staggering compression, and it's why finite matrices can capture maps on infinite sets of inputs. If this doesn't make you feel something, check your pulse.

The matrix of a linear map: the decoder ring generalizes

This is exactly the Matrices stratum's "columns are the images of the basis vectors" rule, now revealed as a theorem rather than a recipe. To build the matrix of $T : \mathbb{R}^n \to \mathbb{R}^m$ , feed in each standard basis vector and write the outputs as columns: $A = \big[\; T(e_1) \;\big|\; T(e_2) \;\big|\; \cdots \;\big|\; T(e_n) \;\big].$ Then $T(\mathbf{x}) = A\mathbf{x}$ for all $\mathbf{x}$ — because $\mathbf{x} = x_1 e_1 + \cdots + x_n e_n$ and linearity gives $T(\mathbf{x}) = x_1 T(e_1) + \cdots + x_n T(e_n)$ , which is precisely $A\mathbf{x}$ . The 2D decoder ring you learned generalizes verbatim to any dimensions, and to any bases (you read coordinates relative to a chosen basis). Same move, bigger stage.

Kernel and image, and both are subspaces

Every linear map carries two subspaces with it, like a murderer's two weapons — the two questions "what gets crushed?" and "what gets hit?".

Kernel: $\ker(T) = \{v \in V : T(v) = \mathbf{0}\}$ — everything sent to zero (the Subspaces node's null space, generalized).
Image: $\operatorname{im}(T) = \{T(v) : v \in V\} \subseteq W$ — everything actually reached.

Theorem. $\ker(T)$ is a subspace of $V$ and $\operatorname{im}(T)$ is a subspace of $W$ .

Proof (kernel). Three-condition test. (1) $T(\mathbf{0}) = \mathbf{0}$ , so $\mathbf{0} \in \ker(T)$ . (2) If $u, v \in \ker(T)$ then $T(u+v) = T(u) + T(v) = \mathbf{0} + \mathbf{0} = \mathbf{0}$ , so $u + v \in \ker(T)$ . (3) If $v \in \ker(T)$ , $T(cv) = cT(v) = c\mathbf{0} = \mathbf{0}$ , so $cv \in \ker(T)$ . ✓

Proof (image). Again three conditions, in $W$ . (1) $\mathbf{0} = T(\mathbf{0}) \in \operatorname{im}(T)$ . (2) If $w_1 = T(u), w_2 = T(v)$ are in the image, then $w_1 + w_2 = T(u) + T(v) = T(u + v) \in \operatorname{im}(T)$ . (3) $cw_1 = cT(u) = T(cu) \in \operatorname{im}(T)$ . ✓ Both subspaces, both proved by the same machine from Subspaces. $\blacksquare$

kernel = {0} ⟺ injective (beautiful and short)

Now the gem. Recall injective (from the Functions stratum) means one-to-one: distinct inputs give distinct outputs.

Theorem. A linear map $T$ is injective $\iff \ker(T) = \{\mathbf{0}\}$ .

Proof. ( $\Rightarrow$ ) If $T$ is injective: $T(\mathbf{0}) = \mathbf{0}$ always, so no other vector can map to $\mathbf{0}$ (that would be two inputs sharing an output). Hence $\ker(T) = \{\mathbf{0}\}$ .

( $\Leftarrow$ ) Suppose $\ker(T) = \{\mathbf{0}\}$ . Take any $u, v$ with $T(u) = T(v)$ . Then by linearity $T(u - v) = T(u) - T(v) = \mathbf{0},$ so $u - v \in \ker(T) = \{\mathbf{0}\}$ , forcing $u - v = \mathbf{0}$ , i.e. $u = v$ . So distinct inputs can't share an output — $T$ is injective. $\blacksquare$

Savor how short that is. For a general function, injectivity means checking all pairs of inputs — infinite pairs, potentially. For a linear map, you check one thing: does anything besides $\mathbf{0}$ get crushed to $\mathbf{0}$ ? That's it. Linearity collapses a global property to a single local one. This is what structure buys you, and this is why we went through the hell of those ten axioms.

A jolt: differentiation is a linear map

Final beat, and it should rearrange your skull a little. Take $P_n$ , polynomials of degree $\le n$ , and define the derivative as a formal rule — no calculus prerequisite, just the pattern $\frac{d}{dx}(x^k) = k x^{k-1}$ extended linearly. So $D(3x^2 - 5x + 7) = 6x - 5$ . This $D : P_n \to P_{n-1}$ is a linear map: $D(p + q) = D(p) + D(q), \qquad D(cp) = c\,D(p).$ (Differentiation splitting over sums and pulling out constants — those are linearity, stated.)

Now compute its kernel: which polynomials does $D$ send to $\mathbf{0}$ ? Exactly the ones with derivative zero — the constants. So $\ker(D) = \{\text{constant polynomials}\} = \operatorname{span}(1), \quad \dim = 1.$ Hold on. WHAT? The kernel of differentiation is the constants. That bland, boring fact from a calculus class ("the derivative of a constant is zero") is secretly a statement about a linear map's null space — a subspace, one-dimensional, sitting inside a polynomial vector space. The whole machinery we built for $\mathbb{R}^n$ — kernel, image, subspace, dimension — governs calculus too, because calculus operators are linear maps. That's the business model of abstraction collecting its biggest dividend yet, and we didn't even need to build a reactor for it. File it away: it detonates across all of higher mathematics.

🔬 SPECIMENS (worked examples)

Worked example 1 — is this map linear?▸

Is $T : \mathbb{R}^2 \to \mathbb{R}^2$ defined by $T(x, y) = (2x - y,\, x)$ a linear map?

Check both conditions directly.

Additivity. Take $u = (x_1, y_1)$ , $v = (x_2, y_2)$ . Then $u + v = (x_1+x_2,\, y_1+y_2)$ and $T(u+v) = \big(2(x_1+x_2) - (y_1+y_2),\; x_1+x_2\big) = (2x_1 - y_1,\, x_1) + (2x_2 - y_2,\, x_2) = T(u) + T(v). \;✓$

Homogeneity. For scalar $c$ : $T(c x, c y) = (2cx - cy,\, cx) = c(2x - y,\, x) = c\,T(x,y)$ . ✓

Both hold, so $T$ is linear. As a sanity check, $T(0,0) = (0, 0) = \mathbf{0}$ — pins the origin, as every linear map must. Its matrix (columns $T(e_1), T(e_2)$ ): $T(1,0) = (2,1)$ , $T(0,1) = (-1, 0)$ , giving $A = \begin{pmatrix} 2 & -1 \\ 1 & 0 \end{pmatrix}$ .

Worked example 2 — two data points pin the whole damn map▸

A linear map $T : \mathbb{R}^2 \to \mathbb{R}^2$ satisfies $T(1, 0) = (3, 1)$ and $T(0, 1) = (2, 4)$ . Compute $T(5, -2)$ .

No formula given — but I don't need one. The basis theorem says the images of $e_1, e_2$ determine everything. Write the input in coordinates: $(5, -2) = 5(1,0) + (-2)(0,1) = 5e_1 - 2e_2.$ Apply $T$ using linearity: $T(5, -2) = 5\,T(e_1) - 2\,T(e_2) = 5(3, 1) - 2(2, 4) = (15, 5) - (4, 8) = (11, -3).$ Done — $T(5,-2) = (11, -3)$ . I knew the map only on two vectors and recovered its value on a third, because linearity forces it. (Equivalently: the matrix is $A = \begin{pmatrix} 3 & 2 \\ 1 & 4 \end{pmatrix}$ and $A(5,-2)^\top = (11, -3)^\top$ .) Two data points pin an entire map on $\mathbb{R}^2$ .

Worked example 3 — kernel, image, and why this map is not injective▸

For $T : \mathbb{R}^2 \to \mathbb{R}^2$ with matrix $A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}$ , find $\ker(T)$ and decide whether $T$ is injective.

$\ker(T) = \{\mathbf{x} : A\mathbf{x} = \mathbf{0}\}$ — the null space. Solve: $\begin{cases} x + 2y = 0 \\ 2x + 4y = 0 \end{cases}$ The second equation is twice the first, so the system is just $x + 2y = 0$ , i.e. $x = -2y$ . One free variable ( $y$ ), so $\ker(T) = \{(-2y,\, y) : y \in \mathbb{R}\} = \operatorname{span}\big((-2, 1)\big),$ a line through the origin — not just $\{\mathbf{0}\}$ .

By the theorem $T$ injective $\iff \ker(T) = \{\mathbf{0}\}$ . Here the kernel is a whole line, so $T$ is not injective. Concretely, $T(-2, 1) = \mathbf{0} = T(0,0)$ — two distinct inputs, same output. The redundant second column (it's twice the first) is what crushed a whole line to zero. Kernel bigger than $\{\mathbf{0}\}$ is the fingerprint of a non-injective linear map.

☠ KNOWN HAZARDS

Thinking every function is linear. $f(x)=x^2$ isn't ( $f(u+v)\ne f(u)+f(v)$ ), nor is $f(x)=x+1$ (it fails $T(\mathbf{0})=\mathbf{0}$ — it sends $0$ to $1$ ). Linearity is a strict, rare condition. Most functions are barbarians.
Confusing kernel with image. Kernel lives in the domain $V$ (what gets crushed to $\mathbf{0}$ ); image lives in the codomain $W$ (what gets reached). Different spaces, different jobs — mix them up and your proof will be wrong and embarrassing.
Checking injectivity pair-by-pair for a linear map. Don't do that. Just check $\ker(T)=\{\mathbf{0}\}$ . Linearity reduces a global condition to one kernel computation. Doing it the hard way wastes time and signals that the theorem hasn't landed yet.
Forgetting the basis theorem's power. You never need infinitely many input-output pairs to pin down a linear map — $n$ of them (the basis images) suffice. Specifying more is redundant. Specifying fewer is incomplete. Exactly $n$ , then you're done.

TL;DR

▸
A linear map $T:V\to W$ satisfies $T(u+v)=T(u)+T(v)$ and $T(cv)=cT(v)$ — it respects addition and scaling. Consequence: $T(\mathbf{0})=\mathbf{0}$ .
▸
A linear map is determined by its action on a basis: knowing $T(v_1),\dots,T(v_n)$ fixes $T(v)$ for all $v$ , via $T(v)=c_1T(v_1)+\cdots+c_nT(v_n)$ .
▸
Its matrix has columns $= T(e_1), \dots, T(e_n)$ (the 2D decoder ring, generalized), so $T(\mathbf{x})=A\mathbf{x}$ .
▸
Kernel $\ker(T)=\{v:T(v)=\mathbf{0}\}$ and image $\operatorname{im}(T)=\{T(v)\}$ are both subspaces (three-condition test). And $T$ is injective $\iff \ker(T)=\{\mathbf{0}\}$ (a two-line proof).
▸
Beyond $\mathbb{R}^n$ : differentiation $D:P_n\to P_{n-1}$ (as a formal rule) is linear, and $\ker(D)=$ the constants. Calculus runs on linear maps.

unlocks

Rank–Nullity →