← mapVector Spaces

Linear Maps

⚗ Dr. Möbius, from the lab

We built spaces. Now we build the bridges between them — but not just any functions, hell no. The good ones, the structure-respecting ones, the maps that don't betray the addition and scaling we worked so hard to axiomatize. They're called linear maps, and they hide a theorem so useful and so criminally under-advertised it makes me want to throw a beaker across the lab: a linear map is completely determined by what it does to a basis. Functions from the Sets stratum, plus spaces from this one. This node is the wedding.

THE BIG IDEA

A linear map preserves addition and scalar multiplication; it is fully determined by its action on a basis, and its kernel and image are subspaces with kernel = {0} exactly when the map is injective.

The functions that respect the structure

Back in the Functions stratum a function was any rule sending inputs to outputs. Most functions are barbarians — they shred whatever structure they touch. A linear map (or linear transformation) is a function T:VWT : V \to W between vector spaces that respects the two operations:

  1. Additivity: T(u+v)=T(u)+T(v)T(u + v) = T(u) + T(v) for all u,vVu, v \in V.
  2. Homogeneity: T(cv)=cT(v)T(cv) = c\,T(v) for all cRc \in \mathbb{R}, vVv \in V.

Add-then-map equals map-then-add; scale-then-map equals map-then-scale. The structure survives the trip — the map cannot corrupt what it's supposed to preserve. (One immediate consequence, worth proving yourself: T(0)=0T(\mathbf{0}) = \mathbf{0} — set c=0c = 0 in homogeneity. A linear map always pins the origin, no exceptions.) The 2D linear transformations from the Matrices stratum were exactly this, in the special case V=W=R2V = W = \mathbb{R}^2. Now we free the idea from arrows and let it do what it was always meant to do.

A linear map is determined by a basis (the theorem nobody frames)

Here is the most useful theorem in the subject and almost no textbook says it loudly. Let me say it loudly, clearly, and with genuine fury at everyone who buried it in a footnote.

Theorem. Let B={v1,,vn}B = \{v_1, \dots, v_n\} be a basis of VV. If you know T(v1),,T(vn)T(v_1), \dots, T(v_n) — just the images of the basis vectors — then T(v)T(v) is forced for every vVv \in V. The map is completely pinned down.

Proof. Take any vVv \in V. Because BB is a basis, vv has unique coordinates (the uniqueness theorem from Basis & Dimension): v=c1v1++cnvnv = c_1 v_1 + \cdots + c_n v_n. Now apply TT and use linearity repeatedly: T(v)=T(c1v1++cnvn)=c1T(v1)++cnT(vn),T(v) = T(c_1 v_1 + \cdots + c_n v_n) = c_1 T(v_1) + \cdots + c_n T(v_n), where additivity splits the sum and homogeneity pulls out each scalar. Every term on the right is known. So T(v)T(v) is determined — there's no freedom left. \blacksquare

Read what that means: to specify a linear map out of an nn-dimensional space, you don't supply infinitely many input-output pairs. You supply nn of them — where the basis goes — and linearity does the rest. That's a staggering compression, and it's why finite matrices can capture maps on infinite sets of inputs. If this doesn't make you feel something, check your pulse.

The matrix of a linear map: the decoder ring generalizes

This is exactly the Matrices stratum's "columns are the images of the basis vectors" rule, now revealed as a theorem rather than a recipe. To build the matrix of T:RnRmT : \mathbb{R}^n \to \mathbb{R}^m, feed in each standard basis vector and write the outputs as columns: A=[  T(e1)    T(e2)        T(en)  ].A = \big[\; T(e_1) \;\big|\; T(e_2) \;\big|\; \cdots \;\big|\; T(e_n) \;\big]. Then T(x)=AxT(\mathbf{x}) = A\mathbf{x} for all x\mathbf{x} — because x=x1e1++xnen\mathbf{x} = x_1 e_1 + \cdots + x_n e_n and linearity gives T(x)=x1T(e1)++xnT(en)T(\mathbf{x}) = x_1 T(e_1) + \cdots + x_n T(e_n), which is precisely AxA\mathbf{x}. The 2D decoder ring you learned generalizes verbatim to any dimensions, and to any bases (you read coordinates relative to a chosen basis). Same move, bigger stage.

Kernel and image, and both are subspaces

Every linear map carries two subspaces with it, like a murderer's two weapons — the two questions "what gets crushed?" and "what gets hit?".

  • Kernel: ker(T)={vV:T(v)=0}\ker(T) = \{v \in V : T(v) = \mathbf{0}\} — everything sent to zero (the Subspaces node's null space, generalized).
  • Image: im(T)={T(v):vV}W\operatorname{im}(T) = \{T(v) : v \in V\} \subseteq W — everything actually reached.

Theorem. ker(T)\ker(T) is a subspace of VV and im(T)\operatorname{im}(T) is a subspace of WW.

Proof (kernel). Three-condition test. (1) T(0)=0T(\mathbf{0}) = \mathbf{0}, so 0ker(T)\mathbf{0} \in \ker(T). (2) If u,vker(T)u, v \in \ker(T) then T(u+v)=T(u)+T(v)=0+0=0T(u+v) = T(u) + T(v) = \mathbf{0} + \mathbf{0} = \mathbf{0}, so u+vker(T)u + v \in \ker(T). (3) If vker(T)v \in \ker(T), T(cv)=cT(v)=c0=0T(cv) = cT(v) = c\mathbf{0} = \mathbf{0}, so cvker(T)cv \in \ker(T). ✓

Proof (image). Again three conditions, in WW. (1) 0=T(0)im(T)\mathbf{0} = T(\mathbf{0}) \in \operatorname{im}(T). (2) If w1=T(u),w2=T(v)w_1 = T(u), w_2 = T(v) are in the image, then w1+w2=T(u)+T(v)=T(u+v)im(T)w_1 + w_2 = T(u) + T(v) = T(u + v) \in \operatorname{im}(T). (3) cw1=cT(u)=T(cu)im(T)cw_1 = cT(u) = T(cu) \in \operatorname{im}(T). ✓ Both subspaces, both proved by the same machine from Subspaces. \blacksquare

kernel = {0} ⟺ injective (beautiful and short)

Now the gem. Recall injective (from the Functions stratum) means one-to-one: distinct inputs give distinct outputs.

Theorem. A linear map TT is injective     ker(T)={0}\iff \ker(T) = \{\mathbf{0}\}.

Proof. (\Rightarrow) If TT is injective: T(0)=0T(\mathbf{0}) = \mathbf{0} always, so no other vector can map to 0\mathbf{0} (that would be two inputs sharing an output). Hence ker(T)={0}\ker(T) = \{\mathbf{0}\}.

(\Leftarrow) Suppose ker(T)={0}\ker(T) = \{\mathbf{0}\}. Take any u,vu, v with T(u)=T(v)T(u) = T(v). Then by linearity T(uv)=T(u)T(v)=0,T(u - v) = T(u) - T(v) = \mathbf{0}, so uvker(T)={0}u - v \in \ker(T) = \{\mathbf{0}\}, forcing uv=0u - v = \mathbf{0}, i.e. u=vu = v. So distinct inputs can't share an output — TT is injective. \blacksquare

Savor how short that is. For a general function, injectivity means checking all pairs of inputs — infinite pairs, potentially. For a linear map, you check one thing: does anything besides 0\mathbf{0} get crushed to 0\mathbf{0}? That's it. Linearity collapses a global property to a single local one. This is what structure buys you, and this is why we went through the hell of those ten axioms.

A jolt: differentiation is a linear map

Final beat, and it should rearrange your skull a little. Take PnP_n, polynomials of degree n\le n, and define the derivative as a formal rule — no calculus prerequisite, just the pattern ddx(xk)=kxk1\frac{d}{dx}(x^k) = k x^{k-1} extended linearly. So D(3x25x+7)=6x5D(3x^2 - 5x + 7) = 6x - 5. This D:PnPn1D : P_n \to P_{n-1} is a linear map: D(p+q)=D(p)+D(q),D(cp)=cD(p).D(p + q) = D(p) + D(q), \qquad D(cp) = c\,D(p). (Differentiation splitting over sums and pulling out constants — those are linearity, stated.)

Now compute its kernel: which polynomials does DD send to 0\mathbf{0}? Exactly the ones with derivative zero — the constants. So ker(D)={constant polynomials}=span(1),dim=1.\ker(D) = \{\text{constant polynomials}\} = \operatorname{span}(1), \quad \dim = 1. Hold on. WHAT? The kernel of differentiation is the constants. That bland, boring fact from a calculus class ("the derivative of a constant is zero") is secretly a statement about a linear map's null space — a subspace, one-dimensional, sitting inside a polynomial vector space. The whole machinery we built for Rn\mathbb{R}^n — kernel, image, subspace, dimension — governs calculus too, because calculus operators are linear maps. That's the business model of abstraction collecting its biggest dividend yet, and we didn't even need to build a reactor for it. File it away: it detonates across all of higher mathematics.

🔬 SPECIMENS (worked examples)

Worked example 1 — is this map linear?

Is T:R2R2T : \mathbb{R}^2 \to \mathbb{R}^2 defined by T(x,y)=(2xy,x)T(x, y) = (2x - y,\, x) a linear map?

Check both conditions directly.

Additivity. Take u=(x1,y1)u = (x_1, y_1), v=(x2,y2)v = (x_2, y_2). Then u+v=(x1+x2,y1+y2)u + v = (x_1+x_2,\, y_1+y_2) and T(u+v)=(2(x1+x2)(y1+y2),  x1+x2)=(2x1y1,x1)+(2x2y2,x2)=T(u)+T(v).  T(u+v) = \big(2(x_1+x_2) - (y_1+y_2),\; x_1+x_2\big) = (2x_1 - y_1,\, x_1) + (2x_2 - y_2,\, x_2) = T(u) + T(v). \;✓

Homogeneity. For scalar cc: T(cx,cy)=(2cxcy,cx)=c(2xy,x)=cT(x,y)T(c x, c y) = (2cx - cy,\, cx) = c(2x - y,\, x) = c\,T(x,y). ✓

Both hold, so TT is linear. As a sanity check, T(0,0)=(0,0)=0T(0,0) = (0, 0) = \mathbf{0} — pins the origin, as every linear map must. Its matrix (columns T(e1),T(e2)T(e_1), T(e_2)): T(1,0)=(2,1)T(1,0) = (2,1), T(0,1)=(1,0)T(0,1) = (-1, 0), giving A=(2110)A = \begin{pmatrix} 2 & -1 \\ 1 & 0 \end{pmatrix}.

Worked example 2 — two data points pin the whole damn map

A linear map T:R2R2T : \mathbb{R}^2 \to \mathbb{R}^2 satisfies T(1,0)=(3,1)T(1, 0) = (3, 1) and T(0,1)=(2,4)T(0, 1) = (2, 4). Compute T(5,2)T(5, -2).

No formula given — but I don't need one. The basis theorem says the images of e1,e2e_1, e_2 determine everything. Write the input in coordinates: (5,2)=5(1,0)+(2)(0,1)=5e12e2.(5, -2) = 5(1,0) + (-2)(0,1) = 5e_1 - 2e_2. Apply TT using linearity: T(5,2)=5T(e1)2T(e2)=5(3,1)2(2,4)=(15,5)(4,8)=(11,3).T(5, -2) = 5\,T(e_1) - 2\,T(e_2) = 5(3, 1) - 2(2, 4) = (15, 5) - (4, 8) = (11, -3). Done — T(5,2)=(11,3)T(5,-2) = (11, -3). I knew the map only on two vectors and recovered its value on a third, because linearity forces it. (Equivalently: the matrix is A=(3214)A = \begin{pmatrix} 3 & 2 \\ 1 & 4 \end{pmatrix} and A(5,2)=(11,3)A(5,-2)^\top = (11, -3)^\top.) Two data points pin an entire map on R2\mathbb{R}^2.

Worked example 3 — kernel, image, and why this map is not injective

For T:R2R2T : \mathbb{R}^2 \to \mathbb{R}^2 with matrix A=(1224)A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}, find ker(T)\ker(T) and decide whether TT is injective.

ker(T)={x:Ax=0}\ker(T) = \{\mathbf{x} : A\mathbf{x} = \mathbf{0}\} — the null space. Solve: {x+2y=02x+4y=0\begin{cases} x + 2y = 0 \\ 2x + 4y = 0 \end{cases} The second equation is twice the first, so the system is just x+2y=0x + 2y = 0, i.e. x=2yx = -2y. One free variable (yy), so ker(T)={(2y,y):yR}=span((2,1)),\ker(T) = \{(-2y,\, y) : y \in \mathbb{R}\} = \operatorname{span}\big((-2, 1)\big), a line through the origin — not just {0}\{\mathbf{0}\}.

By the theorem TT injective     ker(T)={0}\iff \ker(T) = \{\mathbf{0}\}. Here the kernel is a whole line, so TT is not injective. Concretely, T(2,1)=0=T(0,0)T(-2, 1) = \mathbf{0} = T(0,0) — two distinct inputs, same output. The redundant second column (it's twice the first) is what crushed a whole line to zero. Kernel bigger than {0}\{\mathbf{0}\} is the fingerprint of a non-injective linear map.

☠ KNOWN HAZARDS

  • Thinking every function is linear. f(x)=x2f(x)=x^2 isn't (f(u+v)f(u)+f(v)f(u+v)\ne f(u)+f(v)), nor is f(x)=x+1f(x)=x+1 (it fails T(0)=0T(\mathbf{0})=\mathbf{0} — it sends 00 to 11). Linearity is a strict, rare condition. Most functions are barbarians.

  • Confusing kernel with image. Kernel lives in the domain VV (what gets crushed to 0\mathbf{0}); image lives in the codomain WW (what gets reached). Different spaces, different jobs — mix them up and your proof will be wrong and embarrassing.

  • Checking injectivity pair-by-pair for a linear map. Don't do that. Just check ker(T)={0}\ker(T)=\{\mathbf{0}\}. Linearity reduces a global condition to one kernel computation. Doing it the hard way wastes time and signals that the theorem hasn't landed yet.

  • Forgetting the basis theorem's power. You never need infinitely many input-output pairs to pin down a linear map — nn of them (the basis images) suffice. Specifying more is redundant. Specifying fewer is incomplete. Exactly nn, then you're done.

TL;DR

  • A linear map T:VWT:V\to W satisfies T(u+v)=T(u)+T(v)T(u+v)=T(u)+T(v) and T(cv)=cT(v)T(cv)=cT(v) — it respects addition and scaling. Consequence: T(0)=0T(\mathbf{0})=\mathbf{0}.

  • A linear map is determined by its action on a basis: knowing T(v1),,T(vn)T(v_1),\dots,T(v_n) fixes T(v)T(v) for all vv, via T(v)=c1T(v1)++cnT(vn)T(v)=c_1T(v_1)+\cdots+c_nT(v_n).

  • Its matrix has columns =T(e1),,T(en)= T(e_1), \dots, T(e_n) (the 2D decoder ring, generalized), so T(x)=AxT(\mathbf{x})=A\mathbf{x}.

  • Kernel ker(T)={v:T(v)=0}\ker(T)=\{v:T(v)=\mathbf{0}\} and image im(T)={T(v)}\operatorname{im}(T)=\{T(v)\} are both subspaces (three-condition test). And TT is injective     ker(T)={0}\iff \ker(T)=\{\mathbf{0}\} (a two-line proof).

  • Beyond Rn\mathbb{R}^n: differentiation D:PnPn1D:P_n\to P_{n-1} (as a formal rule) is linear, and ker(D)=\ker(D)= the constants. Calculus runs on linear maps.

unlocks