The Maximal Self-Similar Island: Why Every Infinite Set Has the Same Size as Its Square

One of the clean cardinal-arithmetic facts about infinite sets is that an infinite set has the same cardinality as its Cartesian square. In symbols, if X is infinite, then

\displaystyle |X|=|X\times X|.

Equivalently, there is a bijection

\displaystyle X\longrightarrow X\times X.

For X=\mathbb N, this is familiar. One can enumerate the lattice \mathbb N\times\mathbb N by diagonals, or use a coding such as

\displaystyle (p,q)\mapsto 2^p(2q+1),

with the usual harmless adjustment depending on whether one starts \mathbb N at 0 or 1.

But for an arbitrary infinite set X, there need not be a visible grid. There may be no natural ordering, no preferred coordinates, no obvious diagonal traversal. The proof I want to describe instead uses a maximality argument. Its basic idea is this:

Find the largest possible subset Y\subseteq X which already knows how to enumerate its own square.

That is, we look for subsets Y equipped with a bijection

\displaystyle \phi:Y\longrightarrow Y\times Y.

Then we make such a pair maximal by Zorn’s lemma. If the leftover set X\setminus Y is small, then Y absorbs it and hence X\cong Y. If the leftover set is large, then there is enough fresh material outside Y to enlarge Y, contradicting maximality.

The proof has a slightly strange texture because it does not directly build a global bijection X\to X\times X. Instead, it builds a maximal self-similar island inside X, and then proves that maximality forces the island to be all of X up to bijection.

The strange texture is exactly the point.


A small absorption lemma

I will use the following notation. Write A\preceq B to mean that there is an injection from A into B, and write A\cong B to mean that there is a bijection between A and B.

We first record a useful lemma.

Lemma. Suppose Y is infinite and

\displaystyle Y\cong Y\times Y.

Then, for every positive finite integer n,

\displaystyle Y\times n\cong Y.

In particular,

\displaystyle Y\times{0,1,2}\cong Y.

Proof. Since Y is infinite, choose distinct elements

\displaystyle a_0,a_1,\dots,a_{n-1}\in Y.

Define an injection

\displaystyle Y\times n\hookrightarrow Y\times Y

by sending

\displaystyle (y,i)\mapsto (y,a_i).

Since Y\times Y\cong Y, this gives an injection

\displaystyle Y\times n\hookrightarrow Y.

On the other hand, there is an obvious injection

\displaystyle Y\hookrightarrow Y\times n,

for example

\displaystyle y\mapsto (y,0).

By Cantor-Bernstein,

\displaystyle Y\times n\cong Y.

This proves the lemma. \square

The lemma is the small but important compression move. Once a set can absorb its own square, it can also absorb finitely many copies of itself.


The theorem

Theorem. Assuming Zorn’s lemma, if X is infinite, then

\displaystyle X\cong X\times X.

Proof. Since we are using Zorn’s lemma, we are working in the usual choice context. Thus every infinite set contains a countably infinite subset. Choose a countably infinite subset

\displaystyle A\subseteq X.

Since A\cong\mathbb N, we know

\displaystyle A\cong A\times A.

Fix one bijection

\displaystyle \phi_0:A\longrightarrow A\times A.

Now consider the collection \mathcal P of pairs

\displaystyle (Y,\phi)

such that

\displaystyle A\subseteq Y\subseteq X

and

\displaystyle \phi:Y\longrightarrow Y\times Y

is a bijection extending \phi_0. That is,

\displaystyle \phi|_A=\phi_0.

We partially order \mathcal P by extension. Thus

\displaystyle (Y,\phi)\leq (Y',\phi')

means that

\displaystyle Y\subseteq Y'

and

\displaystyle \phi'|_Y=\phi.

The collection \mathcal P is nonempty because it contains

\displaystyle (A,\phi_0).

Now let \mathcal C be a chain in \mathcal P. Define

\displaystyle Y_{\mathcal C}=\bigcup_{(Y,\phi)\in\mathcal C}Y

and define

\displaystyle \phi_{\mathcal C}=\bigcup_{(Y,\phi)\in\mathcal C}\phi.

Because \mathcal C is a chain and the maps are ordered by extension, these maps agree on overlaps. Therefore \phi_{\mathcal C} is a well-defined function.

We claim that

\displaystyle \phi_{\mathcal C}:Y_{\mathcal C}\longrightarrow Y_{\mathcal C}\times Y_{\mathcal C}

is a bijection.

First, it is injective. Suppose

\displaystyle \phi_{\mathcal C}(u)=\phi_{\mathcal C}(v).

Then u belongs to some member Y_1 of the chain, and v belongs to some member Y_2 of the chain. Since the chain is linearly ordered, one of these two sets contains the other. Hence u and v both lie in a common member of the chain, where the corresponding map is injective. Therefore

\displaystyle u=v.

Now we check surjectivity. Take any pair

\displaystyle (a,b)\in Y_{\mathcal C}\times Y_{\mathcal C}.

Then a belongs to some Y_1 in the chain, and b belongs to some Y_2 in the chain. Again, since \mathcal C is a chain, one of Y_1,Y_2 contains the other. Thus both a and b lie in a common member Y_3 of the chain. Since the corresponding map

\displaystyle \phi_3:Y_3\longrightarrow Y_3\times Y_3

is surjective, there is some y\in Y_3 such that

\displaystyle \phi_3(y)=(a,b).

Therefore

\displaystyle \phi_{\mathcal C}(y)=(a,b).

So \phi_{\mathcal C} is surjective.

Thus every chain has an upper bound in \mathcal P. By Zorn’s lemma, \mathcal P has a maximal element. Fix one and call it

\displaystyle (Y,\phi).

So

\displaystyle A\subseteq Y\subseteq X

and

\displaystyle \phi:Y\longrightarrow Y\times Y

is a bijection, maximal with respect to extension.

Since A\subseteq Y and A is countably infinite, the set Y is infinite.

Let

\displaystyle R=X\setminus Y.

We now analyze the complement R.


First case: the complement is small

Suppose

\displaystyle R\preceq Y.

So there is an injection

\displaystyle j:R\hookrightarrow Y.

We claim that in this case

\displaystyle X\cong Y.

Choose two distinct elements

\displaystyle y_0,y_1\in Y.

Define an injection

\displaystyle F:X\hookrightarrow Y\times Y

as follows.

For x\in Y, set

\displaystyle F(x)=(x,y_0).

For x\in R, set

\displaystyle F(x)=(j(x),y_1).

This is injective: the second coordinate separates the two pieces Y and R, and j is injective on R.

Since

\displaystyle Y\times Y\cong Y,

we obtain an injection

\displaystyle X\hookrightarrow Y.

The inclusion

\displaystyle Y\hookrightarrow X

is also an injection. By Cantor-Bernstein,

\displaystyle X\cong Y.

Now conjugate the bijection \phi:Y\to Y\times Y along a bijection b:X\to Y. Define

\displaystyle \Phi:X\longrightarrow X\times X

by

\displaystyle \Phi=(b^{-1}\times b^{-1})\circ\phi\circ b.

Since each piece of this composition is a bijection, \Phi is a bijection. Hence

\displaystyle X\cong X\times X.

So the theorem is proved in this case.


Second case: the complement is large

Now suppose instead that

\displaystyle R\not\preceq Y.

Since we are in the choice context, cardinalities are comparable. Therefore

\displaystyle Y\preceq R.

Thus we may choose a subset

\displaystyle T\subseteq R

such that

\displaystyle T\cong Y.

Let

\displaystyle Z=Y\cup T.

Since T\subseteq R=X\setminus Y, this is a disjoint union.

Now decompose the square Z\times Z into four blocks:

\displaystyle Z\times Z=(Y\times Y)\dot\cup(Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).

The original map

\displaystyle \phi:Y\longrightarrow Y\times Y

already accounts for the old block Y\times Y.

So, to extend \phi from Y to Z=Y\cup T, it remains only to find a bijection from the new elements T onto the three new blocks

\displaystyle (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).

But T\cong Y. Therefore

\displaystyle Y\times T\cong Y\times Y,

\displaystyle T\times Y\cong Y\times Y,

and

\displaystyle T\times T\cong Y\times Y.

Since Y\times Y\cong Y, each of these three blocks has cardinality |Y|. Hence their disjoint union has cardinality

\displaystyle |Y\times{0,1,2}|.

By the absorption lemma,

\displaystyle Y\times{0,1,2}\cong Y.

And since T\cong Y, we conclude that

\displaystyle T\cong (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).

Choose a bijection

\displaystyle \theta:T\longrightarrow (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).

Now define

\displaystyle \phi^+:Z\longrightarrow Z\times Z

as follows.

For z\in Y, set

\displaystyle \phi^+(z)=\phi(z).

For z\in T, set

\displaystyle \phi^+(z)=\theta(z).

The range of \phi is exactly Y\times Y, and the range of \theta is exactly

\displaystyle (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).

These two ranges are disjoint, and together they cover all of Z\times Z. Therefore

\displaystyle \phi^+:Z\longrightarrow Z\times Z

is a bijection.

Moreover, \phi^+ extends \phi, and Z strictly contains Y, because T is nonempty. Thus

\displaystyle (Z,\phi^+)

is a strictly larger element of \mathcal P than

\displaystyle (Y,\phi).

This contradicts the maximality of (Y,\phi).

Therefore the second case is impossible.


Conclusion

The large-complement case cannot occur. Therefore the complement X\setminus Y must satisfy

\displaystyle X\setminus Y\preceq Y.

As shown in the first case, this implies

\displaystyle X\cong Y.

But by construction,

\displaystyle Y\cong Y\times Y.

Conjugating this bijection along X\cong Y gives

\displaystyle X\cong X\times X.

Hence every infinite set has the same cardinality as its Cartesian square. \square


What the proof is really doing

The proof is best understood through the block decomposition

\displaystyle (Y\cup T)^2=Y^2\dot\cup(Y\times T)\dot\cup(T\times Y)\dot\cup T^2.

The old set Y already knows how to enumerate Y^2. That is the role of the bijection

\displaystyle \phi:Y\to Y^2.

If we try to enlarge Y by adding a fresh copy T of Y, then the newly added elements T do not need to enumerate all of (Y\cup T)^2. They only need to enumerate the new part of the square, namely

\displaystyle (Y\times T)\dot\cup(T\times Y)\dot\cup T^2.

And if T\cong Y, then this new part has the size of three copies of Y^2. But Y^2\cong Y, and three copies of Y still have size Y. Thus the fresh copy T is exactly large enough to pay for all the new ordered pairs created by adding T.

That is the heart of the proof.

The maximality argument says: keep enlarging the self-similar island Y as long as possible. Once you cannot enlarge it anymore, the outside cannot contain another full copy of Y. If it did, that copy could be used to absorb all the new square-blocks and produce a larger self-similar island. Therefore the outside is no larger than Y, and so the whole set X is absorbed by Y.

In compressed cardinal notation, the extension step is the calculation

\displaystyle T\cong Y\cong 3Y\cong 3Y^2\cong (Y\times T)\dot\cup(T\times Y)\dot\cup T^2.

But the proof above spells out why this calculation is legitimate without assuming in advance the theorem we are trying to prove.

This is the attractive feature of the argument: instead of proving |X|^2=|X| by global cardinal arithmetic, it proves that any maximal partial instance of the identity must already be global up to bijection. The proof is not a diagonal enumeration. It is a self-similarity trap.

Leave a comment