The Maximal Self-Similar Island: Why Every Infinite Set Has the Same Size as Its Square

One of the clean cardinal-arithmetic facts about infinite sets is that an infinite set has the same cardinality as its Cartesian square. In symbols, if $X$ is infinite, then

$\displaystyle |X|=|X\times X|.$

Equivalently, there is a bijection

$\displaystyle X\longrightarrow X\times X.$

For $X=\mathbb N$ , this is familiar. One can enumerate the lattice $\mathbb N\times\mathbb N$ by diagonals, or use a coding such as

$\displaystyle (p,q)\mapsto 2^p(2q+1),$

with the usual harmless adjustment depending on whether one starts $\mathbb N$ at $0$ or $1$ .

But for an arbitrary infinite set $X$ , there need not be a visible grid. There may be no natural ordering, no preferred coordinates, no obvious diagonal traversal. The proof I want to describe instead uses a maximality argument. Its basic idea is this:

Find the largest possible subset $Y\subseteq X$ which already knows how to enumerate its own square.

That is, we look for subsets $Y$ equipped with a bijection

$\displaystyle \phi:Y\longrightarrow Y\times Y.$

Then we make such a pair maximal by Zorn’s lemma. If the leftover set $X\setminus Y$ is small, then $Y$ absorbs it and hence $X\cong Y$ . If the leftover set is large, then there is enough fresh material outside $Y$ to enlarge $Y$ , contradicting maximality.

The proof has a slightly strange texture because it does not directly build a global bijection $X\to X\times X$ . Instead, it builds a maximal self-similar island inside $X$ , and then proves that maximality forces the island to be all of $X$ up to bijection.

The strange texture is exactly the point.

A small absorption lemma

I will use the following notation. Write $A\preceq B$ to mean that there is an injection from $A$ into $B$ , and write $A\cong B$ to mean that there is a bijection between $A$ and $B$ .

We first record a useful lemma.

Lemma. Suppose $Y$ is infinite and

$\displaystyle Y\cong Y\times Y.$

Then, for every positive finite integer $n$ ,

$\displaystyle Y\times n\cong Y.$

In particular,

$\displaystyle Y\times{0,1,2}\cong Y.$

Proof. Since $Y$ is infinite, choose distinct elements

$\displaystyle a_0,a_1,\dots,a_{n-1}\in Y.$

Define an injection

$\displaystyle Y\times n\hookrightarrow Y\times Y$

by sending

$\displaystyle (y,i)\mapsto (y,a_i).$

Since $Y\times Y\cong Y$ , this gives an injection

$\displaystyle Y\times n\hookrightarrow Y.$

On the other hand, there is an obvious injection

$\displaystyle Y\hookrightarrow Y\times n,$

for example

$\displaystyle y\mapsto (y,0).$

By Cantor-Bernstein,

$\displaystyle Y\times n\cong Y.$

This proves the lemma. $\square$

The lemma is the small but important compression move. Once a set can absorb its own square, it can also absorb finitely many copies of itself.

The theorem

Theorem. Assuming Zorn’s lemma, if $X$ is infinite, then

$\displaystyle X\cong X\times X.$

Proof. Since we are using Zorn’s lemma, we are working in the usual choice context. Thus every infinite set contains a countably infinite subset. Choose a countably infinite subset

$\displaystyle A\subseteq X.$

Since $A\cong\mathbb N$ , we know

$\displaystyle A\cong A\times A.$

Fix one bijection

$\displaystyle \phi_0:A\longrightarrow A\times A.$

Now consider the collection $\mathcal P$ of pairs

$\displaystyle (Y,\phi)$

such that

$\displaystyle A\subseteq Y\subseteq X$

and

$\displaystyle \phi:Y\longrightarrow Y\times Y$

is a bijection extending $\phi_0$ . That is,

$\displaystyle \phi|_A=\phi_0.$

We partially order $\mathcal P$ by extension. Thus

$\displaystyle (Y,\phi)\leq (Y',\phi')$

means that

$\displaystyle Y\subseteq Y'$

and

$\displaystyle \phi'|_Y=\phi.$

The collection $\mathcal P$ is nonempty because it contains

$\displaystyle (A,\phi_0).$

Now let $\mathcal C$ be a chain in $\mathcal P$ . Define

$\displaystyle Y_{\mathcal C}=\bigcup_{(Y,\phi)\in\mathcal C}Y$

and define

$\displaystyle \phi_{\mathcal C}=\bigcup_{(Y,\phi)\in\mathcal C}\phi.$

Because $\mathcal C$ is a chain and the maps are ordered by extension, these maps agree on overlaps. Therefore $\phi_{\mathcal C}$ is a well-defined function.

We claim that

$\displaystyle \phi_{\mathcal C}:Y_{\mathcal C}\longrightarrow Y_{\mathcal C}\times Y_{\mathcal C}$

is a bijection.

First, it is injective. Suppose

$\displaystyle \phi_{\mathcal C}(u)=\phi_{\mathcal C}(v).$

Then $u$ belongs to some member $Y_1$ of the chain, and $v$ belongs to some member $Y_2$ of the chain. Since the chain is linearly ordered, one of these two sets contains the other. Hence $u$ and $v$ both lie in a common member of the chain, where the corresponding map is injective. Therefore

$\displaystyle u=v.$

Now we check surjectivity. Take any pair

$\displaystyle (a,b)\in Y_{\mathcal C}\times Y_{\mathcal C}.$

Then $a$ belongs to some $Y_1$ in the chain, and $b$ belongs to some $Y_2$ in the chain. Again, since $\mathcal C$ is a chain, one of $Y_1,Y_2$ contains the other. Thus both $a$ and $b$ lie in a common member $Y_3$ of the chain. Since the corresponding map

$\displaystyle \phi_3:Y_3\longrightarrow Y_3\times Y_3$

is surjective, there is some $y\in Y_3$ such that

$\displaystyle \phi_3(y)=(a,b).$

Therefore

$\displaystyle \phi_{\mathcal C}(y)=(a,b).$

So $\phi_{\mathcal C}$ is surjective.

Thus every chain has an upper bound in $\mathcal P$ . By Zorn’s lemma, $\mathcal P$ has a maximal element. Fix one and call it

$\displaystyle (Y,\phi).$

$\displaystyle A\subseteq Y\subseteq X$

and

$\displaystyle \phi:Y\longrightarrow Y\times Y$

is a bijection, maximal with respect to extension.

Since $A\subseteq Y$ and $A$ is countably infinite, the set $Y$ is infinite.

Let

$\displaystyle R=X\setminus Y.$

We now analyze the complement $R$ .

First case: the complement is small

Suppose

$\displaystyle R\preceq Y.$

So there is an injection

$\displaystyle j:R\hookrightarrow Y.$

We claim that in this case

$\displaystyle X\cong Y.$

Choose two distinct elements

$\displaystyle y_0,y_1\in Y.$

Define an injection

$\displaystyle F:X\hookrightarrow Y\times Y$

as follows.

For $x\in Y$ , set

$\displaystyle F(x)=(x,y_0).$

For $x\in R$ , set

$\displaystyle F(x)=(j(x),y_1).$

This is injective: the second coordinate separates the two pieces $Y$ and $R$ , and $j$ is injective on $R$ .

Since

$\displaystyle Y\times Y\cong Y,$

we obtain an injection

$\displaystyle X\hookrightarrow Y.$

The inclusion

$\displaystyle Y\hookrightarrow X$

is also an injection. By Cantor-Bernstein,

$\displaystyle X\cong Y.$

Now conjugate the bijection $\phi:Y\to Y\times Y$ along a bijection $b:X\to Y$ . Define

$\displaystyle \Phi:X\longrightarrow X\times X$

$\displaystyle \Phi=(b^{-1}\times b^{-1})\circ\phi\circ b.$

Since each piece of this composition is a bijection, $\Phi$ is a bijection. Hence

$\displaystyle X\cong X\times X.$

So the theorem is proved in this case.

Second case: the complement is large

Now suppose instead that

$\displaystyle R\not\preceq Y.$

Since we are in the choice context, cardinalities are comparable. Therefore

$\displaystyle Y\preceq R.$

Thus we may choose a subset

$\displaystyle T\subseteq R$

such that

$\displaystyle T\cong Y.$

Let

$\displaystyle Z=Y\cup T.$

Since $T\subseteq R=X\setminus Y$ , this is a disjoint union.

Now decompose the square $Z\times Z$ into four blocks:

$\displaystyle Z\times Z=(Y\times Y)\dot\cup(Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).$

The original map

$\displaystyle \phi:Y\longrightarrow Y\times Y$

already accounts for the old block $Y\times Y$ .

So, to extend $\phi$ from $Y$ to $Z=Y\cup T$ , it remains only to find a bijection from the new elements $T$ onto the three new blocks

$\displaystyle (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).$

But $T\cong Y$ . Therefore

$\displaystyle Y\times T\cong Y\times Y,$

$\displaystyle T\times Y\cong Y\times Y,$

and

$\displaystyle T\times T\cong Y\times Y.$

Since $Y\times Y\cong Y$ , each of these three blocks has cardinality $|Y|$ . Hence their disjoint union has cardinality

$\displaystyle |Y\times{0,1,2}|.$

By the absorption lemma,

$\displaystyle Y\times{0,1,2}\cong Y.$

And since $T\cong Y$ , we conclude that

$\displaystyle T\cong (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).$

Choose a bijection

$\displaystyle \theta:T\longrightarrow (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).$

Now define

$\displaystyle \phi^+:Z\longrightarrow Z\times Z$

as follows.

For $z\in Y$ , set

$\displaystyle \phi^+(z)=\phi(z).$

For $z\in T$ , set

$\displaystyle \phi^+(z)=\theta(z).$

The range of $\phi$ is exactly $Y\times Y$ , and the range of $\theta$ is exactly

$\displaystyle (Y\times T)\dot\cup(T\times Y)\dot\cup(T\times T).$

These two ranges are disjoint, and together they cover all of $Z\times Z$ . Therefore

$\displaystyle \phi^+:Z\longrightarrow Z\times Z$

is a bijection.

Moreover, $\phi^+$ extends $\phi$ , and $Z$ strictly contains $Y$ , because $T$ is nonempty. Thus

$\displaystyle (Z,\phi^+)$

is a strictly larger element of $\mathcal P$ than

$\displaystyle (Y,\phi).$

This contradicts the maximality of $(Y,\phi)$ .

Therefore the second case is impossible.

Conclusion

The large-complement case cannot occur. Therefore the complement $X\setminus Y$ must satisfy

$\displaystyle X\setminus Y\preceq Y.$

As shown in the first case, this implies

$\displaystyle X\cong Y.$

But by construction,

$\displaystyle Y\cong Y\times Y.$

Conjugating this bijection along $X\cong Y$ gives

$\displaystyle X\cong X\times X.$

Hence every infinite set has the same cardinality as its Cartesian square. $\square$

What the proof is really doing

The proof is best understood through the block decomposition

$\displaystyle (Y\cup T)^2=Y^2\dot\cup(Y\times T)\dot\cup(T\times Y)\dot\cup T^2.$

The old set $Y$ already knows how to enumerate $Y^2$ . That is the role of the bijection

$\displaystyle \phi:Y\to Y^2.$

If we try to enlarge $Y$ by adding a fresh copy $T$ of $Y$ , then the newly added elements $T$ do not need to enumerate all of $(Y\cup T)^2$ . They only need to enumerate the new part of the square, namely

$\displaystyle (Y\times T)\dot\cup(T\times Y)\dot\cup T^2.$

And if $T\cong Y$ , then this new part has the size of three copies of $Y^2$ . But $Y^2\cong Y$ , and three copies of $Y$ still have size $Y$ . Thus the fresh copy $T$ is exactly large enough to pay for all the new ordered pairs created by adding $T$ .

That is the heart of the proof.

The maximality argument says: keep enlarging the self-similar island $Y$ as long as possible. Once you cannot enlarge it anymore, the outside cannot contain another full copy of $Y$ . If it did, that copy could be used to absorb all the new square-blocks and produce a larger self-similar island. Therefore the outside is no larger than $Y$ , and so the whole set $X$ is absorbed by $Y$ .

In compressed cardinal notation, the extension step is the calculation

$\displaystyle T\cong Y\cong 3Y\cong 3Y^2\cong (Y\times T)\dot\cup(T\times Y)\dot\cup T^2.$

But the proof above spells out why this calculation is legitimate without assuming in advance the theorem we are trying to prove.

This is the attractive feature of the argument: instead of proving $|X|^2=|X|$ by global cardinal arithmetic, it proves that any maximal partial instance of the identity must already be global up to bijection. The proof is not a diagonal enumeration. It is a self-similarity trap.