Cantor's diagonal argument

An illustration of Cantor's diagonal argument (in base 2) for the existence of uncountable sets. The sequence at the bottom cannot occur anywhere in the enumeration of sequences above.

In set theory, Cantor's diagonal argument, also called the diagonalisation argument, the diagonal slash argument, the anti-diagonal argument, the diagonal method, and Cantor's diagonalization proof, was published in 1891 by Georg Cantor as a mathematical proof that there are infinite sets which cannot be put into one-to-one correspondence with the infinite set of natural numbers.[1][2]: 20– [3] Such sets are now known as uncountable sets, and the size of infinite sets is now treated by the theory of cardinal numbers which Cantor began.

The diagonal argument was not Cantor's first proof of the uncountability of the real numbers, which appeared in 1874.[4][5] However, it demonstrates a general technique that has since been used in a wide range of proofs,[6] including the first of Gödel's incompleteness theorems[2] and Turing's answer to the Entscheidungsproblem. Diagonalization arguments are often also the source of contradictions like Russell's paradox[7][8] and Richard's paradox.[2]: 27 

Cantor considered the set T of all infinite sequences of binary digits (i.e. each digit is zero or one).[note 1] He begins with a constructive proof of the following lemma:

Next, a sequence s is constructed by choosing the 1st digit as complementary to the 1st digit of s1 (swapping 0s for 1s and vice versa), the 2nd digit as complementary to the 2nd digit of s2, the 3rd digit as complementary to the 3rd digit of s3, and generally for every n, the nth digit as complementary to the nth digit of sn. For the example above, this yields:

By construction, s is a member of T that differs from each sn, since their nth digits differ (highlighted in the example). Hence, s cannot occur in the enumeration.

Based on this lemma, Cantor then uses a proof by contradiction to show that:

The proof starts by assuming that T is countable. Then all its elements can be written in an enumeration s1, s2, ... , sn, ... . Applying the previous lemma to this enumeration produces a sequence s that is a member of T, but is not in the enumeration. However, if T is enumerated, then every member of T, including this s, is in the enumeration. This contradiction implies that the original assumption is false. Therefore, T is uncountable.[1]

An injection from T to R is given by mapping binary strings in T to decimal fractions, such as mapping t = 0111... to the decimal 0.0111.... This function, defined by f(t) = 0.t, is an injection because it maps different strings to different numbers.[note 3]

Constructing a bijection between T and R is slightly more complicated. Instead of mapping 0111... to the decimal 0.0111..., it can be mapped to the base b number: 0.0111...b. This leads to the family of functions: fb(t) = 0.tb. The functions fb(t) are injections, except for f2(t). This function will be modified to produce a bijection between T and R.

A generalized form of the diagonal argument was used by Cantor to prove Cantor's theorem: for every set S, the power set of S—that is, the set of all subsets of S (here written as P(S))—cannot be in bijection with S itself. This proof proceeds as follows:

Let f be any function from S to P(S). It suffices to prove f cannot be surjective. That means that some member T of P(S), i.e. some subset of S, is not in the image of f. As a candidate consider the set:

For every s in S, either s is in T or not. If s is in T, then by definition of T, s is not in f(s), so T is not equal to f(s). On the other hand, if s is not in T, then by definition of T, s is in f(s), so again T is not equal to f(s); cf. picture. For a more complete account of this proof, see Cantor's theorem.

Assuming the law of excluded middle, every subcountable set (a property in terms of surjections) is also already countable.

Motivated by the insight that the set of real numbers is "bigger" than the set of natural numbers, one is led to ask if there is a set whose cardinality is "between" that of the integers and that of the reals. This question leads to the famous continuum hypothesis. Similarly, the question of whether there exists a set whose cardinality is between |S| and |P(S)| for some infinite S leads to the generalized continuum hypothesis.

Russell's Paradox has shown that naive set theory, based on an unrestricted comprehension scheme, is contradictory. Note that there is a similarity between the construction of T and the set in Russell's paradox. Therefore, depending on how we modify the axiom scheme of comprehension in order to avoid Russell's paradox, arguments such as the non-existence of a set of all sets may or may not remain valid.

Analogues of the diagonal argument are widely used in mathematics to prove the existence or nonexistence of certain objects. For example, the conventional proof of the unsolvability of the halting problem is essentially a diagonal argument. Also, diagonalization was originally used to show the existence of arbitrarily hard complexity classes and played a key role in early attempts to prove P does not equal NP.

The above proof fails for W. V. Quine's "New Foundations" set theory (NF). In NF, the naive axiom scheme of comprehension is modified to avoid the paradoxes by introducing a kind of "local" type theory. In this axiom scheme,

is not a set — i.e., does not satisfy the axiom scheme. On the other hand, we might try to create a modified diagonal argument by noticing that

is a set in NF. In which case, if P1(S) is the set of one-element subsets of S and f is a proposed bijection from P1(S) to P(S), one is able to use proof by contradiction to prove that |P1(S)| < |P(S)|.

It is not possible to put P1(S) in a one-to-one relation with S, as the two have different types, and so any function so defined would violate the typing rules for the comprehension scheme.