COMS W3261 CS Theory
Lecture 16: The Universal Language

1. Countable and Uncountable Sets

Two sets A and B are the same size if there is one-to-one correspondence (one-to-one, onto mapping) from A to B.
A set A is countable if either it is finite or it has the same size as the set of integers {1, 2, 3, … }.
Every language is countable. In particular, {0, 1}* is countable.
The set of all Turing machines using the input alphabet {0, 1} is countable.
The set of all real numbers is uncountable.
The set of all infinite binary sequences is uncountable.
The set of all languages over the alphabet {0, 1} is uncountable. Hence, it comes as no surprise that there are (uncountably many) languages that cannot be accepted by any Turing machine.

We have seen one language, the diagonalization language, that is not accepted by any Turing machine. This proves the diagonalization language is not recursively enumerable.
We shall now define a language L_u, the universal language, that can be accepted by a Turing machine but is still undecidable. L_u is recursively enumerable but not recursive.
L_u is the set of binary strings that consist of encoded pairs (M, w) such that M is an encoding of a Turing machine and w is an encoding of a binary input string accepted by that Turing machine.
Formally, L_u = { (M, w) | M is an encoding of a Turing machine and w is an encoding of a binary string accepted by M }.

To show that L_u is recursively enumerable, we shall construct a Turing machine U, called the universal Turing machine, to recognize L_u.
It is convenient to think of U as a multitape TM in which:

The first tape holds the input with the encodings of M and w. We use the encodings of Turing machines and binary strings we developed in Lecture 14.
A second tape is used to simulate M's input tape. We initialize the second tape with the encoded form of w. We then move the head on the second tape to the first simulated cell.
A third tape is used to keep track of M's state. We initialize the third tape with the start state of M.
A fourth tape is used as a scratch tape.

To simulate a move of M, U searches tape 1 for a transition on the current state of M (stored on tape 3) and the current tape symbol of M (stored on tape 2). Then,

U simulates the behavior of M on the input string w by repeated making moves in this manner. If M enters its final state signaling that M accepts w, then U accepts the coded pair (M, w) and halts.
Thus, L(U) = L_u.
As we can see, U has the elements of a modern stored program digital computer!

Suppose L_u were recursive. Then there would exist a TM M that accepts the complement of L_u.
But now we can transform M into a TM M' that accepts L_d as follows:

M' transforms its input string w into a pair (w, w).
M' simulates M on (w, w) assuming the first w is an encoding of a TM M_i and the second w is an encoding of a binary string w_i. Since M accepts the complement of L_u, M will accept (w, w) if and only if M_i does not accept w_i.

Thus, M' accepts w if and only if w is in L_d. But we have previously shown there does not exist a TM that recognizes L_d.
We conclude L_u is not recursive.

In Alan Turing’s original formulation of Turing machines acceptance was just by halting not necessarily by halting in a final state.
We can define H(M) for a Turing machine M to be the set of input strings w on which M halts in either a final or a nonfinal state.
The famous halting problem is the set of pairs {(M, w) | w is in H(M)}.
We can show the halting problem is recursively enumerable but not recursive.
A similar argument can be used to show that many practical problems associated with software verification are undecidable. For example, the problem of determining whether a program will ever go into an infinite loop is undecidable.
See Geoffrey Pullum's "Scooping the Loop Snooper" for an amusing Dr. Seuss-like poetic proof of the undecidability of the halting problem.

If we can reduce a problem A to a decidable problem B, can we conclude A is decidable?
If we can reduce a decidable problem A to a problem B, can we conclude B is decidable?
If we can reduce a problem A to an undecidable problem B, can we conclude problem A is undecidable?
If we can reduce an undecidable problem A to a problem B, can we conclude problem B is undecidable?
Prove that for a Turing machine M the halting problem language { (M, w) | w is in H(M) } is recursively enumerable but not recursive.
Is the proposition "This statement is not true" true or false?