Consider the following two strings of 32 lowercase letters and digits:
Any string s has at least one description. For example, the second string above is output by the pseudo-code:
The length of the shortest description will depend on the choice of description language; but the effect of changing languages is bounded (a result called the invariance theorem).
There are some description languages which are optimal, in the following sense: given any description of an object in a description language, said description may be used in the optimal description language with a constant overhead. The constant depends only on the languages involved, not on the description of the object, nor the object being described.
Here is an example of an optimal description language. A description will have two parts:
In more technical terms, the first part of a description is a computer program (specifically: a compiler for the object's language, written in the description language), with the second part being the input to that computer program which produces the object as output.
The theorem says that, among algorithms that decode strings from their descriptions (codes), there exists an optimal one. This algorithm, for all strings, allows codes as short as allowed by any other algorithm up to an additive constant that depends on the algorithms, but not on the strings themselves. Solomonoff used this algorithm and the code lengths it allows to define a "universal probability" of a string on which inductive inference of the subsequent digits of the string can be based. Kolmogorov used this theorem to define several functions of strings, including complexity, randomness, and information.
This program iterates through all possible programs (by iterating through all possible strings and only considering those which are valid programs), starting with the shortest. Each program is executed to find the result produced by that program, comparing it to the input s. If the result matches the length of the program is returned.
What is more, no program at all can compute the function K, be it ever so sophisticated. This is proven in the following.
There is a corollary, humorously called the "full employment theorem" in the programming language community, stating that there is no perfect size-optimizing compiler.
We can find an effective enumeration of all the formal proofs in S by some procedure
Finally, consider the program consisting of all these procedure definitions, and a main call: