The mathematical foundation of probability theory appears among the famous problems of Hilbert formulated in 1900 (mentioned before). Von Mises made an important attempt in 1919 to define the randomness of a 0-1 sequence. his attempt can be sketched as follows. We require that the frequency of 0's and 1's be approximately the same. This is clearly not enough, but we can require the same to hold also if we select every other number of the sequence. more generally, we can require the same for all subsequences obtained by selecting indices from an arithmetic progression. This approach, however, did not prove sufficiently fruitful.
In 1931 Kolmogorov initiated another approach, using measure theory. His theory was very successful from the point of view of probability theory, and it is the basis of the rigorous development of probability theory in any textbook today.
However, this standard approach fails to capture some important aspects For example, in probability theory based on measure theory, we cannot speak of the randomness of a single 0-1 sequence, only of the probability of a set of sequences. At the same time, in an everyday sense, it is "obvious" that the sequence "Head, Head, Head,..." cannot be the result of coin tossing. In the 1960's Kolmogorov and independently Chaitin revived the idea of von Mises, using complexity-theoretic tools. They defined the information complexity (information content) of a sequence; then (roughly speaking) random sequences are those whose information content is as large as possible. The importance of these results goes beyond the foundation of probabity theory; it contributes to the clarification of the basic notions in several fields like data compression, information theory and statistics.
In this chapter we introduce the notion of information complexity first. Then we treat the notion of self-limiting information complexity . While this may seem just a technical variant of the basic notion, it is important because (unlike in the case of, say, Turing machines, where such variants turn out te be equivalent) self-limiting information complexity has quite different properties in many respects.
Then we discuss the notion of a informatically random sequence, and show that such sequences behave like "usual" random sequences: they obeu the Laws of Large Numbers. Finally, we discuss the problem of optimal encoding of various structures.