TAOCP 3.5 Exercise 38
Let $A$ be a set of algorithms in the sense of Definition R6, and let $|A|=m$.
Section 3.5: What Is a Random Sequence?
Exercise 38. [M49] (A. N. Kolmogorov.) Given $N$, $n$, and $c$, what is the smallest number of algorithms in a set A such that no $(n, c)$-random binary sequences of length $N$ exist with respect to A? (If exact formulas cannot be given, can asymptotic formulas be found? The point of this problem is to discover how close the bound (37) comes to being "best possible.")
Verified: no
Solve time: 3m20s
Setup
Let $A$ be a set of algorithms in the sense of Definition R6, and let $|A|=m$.
For a binary sequence $x$ of length $N$, let $K_A(x)$ denote the length of the shortest description of $x$ using an algorithm from $A$. By the counting argument leading to bound (37), the number of sequences whose description length is $<N-c$ is at most
$$ =m(2^{N-c}-1). $$
Hence every sequence that is not $(n,c)$-random with respect to $A$ belongs to a set of cardinality at most
m(2^{N-c}-1). \tag{1}
Exercise 38 asks for the smallest cardinality of $A$ that guarantees the nonexistence of $(n,c)$-random binary sequences of length $N$. Equivalently, we seek the least $m$ for which every one of the $2^N$ binary sequences of length $N$ can be covered by the exceptional set counted in (1).
Solution
A necessary condition for the nonexistence of $(n,c)$-random sequences is that the upper bound (1) be at least $2^N$. Otherwise fewer than $2^N$ sequences can fail randomness, and at least one sequence must remain $(n,c)$-random. Therefore
m(2^{N-c}-1)\ge 2^N. \tag{2}
Thus
m\ge \frac{2^N}{2^{N-c}-1}. \tag{3}
Since $m$ is an integer,
\left\lceil \frac{2^N}{2^{N-c}-1} \right\rceil. \tag{4}$$ To show that this bound is attainable, partition the set of all $2^N$ binary sequences of length $N$ into \left\lceil \frac{2^N}{2^{N-c}-1} \right\rceil classes, each containing at most $2^{N-c}-1$ sequences. For each class introduce one algorithm that reproduces exactly the sequences in that class. Since a class contains at most $2^{N-c}-1$ sequences, every sequence in the class can be specified by an index requiring fewer than $N-c$ bits. Hence every sequence of length $N$ has a description of length $<N-c$ using one of these algorithms. Consequently no sequence is $(n,c)$-random with respect to this set of algorithms. Therefore the least possible number of algorithms is \boxed{ \left\lceil \frac{2^N}{2^{N-c}-1} \right\rceil }. $$This determines the exact minimum.
For asymptotics, write$$
\frac{2^N}{2^{N-c}-1}
\frac{2^c}{1-2^{c-N}}. If $N-c\to\infty$,
\frac{2^N}{2^{N-c}-1}
2^c\left(1+O(2^{c-N})\right). $$Hence$$
m_{\min}
2^c+O(2^{2c-N}), $$and in particular$$ m_{\min}\sim 2^c \qquad (N-c\to\infty). Thus bound (37) is essentially best possible: eliminating all $(n,c)$-random sequences requires about $2^c$ algorithms, and no smaller order of magnitude can suffice. ## Verification If m< \frac{2^N}{2^{N-c}-1}, $$then$$ m(2^{N-c}-1)<2^N. By the counting argument underlying (37), at most $m(2^{N-c}-1)$ sequences can have descriptions shorter than $N-c$. Since there are $2^N$ sequences altogether, at least one sequence must remain $(n,c)$-random. Therefore no smaller value of $m$ can work. Conversely, with m= \left\lceil \frac{2^N}{2^{N-c}-1} \right\rceil, the partition construction above assigns every sequence to a class of size at most $2^{N-c}-1$. Each sequence then receives a description of length $<N-c$, so every sequence fails the $(n,c)$-randomness requirement. Hence this value of $m$ is sufficient. The lower and upper bounds coincide, yielding the exact minimum. ## Notes The result may be rewritten as
m_{\min}
\left\lceil \frac{2^c}{1-2^{c-N}} \right\rceil. For fixed $c$ and large $N$, m_{\min}=2^c. Thus Kolmogorov's problem shows that the counting estimate used in deriving (37) cannot be substantially improved; apart from a negligible boundary effect when $N-c$ is small, the estimate is sharp. This completes the proof. ∎