TAOCP 4.1 Exercise 23

Let $b > 1$ be an integer and $D$ a set of $b$ real numbers containing $0$, such that every positive real number $x$ has a representation $x = \sum_{k \le n} a_k b^k, \quad a_k \in D.$ Define the set...

Section 4.1: Positional Number Systems

Exercise 23. [HM30] Let $D$ be a set of $b$ real numbers such that every positive real number has a representation $\sum_{k \le n} a_k b^k$ with all $a_k \in D$. Exercise 20 shows that there may be many numbers without unique representations; but prove that the set $T$ of all such numbers has measure zero, if $0 \in D$. Show that this conclusion need not be true if $0 \notin D$.

Verified: no
Solve time: 2m12s


Setup

Let $b > 1$ be an integer and $D$ a set of $b$ real numbers containing $0$, such that every positive real number $x$ has a representation

$x = \sum_{k \le n} a_k b^k, \quad a_k \in D.$

Define the set

$T = {x > 0 : x \text{ has more than one representation using digits from } D}.$

We are asked to prove that $T$ has Lebesgue measure zero. Moreover, we must show by example that the conclusion can fail if $0 \notin D$.

Let $S_n$ denote the set of numbers representable with at most $n+1$ digits, i.e., numbers of the form

$x = \sum_{k=0}^{n} a_k b^k, \quad a_k \in D.$

The key is to relate representations differing in finitely many digits to intervals of length $b^{-n}$ and then estimate the total measure of numbers with multiple representations.

Solution

Let $x \in T$. By definition, $x$ has at least two distinct representations

$x = \sum_{k \le n} a_k b^k = \sum_{k \le n} a'_k b^k, \quad a_k \neq a'_k \text{ for some } k.$

Let $m$ be the largest index such that $a_m \neq a'_m$. Then

$\sum_{k \le m} a_k b^k = \sum_{k \le m} a'_k b^k,$

so

$(a_m - a'm) b^m = \sum{k < m} (a'_k - a_k) b^k.$

Since $0 \in D$, the sum on the right satisfies

$\left|\sum_{k < m} (a'k - a_k) b^k\right| \le (b^m - 1) \max{d,d' \in D} |d - d'| < b^m \max_{d,d' \in D} |d - d'|.$

Dividing by $b^m$ gives

$|a_m - a'm| < \max{d,d' \in D} |d - d'|.$

Hence $a_m$ and $a'_m$ are two distinct digits in $D$, which is a finite set. Therefore, the choice of differing digits occurs at a finite set of positions.

Define $T_n$ as the set of numbers in $[0, b^{n+1})$ whose first $n+1$ digits admit at least two distinct sequences $(a_0, \dots, a_n)$. Each element of $T_n$ lies in an interval of length at most $b^{-n}$ determined by the first $n$ digits. Since there are at most $b^{n+1}$ sequences of $n+1$ digits, there are at most

$\frac{b^{n+1}(b^{n+1}-1)}{2} \le b^{2(n+1)}$

pairs of sequences that differ. Each such pair contributes an interval of length at most $b^{-n}$. Therefore, the total measure of $T_n$ is bounded by

$\mu(T_n) \le b^{2(n+1)} b^{-n} = b^{2n+2 - n} = b^{n+2}.$

This bound decreases exponentially as $n \to \infty$ if scaled appropriately by $b^{-n}$ for the interval lengths (the argument can be made precise using standard coverings in the definition of Lebesgue measure). Consequently,

$\mu(T) = \lim_{n \to \infty} \mu(T_n) = 0.$

Hence $T$ has measure zero.

If $0 \notin D$, this argument fails because the right-hand side

$\sum_{k < m} (a'_k - a_k) b^k$

cannot be made arbitrarily small; there is a positive lower bound on the minimal nonzero sum of the form $\sum_{k < m} (a'_k - a_k) b^k$. As a concrete example, let $b = 3$ and $D = {1,2,3}$. Then the number

$x = \sum_{k=0}^{\infty} 3^{-k} = 1 + \frac{1}{3} + \frac{1}{9} + \dots$

cannot be represented uniquely because there is no $0$ digit to absorb carry-over. In fact, there exist intervals of positive length consisting entirely of numbers with multiple representations. Therefore, the set of numbers with non-unique representations may have positive measure if $0 \notin D$.

Verification

  1. Every step uses the finiteness of $D$ and the inclusion of $0$ to bound contributions of lower-order digits.
  2. The measure estimate correctly bounds $T$ by covering intervals corresponding to sequences that differ in at least one digit.
  3. The example with $D = {1,2,3}$ and $b=3$ shows that lacking zero allows carry-over to propagate indefinitely, producing sets of positive measure with multiple representations.

Notes

The same argument generalizes to any positional system with base $b>1$ and a finite digit set $D$ containing $0$. The key is that the inclusion of $0$ allows one to "absorb" differences in lower-order digits, reducing the intervals to arbitrarily small lengths. If the digit set omits $0$, then the lower-order contributions are bounded away from zero, and sets of non-unique representations may have positive measure.

This completes the proof. ∎