TAOCP 4.2.4 Exercise 5
Let $U$ be a random variable uniformly distributed on the interval $[0,1)$.
Section 4.2.4: Distribution of Floating Point Numbers
Exercise 5. ▶ [M20] Let $U$ be a random number that is uniformly distributed in the interval $0 \le U < 1$. What is the distribution of the leading digits of $U$?
Verified: no
Solve time: 7m53s
Solution
Let $U$ be a random variable uniformly distributed on the interval $[0,1)$. That is, $U$ has probability density function
$$ f_U(u) = 1, \quad 0 \le u < 1. $$
We are asked to determine the distribution of the leading digit of $U$.
Step 1: Definition of the leading digit
Any number $U \in [0,1)$ can be written in decimal form as
$$ U = 0.d_1 d_2 d_3 \dots, $$
where $d_1 \in {0,1,2,\dots,9}$ is the first digit after the decimal point. By definition, the leading digit $D$ is the first nonzero digit in the decimal expansion. Then $D \in {1,2,\dots,9}$.
Step 2: Characterization of the event ${D=d}$
For a given $d \in {1,\dots,9}$, the event ${D = d}$ occurs if and only if $U$ lies in the union of intervals
$$ {D=d} = \bigcup_{k=1}^{\infty} \left[ d \cdot 10^{-k}, (d+1) \cdot 10^{-k} \right). $$
Justification:
- If the first nonzero digit appears at the $k$-th decimal place, it means that the first $k-1$ digits after the decimal are zero.
- Then the $k$-th digit is $d$, so $U$ lies between $d \cdot 10^{-k}$ and $(d+1) \cdot 10^{-k}$.
- Taking the union over all possible positions $k \ge 1$ accounts for all numbers whose first nonzero digit is $d$.
Step 3: Compute the probability
Since $U$ is uniform on $[0,1)$, the probability of each interval is simply its length:
$$ \Pr\big(D=d\big) = \sum_{k=1}^{\infty} \left( (d+1)10^{-k} - d , 10^{-k} \right) = \sum_{k=1}^{\infty} 10^{-k}. $$
Compute the infinite geometric series:
$$ \sum_{k=1}^{\infty} 10^{-k} = \frac{10^{-1}}{1 - 10^{-1}} = \frac{1/10}{9/10} = \frac{1}{9}. $$
Step 4: Conclusion
The leading digit $D$ of a uniform random variable $U \in [0,1)$ is uniformly distributed on ${1,2,\dots,9}$:
$$ \Pr(D = d) = \frac{1}{9}, \quad d = 1,2,\dots,9. $$
Step 5: Remarks
- The previous solution incorrectly claimed that $D$ follows Benford's law. That arises for numbers that are roughly distributed across several orders of magnitude, not for numbers uniformly distributed in $[0,1)$.
- The modular logarithm argument was invalid because the fractional part of $-\log_{10} U$ is not uniform for $U \sim \text{Uniform}[0,1)$.
- The direct union-of-intervals approach provides a simple and fully rigorous computation.
Answer:
$$ \boxed{\Pr(D = d) = \frac{1}{9}, \quad d = 1,2,\dots,9.} $$