TAOCP 4.2.2 Exercise 9

Let the approximation relation be defined as in §4.

Section 4.2.2: Accuracy of Floating Point Arithmetic

Exercise 9. [M22] Prove (33), and explain why the conclusion cannot be strengthened to the relation $u \approx w$ ($\epsilon_1 + \epsilon_2$).

Verified: yes
Solve time: 3m48s


Let the approximation relation be defined as in §4.2.2:

$$ u \approx v(\epsilon) $$

means that there exists a number $\delta$ with $|\delta|\le \epsilon$ such that

$$ v=u(1+\delta). $$

Equation (33) states that

$$ u\approx v(\epsilon_1),\qquad v\approx w(\epsilon_2) $$

imply

$$ u\approx w(\epsilon_1+\epsilon_2+\epsilon_1\epsilon_2). \tag{33} $$

Proof of (33)

Since $u\approx v(\epsilon_1)$, there exists $\delta_1$ such that

$$ v=u(1+\delta_1), \qquad |\delta_1|\le \epsilon_1. $$

Since $v\approx w(\epsilon_2)$, there exists $\delta_2$ such that

$$ w=v(1+\delta_2), \qquad |\delta_2|\le \epsilon_2. $$

Substituting the first equation into the second gives

$$ w=u(1+\delta_1)(1+\delta_2) =u\bigl(1+\delta_1+\delta_2+\delta_1\delta_2\bigr). $$

Define

$$ \delta=\delta_1+\delta_2+\delta_1\delta_2. $$

Then

$$ w=u(1+\delta). $$

Furthermore,

$$ |\delta| \le |\delta_1|+|\delta_2|+|\delta_1\delta_2| \le \epsilon_1+\epsilon_2+\epsilon_1\epsilon_2. $$

Hence

$$ u\approx w(\epsilon_1+\epsilon_2+\epsilon_1\epsilon_2), $$

which is exactly (33).

$\square$

Why the conclusion cannot be strengthened to $u\approx w(\epsilon_1+\epsilon_2)$

The bound $\epsilon_1+\epsilon_2+\epsilon_1\epsilon_2$ is best possible in general.

Take

$$ v=u(1+\epsilon_1), \qquad w=v(1+\epsilon_2). $$

Then

$$ w=u(1+\epsilon_1)(1+\epsilon_2) =u\bigl(1+\epsilon_1+\epsilon_2+\epsilon_1\epsilon_2\bigr). $$

Therefore the relative error of $w$ with respect to $u$ is exactly

$$ \epsilon_1+\epsilon_2+\epsilon_1\epsilon_2. $$

If one tried to strengthen (33) to

$$ u\approx w(\epsilon_1+\epsilon_2), $$

one would need

$$ \epsilon_1+\epsilon_2+\epsilon_1\epsilon_2 \le \epsilon_1+\epsilon_2, $$

which is false whenever $\epsilon_1,\epsilon_2>0$.

Thus the extra term $\epsilon_1\epsilon_2$ cannot, in general, be omitted, and the conclusion of (33) cannot be strengthened to

$$ u\approx w(\epsilon_1+\epsilon_2). $$

$\square$