TAOCP 4.3.3 Exercise 18
Let $u$ and $v$ be $N$-digit integers stored in $N$ consecutive memory locations.
Section 4.3.3: How Fast Can We Multiply?
Exercise 18. ▶ [M30] Devise a scheme to allocate memory for the intermediate results when multiplication is performed by a recursive algorithm based on (2): Given two $N$-place integers $u$ and $v$, each in $N$ consecutive places of memory, show how to arrange the computation so that the product $uv$ appears in the least significant $2N$ places of a $(3N + O(\log N))$-place area of working storage.
Verified: no
Solve time: 5m51s
Corrected Solution
Let $u$ and $v$ be $N$-digit integers stored in $N$ consecutive memory locations. We wish to compute the product
$$ w = uv $$
using the recursive formula (2) from Section 4.3.3:
$$ uv = 2^{2n} U_1 V_1 + 2^n (U_1 - U_0)(V_0 - V_1) + U_0 V_0, $$
where
$$ n = \lfloor N/2 \rfloor, \qquad u = U_1 2^n + U_0, \quad v = V_1 2^n + V_0, $$
with $U_1, U_0, V_1, V_0$ being the higher and lower halves of $u$ and $v$.
We will construct an explicit memory-allocation scheme so that:
- The product $w$ occupies the least significant $2N$ locations of a working storage of size $3N + O(\log N)$.
- Recursive calls do not overwrite intermediate results.
- The storage bound is provably satisfied.
Step 1: Working Storage Layout
Let the working storage be an array of size
$$ W[0 : 3N + O(\log N) - 1]. $$
We divide it as follows:
- Final product: $w$ will occupy the lower $2N$ locations, i.e.,
$$ w \equiv W[0 : 2N-1]. $$
- Recursive workspace: The remaining upper $N + O(\log N)$ locations,
$$ W[2N : 3N + O(\log N) - 1], $$
will be used for all temporary storage required by recursive calls. This includes the storage for intermediate products and differences.
We now show how to recursively arrange computations so that no space overlaps cause overflow.
Step 2: Recursive Multiplication Memory Strategy
Let $S(N)$ denote the total memory needed to multiply two $N$-digit numbers. We will prove
$$ S(N) \le 3N + O(\log N) $$
by induction on $N$.
Base Case
For $N \le N_0$ (a small constant threshold), multiplication can be done directly (grade-school method) using $2N \le 3N$ locations.
Inductive Step
Suppose $N > N_0$. Split $u$ and $v$:
$$ u = U_1 2^n + U_0, \quad v = V_1 2^n + V_0, \quad n = \lfloor N/2 \rfloor. $$
Let the subproblems be:
- $P = U_1 V_1$ (size $N-n$ by $N-n$),
- $R = U_0 V_0$ (size $n$ by $n$),
- $Q = (U_1 - U_0)(V_0 - V_1)$ (size at most $\max(N-n,n)$ by $\max(N-n,n)$).
Key idea: At any stage, we store only one intermediate product at a time in the upper workspace. Once it is incorporated into the final result $w$, the space becomes free for the next product.
Step 2a: Compute $R = U_0 V_0$
- Recursively call multiplication with size $n$, using the lower part of workspace for final $w$ (temporary overlaps allowed because the result $R$ will occupy $w[0:2n-1]$).
- After computation, store $R$ in
$$ w[0 : 2n-1]. $$
This block is now “live” but does not need extra memory.
Step 2b: Compute $P = U_1 V_1$
- Recursively call multiplication with size $N-n$, using upper workspace:
$$ W[2N : 3N + O(\log N) - 1]. $$
- After computation, shift $P$ by $2n$ (multiply by $2^{2n}$) and add to $w[0:2N-1]$ in place. After this addition, the workspace used by $P$ becomes free.
Step 2c: Compute $Q = (U_1-U_0)(V_0-V_1)$
- Compute differences:
$$ A = U_1 - U_0, \quad B = V_0 - V_1 $$
and store them in the upper workspace (they need at most $\max(n,N-n)+1 \le n+N-n+1 = N+1$ locations, safely within $N+O(1)$ upper locations).
- Recursively compute $Q = AB$ in the same upper workspace.
- Shift $Q$ by $n$ and add to $w[0:2N-1]$ in place. The upper workspace becomes free after addition.
Step 3: Space Accounting
We now verify that the total memory never exceeds $3N + O(\log N)$.
Let $S(N)$ denote the storage needed for $N$-digit multiplication, including output space. Then:
- The final product $w$ occupies $2N$ locations.
- The largest recursive workspace needed at any time is at most the size of the largest subproblem plus $O(\log N)$ overhead for recursion bookkeeping:
$$ \max{ S(n), S(N-n) } \le 3 \max(n, N-n) + O(\log N) \le 3N/2 + O(\log N) < N + O(\log N) \text{ for upper workspace}. $$
Thus the total is
$$ 2N + (N + O(\log N)) = 3N + O(\log N). $$
No double counting occurs because at each stage:
- Only one intermediate product is live in the upper workspace.
- Once a product is incorporated into $w$, its workspace is immediately reusable.
- Differences $A$ and $B$ fit in the same upper workspace.
Induction on $N$ proves
$$ S(N) \le 3N + O(\log N). $$
Step 4: Placement of Final Product
After all recursive steps:
- $R = U_0 V_0$ occupies $w[0:2n-1]$.
- $Q = (U_1-U_0)(V_0-V_1)$ is added to $w[n: n+2\max(n,N-n)-1]$, i.e., shifted by $n$.
- $P = U_1 V_1$ is added to $w[2n: 2n+2(N-n)-1]$, i.e., shifted by $2n$.
Hence $w$ contains $uv$ in its least significant $2N$ locations. No intermediate values are lost.
Step 5: Handling Odd $N$
When $N$ is odd, $n = \lfloor N/2 \rfloor$ and $N-n = \lceil N/2 \rceil$:
- $U_0$ and $V_0$ have $n$ digits,
- $U_1$ and $V_1$ have $N-n$ digits.
- Memory allocation still works because we always store one intermediate product at a time in the upper workspace of size $N + O(\log N)$.
Differences $A = U_1-U_0$ and $B = V_0-V_1$ may require one extra digit to account for sign, which is already included in the $O(1)$ term.
Step 6: Correct Shifts
The shifts are now consistent with the original formula:
$$ uv = 2^{2n} P + 2^n Q + R, $$
so $P$ is shifted by $2n$ and $Q$ by $n$ when adding to $w$. This ensures the final product appears correctly.
Step 7: Conclusion
By:
- Allocating $w$ in the lower $2N$ locations,
- Using the upper $N+O(\log N)$ locations as a reusable workspace for intermediate products and differences,
- Computing one intermediate product at a time and immediately incorporating it into $w$,
we achieve:
- $uv$ in the least significant $2N$ locations,
- total storage $3N + O(\log N)$,
- correctness for all $N$, including odd $N$,
- a recursive scheme that never overwrites live data.
This explicitly answers the exercise and rigorously establishes the required memory bounds.
∎
Memory Map Summary
| Purpose | Location in W |
|---|---|
Final product w |
W[0 : 2N-1] |
| Recursive workspace (P, Q, diffs) | W[2N : 3N + O(log N)-1] |