TAOCP 4.3.1 Exercise 3
Exercise 2 generalizes Algorithm A to the addition of $m$ nonnegative $n$-place numbers (u^{(1)}_{n-1}\cdots u^{(1)}_0)_b,\; (u^{(2)}_{n-1}\cdots u^{(2)}_0)_b,\; \ldots,\; (u^{(m)}_{n-1}\cdots u^{(m)}...
Section 4.3.1: The Classical Algorithms
Exercise 3. [21] Write a MIX program for the algorithm of exercise 2, and estimate its running time as a function of $m$ and $n$.
Verified: no
Solve time: 7m25s
Solution
Exercise 2 generalizes Algorithm A to the addition of $m$ nonnegative $n$-place numbers
$$ (u^{(1)}_{n-1}\cdots u^{(1)}_0)b,; (u^{(2)}{n-1}\cdots u^{(2)}_0)b,; \ldots,; (u^{(m)}{n-1}\cdots u^{(m)}_0)_b, $$
where $m\le b$. At stage $j$, the algorithm forms
$$ t=k+\sum_{r=1}^{m}u^{(r)}_j, $$
then sets
$$ w_j=t\bmod b,\qquad k=\lfloor t/b\rfloor . $$
After the last column, $w_n=k$.
Storage conventions
Let
$$ U_r+j $$
contain the digit $u^{(r)}_j$, for $1\le r\le m$ and
$0\le j\le n-1$.
The result digits are stored in
$$ W+j,\qquad 0\le j\le n. $$
The values $m$, $n$, and $b$ are stored in locations
$M$, $N$, and $B$.
The carry $k$ is kept in location $K$.
The current column sum $t$ is accumulated in location $T$.
The index register $I_1$ contains the current digit position $j$.
The index register $I_2$ contains the current summand number $r$.
A table $BASE$ contains the addresses $U_1,U_2,\ldots,U_m$. Thus
$BASE+r$ contains the starting address of the $r$th summand.
MIX program
ENT1 0 j ← 0
ENTA 0
STA K k ← 0
COL LDA K t ← k
STA T
ENT2 0 r ← 0
SUM INC2 1 r ← r+1
CMP2 M
JG DONE
LDA BASE,2 A ← address(Ur)
STA P
LDA T
ADD P,1 add u(r)j
STA T
JMP SUM
DONE LDA T
ENTX 0
DIV B quotient = ⌊t/b⌋,
remainder = t mod b
STA K new carry
STX W,1 store wj
INC1 1 j ← j+1
CMP1 N
JL COL
LDA K
STA W,1 store wn
HLT
The essential point is the use of MIX division. Before DIV B, the
quantity $t$ is placed in register $A$ and register $X$ is set to
zero. The operation divides the double-length quantity $(A:X)$ by
$b$. Since $t<b^2$ when $m\le b$,
$$ t=(\lfloor t/b\rfloor)b+(t\bmod b), $$
and after the division the quotient $\lfloor t/b\rfloor$ is in $A$,
while the remainder $t\bmod b$ is in $X$. Hence the carry and output
digit are obtained exactly as required by Algorithm A.
Correctness
At the beginning of each execution of label COL, let $j$ be the value
contained in $I_1$. The invariant is:
$$ K= \left\lfloor \frac{ k_{j-1}+\sum_{i=0}^{j-1} \left(\sum_{r=1}^{m}u^{(r)}_i\right)b^i } {b^j} \right\rfloor , $$
and the locations $W,W+1,\ldots,W+j-1$ contain the already computed
digits $w_0,\ldots,w_{j-1}$.
The loop SUM forms
$$ T=K+\sum_{r=1}^{m}u^{(r)}_j=t. $$
The division step computes
$$ A=\left\lfloor \frac{t}{b}\right\rfloor, \qquad X=t\bmod b. $$
Therefore
$$ K\leftarrow \left\lfloor \frac{t}{b}\right\rfloor, \qquad W+j\leftarrow t\bmod b, $$
which is exactly the rule of Exercise 2. Thus the invariant is preserved
for the next column. After the final column $j=n-1$ has been processed,
the carry remaining in $K$ is $w_n$, which is stored and the program
halts. Hence the program correctly computes the $(n+1)$-digit sum.
Running time
For each column:
- Initialize $T$ from the carry: $2$ instructions.
- Initialize $I_2$: $1$ instruction.
- For each of the $m$ summands:
- increment $I_2$,
- compare with $M$,
- load base address,
- store it,
- load $T$,
- add the digit,
- store $T$,
- jump back.
This is $8m+O(1)$ instructions. 4. Perform division and store carry and digit: $5$ instructions. 5. Advance $j$ and test termination: $3$ instructions.
Hence the instruction count is
$$ (8m+11)n+O(1). $$
Therefore the running time grows linearly in both $m$ and $n$:
$$ \boxed{\Theta(mn)}. $$
This is the expected complexity, since every one of the $mn$ input
digits must be examined exactly once.