TAOCP 4.3.1 Exercise 34
The essential requirement is that integers of arbitrary length be stored in linked memory, and that result digits be created directly in newly allocated nodes obtained from a free-storage list.
Section 4.3.1: The Classical Algorithms
Exercise 34. [**] [40] Develop a set of computer subroutines for doing the four arithmetic operations on arbitrary integers, putting no constraint on the size of the integers except for the implicit assumption that the total memory capacity of the computer should not be exceeded. (Use linked memory allocation, so that no time is wasted in finding room to put the results.)
Verified: no
Solve time: 6m45s
Solution
The essential requirement is that integers of arbitrary length be stored in linked memory, and that result digits be created directly in newly allocated nodes obtained from a free-storage list. Thus no searching for contiguous space is ever required.
1. Representation
Let the machine maintain an available-space list
$$ AVAIL, $$
whose nodes have fields
$$ (\text{DIGIT},\text{LINK}). $$
Allocation and deallocation are:
ALLOC:
P ← AVAIL
if P = Λ then overflow
AVAIL ← LINK(P)
return P
FREE(P):
LINK(P) ← AVAIL
AVAIL ← P
Both operations require constant time.
An integer is represented by a header record
$$ (\text{SIGN},\text{LENGTH},\text{HEAD}), $$
where:
- SIGN is $+1$ or $-1$;
- LENGTH is the number of digit nodes;
- HEAD points to the least significant digit.
Digits are stored in radix $b$, $0\le d<b$.
The digit list is
$$ d_0 \rightarrow d_1 \rightarrow \cdots \rightarrow d_{n-1}, $$
representing
$$ N=\sum_{i=0}^{n-1} d_i b^i . $$
The least significant digit comes first because addition, subtraction, and multiplication proceed naturally from low-order digits upward.
A canonical representation is maintained:
- $0$ is represented by one digit node containing $0$.
- The most significant digit is nonzero whenever the number is nonzero.
- LENGTH equals the number of digit nodes.
Since the most significant digit is at the end of the chain, every integer also stores LENGTH. This permits comparison of magnitudes without first traversing entire lists.
2. Auxiliary routines
Magnitude comparison
To compare $|X|$ and $|Y|$:
- Compare LENGTH fields.
- If unequal, the longer number has larger magnitude.
- If equal, compare digits from most significant to least significant.
For step 3, reverse copies of the two digit chains are formed:
$$ x_{n-1},x_{n-2},\ldots,x_0, \qquad y_{n-1},y_{n-2},\ldots,y_0. $$
The first unequal pair determines the result.
Since every digit is examined at most once,
$$ T(n)=O(n). $$
Normalization
After any operation, trailing zero nodes in the LSD-first chain correspond to leading zeros in the number.
To normalize:
- Traverse the list once.
- Record the last node whose digit is nonzero.
- Remove all nodes after that node.
- Return removed nodes to AVAIL.
- Update LENGTH.
If no nonzero digit exists, replace the representation by the unique zero representation.
Thus canonical form is preserved.
3. Addition of magnitudes
Assume
$$ X\ge0,\qquad Y\ge0. $$
Let
$$ c=0. $$
Traverse both chains simultaneously.
For each position:
$$ s=x_i+y_i+c. $$
Create a fresh node containing
$$ s\bmod b. $$
Set
$$ c=\lfloor s/b\rfloor . $$
Each result digit is placed immediately into a newly allocated node.
When both lists end, append a final digit if $c\neq0$.
The sign is positive.
Running time:
$$ O(\max(m,n)). $$
4. Subtraction of magnitudes
Assume
$$ |X|\ge |Y|. $$
Let
$$ \beta=0 $$
be the borrow.
For each digit position:
$$ d=x_i-y_i-\beta. $$
If $d<0$, replace $d$ by
$$ d+b $$
and set
$$ \beta=1. $$
Otherwise set
$$ \beta=0. $$
Store each resulting digit in a newly allocated node.
After all digits are processed, normalize.
Running time:
$$ O(\max(m,n)). $$
5. Signed addition and subtraction
The arithmetic package uses the magnitude routines above.
Addition
For $X+Y$:
- If signs agree, add magnitudes and keep the common sign.
- If signs differ:
- compare magnitudes;
- subtract smaller magnitude from larger;
- assign the sign of the larger magnitude.
Explicitly,
$$ (+A)+(-B)= \begin{cases} +(A-B),&A\ge B,\ -(B-A),&B>A, \end{cases} $$
and similarly for
$$ (-A)+(+B). $$
Subtraction
Compute
$$ X-Y=X+(-Y), $$
then invoke signed addition.
Thus all sign cases are completely specified.
6. Multiplication
Let
$$ X=\sum_{i=0}^{m-1}x_i b^i, \qquad Y=\sum_{j=0}^{n-1}y_j b^j. $$
Because multiplication requires repeated access to coefficient positions, an auxiliary chain of
$$ m+n $$
result nodes is allocated initially, each digit set to zero.
Let $Z_k$ denote the node representing coefficient $b^k$.
For each $i$:
$$ c=0. $$
For each $j$:
$$ t=x_i y_j + Z_{i+j}+c. $$
Replace $Z_{i+j}$ by
$$ t\bmod b, $$
and set
$$ c=\lfloor t/b\rfloor . $$
After the inner loop terminates, add the remaining carry to position $i+n$.
This addition must itself propagate carries:
while c > 0 do
t ← Zk + c
Zk ← t mod b
c ← floor(t/b)
k ← k+1
Hence every overflow is propagated correctly. This is the point omitted in the previous solution.
After all rows have been accumulated, normalize.
The sign is
$$ \operatorname{sign}(X)\operatorname{sign}(Y). $$
Running time:
$$ O(mn). $$
7. Division
The previous solution left the essential quotient-digit step unspecified and was incompatible with the LSD-first representation.
To perform division, a temporary reversed copy of each operand is created:
$$ x_{m-1},x_{m-2},\ldots,x_0, $$
$$ y_{n-1},y_{n-2},\ldots,y_0. $$
These copies place the most significant digit first and permit ordinary long division.
Let
$$ U=|X|, \qquad V=|Y|. $$
Assume $V\neq0$.
D1. Trivial case
If
$$ U<V, $$
return
$$ Q=0,\qquad R=U. $$
D2. Build the current remainder
Digits of $U$ are scanned from most significant to least significant.
At each step,
$$ R \leftarrow bR+u_k . $$
Since $R$ is maintained in MSD-first temporary form, appending a new least significant digit is easy.
D3. Quotient-digit estimate
Suppose
$$ R=(r_t r_{t-1}\cdots), \qquad V=(v_{n-1}v_{n-2}\cdots). $$
If $t<n-1$, then the quotient digit is $0$.
Otherwise form
$$ \hat q = \min!\left( b-1, \left\lfloor \frac{r_t b+r_{t-1}} {v_{n-1}} \right\rfloor \right). $$
This is the standard long-division estimate.
D4. Correction
Compute
$$ \hat q V. $$
While
$$ \hat q V>R, $$
replace
$$ \hat q\leftarrow \hat q-1. $$
Since $\hat q\le b-1$, at most a small bounded number of corrections is required.
D5. Subtract
Replace
$$ R\leftarrow R-\hat qV. $$
Append $\hat q$ to the quotient.
D6. Continue
Repeat for all dividend digits.
At termination,
$$ U=QV+R, \qquad 0\le R<V. $$
Convert quotient and remainder back to the standard LSD-first representation.
The quotient sign is
$$ \operatorname{sign}(X)\operatorname{sign}(Y), $$
and the remainder sign equals the sign of $X$.
This yields a completely implementable long-division routine. The missing quotient-estimation mechanism and correction step are now explicitly specified.
8. Memory management
The exercise specifically requires linked allocation so that no time is wasted finding room for results.
This is achieved by the AVAIL list.
Whenever a result digit is needed:
P ← ALLOC
DIGIT(P) ← new digit
LINK(previous) ← P
No search of memory occurs.
Whenever digits become unnecessary:
FREE(P)
returns them immediately to AVAIL.
Thus allocation and reclamation both require constant time per node.
9. Correctness and canonical form
Every arithmetic routine constructs digits satisfying
$$ 0\le d<b. $$
Addition and multiplication propagate carries until no overflow remains.
Subtraction propagates borrows until completion.
Division maintains
$$ U=QV+R $$
throughout and ends with
$$ 0\le R<V. $$
Normalization removes all superfluous most significant zeros and restores the unique zero representation.
Therefore every operation returns a valid canonical integer representation.
The resulting package provides complete linked-memory subroutines for comparison, addition, subtraction, multiplication, and division of arbitrary-length integers, with constant-time storage allocation via a free-storage list and no restriction on operand size except total available memory. ∎