TAOCP 4.2.2 Exercise 17
Let the floating point number in location ACC be u=s_u\,m_u\,b^{e_u}, and let the floating point number in register A be
Section 4.2.2: Accuracy of Floating Point Arithmetic
Exercise 17. [28] Write a MIX subroutine, FCMP, that compares the floating point number $u$ in location ACC with the floating point number $v$ in register A, setting the comparison indicator to LESS, EQUAL, or GREATER according as $u < v$, $u = v$, or $u > v$ ${e}$; here $e$ is stored in location EPSILON as a nonnegative fixed point quantity with the radix point assumed at the left of the word. Assume normalized inputs.
Verified: yes
Solve time: 5m56s
Solution
Let the floating point number in location ACC be
$$ u=s_u,m_u,b^{e_u}, $$
and let the floating point number in register A be
$$ v=s_v,m_v,b^{e_v}, $$
where the inputs are normalized. Let $e$ be the nonnegative fixed-point quantity stored in EPSILON.
The notation ${e}$ means that $u$ and $v$ are to be regarded as equal whenever their relative difference is at most $e$. Thus FCMP should set the comparison indicator according to
$$ u<v-e|v|,\qquad |u-v|\le e|v|,\qquad u>v+e|v|. $$
Equivalently, if
$$ d=u-v, \qquad t=e|v|, $$
then
$$ d<-t \Rightarrow \text{LESS}, \qquad |d|\le t \Rightarrow \text{EQUAL}, \qquad d>t \Rightarrow \text{GREATER}. $$
The problem is therefore reduced to computing $d$ and $t$.
Computation of $d=u-v$
The critical step is exponent alignment.
Suppose first that $e_u\ge e_v$. Then
$$ u-v = b^{e_u} \left( s_u m_u - s_v m_v b^{e_v-e_u} \right). $$
Since $e_v-e_u\le0$, the factor $b^{e_v-e_u}$ is obtained by dividing the mantissa of $v$ by $b^{e_u-e_v}$.
Similarly, if $e_v>e_u$,
$$ u-v = b^{e_v} \left( s_u m_u b^{e_u-e_v} - s_v m_v \right), $$
and the mantissa of $u$ must be divided by $b^{e_v-e_u}$.
Thus exponent alignment is performed by repeatedly shifting the mantissa belonging to the smaller exponent one radix position to the right until the exponents agree. The earlier description $m_v'=m_v b^{e_v-e_u}$ is correct only when interpreted as division by $b^{e_u-e_v}$; the actual operation is a right shift.
After alignment, the signed mantissas are subtracted and the result normalized. This yields the floating point number
$$ d=u-v. $$
Computation of $t=e|v|$
The quantity stored in EPSILON is a fixed-point fraction with radix point at the left of the word. Let
$$ e=\frac{E}{b^{5}}, $$
where $E$ is the integer represented by the five bytes of EPSILON.
Let
$$ |v|=m_v b^{e_v}, $$
with $m_v>0$.
To form
$$ t=e|v|, $$
the mantissa of $|v|$ is multiplied by the fixed-point fraction $e$. Since both operands are represented as ordinary MIX words, this is done by integer multiplication of the mantissa field by $E$, followed by normalization and adjustment of the exponent. No floating-point multiply instruction is required. The result is exactly the floating-point quantity
$$ t=e|v|. $$
Computation of $|d|$
After $d=u-v$ has been formed, its sign byte is inspected.
- If $d\ge0$, leave it unchanged.
- If $d<0$, change the sign byte from minus to plus.
Since the magnitude field is unchanged, the resulting value is $|d|$.
This explicitly supplies the absolute-value operation needed for the equality test.
Comparison
The comparison is now straightforward.
- Compare $d$ with $-t$.
If
$$ d<-t, $$
set the comparison indicator to LESS and return. 2. Compare $|d|$ with $t$.
If
$$ |d|\le t, $$
set the comparison indicator to EQUAL and return. 3. Otherwise,
$$ d>t, $$
so set the comparison indicator to GREATER and return.
The equivalence of these tests with the relative-error criterion follows immediately from the definition
$$ t=e|v|. $$
Indeed,
$$ |d|\le t \iff |u-v|\le e|v|, $$
which is exactly the condition for equality within tolerance ${e}$.
MIXAL sketch
A complete implementation follows the structure
FCMP STJ RETADR
* Form d = u - v.
LDA ACC
STA UBUF
STA DBUF
* Align exponents.
* Repeatedly right-shift the mantissa
* belonging to the smaller exponent
* until exponents agree.
* Subtract aligned mantissas.
* Normalize result.
* Store d in DBUF.
* Form t = e|v|.
LDA A
JAN ABSV
CONT1 STA VABS
LDA EPSILON
MUL VABS
* Normalize product.
STA TBUF
* Test d < -t.
CMPA ...
JL LESS
* Test |d| <= t.
LDA DBUF
JAN ABSD
CONT2 CMPA TBUF
JLE EQUAL
GREATER CMPA =0=
JG RETURN
LESS * set comparison indicator LESS
JMP RETURN
EQUAL * set comparison indicator EQUAL
RETURN JMP RETADR
ABSV ENTA |A|
JMP CONT1
ABSD ENTA |A|
JMP CONT2
The essential points are:
- Exponent alignment is done by right-shifting the mantissa associated with the smaller exponent.
- $|d|$ is obtained by replacing a negative sign byte by a positive sign byte.
- $t=e|v|$ is formed by multiplying the mantissa of $|v|$ by the fixed-point fraction stored in EPSILON and normalizing.
- The comparison indicator is set according to
$$ d<-t,\qquad |d|\le t,\qquad d>t. $$
Hence FCMP correctly implements comparison of normalized floating point numbers with relative tolerance ${e}$. ∎