TAOCP 4.6.4 Exercise 20

Let P(x)=u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5.

Section 4.6.4: Evaluation of Polynomials

Exercise 20. ▶ [21] Write a MIX program that evaluates a fifth-degree polynomial according to scheme (11); try to make the program as efficient as possible, by making slight modifications to (11). Use MIX's floating point arithmetic operators FADD and FMUL, which are described in Section 4.2.1.

Verified: yes
Solve time: 4m11s


Solution

Let

$$ P(x)=u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5. $$

Scheme (11) in §4.6.4 is Horner's rule:

$$ P(x) = \bigl(\bigl(\bigl(\bigl(u_0x+u_1\bigr)x+u_2\bigr)x+u_3\bigr)x+u_4\bigr)x+u_5. $$

A direct implementation requires five multiplications by $x$ and five additions.

The exercise asks for a fifth-degree polynomial and suggests making slight modifications to (11) in order to improve efficiency. Since the degree is fixed, we can group two Horner steps at a time by introducing

$$ y=x^2. $$

Then

$$ \begin{aligned} P(x) &= u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5 \ &= x(u_0y^2+u_2y+u_4) +(u_1y^2+u_3y+u_5). \end{aligned} $$

Define

$$ A(y)=u_0y^2+u_2y+u_4, \qquad B(y)=u_1y^2+u_3y+u_5. $$

Both $A$ and $B$ are quadratic polynomials in $y$, hence

$$ A(y)=(u_0y+u_2)y+u_4, \qquad B(y)=(u_1y+u_3)y+u_5. $$

The evaluation can therefore proceed as follows:

  1. Compute $y=x^2$.
  2. Evaluate $A(y)$.
  3. Evaluate $B(y)$.
  4. Form $xA(y)+B(y)$.

This uses

$$ 1+2+2+1=6 $$

floating multiplications and

$$ 2+2+1=5 $$

floating additions.

The number of additions is unchanged, but the evaluation of the two quadratics is independent and allows better use of registers and memory. In MIX this leads to a compact program.

Assume:

  • XVAL contains $x$,
  • U0, U1, U2, U3, U4, U5 contain the coefficients,
  • Y, A, B are temporary storage locations.

All quantities are floating-point numbers.

        LDA   XVAL
        FMUL  XVAL        * y := x^2
        STA   Y

* A = (u0*y + u2)*y + u4

        LDA   U0
        FMUL  Y
        FADD  U2
        FMUL  Y
        FADD  U4
        STA   A

* B = (u1*y + u3)*y + u5

        LDA   U1
        FMUL  Y
        FADD  U3
        FMUL  Y
        FADD  U5
        STA   B

* P(x) = x*A + B

        LDA   A
        FMUL  XVAL
        FADD  B

        HLT

Every FMUL and FADD has a memory operand, exactly as required by the MIX floating-point instruction set.

To verify correctness, let the value stored in A be

$$ A=(u_0y+u_2)y+u_4=u_0y^2+u_2y+u_4, $$

and the value stored in B be

$$ B=(u_1y+u_3)y+u_5=u_1y^2+u_3y+u_5. $$

The final accumulator value is

$$ xA+B = x(u_0y^2+u_2y+u_4) +(u_1y^2+u_3y+u_5). $$

Substituting $y=x^2$ gives

$$ u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5 = P(x). $$

Thus the program correctly evaluates the fifth-degree polynomial.

The slight modification of Horner's scheme consists of treating the polynomial as a linear polynomial in $x$ whose coefficients are quadratic polynomials in $x^2$. This yields an efficient MIX floating-point program using only valid FMUL and FADD instructions with memory operands. ∎