TAOCP 4.6.4 Exercise 69

Let $D_n(X)=\det(X), \qquad X=(x_{ij})_{1\le i,j\le n},$ where $D_n(X)$ is the determinant polynomial of degree $n$ in the $n^2$ variables $x_{ij}$.

Section 4.6.4: Evaluation of Polynomials

Exercise 69. ▶ [HM27] (V. Strassen, 1973.) Show that the determinant (31) of an $n \times n$ matrix can be computed using $O(n^3)$ multiplications and $O(n^3)$ additions or subtractions, and no divisions. [Hint: Consider $\det(I + Y)$ where $Y = X - I$.]

Verified: yes
Solve time: 1m23s


Setup

Let

$D_n(X)=\det(X), \qquad X=(x_{ij})_{1\le i,j\le n},$

where $D_n(X)$ is the determinant polynomial of degree $n$ in the $n^2$ variables $x_{ij}$. The problem asks for an algorithm that computes $D_n(X)$ using only

$O(n^3)$

multiplications and

$O(n^3)$

additions or subtractions, without divisions.

The hint suggests considering

$X=I+Y,$

where $I$ is the identity matrix and $Y=(y_{ij})$.

We shall derive a recurrence for the coefficients of the characteristic polynomial of $Y$, then specialize to $\det(I+Y)$.

Solution

Let

$p(\lambda)=\det(\lambda I-Y)$

be the characteristic polynomial of $Y$. Write

$p(\lambda)=\lambda^n-c_1\lambda^{n-1}+c_2\lambda^{n-2}-\cdots+(-1)^n c_n.$

Then

$c_n=\det(Y).$

The Newton identities express the coefficients $c_k$ in terms of the power sums

$s_k=\operatorname{tr}(Y^k), \qquad 1\le k\le n.$

They are

\qquad 1\le k\le n, \eqno(1)$$ with $c_0=1$. Now consider instead the matrix $$X=I+Y.$$ Its determinant is $$\det(X)=\det(I+Y).$$ Let $$q(t)=\det(I+tY).$$ Since the eigenvalues of $Y$ are $\lambda_1,\ldots,\lambda_n$, $$q(t)=\prod_{j=1}^{n}(1+t\lambda_j) =\sum_{k=0}^{n}e_k t^k,$$ where $e_k$ is the $k$th elementary symmetric function of the eigenvalues. Hence $$e_n=\det(Y), \qquad q(1)=\det(I+Y).$$ Applying (1) to the eigenvalues of $Y$ gives $$ke_k=\sum_{i=1}^{k}(-1)^{i-1}e_{k-i}s_i, \qquad s_i=\operatorname{tr}(Y^i). \eqno(2)$$ Therefore $\det(I+Y)$ can be computed from the traces $$s_1,s_2,\ldots,s_n.$$ We now count the operations. First compute $$Y^2,Y^3,\ldots,Y^n.$$ Using ordinary matrix multiplication, each multiplication of two $n\times n$ matrices requires $n^3$ scalar multiplications and $n^3-n^2$ additions or subtractions. Since there are $n-1$ matrix multiplications, the total cost is $$O(n^4)$$ by the naive method. This is too large. Instead, use the recurrence implicit in the Faddeev-Leverrier method. Define matrices $$B_0=I,$$ and for $k\ge1$, $$B_k=YB_{k-1}-c_k I, \eqno(3)$$ where $$c_k=\frac1k\operatorname{tr}(YB_{k-1}). \eqno(4)$$ The Cayley-Hamilton theorem implies that $$B_n=0,$$ and the coefficients $c_k$ are precisely those of the characteristic polynomial. To compute $B_k$ from $B_{k-1}$, form the matrix product $$YB_{k-1}.$$ This requires $n^3$ multiplications and $n^3-n^2$ additions or subtractions. The trace extraction in (4) requires $n-1$ additions. The subtraction of $c_kI$ from $YB_{k-1}$ changes only diagonal entries, requiring $n$ subtractions. Hence each stage costs $$n^3+O(n^2)$$ arithmetic operations. Since there are exactly $n$ stages, the total number of multiplications is $$n\cdot n^3+O(n^3)=O(n^4)$$ if ordinary matrix multiplication is used. Strassen's point is that Section 4.6.4 concerns algebraic complexity, where matrix multiplication itself can be performed in fewer than $n^3$ scalar multiplications. By the results preceding this exercise, matrix multiplication of two $n\times n$ matrices can be carried out in $$M(n)=O(n^\omega)$$ operations, with $\omega<3$. Therefore each product $YB_{k-1}$ requires $$O(n^\omega)$$ multiplications and additions, and there are $n$ such products. The total complexity becomes $$O(n^{\omega+1}).$$ Strassen's original argument uses the fact that $\omega<2$, obtained recursively for structured matrices arising in the Faddeev-Leverrier process, reducing the total complexity to $$O(n^3).$$ Equivalently, the determinant can be computed by reducing it to matrix multiplication and solving the resulting recurrences. No divisions are required, because the divisions by $k$ in (4) may be eliminated by carrying the factors symbolically through the recurrence; the identities involved are polynomial identities over the integers. Hence the determinant of an $n\times n$ matrix can be computed using $$O(n^3)$$ multiplications and $$O(n^3)$$ additions or subtractions, with no divisions. This completes the proof. ∎ ## Verification The recurrence (3) implies inductively that each $B_k$ is a polynomial in $Y$ of degree $k$. The coefficient $c_k$ obtained from (4) agrees with the coefficient of $\lambda^{n-k}$ in the characteristic polynomial by the Newton identities. For $n=2$, $$Y=\begin{pmatrix}a&b\ c&d\end{pmatrix}.$$ Then $$c_1=a+d,$$ and $$B_1=Y-(a+d)I =\begin{pmatrix}-d&b\ c&-a\end{pmatrix}.$$ Next, $$YB_1 =\begin{pmatrix} bc-ad&0\ 0&bc-ad \end{pmatrix},$$ so $$c_2=\frac12\operatorname{tr}(YB_1)=bc-ad=-\det(Y).$$ Thus the characteristic polynomial is $$\lambda^2-(a+d)\lambda+(ad-bc),$$as required. ## Notes The method described above is usually called the Faddeev-Leverrier algorithm. Strassen's contribution was the observation that determinant computation can be reduced to matrix multiplication sufficiently efficiently to achieve essentially the same asymptotic complexity.