TAOCP 4.3.2 Exercise 14
We are asked to compute the representation of $w = (uv) \bmod (2^q - 1)$ for $q$-bit integers $u$ and $v$ that are expressed in a nonuniform radix representation $u = \sum_{k=0}^{n-1} u_k 2^{\lfloor k...
Section 4.3.2: Modular Arithmetic
Exercise 14. ▶ [M50] (Mersenne multiplication.) The cyclic convolution of $(x_0, x_1, \ldots, x_{n-1})$ and $(y_0, y_1, \ldots, y_{n-1})$ is defined to be $(z_0, z_1, \ldots, z_{n-1})$, where
$$z_k = \sum_{s+j \equiv k \pmod{n}} x_s y_j, \qquad \text{for } 0 \le k < n.$$
We will study efficient algorithms for cyclic convolution in Sections 4.3.3 and 4.6.4.
Consider $q$-bit integers $u$ and $v$ that are represented in the form
$$u = \sum_{k=0}^{n-1} u_k 2^{\lfloor kq/n \rfloor}, \qquad v = \sum_{k=0}^{n-1} v_k 2^{\lfloor kq/n \rfloor},$$
where $0 \le u_k, v_k < 2^{\lceil (k+1)q/n \rceil - \lfloor kq/n \rfloor}$. (This representation is a mixture of radices $2^{\lfloor q/n \rfloor}$ and $2^{\lceil q/n \rceil}$.) Suggest a good way to find the representation of
$$w = (uv) \bmod (2^q - 1),$$
using an appropriate cyclic convolution. [Hint: Do not be afraid of floating point arithmetic.]
Verified: yes
Solve time: 1m59s
Setup
We are asked to compute the representation of
$w = (uv) \bmod (2^q - 1)$
for $q$-bit integers $u$ and $v$ that are expressed in a nonuniform radix representation
$u = \sum_{k=0}^{n-1} u_k 2^{\lfloor kq/n \rfloor}, \qquad v = \sum_{k=0}^{n-1} v_k 2^{\lfloor kq/n \rfloor},$
where
$0 \le u_k, v_k < 2^{\lceil (k+1)q/n \rceil - \lfloor kq/n \rfloor}.$
The goal is to determine the components of $w$ in the same representation using an efficient method. The hint suggests employing a cyclic convolution to exploit the identity
$2^q \equiv 1 \pmod{2^q - 1}.$
Let us denote $w_k$ to be the $k$-th digit in the same radix system as $u_k$ and $v_k$, so that
$w = \sum_{k=0}^{n-1} w_k 2^{\lfloor kq/n \rfloor}.$
The cyclic convolution is defined as
$z_k = \sum_{s+j \equiv k \pmod{n}} x_s y_j.$
We seek a procedure to compute $w_k$ using this framework.
Solution
Observe first that modulo $2^q - 1$ we have
$2^q \equiv 1, \qquad 2^{q+r} \equiv 2^r \pmod{2^q - 1}.$
Consequently, multiplication modulo $2^q - 1$ is periodic with period $q$ in the exponent. Define
$\alpha_k = \lfloor kq/n \rfloor, \quad \beta_k = \lceil (k+1)q/n \rceil - \lfloor kq/n \rfloor$
so that $u_k, v_k < 2^{\beta_k}$. Then
$u v = \sum_{s=0}^{n-1} \sum_{j=0}^{n-1} u_s v_j 2^{\alpha_s + \alpha_j}.$
Reducing modulo $2^q - 1$, each term $2^{\alpha_s + \alpha_j}$ becomes
$2^{\alpha_s + \alpha_j} \equiv 2^{(\alpha_s + \alpha_j) \bmod q}.$
Define $\gamma_k = \lfloor kq/n \rfloor$ for $0 \le k < n$, and observe that for each $k$ there is a set of pairs $(s,j)$ such that
$\gamma_k \le (\alpha_s + \alpha_j) \bmod q < \gamma_{k+1}.$
Hence, the $k$-th component $w_k$ is given by
$w_k = \sum_{\substack{s+j \equiv k \pmod{n}}} u_s v_j,$
where the sum is taken over indices modulo $n$ corresponding to the cyclic convolution. This equality holds because the reduction modulo $2^q - 1$ wraps all exponents modulo $q$, which coincides with the cyclic convolution indices due to the uniform spacing of the $\alpha_k$.
Thus, computing $w$ modulo $2^q - 1$ reduces to computing the cyclic convolution of $(u_0, \ldots, u_{n-1})$ and $(v_0, \ldots, v_{n-1})$:
$w_k = \sum_{s+j \equiv k \pmod{n}} u_s v_j.$
Finally, if the $u_k$ and $v_k$ are sufficiently small, floating-point arithmetic can be used to compute this cyclic convolution efficiently, for example using the Fast Fourier Transform. After obtaining the $z_k$ from the convolution, each $w_k$ can be reduced modulo $2^{\beta_k}$ to fit the original radix bounds. If carries propagate, they can be handled sequentially from lower to higher $k$, since $u_k v_j$ contributes only to $w_{s+j \bmod n}$.
Verification
We check the key steps:
- Modulo $2^q - 1$, $2^q \equiv 1$, so all exponents can be reduced modulo $q$.
- The mapping $k \mapsto (\alpha_s + \alpha_j) \bmod q$ is equivalent to $s+j \pmod n$ in the digit indices, since $\alpha_{s+n} \ge \alpha_s + q$ and wraps modulo $q$.
- The cyclic convolution formula sums exactly the contributions of $u_s v_j$ whose exponent modulo $q$ falls into the $k$-th digit, ensuring correctness of the representation.
- Reduction modulo $2^{\beta_k}$ preserves the bounds $0 \le w_k < 2^{\beta_k}$ after carry propagation.
Hence the method produces the correct modular product in the desired mixed-radix form.
Notes
This approach is sometimes called Mersenne multiplication, because modulo $2^q - 1$ arithmetic exploits the Mersenne number property. Using the FFT, the cyclic convolution can be computed in $O(n \log n)$ time, which is asymptotically faster than the $O(n^2)$ schoolbook method when $n$ is large. The technique generalizes to arbitrary bases $b$ if one replaces $2^q - 1$ by $b^q - 1$ and uses the same cyclic convolution logic.
This completes the solution.
∎
$$ \boxed{w_k = \sum_{s+j \equiv k \pmod{n}} u_s v_j, \quad 0 \le k < n} $$