In zero-knowledge proof systems, we (almost) always operate over a finite field
and because the prover usually has to do a lot of field operations to generate the proof, we naturally want our field operations to be as fast as possible.
With elliptic curve cryptography we are restricted to fields of “cryptographic size” e.g.
around 256 bits for 128 bits of security.
However, STARK-like techniques (Reed-Solomon IOPs)
have no such direct dependency between the security parameter and field-size. We can therefore use much smaller fields.

By doing so we can use fields with very efficient arithmetic, such as small binary fields or small prime fields (think 32-bit or 64-bit).
Some of the fastest such fields are small Mersenne prime fields, which have very simple modular reduction.

Unfortunately, until recently, the structure of these fields made them unsuitable for STARKs because they could not efficiently accommodate FRI/STIR. That has now changed.
This is the beginning of a four part series explaining recent construction by Haböck, Levit and Papini dupped Circle STARKs, which enables the use of these particular fields for STARKs with greater efficiency. The series will be structured as follows:

Part I – intro, context and motivation for Mersenne prime fields (this post).

Part II – we explore the unit circle as a group.

Part III – implement the circle FFT and apply the techniques within FRI.

Part IV – we explore the arithmetization of Circle STARKs.

Prerequisites for FRI and STARKs

This post is not aiming to explain the general workings of FRI, for that I recommend this blog post or this one, although we will cover Circle FRI and Circle STARKs in future parts of this series. For now it is sufficient to know that to instantiate FRI (and STARKs) you need:

A finite field $\mathbb{F}$.

A group $L_0$ of order $|L_0| = 2^k$: a smooth group with its operation defined over $\mathbb{F}$:
It can be computed by a (small degree) polynomial/rational function over $\mathbb{F}$.

A sequence of $k$ group homomorphisms $\phi_i : L_i \to L_{i+1}$ with small (order 2) kernel.
Each of which can be expressed as (small degree) polynomials/rational functions over $\mathbb{F}$.

The homomorphisms define a sequence of smaller and smaller smooth groups $L_i = \phi_i(L_{i-1})$:

Here is a brief overview of the landscape of possible FRI instantiations:

1) Smooth Multiplicative Subgroup

Usable when the multiplicative subgroup $\mathbb{F}^*$ of $\mathbb{F}$ is smooth: $2^k {\large\vert} |\mathbb{F}^{*}|$

Suitable for the prime fields $\mathbb{F}_p$ when $p - 1 = 2^k \cdot n$ for some $n$.

The subgroup is $L_0 = \langle \omega \rangle = \left\{ \omega^i \mid i \in \left[0, 2^k\right) \right\}$ where $\omega \in \mathbb{F}_p^*$ has order $2^k$.

The homomorphism is simply squaring $\phi : X \mapsto X^2$.

Therefore the smaller groups $L_i$ are $L_i = \langle \omega^{2^i} \rangle$.

Usable when the field is a binary field $\mathbb{F}_{2^m}$.

This flavor is useful for applications where the computation being verified is, well, binary.
For instance the evaluation of Keccak is most efficiently expressed over binary fields.
Additionally, binary fields are exceptionally fast in hardware.

The subgroup is a linear subspace: we view $\mathbb{F}_{2^m}$ as an $m$-dimensional vector space over $\mathbb{F}_2$
and take a subspace of dimension $k$, in other words for $\mathbb{F}_2$-linearly independent vectors $v_1, \ldots, v_k \in \mathbb{F}^{2^m}$.
$$L_0 = \langle \vec{v_1}, \ldots, \vec{v_k} \rangle = \left\{ \sum_i \lambda_i \vec{v_i} \ {\large\mid} \ \lambda_i \in \mathbb{F}_2 \right\}$$

The homomorphism quotients by a subspace of $L_0$:
$$
\phi_i(X) = \prod_{\vec{e} \ \in \ \langle \vec{v_i} \rangle} (X - \vec{e})
$$
This is an $\mathbb{F}_2$-linear transformation,
it corresponds to a matrix with kernel-space $\langle \vec{v_i} \rangle = \left\{ \vec{v_i}, \vec{0} \right\}$ i.e. sending some one-dimensional subspace generated by $\vec{v_i}$ to zero.

Useful if you need a specific field. For instance if you want to verify secp256k1 ECDSA signatures inside a STARK,
you can use the base field of secp256k1 as the field for the STARK, allowing efficient verification of the curve operations.

The technique actually slightly generalizes the form we outlined above, the evaluation domain $L_0$ is not a subgroup, but a coset of a subgroup. The subgroup is a smooth subgroup of some elliptic curve with points in $\mathbb{F}$:
$$
\begin{aligned}
\mathbb{E}_0(\mathbb{F}) &= \left\{ (x, y) \mid y^2 = x^3 + A \cdot x + B \right\} \subseteq \mathbb{F}^2 \\\
G &\leq \mathbb{E}_0(\mathbb{F}) \ \text{of order} \ 2^k \\
L_0 &= e + G \subseteq \mathbb{E}_0(\mathbb{F})
\end{aligned}
$$
Hence the coset $L_0$ also has order $2^k$.
A coset is used (in place of $G$) because we eventually want to evaluate polynomials at field elements, not elliptic curve points $(x, y)$,
this is achieved by mapping the curve point $(x, y)$ to its x-coordinate and we must therefore ensure that every element of $L_0$ has a unique x-coordinate – which is not the case of a subgroup of an elliptic curve: the element and its inverse have the same x-coordinate.

The homomorphisms are a sequence of isogenies, each with a kernel of size 2, between elliptic curves:
mapping $L_i \subseteq \mathbb{E}_i(\mathbb{F})$ to $L_{i+1} \subseteq \mathbb{E}_{i+1}(\mathbb{F})$
with $|L_{i+1}| = |L_i| \ / \ 2$.
Since isogenies between elliptic curves are rational functions the homomorphisms are rational function, not polynomials.

Implementation puts efficiency at a comparable level with the multiplicative subgroup case,
but obviously using a large field (like the base field of an elliptic curve) will make the STARK slower.
One problem is that the field needs to be large(ish), quadratic in the size of the domain $L_0$, to ensure that an elliptic curve with a smooth subgroup exists: Hasse’s theorem states that the order of an elliptic curve is $p + 1 \pm 2\sqrt{p}$ where $p$ is the size of the field, so to have an order divisible by $2^k$ we need $p \approx (2^k)^2$.

This construction was introduced in 2021 and is split across two papers ECFFT Part I and ECFFT Part II by Ben-Sasson, Carmon, Kopparty and Levit.

4) Unit Circle (Circle STARK / Circle FRI)

Suitable for prime fields $\mathbb{F}_p$ where $2^k \mid p + 1$. Most notably Mersenne primes: $p = 2^k - 1$.

Useful if you need a fast STARK over a prime field.

The subgroup is a set of points on the unit circle over $\mathbb{F}_p$.

The homomorphism is point doubling on the circle (and complex conjugation).

Unlike the elliptic curve case, the homomorphism is a polynomial: because complex multiplication and conjugation are polynomials, unlike addition/doubling on elliptic curves.

Essentially a simplified (genus 0) special case of the more powerful ECFFT techniques.

The first two set of techniques has been widely deployed, for instance by the following teams:

Polygon Zero. Using a prime field $\mathbb{F}_p$ with $ p = 2^{64} - 2^{32} + 1 $.

StarkWare. Using a prime field $\mathbb{F}_p$ with $ p = 2^{61} + 20 \cdot 2^{32} + 1 $.

Ulvetanna. Using binary fields $\mathbb{F}_{2^k}$.

These fields have been chosen because they enable fast arithmetic, and hence fast STARK proving. In the case of prime fields, the recent introduction of the fourth technique, Circle STARKs, will allow for even faster STARKs for a class of prime fields: Mersenne prime fields.

Circle STARKs are defined over fields $\mathbb{F} = \mathbb{Z} / (2^k - 1) \mathbb{Z}$ where $ 2^k - 1 $ is a prime number: the integers modulo a Mersenne prime $ 2^k - 1 $.
In binary the prime is simply a string of $ k $ ones:

$$
2^k - 1 = 1111\ldots 1111_2
$$

And it turns out modular reduction is as simple as ANDing with a mask, and adding the high bits shifted down e.g.
for the 31-bit Mersenne prime $ 2^{31} - 1 $ the reduction can be implemented as:

This takes a 64-bit integer and produces a partial reduced result, applying the reduction twice yields the fully reduced result. To see why this works, observe that if we define:

$$
\begin{align*}
\mathsf{low} &= x \ \text{&} \ \texttt{MASK} = x \ \mathrm{mod} \ 2^{31} \\
\mathsf{high} &= x \ \texttt{>>} \ 31 = \lfloor x \ / \ 2^{31} \rfloor
\end{align*}
$$

We can express $ x $ as:

$$
x = \mathsf{low} + 2^{31} \cdot \mathsf{high}
$$

This is obviously very fast: requiring just a couple of cycles on any modern CPU. Because it is a 31-bit prime, the unreduced result of any field multiplication is at most 62 bits and thus fits in a 64-bit integer. It is also trivially vectorizable using e.g VPMULUDQ.

If 31 bits is not enough, you can use the 61-bit Mersenne prime $ 2^{61} - 1 $, which is the largest Mersenne prime that fits in a 64-bit integer and apply similar techniques.

Take Away: We want to use Mersenne prime fields for our STARKs because the modular reduction is very fast: it really doesn’t get any better than this.

The Problem with Mersenne Primes

Usually when designing a STARK we have many possible choices of (prime) field $\mathbb{F}_p$ and we can pick one with a smooth multiplicative subgroup.
One way to do so, is to pick some $n$ such that $p = 2^m + n \cdot 2^k + 1$ is prime, then the field will have a smooth subgroup of order $2^k$ since $2^k \mid p - 1$.
For instance, the ethSTARK prime
$p = 2^{61} + 20 \cdot 2^{32} + 1$ has $n = 20$, simply because it’s the smallest $n > 0$ that makes $p$ prime.

The problem with Mersenne primes is that we cannot pick them,
only 51 are known and they grow very very fast.
So unlike the ethSTARK prime, we cannot just “pick another one” and unfortunately they do not have smooth multiplicative subgroups,
for instance, the 31-bit Mersenne prime $ p = 2^{31} - 1 $ has a multiplicative group of order:
$$
\left| \mathbb{F}_{2^{31} - 1}^* \right| =
2^{31} - 1 = 2 \cdot 3^2 \cdot 7 \cdot 11 \cdot 31 \cdot 151 \cdot 331
$$
With no large power of 2 in sight :(

We will start exploring an alternative group structure we can exploit in the case of Mersenne primes, namely the unit circle, in part II of this series.