DoubleOdd Elliptic Curves
Doubleodd elliptic curves are a large subset of elliptic curves: these are curves whose order is equal to 2 modulo 4 (i.e. the order is the double of an odd integer). On top of doubleodd elliptic curves, we can define a prime order group suitable for building cryptographic protocols. This is similar to what Ristretto achieves for twisted Edwards curves, albeit with a different method. Some highlights of doubleodd elliptic curves are:

There is no cofactor to deal with. A clean prime order group abstraction is provided, with a canonical encoding for elements which is automatically verified when decoding.

The encoding is economical: for \(n\)bit security, group elements are encoded over \(2n+1\) bits (compared to \(2n+3\) for Ristretto).

All group operations are supported by unified and complete formulas amenable to efficient and secure (constanttime) implementations.

No drastically new cryptographic hypothesis is introduced. Doubleodd elliptic curves are fairly mundane; about 1/4th of all elliptic curves are doubleodd curves. Such curves are considered secure for cryptography, just like any other random elliptic curve with a large enough subgroup of prime order.

The formulas are fast and provide performance at least on par with twisted Edwards curves. In some respects they are faster; e.g. point doublings can be performed, for half of doubleodd curves, with six multiplications (down to 1M+5S for some specific curves).
In order to leverage these techniques, two specific doubleodd elliptic curves are defined: do255e and do255s. Both provide "128bit security" (technically, 127 bits); do255e is a GLV curve that can leverage an internal endomorphism for further improved performance, while do255s is a perfectly ordinary curve.
Resources
The whitepaper contains all the formulas and demonstrations; it also specifies the use of do255e and do255s in higherlevel cryptographic functionalities (key pair generation, key exchange, signature, and hashtocurve). Most of the present site consists in extracts from this whitepaper. Note: the whitepaper is now also published on ePrint.
Several implementations of do255e and do255s exist and are listed in the implementations section.
Outline
Subsequent pages in this site include the following:

A geometrical introduction to doubleodd elliptic curves: a selfcontained explanatory text (with pictures) to explain the motivation and the core ideas of this work.

The formulas section lists useful formulas for computing over points of doubleodd elliptic curves with several convenient systems of coordinates.

Explicit curve choices are listed: these are curves do255e and do255s. Selection criteria and curve parameters are provided.

Available opensource implementations are listed.
About
Doubleodd elliptic curves were designed by Thomas Pornin, who also wrote this Web site, with contributions from Paul Bottinelli, Gérald Doussot and Eric Schorn.
A Geometrical Introduction to DoubleOdd Elliptic Curves
To give an intuition about doubleodd elliptic curves, why we use them and how we leverage their structure, we present here a geometrical interpretation.
For use in cryptography, we work over a finite field \(\mathbb{F}_q\) of order \(q\), typically integers modulo a given prime. All we explain here also works over field extensions, in which case \(q\) is a given power of a prime integer (called the field characteristic). The only restriction is that the characteristic should not be 2 or 3 (elliptic curves over such fields use different equations and rules). However, finite field are not very visual; the pictures below pretend that we are using real numbers, where there are such things as continuous functions. These pictures are still useful in getting used to the underlying ideas.
Motivation
Short Weierstraß Curves
Let's start with a classic "short Weierstraß" curve and the definition of point addition, as show below.
The curve here is the set of points \(x, y\) that fulfill the curve equation \(y^2 = x^3 + Ax + B\) for two given constants \(A\) and \(B\) that define the curve shape. As is wellknown, every elliptic curve can be converted into that kind of shape and equation with appropriate changes of variables.
Point addition is defined as follows:

The neutral element is the formal extra "pointatinfinity" (which is not in the plane and does not have coordinates).

For a point \(P\), the point \(P\) is the symmetric of the point relatively to the horizontal axis.

When adding two points \(P\) and \(Q\) together, the line \((PQ)\) is drawn; it intersects the curve in a third point which is \((P+Q)\), i.e. the opposite of the sum of \(P\) and \(Q\).
This is illustrated by the red lines on the figure. Given the coordinates of \(P\) and \(Q\), the coordinates of \(P+Q\) can be computed as rational functions of \(P\) and \(Q\), with coefficients that depend only on the curve parameters. Unfortunately, this process has exceptional cases which must be handled differently:

The pointatinfinity does not have defined coordinates, so it can be hard to apply rational functions on these nonexisting coordinates.

When \(P = Q\), the line is vertical \((PQ)\) and the third intersection point is "at infinity", again with no defined coordinates.

When \(P = Q\) (i.e. a point doubling), the line \((PQ)\) is not welldefined; if we apply the rational functions that compute point addition in general, we end up dividing zero by zero. Instead, we have to use the tangent to the curve on \(P\), and this leads to different rational functions to obtain the result.
These exceptional cases are mightily inconvenient. The pointatinfinity can somehow be handled by switching to fractional coordinates, in which each coordinate is represented by a fraction. In that case, infinity corresponds to a denominator value of zero. There are several wellknown systems of coordinates that use fractions, e.g. projective coordinates (\((x, y) = (X/Z, Y/Z)\)) or Jacobian coordinates (\((x, y) = (X/Z^2, Y/Z^3)\)). Formulas working on fractional coordinates can be established, that will handle infinities, and have the added benefit of allowing most computations to be done only with additions, subtractions and multiplications, and no divisions. A single division will be needed only at the end of a long computation, usually for encoding a point into a welldefined sequence of bytes. This promotes performance, because divisions are more expensive than multiplications. The cost of formulas is thus often expressed as a number of multiplications; it is common practice to separate general multiplications from squarings, the latter being usually somewhat faster than multiplications.
Handling the exceptional case related to point doublings is harder. This case can be detected with a simple test, but this leads to conditional execution, i.e. non constanttime behaviour, which is unsafe in all generality (timing attacks will detect that occurrence which may lead to information leaks). In general, one can apply both formulas, for general addition and for doubling, and select the right one afterwards in a constanttime way, but this is expensive. Complete formulas that do not have exceptional cases, in projective coordinates, have been published; they allow safe implementations of any curve (of odd order), but at a cost: each point addition requires 12 multiplications in the field (12M), while specialized doubling formulas (when the two operands are statically known to be the same point) need 8 multiplications and 3 squarings (8M+3S). If we forego completeness, Jacobian coordinates lead to more efficient formulas (especially doublings, in cost 3M+5S), but analysis becomes a lot harder because we must prove that no exceptional case may be inadvertently reached when using incomplete formulas. This can lead, and has led, to exploitable vulnerabilities.
Apart from this tension between performance and safety, short Weierstraß curves can have nice characteristics:

The order of a curve is close to that of the field, but can be about any integer in a range centered on that value, so that we may choose curves with a prime order. Higherlevel cryptographic functionalities building on elliptic curves usually need a prime order group.

With point compression, elements of a curve with \(2n\) points and offering "\(n\)bit security" can be encoded over \(2n+1\) bits. This is the best one can realistically hope for with elliptic curves.
Twisted Edwards Curves and Montgomery Curves
Twisted Edwards curves use a different kind of polynomial equation (of degree 4) that leads to the shape shown here; the point addition rule is somewhat different and harder to represent on a picture (I won't try). They are still elliptic curves, though, and therefore they can be turned into a Weierstraß curves with a birational transform on coordinates. As Weierstraß curves, twisted Edwards curves become Montgomery curves (which were historically discovered before Edwards curves). A wellknown Montgomery curve is Curve25519 (defined over the field of integers modulo \(2^{255}19\)) and its twisted Edwards counterpart is called Edwards25519. The Ed25519 signature algorithm uses Edwards25519, while the X25519 key exchange algorithm works over Curve25519. From a security point of view, they are the same curve.
The most interesting aspect of twisted Edwards curves is that point addition can be expressed with formulas that have no exceptional cases, and these formulas are substantially more efficient than the ones for short Weierstraß curves (8M for general point addition, 4M+4S for point doubling).
They have, however, a drawback, which is the cofactor. A twisted Edwards curve cannot have prime order, since its order is always a multiple of 4. The ratio between the curve order and the prime order of the subgroup which is relevant to cryptographic applications is called the cofactor. The cofactor for Edwards25519 is 8. The cofactor tends to be a nuisance for cryptographic protocols, because it allows the existence of several "equivalent" points that still look different from each other on the wire. This can lead to several issues, e.g.:

The cofactor induces public key malleability, which was exploited in a doublespend attack on the Monero cryptocurrency.

Several equations are possible to validate Ed25519 signatures, and while they have equivalent security, they imply that it is possible to (maliciously) make signatures that will be accepted by some verifiers, and rejected by others. This tends to break consensusbased distributed applications.
For more cases and extensive analysis, see this report from Cramer and Jackson. The gist of it is simple: we need a prime order group. When using a curve with a nontrivial cofactor, the loworder points and the malleability may require some corrective actions, which is usually a generous sprinkling of multiplications by the cofactor in various place of the upper protocol, and explicit detection of the few points whose order is a divisor of the cofactor. Even when such mitigations are not required, some extensive and nonobvious analysis is needed to ascertain this fact. In that sense, a lot of the apparent simplicity of use of twisted Edwards curves has been obtained not by removal of complexity, but by moving it into upper layers of the overal protocol design.
Decaf and Ristretto
Decaf and Ristretto are an almost perfect solution to the cofactor issues with twisted Edwards curves. They are applicable to such curves when the cofactor is exactly 4 (for Decaf) or 8 (for Ristretto). With Decaf/Ristretto, what you decode is a point on the twisted Edwards curve, which is not necessarily a member of the prime order subgroup on that curve. However, all points that differ from each other by only a low order point will encode to the same sequence of bits, and any two points that encode to a given sequence of bits differ from each other by only a low order point. This leverages all the performance and completeness of formulas on twisted Edwards curves, but provides the required abstraction for building higherlevel cryptographic functionalities: a prime order group with a canonical and verified encoding.
There are a few remaining points where Decaf/Ristretto is suboptimal:

The decoding and encoding processes both use an inverse square root in the field, which is implemented with a modular exponentiation, with a nonnegligible cost. Ordinary twisted Edwards curves, and short Weierstraß curves, also need a square root for decoding (when using a compressed encoding format), but for encoding they "only" need an inversion. Inversion is traditionally implemented with Fermat's little theorem, i.e. another exponentiation, but faster methods exist. Decaf and Ristretto cannot leverage that.

The cofactor still requires some space: for \(n\)bit security, the curve must work over a field of size \(2n+2\) bits (for Decaf) or \(2n+3\) bits (for Ristretto), leading to the same size requirement for encoding.
And, more generally, while the currently known formulas for twisted Edwards curves are fast, nothing proves that one cannot be better.
DoubleOdd Elliptic Curves
Enter doubleodd elliptic curves. This started with a question: since short Weierstraß curves can have prime order, but Montgomery/Edwards curves have cofactor at least 4, what of the "intermediate" curves with cofactor 2? Is there anything that can be done with them?
It turns out that such curves, which we named doubleodd elliptic curves, have interesting properties that help in both finding a canonical encoding for a prime order group (in a way similar to Ristretto) while also giving options for unified and complete formulas. It so happens that the resulting formulas are also quite fast, in particular with regard to computing point doublings  and point doublings tend to represent the large majority of the cost of point multiplication by a scalar, so any gains in that area are most welcome.
Curve Equation and Point Encoding
Let's consider a curve \(E\) with order \(2r\) for some odd integer \(r\) (in practice, we'll choose a curve such that \(r\) is prime).
A doubleodd curve necessarily has a single point of order 2, i.e. a point which, added to itself, yields the pointatinfinity. Equivalently, points or order 2 are points with \(y = 0\). Let's call that point \(N\).
With a simple change of variable (adding a constant to the \(x\) coordinate), we can always arrange for \(N\) to be the point \((0, 0)\), as illustrated on the figure. This transform the curve equation into another form which is similar but not identical to short Weierstraß equations: \(y^2 = x(x^2 + ax + b)\) for two constants \(a\) and \(b\).
Mathematical analysis shows that for all doubleodd elliptic curves, we get such an equation such that neither \(b\) nor \(a^2  4b\) is a square in the field. The converse is also true: all curves with equation \(y^2 = x(x^2 + ax + b)\) with nonsquare \(b\) and \(a^2  4b\) are doubleodd curves.
Given a doubleodd elliptic curve with order \(2r\), there is a subgroup of order \(r\), which consists in all the points that, multiplied by \(r\), yield the pointatinfinity. This subgroup is called the points of \(r\)torsion and is traditionally noted \(E[r]\). In particular, for any point \(P\) on \(E\):
 if \(P \in E[r]\), then \(P + N \notin E[r]\);
 if \(P \notin E[r]\), then \(P + N \in E[r]\).
The subgroup \(E[r]\) has order \(r\) (which we chose to be prime) so this looks like the prime order subgroup we want to use for cryptography.
Let's now consider a point \(P\) on the curve (distinct from \(N\), and not the pointatinfinity either), and the addition of \(N\) to \(P\).
By the geometrical definition of the point addition, we draw the line \((PN)\), which then intersects the curve on a third point, which is \((P+N) = (P)+N\). Crucially, exactly one of \(P\) and \(P+N\) is a point of \(r\)torsion. Since the line \((PN)\) goes through \(N\) (which is at a fixed, known position) and is not vertical (because \(N\) is the only point with \(x = 0\)), that line is uniquely defined by its slope \(w = y/x\). Thus, for a given slope value \(w\) that defines a line through \(N\):

either this line does not intersect the curve in any other point;

or it intersects the curve in exactly two points, exactly one of which is in \(E[r]\).
This means that the mapping of a point \((x, y) \in E[r]\) into the slope value \(w = y/x\) is injective. Thus, it can potentially be used to unambiguously encode elements of \(E[r]\).
How about decoding? It so happens (this is not obvious geometrically) that an extra property applies here: for any \(P = (x, y) \in E\), with \(P \neq N\) and not the pointatinfinity, then \(P \in E[r]\) if and only if \(x\) is a square in the field. This result is obtained analytically from the following:

Every element of \(E[r]\) is the double of some other point (because \(P = (r+1)P\), and \(r+1\) is an even integer), and the \(x\) coordinate of the double of \((x, y)\) is computed as \(((x^2b)/(2y))^2\), which is a square.

If \(P = (x, y)\) then \(P+N = (b/x, by/x^2)\), and since \(b\) is not a square, if \(x\) is a square then \(b/x\) is not, and vice versa.
At that point, we can encode and decode elements of \(E[r]\):
 Encoding of \((x, y)\) is simply the slope value \(w = y/x\).
 Decoding is using the curve equation to find the two points corresponding to a given slope \(w = y/x\) (it is a degree2 equation, which involves a square root computation), then keeping the solution \(x\) which is a square.
Note that encoding only requires a single inversion (thus, this is as fast as with twisted Edwards curves) while decoding uses a square root and a square detection (Legendre symbol), which would make it slower than twisted Edwards curves and Decaf/Ristretto. However, practical implementation of inversion and Legendre symbol can be quite fast; on small microcontrollers based on the ARM Cortex M0+, inversion and Legendre symbol can be computed in, respectively, 1/5th and 1/6th of the cost of a square root (see this). Thus, the total cost of decoding and encoding can realistically be lower than that of Decaf/Ristretto, and close to what is obtained with plain Weierstraß curves or twisted Edwards curves.
Since encoding uses \(w\), we can use it directly as a point coordinate, as a replacement to \(y\). This does not lose information, and it makes the computation of \(P+N\) from \(P\) even easier: for a point \(P = (x, w)\), we have \(P+N = (b/x, w)\).
Use of \(w\) adds an extra difficulty, which we will have to deal with, namely that \(w\) is not defined for point \(N\) (even though that point has welldefined \(x\) and \(y\) coordinates). We then have two tricky points: the pointatinfinity, and \(N\). As will be exposed later on, we can avoid the pointatinfinity altogether. Point \(N\) can be represented with fractional representations with a denominator equal to 0. Another solution (which we also leverage) is to use \(u = x/y\) (instead of \(w = y/x\)) which can be extended to \(u = 0\) for the point \(N\) without breaking formulas.
Point Addition and Unified Formulas
Let's consider the exceptional case of adding a point \(P\) to itself.
The classical definition involves taking the tangent to the curve on point \(P\) as the line that leads to \((P+P)\). This is inconvenient because the formulas that yield the slope of that tangent are quite different from those that are used for "normal" point addition, so you have to know (or check) that the two addition operands are the same point. This leads to either unsafe implementations (non constanttime), or computational overhead and decreased performance.
However, on doubleodd elliptic curves, we can use \(N\). As explained previously, from a point \(P\), we can obtain \(P+N\) easily. This leads to the idea of computing an addition \(P+Q\) as the sum of \(P\) and \(Q+N\).
As is illustrated on this figure, \(P\) and \(P+N\) are distinct points, and not opposite to each other, so that addition of \(P\) with \(P+N\) is no longer an exceptional case, and no tangent line is involved.
More generally, when \(P\) and \(Q\) are two elements of \(E[r]\), then we can add \(P\) with \(Q+N\) and it is guaranteed that \(P\) and \(Q+N\) are distinct points, and not opposite to each other, since \(P \in E[r]\) but \(Q+N \notin E[r]\). This naturally avoids exceptional cases related to point doubling; this is our main method to achieve unified formulas (formulas for which the only exceptional cases involve the pointatinfinity).
This trick requires two additions of \(N\) (one on one of the operands, and another on the result to get back to \(E[r]\)). Though these operations are inexpensive, we can do slightly better, with a trick which we will integrate in the formal definition of our prime order group.
A Prime Order Group
We define our prime order group \(\mathbb{G}\) as the points of \(E\) which are not in \(E[r]\): \[ \mathbb{G} = \{\, P+N \,\, P \in E[r] \,\} \] Each element of \(\mathbb{G}\) can be written as \(P+N\) with \(P\) being a point in \(E[r]\). We can thus define the group law on \(\mathbb{G}\) as follows:

Neutral element is \(N\).

The opposite of \(P+N\) is \(P+N\) (which is equal to \((P+N)\) and thus corresponds to the negation of the \(y\), \(w\) or \(u\) coordinate, depending on whichever coordinate system we are using).

The addition in \(\mathbb{G}\) (here denoted with \( * \)) is: \[\begin{eqnarray*} (P+N) * (Q+N) &=& (P+N) + ((Q+N)+N) \\ &=& P + (Q+N) \\ \end{eqnarray*}\]
This definition means that a single extra addition with \(N\) is needed to compute the group law. Maybe more importantly, it also removes the need to deal with the pointatinfinity, except in a transient way: the pointatinfinity is not part of the group, and the neutral \(N\) has welldefined \(x, y\) coordinates. This simplifies formulas and helps with achieving completeness.
On this group, we can derive formulas that achieve the following results:

In Jacobian \((x, w)\) coordinates, a group element is represented by \((X{:}W{:}Z)\) with \(x = X/Z^2\) and \(w = W/Z\) (for the point \(N\), we have \(X = Z = 0\) and \(W \neq 0\)). This yields:

Unified point addition formulas in cost 8M+6S (mixed addition in 8M+3S).

Complete point doubling formulas in cost 2M+5S. For half of the curves, this can be reduced to 1M+6S. For some specific curves, 2M+4S can be achieved.

Additional savings are obtained when computing multiple successive doublings. For half of the curves, \(n\) doublings cost \(n\)(4M+2S)+1M. For some specific curves, this gets down to \(n\)(1M+5S)+1S.


In fractional \((x, u)\) coordinates, a group element is represented by \((X{:}Z{:}U{:}T)\) with \(x = X/Z\) and \(u = U/T\), and \(ZT \neq 0\). Formulas yield:

Complete point addition in cost 10M (mixed addition in 8M).

Complete point doubling in cost 3M+6S. When computing a sequence of successive doublings, the same perdoubling marginal costs as in Jacobian \((x, w)\) coordinates are obtained, albeit with a higher initial overhead (e.g. \(n\)(1M+5S)+3M, i.e. overhead of 3M instead of 1S at the start of the sequence).


A ladder \(x\)only algorithm can be performed in cost 8M+2S per bit. This is somewhat slower than Montgomery curves (in 5M+4S) but still a tolerable option for small systems with severe RAM constraints. For key exchange, this ladder can also avoid the square root computation involved in point decoding.
Formulas
We can derive many formulas working in several coordinate systems. We give in the following pages formulas for:
 Affine \((x, y)\) coordinates
 Affine and Jacobian \((x, w)\) coordinates
 Affine and fractional \((x, u)\) coordinates
 xonly point multiplication ladders
Proofs for these formulas, as well as extra structures and other mathematical considerations, are explained at length in the whitepaper.
In all these formulas, the following apply:

The base curve \(E\) has equation \(y^2 = x(x^2 + ax + b)\). Group \(\mathbb{G}\) consists of the point of \(E\) which are not points of \(r\)torsion.

The neutral is \(N = (0, 0)\). The group law is called "addition" and denoted as such. Group \(\mathbb{G}\) is homomorphic to the subgroup of points of \(r\)torsion \(E[r]\) and thus has the same resistance to discrete logarithm.
Affine (x, y) Coordinates
Given points \(P_1 = (x_1, y_1)\) and \(P_2 = (x_2, y_2)\) in \(\mathbb{G}\), their sum in the group \(P_3 = (x_3, y_3) = P_1 + P_2\) is computed with: \[\begin{eqnarray*} x_3 &=& \frac{b((x_1 + x_2)(x_1 x_2 + b) + 2a x_1 x_2 + 2 y_1 y_2)}{(x_1 x_2  b)^2} \\ y_3 &=& \frac{b(2a(x_1 y_2 + x_2 y_1)(x_1 x_2 + b) + (x_1^2 y_2 + x_2^2 y_1)(x_1 x_2 + 3b) + (y_1 + y_2)(3b x_1 x_2 + b^2))}{(x_1 x_2  b)^3} \end{eqnarray*}\]
These formulas are complete. Denominators can never be zero (because \(x_1 x_2\) is a quadratic residue, but \(b\) is not).
Affine and Jacobian (x, w) Coordinates
In \((x, w)\) coordinates, the curve equation becomes: \[ w^2 x = x^2 + ax + b \] The neutral \(N\) does not have a defined affine \(w\) coordinate. For points in \(\mathbb{G}\) other than \(N\), the \(w\) coordinates is nonzero.
Encoding and Decoding
Encoding of \((x, w)\) is a simple encoding of the \(w\) coordinate.
Decoding proceeds as follows:

If \(w = 0\), then return the point \(N\).

Solve for \(x\) the equation \(x^2  (w^2  a)x + b = 0\):
 Compute \(\Delta = (w^2  a)^2  4b\) (which is necessarily nonzero).
 If \(\Delta\) is not a quadratic residue, then there is no solution (the provided encoding is invalid). Otherwise, there are two solutions for \(x\), depending on which square root of \(\Delta\) is used: \[ x = \frac{w^2  a \pm \sqrt{\Delta}}{2} \]
 Exactly one of the two solutions is a quadratic residue; use the other one.
Affine Coordinates
Given points \(P_1 = (x_1, w_1)\) and \(P_2 = (x_2, w_2)\) in \(\mathbb{G}\), their sum in the group \(P_3 = (x_3, w_3) = P_1 + P_2\) is computed with: \[\begin{eqnarray*} x_3 &=& \frac{b x_1 x_2 (w_1 + w_2)^2}{(x_1 x_2  b)^2} \\ w_3 &=& \frac{(w_1 w_2 + a)(x_1 x_2 + b) + 2b(x_1 + x_2)}{(w_1 + w_2)(x_1 x_2  b)} \end{eqnarray*}\]
These formulas are unified: they work properly for all points except when \(P_1\), \(P_2\) or \(P_3\) is the neutral \(N\).
Specialized doubling formulas, to compute the double \((x', w')\) of the point \((x, w)\): \[\begin{eqnarray*} x' &=& \frac{4bw^2}{(2x + a  w^2)^2} \\ w' &=&  \frac{w^4 + (4ba^2)}{2w(2x + a  w^2)} \end{eqnarray*}\]
Again, these formulas are only unified, since \(N\) does not have a defined \(w\) coordinate. Note that if the input point \(P\) is not \(N\), then its double \(2P\) is not \(N\) either, since the group \(\mathbb{G}\) has odd order \(r\).
Jacobian Coordinates
Jacobian coordinates represent point \((x, w)\) as the triplet \((X{:}W{:}Z)\) with \(x = X/Z^2\) and \(w = W/Z\). The point \(N\) is represented by \((0{:}W{:}0)\) for any \(W \neq 0\). Jacobian coordinates correspond to an isomorphism between curve \(E(a, b)\) and curve \(E(aZ^2, bZ^4)\).
The addition of \(P_1 = (X_1{:}W_1{:}Z_1)\) with \(P_2 = (X_2{:}W_2{:}Z_2)\) into \(P_3 = (X_3{:}W_3{:}Z_3)\) is obtained with the following: \[\begin{eqnarray*} X_3 &=& b X_1 X_2 (W_1 Z_2 + W_2 Z_1)^4 \\ W_3 &=& ((W_1 W_2 + a Z_1 Z_2)(X_1 X_2 + b Z_1^2 Z_2^2) + 2b Z_1 Z_2 (X_1 Z_2^2 + X_2 Z_1^2)) \\ Z_3 &=& (X_1 X_2 b Z_1^2 Z_2^2)(W_1 Z_2 + W_2 Z_1) \end{eqnarray*}\] These formulas can be computed with cost 8M+6S. If using Chudnovsky coordinates \((X{:}W{:}Z{:}Z^2)\) (i.e. we also keep around the value \(Z^2\) in addition to \(Z\)), then cost is lowered to 8M+5S, but inmemory representation is larger.
Addition formulas are unified, because they do not compute the correct result when one or both of the operands are equal to \(N\). However, if \(P_1 \neq N\) and \(P_2 = P_1\), then a valid representation of \(N\) is computed. Thus, the only exceptional cases that must be handled in an implementation are \(P_1 = N\) and \(P_2 = N\). These two cases can be handled inexpensively with a couple of conditional copies, to obtain a complete routine which induces no tension between security and performance.
For doubling point \((X{:}W{:}Z)\) into \((X'{:}W'{:}Z')\), formulas are: \[\begin{eqnarray*} X' &=& 16bW^4 Z^4 \\ W' &=& (W^4 + (4ba^2)Z^4) \\ Z' &=& 2WZ(2X + aZ^2  W^2) \end{eqnarray*}\] These formulas are complete: they work properly for all inputs, including the neutral element \(N\).
Generic implementation of point doubling is easily done in cost 2M+5S. If the base field is such that \(q = 3\bmod 4\), then \(4b  a^2\) is a square, and, provided that we have \(e = \sqrt{4b  a^2}\) and that the constant \(e\) is small, then we can compute \(W'\) as \((e/2)(2WZ)^2  (Z^2 + eW^2)^2\), which leads to an implementation in cost 1M+6S.
More interestingly, if parameters are such that \(a\) is a square, and that \(2b = a^2\), then we can obtain an implementation of doubling in cost 2M+4S. Analysis shows that such a situation can happen only if \(q = 3\bmod 8\), and there will be a single curve (up to isomorphism) that matches these conditions; we can thus assume that the curve parameters are \(a = 1\) and \(b = 1/2\). In that case, the point doubling algorithm becomes the following:
 \(t_1 \leftarrow WZ\)
 \(t_2 \leftarrow t_1^2\)
 \(X' \leftarrow 8 t_2^2\)
 \(t_3 \leftarrow (W + Z)^2  2t_1\)
 \(W' \leftarrow 2t_2  t_3^2\)
 \(Z' \leftarrow 2t_1 (2X  t_3)\)
Successive doublings rely on the definition of some isogenies: \[\begin{eqnarray*} \psi_\pi : E(a, b) &\longrightarrow& E(2a\pi^2, \pi^4(a^24b))[r] \\ (x, w) &\longmapsto& \left(\pi^2 w^2, \frac{\pi (x  b/x)}{w}\right) \end{eqnarray*}\] for a constant \(\pi \neq 0\) (and defining that the image of the pointatinfinity and the image of \(N\) are both the pointatinfinity in the target curve). These functions are isogenies on the curves (that is, using classic point addition) and the image is the subgroup of points of \(r\)torsion in the destination curve. If two constants \(\pi\) and \(\pi'\) are such that \(2\pi\pi' = 1\), then for any \(P \in E(a, b)\), \(\psi_{\pi'}(\psi_{\pi}(P)) = 2P\). In other words, composing the two isogenies brings us back to the original curve, and together they compute a point doubling (on the curve).
To obtain a similar effect with our group \(\mathbb{G}\), we also define the following functions: \[\begin{eqnarray*} \theta_\pi : E(a, b) &\longrightarrow& \mathbb{G}(2a\pi^2, \pi^4(a^24b)) \\ (x, w) &\longmapsto& \left(\frac{\pi^2(a^24b)}{w^2}, \frac{\pi (x  b/x)}{w}\right) \end{eqnarray*}\] using the notation that \(\mathbb{G}(a, b)\) is our prime order group, defined over a base doubleodd elliptic curve with parameters \(a\) and \(b\). With these functions, we get that both \(\theta_{1/2}(\psi_1(P))\) and \(\theta_{1/2}(\theta_1(P))\) compute the double (in \(\mathbb{G}\)) of point \(P\).
We can then compute a succession of doublings by a sequence of invocations of these functions, freely switching between \(\psi\) and \(\theta\) as best suits computations, in order to minimize cost. This yields complete formulas; the neutral \(N\) is represented by \((0{:}W{:}0)\) for any \(W \neq 0\), and the pointatinfinity (which will appear whenever applying a \(\psi\) function on the neutral \(N\)) is represented by \((W^2{:}W{:}0)\) for any \(W \neq 0\). Using these techniques, we can obtain the following costs for computing \(n\) successive doublings, depending on the base field and the curve parameters:
 \(n\)(2M+5S) (for all curves)
 \(n\)(1M+6S) (if \(q = 3\bmod 4\))
 \(n\)(4M+2S)+1M (if \(q = 3\) or \(5\bmod 8\))
 \(n\)(1M+5S)+1S (if \(a = 0\))
 \(n\)(2M+4S) (if \(a = 1\) and \(b = 1/2\))
Please refer to the whitepaper (section 4.1) for additional details on these formulas.
Affine and Fractional (x, u) Coordinates
It can be shown that a doubleodd elliptic curve is birationally equivalent to a subgroup of a twisted Edwards curve defined in a quadratic extension \(\mathbb{F}_{q^2}\). The general point addition formulas on that twisted Edwards curve can then be shown to be complete as long as we work over that specific subgroup. From these formulas, we can derive back a coordinate system on the original doubleodd elliptic curve, with only values in \(\mathbb{F}_q\), that yield complete formulas for general point addition in the group \(\mathbb{G}\). These are the \(x, u\) coordinates.
Affine Coordinates
The \(x\) coordinate is the same one as in the affine \(x, y\) coordinates. The \(u\) coordinate is \(u = x/y\). For the neutral \(N\), the \(u\) coordinate is set to 0. Given points \(P_1 = (x_1, u_1)\) and \(P_2 = (x_2, u_2)\) in \(\mathbb{G}\), their sum in the group \(P_3 = (x_3, u_3)\) can be obtained with the following formulas: \[\begin{eqnarray*} x_3 &=& \frac{b((x_1 + x_2)(1 + a u_1 u_2) + 2 u_1 u_2 (x_1 x_2 + b))} {(x_1 x_2 + b)(1  a u_1 u_2)  2b u_1 u_2 (x_1 + x_2)} \\ u_3 &=& \frac{(u_1 + u_2)(x_1 x_2  b)} {(x_1 x_2 + b)(1 + a u_1 u_2) + 2b u_1 u_2 (x_1 + x_2)} \end{eqnarray*}\] These formulas are complete.
Fractional Coordinates
In order to perform group operations efficiently, we use a fractional representation of the \((x, u)\) coordinates. Namely, a point \((x, u)\) is represented by the quadruplet \((X{:}Z{:}U{:}T)\) with \(ZT \neq 0\), and such that \(x = X/Z\) and \(u = U/T\).
Since \(u = 1/w\), decoding of a point from its external representation (encoding of the value \(w\)) into fractional \((x, u)\) is done in the same way as in \((x, w)\) coordinates; when \(x\) and \(w\) have been obtained, the fractional coordinates are set to \((x{:}1{:}1{:}w)\) (with only a special treatment for the case of the neutral \(N\)).
The addition of \(P_1 = (X_1{:}Z_1{:}U_1{:}T_1)\) with \(P_2 = (X_2{:}Z_2{:}U_2{:}T_2)\) into \(P_3 = (X_3{:}Z_3{:}U_3{:}T_3)\) is obtained with the following addition formulas: \[\begin{eqnarray*} X_3 &=& b((X_1 Z_2 + X_2 Z_1)(T_1 T_2 + a U_1 U_2) + 2 U_1 U_2 (X_1 X_2 + b Z_1 Z_2)) \\ Z_3 &=& (X_1 X_2 + b Z_1 Z_2)(T_1 T_2  a U_1 U_2)  2b U_1 U_2 (X_1 Z_2 + X_2 Z_1) \\ U_3 &=& (U_1 T_2 + U_2 T_1)(X_1 X_2  b Z_1 Z_2) \\ T_3 &=& (X_1 X_2 + b Z_1 Z_2)(T_1 T_2 + a U_1 U_2) + 2b U_1 U_2 (X_1 Z_2 + X_2 Z_1) \end{eqnarray*}\] Like their affine counterparts, these formulas are complete.
They can be implemented in cost 10M by using the following algorithm, with the help of two precomputed constants \(\alpha = (4ba^2)/(2ba)\) and \(\beta = (a2)/(2ba)\):
 \(t_1 \leftarrow X_1 X_2\)
 \(t_2 \leftarrow Z_1 Z_2\)
 \(t_3 \leftarrow U_1 U_2\)
 \(t_4 \leftarrow T_1 T_2\)
 \(t_5 \leftarrow (X_1+Z_1)(X_2+Z_2)  t_1  t_2\)
 \(t_6 \leftarrow (U_1+T_1)(U_2+T_2)  t_3  t_4\)
 \(t_7 \leftarrow t_1 + b t_2\)
 \(t_8 \leftarrow t_4 t_7\)
 \(t_9 \leftarrow t_3 (2bt_5 + at_7)\)
 \(t_{10} \leftarrow (t_4 + \alpha t_3)(t_5 + t_7)\)
 \(X_3 \leftarrow b(t_{10}  t_8 + \beta t_9)\)
 \(Z_3 \leftarrow t_8  t_9\)
 \(U_3 \leftarrow t_6(t_1  bt_2)\)
 \(T_3 \leftarrow t_8 + t_9\)
The cost of 10M assumes that multiplication by constants \(a\), \(b\), \(\alpha\) and \(\beta\) is inexpensive, i.e. that these are all small integers or easily computed fractions such as \(1/2\). Also, \(\alpha\) and \(\beta\) are not defined in \(a = 2b\); in such a case, the implementation can still use the algorithm above by working on an isomorphic curve where \(a \neq 2b\). Indeed, the transform \((x, u) \mapsto (2x, u/2)\) is an easily computable isomorphism from \(E(a, b)\) into \(E(4a, 16b)\), which sidesteps the issue.
For point doublings, applying the general algorithm leads to a cost of 4M+6S (the first six multiplications are squarings in that case). However, we can do better, with cost 3M+6S. If the source point is \((X{:}Z{:}U{:}T)\), then we can compute its double in the group \((X''{:}Z''{:}U''{:}T'')\) as follows: \[\begin{eqnarray*} X' &=& (a^24b) XZ \\ Z' &=& X^2 + aXZ + bZ^2 \\ X'' &=& 4b X'Z' \\ Z'' &=& X'^2  2a X'Z' + (a^24b) Z'^2 \\ U'' &=& 2(a^2  4b) (X^2  bZ^2) Z'U \\ T'' &=& (X'^2  (a^2  4b)Z'^2) T \end{eqnarray*}\] These formulas, internally, apply the \(\theta_1\) isogeny, then \(\theta_{1/2}\); together, these two isogenies compute a point doubling in \(\mathbb{G}\).
As with Jacobian coordinates, sequences of successive doublings allow for extra optimizations. In fractional \((x, u)\) coordinates, we can apply the \(\psi_1\) isogeny in cost 4M+2S, with an output in Jacobian \((x, w)\) coordinates. We can then apply the \(\psi\) and \(\theta\) isogenies repeatedly, to achieve the same marginal perdoubling costs as in Jacobian \((x, w)\) coordinates. The final isogeny can be modified to produce an output back in fractional \((x, u)\) coordinates with very low overhead (at worst one extra squaring). This leads to the following costs for \(n\) successive doublings:
 \(n\)(2M+5S)+2M+1S (for all curves)
 \(n\)(1M+6S)+4M1S (if \(q = 3\bmod 4\))
 \(n\)(4M+2S)+2M+2S (if \(q = 3\) or \(5\bmod 8\))
 \(n\)(1M+5S)+3M (if \(a = 0\))
 \(n\)(2M+4S)+2M+2S (if \(a = 1\) and \(b = 1/2\))
The persequence overheads are higher than with Jacobian \((x, w)\) coordinates; thus, it is beneficial to organize operations such that doublings are always performed in long sequences.
Please refer to the whitepaper (section 4.2) for additional details on these formulas.
xonly Ladders
A ladder algorithm working only over the \(x\) coordinates of points can be used to compute DiffieHellman key exchanges over doubleodd elliptic curves. Such ladders are already known, but our formulas are a bit faster than what was previously published for such curves.
Suppose that we are computing a multiplication of point \(P_0\) whose \(x\) coordinate is known as a field element \(x_0\). At any point in a doubleandadd algorithm, our current accumulator point is represented as the \(x\) coordinates of two points \(P_1\) and \(P_2\) which are such that \(P_2 = P_0 + P_1\) (addition in the group \(\mathbb{G}\)). These \(x\) coordinates are known as fractions, i.e. \(x_1 = X_1/Z_1\) and \(x_2 = X_2/Z_2\). We can then compute the \(x_3 = X_3/Z_3\) coordinate of \(P_3 = P_1 + P_2\) with: \[\begin{eqnarray*} X_3 &=& b (X_1 Z_2  X_2 Z_1)^2 \\ Z_3 &=& x_0 (X_1 X_2  b Z_2 Z_1)^2 \end{eqnarray*}\] with a cost of 4M+2S.
For doubling either \(P_1\) or \(P_2\), we observe that when applying the isogeny \(\theta_1\) on a point \(P\) whose \(x\) coordinate is \(X/Z\), to obtain a point \(P'\) with \(x' = X'/Z'\) on the dual curve, we can compute \(X'\) and \(Z'\) as: \[\begin{eqnarray*} X' &=& (a^2  4b) XZ \\ Z' &=& X^2 + aXZ + bZ^2 \\ &=& (X + Z)(X + bZ) + (a  1  b)XZ \end{eqnarray*}\] with a cost of 2M. We can similarly compute \(\theta_{1/2}\) on that point \(P'\) with cost 2M, and that yields the \(x\) coordinate of \(2P\). In total, we can have the \(x\) coordinate of \(2P\), starting from the \(x\) cordinate of \(P\), in cost 4M.
Each ladder step combines the two operations above: we replace \((P_1, P_2)\) with either \((2P_1, P_1+P_2)\) (if the scalar bit is 0) or with \((P_1+P_2, 2P_2)\) (if the scalar bit is 1). Either way, the cost per scalar bit is 8M+2S. This is not as efficient as using Jacobian \((x, w)\) or fractional \((x, u)\) coordinates, but it allows implementation of key exchange with less code, and in a more RAMefficient way, compared with doubleandadd algorithms with window optimizations.
Since we encode points as their \(w\) coordinate, not \(x\) coordinate, it is convenient to define the DiffieHellman key algorithm to output, as shared secret, the value \(w^2\), for the \(w\)coordinate of the resulting point. Indeed, with the \(\theta_1\) isogeny, the \(x\) coordinate of \(\theta_1(x, w)\) is equal to \((a^24b)/w^2\); thus, we can make a \(w^2\)only ladder on curve \(E(a,b)\) by simply using an \(x\)only ladder (with the formulas shown above) on the dual curve \(E(2a, a^24b)\). In that way, a DiffieHellman key exchange (multiplication of the point from the peer by the local private key) can be implemented with a ladder algorithm without requiring full point decoding. This saves a square root computation. Namely, when the peer's point is received as value \(w\), it suffices to verify that \(w \neq 0\) and that \((w^2  a)^2  4b\) is a quadratic residue (with a Legendre symbol), thereby validating that the value \(w\) is correct and could be decoded into a full point, but it is not necessary to actually decode it and retrieve the \(x\) coordinate, which is not needed for a \(w^2\)only ladder.
Curve Choices
We define two specific sets of curve parameters for doubleodd curves that leverage the formulas for good performance: do255e and do255s. The do255e curve is a GLV curve (with equation \(y^2 = x(x^2 + b)\)) which permits use of our doubling formulas with the lowest perdoubling cost (1M+5S). GLV curves feature some extra structure, namely a fast endomorphism on the curve that can be leveraged to considerably speed up point multiplication. Since this extra structure has historically generated unease about security (no weakness was found, but these curves are still admittedly "special" in a fuzzy way), we also define do255s, an ordinary curve with no such structure. do255s uses the equation \(y^2 = x(x^2  x + 1/2)\), for which we have doubling formulas in cost 2M+4S.
In both cases, the base field is chosen to be the field of integers modulo \(q = 2^{255}  m\) for a small integer \(m\); the value of \(m\) is the main selection parameter. Fastest known implementations on both 64bit x86 CPUs and ARM Cortex M0+ CPUs use registersized limbs and will provide the same performance for all values of \(m\), as long as \(m < 2^{15}\). Moreover, there are some slight benefits, when doing modular reduction, to a 255bit modulus (as opposed to a 256bit modulus), because that allows an efficient partial reduction. Since the curve order is close to \(q\) and that order is \(2r\), where \(r\) is the order of the final group, we will obtain a group of size about \(2^{254}\), which is sufficient to claim "128bit security" (in the same way that Curve25519 / Edwards25519 also claims the 128bit security level with a subgroup prime order of about \(2^{252}\)).
We thus choose the target values of \(a\) and \(b\) such that all constants in applicable formulas lead to fast operations, then explore increasing values of \(m\) until a match is found, i.e. \(q = 2^{255}  m\) is prime and the curve equation leads to order \(2r\) with \(r\) prime.
Once the field is chosen, it only remains to define a conventional generator \(G\) for the group. Which point we use is not important for security (discrete logarithm relatively to one generator is equivalent to discrete logarithm relatively to any other generator). To make the selection deterministic, we use the point in the group \(\mathbb{G}\) with the lowest nonzero \(u\) coordinate, interpreted as an integer in the \(0...q1\) range.
Specific parameters are provided in the following subpages:
do255e
The curve do255e is a GLV curve, i.e. with equation \(y^2 = x(x^2 + b)\). We apply the following criteria:

Modulus \(q = 2^{255}  m\) should be equal to 5 modulo 8. For the GLV curve not to be supersingular, we need \(q\) to be equal to 1 or 5 modulo 8; the second choice makes computation of square roots in the field easier to implement.

Since the \(j\)invariant of such a curve is fixed (it's 1728), there are only two potential curves (up to isomorphisms) to check for a given field. We can thus always enforce that \(b = 2\) or \(2\), which are convenient values for implementation.

Curve order must be equal to \(2r\) for a prime integer \(r\).
Under these criteria, the first match is for \(m = 18651\). Here are the resulting curve parameters:
 Name: do255e
 Field: integers modulo \(q = 2^{255}  18651\)
 Equation: \(y^2 = x(x^2  2)\)
 Order: \(2r\), with \(r = 2^{254}  131528281291764213006042413802501683931\)
 Generator: \[\begin{eqnarray*} G_x &=& 2 \\ G_y &=& 2 \end{eqnarray*}\]
do255s
The curve do255s is an ordinary curve with no easily computable endomorphism. Its equation parameters are such that \(a\) is not a quadratic residue, and \(a^2 = 2b\), so that point doubling formulas in cost 2M+4S (in Jacobian \((x, w)\) coordinates) may be used. As exposed in the whitepaper, this implies that the curve \(j\)invariant is 128, and that there is only a single curve per field (up to isomorphisms) that matches these criteria, and we can thus enforce that \(a = 1\) and \(b = 1/2\).
We apply the following criteria:

Curve equation is \(y^2 = x(x^2  x + 1/2)\).

Modulus \(q = 2^{255}  m\) should be equal to 3 modulo 8. This is needed for the curve with that equation to be a doubleodd curve.

Curve order must be equal to \(2r\) for a prime integer \(r\).
Under these criteria, the first match is for \(m = 3957\). Here are the resulting curve parameters:
 Name: do255s
 Field: integers modulo \(q = 2^{255}  3957\)
 Equation: \(y^2 = x(x^2  x + 1/2)\)
 Order: \(2r\), with \(r = 2^{254} + 56904135270672826811114353017034461895\)
 Generator: \[\begin{eqnarray*} G_x &=& 26116555989003923291153849381583511726884321626891190016751861153053671511729 \\ G_y &=& 28004200202554007000979780628642488551173104653237157345493551052336745442580 \end{eqnarray*}\]
Implementations
The following opensource (MIT license) implementations of curves do255e and do255s are available:

Reference implementation in Python:
https://github.com/doubleodd/pydo255
This implementation is only for reference purposes; it computes correct values, but does so in a suboptimal way and, crucially, is not free of timingbased side channels, since it relies on Python's big integers.
This repository includes comprehensive test vectors for curves do255e and do255s (both lowlevel operations such as point addition, and highlevel functionalities: key exchange, signature, hashtocurve). The script that generates these vectors leverages the Python implementation.

Go implementation:
https://github.com/doubleodd/godo255
The Go implementation is considered correct and secure; all the code is strictly constanttime. It is pure Go code that should run on any supported Go platform. It internally uses 64bit types with
math/bits
functionalities (especiallyAdd64()
,Sub64()
andMul64()
) which provide decent performance in a portable way on 64bit platforms. 
C and assembly implementations:
https://github.com/doubleodd/cdo255
This repository contains some extensive demonstration code in C and assembly:
 C code (with 64bit intrinsics) for 64bit x86
 C code with some inline assembly for 64bit x86 with BMI2 and ADX opcodes (e.g. Intel Skylake and later cores)
 C code (with 32bit intrinsics) for 32bit x86
 Assembly code for ARM Cortex M0+
While this implementation was written for research and demonstration purposes, it follows APIs and conventions appropriate for general inclusion in applications. All the code is strictly constanttime. This is the code that was used for the benchmarks reported in the whitepaper.