Curve25519 implementation is both the most blessed and cursed code in the world.

On one hand, it has the cleanest math and attracted the best minds to work on it. Now the code deployed in most projects have been fully optimized, and formally verified to be mathematically correct and secure. One of the biggest accomplishment in crypto applications in recent years.

On the other hand, the monstrous, machine-generated assembly or C code will give any unsuspected programmer a heart attack.

Seriously, "you are not expected to understand this."

BTW, human-understandable implementations do exist...

Curve25519 in 18 tweets

https://www.cs.ru.nl/bachelors-theses/2014/Wesley_Janssen___4037332___Curve25519_in_18_tweets.pdf

/me now trying to implement it in PDP-11 assembly...

This paper is even better, fully annotated commentary of TweetNaCl's X25519 code. https://martin.kleppmann.com/papers/curve25519.pdf

Now I need to find a way to do carry propagation on smaller limbs. Two papers all "cheated" by representing 256-bit bignum using 16 x 64-bit integers to ensure they never overflow, avoiding the need to carry completely, but it would be too slow on the PDP-11..

Michael Hutter & Peter Schwabe's code for 8-bit Atmel chips look like a good start, but it also features multi-thousand lines of fully-unrolled and semi-machine generated code. No macros, the same thing just repeats over and over again, not the easiest thing to read... 🙈

I was staring at the screen for an entire afternoon and trying to understand why the carry propagation is performed twice in Curve25519 modular additions, no idea at all... Why is the second carry necessary?

Later, after a web search, I found a historical OpenSSL bug caused by not doing the second carry... Okay, I'm glad I'm not the only one who doesn't understand it.

Follow

PDP-11's (optional) multiplier can only perform SIGNED multiplication, to do unsigned multiplication in branchless, constant-time cryptography code, I need to do this convoluted sequence of adjustments... Totally a brain teaser. #retrocomputing #PDP11

niconiconi@niconiconi@cybre.spaceUnsigned multiplication is just terrible on the #PDP11. I wonder whether switching from 16-bit to 15-bit integers for bignum is a feasible workaround - multiplication is easier, but difficult to do carry propagation... #retrocomputing