Benchmarks the BLS12-381 cryptography that Cashu would need if it migrates
BDHKE from secp256k1 to BLS12-381 — implementing NUT-00's BLS12-381 (v3)
protocol (cashubtc/nuts PR #371, keysets with version byte 02) — running
on an ESP32-C3, including a path that offloads the field arithmetic to the
chip's RSA/MPI peripheral.
This is the one to look at. It implements NUT-00's BLS12-381 (v3) protocol
faithfully (see esp32c3-bench-blst/src/main.rs), and a startup gate checks it
against the spec's test vectors (tests/00-tests.md) byte-for-byte:
- Multiplicative blinding —
B_ = r·Y,C_ = a·B_,C = r⁻¹·C_ = a·Y. No point additions in the BDHKE steps, no− r·Kunblind. - Mint pubkey
K = a·G2on G2 only — 96-byte keyset keys; no G1 mint key. (Additive blinding would need the key on both G1 and G2 = 144 bytes — that's why the spec is multiplicative.) - Hash-to-G1 via RFC 9380 SSWU, DST
CASHU_BLS12_381_G1_XMD:SHA-256_SSWU_RO_. - Mandatory point validation (NUT-00 §Point Validation, flagged CRITICAL):
every received
B_/C_/C/Kis decompressed from canonical bytes and rejected unless on-curve, non-identity, and in the prime-order subgroup (uncompress+in_g1/in_g2). The mint validatesB_before signing; the wallet validatesK,C_, andC. - Wallet verify
e(C, G2) == e(Y, K); batch verify collapses N proofs into1 + UMiller loops (U = unique keysets), with the random-linear- combination weights derived deterministically via a Fiat-Shamir SHA-256 transcript + per-proof rejection sampling inFr*(BLS_BATCH_DST). - No DLEQ for v3 — NUT-12 scopes DLEQ to secp256k1; the pairing check replaces it.
Backend: blst 0.3.16, vendored + patched (esp32c3-bench-blst/vendor/blst/,
wired via [patch.crates-io]). On RV32IMC blst has no asm path, so it falls back
to portable C — and the patch routes blst's Montgomery multiply and squaring
(mul_mont_n, mul_mont_nonred_n, sqr_mont_382x) through the C3's RSA/MPI
peripheral via mpi_mul_mont_n in esp32c3-bench-blst/src/mpi.rs. The bench
prints a spec-conformance gate (the NUT-00 test vectors — Y/K/B_/C_/C, the
batch challenge, and the rejection-sampled weights — must match byte-for-byte)
and a bit-exact MPI-vs-software mul_mont diagnostic.
Headline numbers (ESP32-C3 rev v0.4 @ 160 MHz; full table in
RESULTS.md): portable-C blst does bdhke_full_round in 459 ms
and pairing_verification in 1.31 s; with the MPI peripheral those drop to
104 ms and 304 ms — ~4.5× across the board. A typical 10-proof token
(all one keyset) batch-verifies in ~0.9 s on the bare chip, a realistic
3–4-keyset mix in ~1.1-1.2 s — at parity with today's secp256k1+DLEQ wallet
(~1.5 s for 10 proofs), no coprocessor. All figures now include the spec's
mandatory point validation on every received point.
Hardware-characterization microbench for the original ESP32 (Xtensa LX6 @
240 MHz): times the RSA peripheral's native single-mul mode, for comparison
against the C3's double-mul quirk + the modexp-early-exit workaround. Not a
Cashu-scheme bench. Needs the Xtensa Rust toolchain (espup install); build
with cargo +esp run --release from that crate.
The legacy/crypto/, legacy/esp32c3-bench/, legacy/host-bench/ crates
predate the spec. They mock an additive-blinding BDHKE
(B' = Y + r·G, C = C' − r·K) — which is not what NUT-00 (v3) does, and
would force the mint key onto both G1 (for the − r·K unblind) and G2 (for the
pairing check) = 144-byte keyset keys. They also still hash with a placeholder
DST and use the pure-Rust bls12_381 (zkcrypto) backend rather than blst.
They live under legacy/ and are kept only for the historical
pure-Rust-vs-blst per-primitive comparison (the ~9-40×-per-op gap is
interesting; see RESULTS.md). Do not use them for protocol-accurate numbers —
use esp32c3-bench-blst/ for anything that needs to match the NUT-00 spec.
bls-bench/
├── esp32c3-bench-blst/ ← the bench that matters: blst + MPI, matches NUT-00 v3
│ ├── src/mpi.rs MPI peripheral driver + mpi_mul_mont_n
│ └── vendor/blst/ vendored+patched blst 0.3.16
├── esp32-bench-mpi/ ← original-ESP32 (Xtensa) MPI hardware microbench
└── legacy/ ← ⚠️ SUPERSEDED — additive-blinding mock, not NUT-00 v3
├── crypto/ additive-blinding BDHKE mock (zkcrypto backend)
├── esp32c3-bench/ runs the mock on ESP32-C3
└── host-bench/ criterion host baseline for the mock
ESP32-C3 rev v0.4 @ 160 MHz, RISC-V RV32IMC, 400 KB SRAM, connected over USB
serial. (esp32-bench-mpi targets the original ESP32, Xtensa LX6 @ 240 MHz.)
# The NUT-00-v3-accurate bench (board connected, espflash installed)
cd esp32c3-bench-blst && cargo run --release
# Original-ESP32 MPI hardware microbench (also needs `espup install`)
cd esp32-bench-mpi && cargo +esp run --release
# Superseded — historical pure-Rust comparison only
cd legacy/esp32c3-bench && cargo run --release
cargo bench -p host-benchMIT — see LICENSE.