- Rust crate, edition 2021, rust-version 1.83.0
- Dependencies include: curve25519-dalek, hax-lib, libcrux-ml-kem, prost, sha2, etc.
- Already has hax proof infrastructure (
proofs/dir,hax-libdependency,hax.py)
Extract the Rust code to Lean 4 using Aeneas for formal verification.
- Forked the repo on GitHub
- Cloned locally and opened in VSCode
- Ran
npx aeneas-cli init(interactive wizard) - Set crate directory to
.(current dir) - Aeneas pinned to
mainbranch ofhttps://github.com/AeneasVerif/aeneas.git - Charon preset:
aeneas - Aeneas options:
loops-to-rec,split-files - Output destination:
LeanOutput/ - Fixed crate name from
SparsePostQuantumRatchettospqr(must matchCargo.tomlpackage name)
- Listed all crate modules in
aeneas-config.ymlundercharon.start_from - Modules (in planned extraction order):
spqr::util(active)spqr::kdfspqr::chainspqr::encodingspqr::authenticatorspqr::serializespqr::incremental_mlkem768spqr::v1spqr::proto
- All commented out except
spqr::utilto start with the simplest module
- Ran
npx aeneas-cli installto build Aeneas locally in.aeneas/ - Ran
npx aeneas-cli extractfor each module
| Module | Charon | Aeneas | Notes |
|---|---|---|---|
spqr::util |
OK | OK | Clean extraction |
spqr::kdf |
Hangs | - | Charon runs indefinitely |
spqr::chain |
Hangs | - | Charon runs indefinitely |
spqr::encoding |
OK | Partial | Funs.lean generated but 7 functions have sorry |
spqr::authenticator |
OK | ? | Currently active alongside util |
spqr::serialize |
Not tried | - | |
spqr::incremental_mlkem768 |
Not tried | - | |
spqr::v1 |
Not tried | - | |
spqr::proto |
Not tried | - |
LeanOutput/Types.lean- 267 lines, type definitions extracted cleanlyLeanOutput/Funs.lean- 3807 lines, function definitionsLeanOutput/TypesExternal_Template.lean- external type stubsLeanOutput/FunsExternal_Template.lean- external function stubs
cpufeaturesinit_inner (external crate inline asm)GF16.div_impl(GF16 division, src/encoding/gf.rs:548)Poly.compute_at(polynomial evaluation, src/encoding/polynomial.rs:255)Poly.from_complete_points(src/encoding/polynomial.rs:291)PolyEncoder.from_pb(protobuf deserialization)PolyDecoder.from_pb(protobuf deserialization)PolyDecoder.decoded_message(src/encoding/polynomial.rs:883)
The encoding module has 3 submodules:
encoding::gf- Galois field GF(2^16) arithmeticencoding::polynomial- Polynomial operations over GF(2^16)encoding::round_robin- Round-robin encoding
Error 1: Arrow types not supported (gf.rs:371-372)
static TOKEN: LazyLock<use_accelerated::InitToken> = LazyLock::new(use_accelerated::init);- The
cpufeatures::new!macro generates function pointer types. Aeneas does not support arrow/function-pointer types. - Fix: Mark
spqr::encoding::gf::check_acceleratedasopaquein config. This is CPU feature detection for hardware-accelerated GF multiplication (PCLMULQDQ/PMULL) - not needed for verification.
Error 2: Invalid input for unop cast (gf.rs:372)
- Same
cpufeaturescode path, cast from function to InitToken. - Fix: Same as above - make opaque.
Error 3: Iterator methods missing in Lean library (warnings)
zip,map,collectmethods onIteratortrait not modeled in Aeneas Lean library.- These are used in polynomial operations.
- Fix: May need to rewrite Rust to avoid iterator combinators, or accept
sorryfor those functions.
Error 4: Internal errors (4 occurrences during file generation)
- "Internal error: please file an issue" - likely cascading from the arrow type errors.
Attempt 1: Mark check_accelerated as opaque
- Reduced from 9 to 7 unique errors, but arrow type errors persisted (Aeneas still sees type signatures of opaque items)
Attempt 2: Use exclude instead of just opaque for check_accelerated
- Arrow type errors for
TOKENeliminated - But
mul2_u16andMulAssignstill errored because they reference excludedTOKEN
Attempt 3: Also exclude accelerated module + mark mul2_u16 and MulAssign as opaque
- Down to 5 unique errors (from original 9)
- GF accelerated code fully handled
- Remaining errors are all "Unreachable" in
polynomial.rs- iterator combinator patterns
Attempt 4 (failed): RUSTFLAGS="--cfg hax" to use hax-friendly code paths
- The
cfg(hax)flag is for hax toolchain, not Charon. It triggered compilation errors incore-modelsdependency.
All remaining sorry functions use iterator combinators (zip, map, collect) unsupported by Aeneas Lean library:
| Function | File:Line | Root cause |
|---|---|---|
GF16.div_impl |
gf.rs:548 | Tuple destructuring in loop body |
Poly.compute_at |
polynomial.rs:255 | Iterator zip+fold pattern |
Poly.from_complete_points |
polynomial.rs:291 | Iterator map+collect pattern |
PolyEncoder.from_pb |
polynomial.rs:569 | Iterator map+collect pattern |
PolyDecoder.from_pb |
polynomial.rs:808 | Iterator map+collect pattern |
PolyDecoder.decoded_message |
polynomial.rs:883 | Iterator zip+map+collect pattern |
- Tried both simplified names (
PolyEncoder::from_pb) and full Charon{impl}syntax - Charon
opaqueflag does NOT prevent Aeneas from encountering "Unreachable" errors during its prepasses - The errors happen in Aeneas's interpretation phase, before opaque/transparent distinction matters
- Conclusion:
opaquein Charon config cannot suppress these Aeneas-internal errors
- With opaque (6 entries): 5 errors (4 unique), 172 opaque functions, 136 transparent, 5 sorry in Funs.lean
- Without opaque (entries commented out): 6 errors (5 unique), 174 opaque functions, 139 transparent, 6 sorry in Funs.lean
- The only difference:
decoded_messagegains a sorry without the opaque entry (it was previously treated as opaque and skipped) - The other 5 functions (
div_impl,compute_at,from_complete_points,PolyEncoder::from_pb,PolyDecoder::from_pb) produce sorry regardless - Conclusion: The opaque list has minimal effect - saves exactly 1 function from sorry. The real fix is refactoring the Rust code. Keeping the opaque entries is harmless but not a substitute for the refactoring.
- Lean files ARE generated despite exit code 1 (Aeneas treats errors as non-fatal for output)
- 5 functions have
sorrybodies - these need manual Lean implementations or Rust rewrites - All other functions (130+ transparent functions) extracted successfully
These are not peripheral helpers - they are the mathematical core and serialization backbone:
GF16.div_impl- Implements GF(2^16) division (viaa^(2^16-2)exponentiation). Powers allDiv/DivAssignoperators. Used in Lagrange interpolation, the foundation of the erasure coding.Poly.compute_at- Polynomial evaluation at a point. Called bydecoded_messageand the encoder'spoint_at. Core to the encoding/decoding correctness.Poly.from_complete_points- Builds Lagrange interpolation polynomials from precomputed points. Used bypoint_atto initialize the encoding scheme.PolyEncoder.from_pb/PolyDecoder.from_pb- Reconstruct encoder/decoder state from protobuf. Used in every V1 protocol state deserialization path (v1/chunked/send_ek/serialize.rs,v1/chunked/send_ct/serialize.rs).PolyDecoder.decoded_message- Reassembles a message from erasure-coded chunks. Called by the V1 state machine insend_ct.rsandsend_ek.rs. The whole point of the encoding layer.
We will return to refactor these functions to replace iterator combinators with explicit index-based loops:
| Function | Current pattern | Refactor to |
|---|---|---|
GF16.div_impl |
(a, b) = mul2_u16(...) tuple destructuring |
Separate assignments in loop |
Poly.compute_at |
iter().zip(). fold |
for i in 0..len with index access |
Poly.from_complete_points |
iter().map().collect() |
for loop pushing to pre-allocated Vec |
PolyEncoder.from_pb |
iter().enumerate() + chunks_exact |
for i in 0..len with manual chunking |
PolyDecoder.from_pb |
chunks_exact iteration |
for i in (0..len).step_by(4) or index loop |
PolyDecoder.decoded_message |
binary_search + Option::as_ref |
Keep structure but simplify iterator usage |
These refactors should be semantically equivalent and testable against the existing test suite. They could potentially be upstreamed since the project already avoids iterators in some places for hax compatibility (see comment in encoding.rs:49).
Applied minimal Rust changes to eliminate all sorry functions. All 53 tests pass after each change.
| Function | Problem | Minimal fix |
|---|---|---|
GF16.div_impl |
Opaque mul2_u16 returns tuple; Aeneas can't handle tuple access from opaque fn |
Replace mul2_u16 call with direct * operator (already extracted) |
Poly.compute_at |
.iter().zip() pattern |
Index-based for i in 0..len loop |
Poly.lagrange_sum |
.iter().zip() pattern |
Index-based loop |
Poly.from_complete_points |
.iter().map().collect() in fallback branch |
Explicit for loop with push |
Poly.deserialize |
.chunks_exact(2) |
while j+2 <= len with manual indexing |
PolyEncoder.from_pb |
? inside if-else binding + core::array::from_fn closure |
Flatten to sequential if/return; explicit array literal init |
PolyDecoder.from_pb |
return Err inside for loop (any early return in loop is "Unreachable") |
Hoist validation to unrolled if before loop |
Key Aeneas limitations discovered:
- Iterator combinators (
.zip(),.map(),.collect()) - no Lean model, use index loops - Early return inside loops (
return Err(...)infor) - triggers "Unreachable", hoist validation before loop core::array::from_fnwith closures - use explicit array literals- Tuple access from opaque functions (
.0,.1on opaque return) - avoid calling opaque functions that return tuples chunks_exact()- usewhileloop with manual index stepping
Result: 0 errors, 0 sorry, 126 transparent functions extracted cleanly.
Current state:
- Aeneas installed and configured (
aeneas-config.yml) spqr::util,spqr::serialize,spqr::encodingall extract cleanly (0 sorry)spqr::kdf,chain,authenticator,v1,proto- Charon hangs (external crypto dependency graphs)spqr::incremental_mlkem768- Charon OK, Aeneas 2 cosmetic errors, 0 sorry, 139 transparent functions
Charon: Completes successfully (~10s). No hanging — libcrux_ml_kem dependency graph is tractable.
Aeneas errors (2, both from same root cause):
log::__private_api::log— "Internal error" fromlogcrate macros.rs. Thelog::info!andlog::warn!macros expand to function pointer types (arrow types) that Aeneas doesn't support.- Cascading: body of
potentially_fix_state_incorrectly_encoded_by_libcrux_issue_1275ignored due to error 1.
Root cause: The function uses #[cfg(not(hax))] log::info!(...) and log::warn!(...) (lines 126, 132). Charon doesn't set the hax cfg, so these calls are compiled in. The log crate's internal __private_api::log function uses arrow types.
Fix: Added log to the exclude list in aeneas-config.yml. This reduced errors from 3 to 2. The function is already #[hax_lib::opaque] in the source, so Aeneas correctly generates an axiom (opaque declaration) instead of a sorry body.
Result: 0 sorry, 139 transparent functions, 203 opaque functions. The 2 remaining errors are cosmetic — they don't affect the output quality.
Warnings (non-blocking):
- Unknown types with region parameters:
core::slice::iter::Chunks,log::Metadata,log::Record,core::panic::location::Location - Missing Iterator methods:
map,collect(not used by incremental_mlkem768's transparent functions)
Refactor the sorry functions in encodingDONE - 0 sorryClean up opaque listDONEInvestigate Charon hanging modulesBLOCKED - Charon hangs at the MIR analysis level (inside charon-driver/rustc), before--opaque/--exclude/--start-fromflags take effect. Tested withkdf(simplest hanging module, only 2 functions). Addinghkdf::*,sha2::*, etc. to opaque/exclude lists has no effect. Even--start-from "spqr::kdf::hkdf_to_vec"with--exclude "hkdf"hangs. Root cause: Charon must process the full rustc MIR including all type resolution for generic trait hierarchies (hkdf::Hkdf<sha2::Sha256>->Digest->CoreWrapper->BlockSizeUser->generic-array->typenum-> ...). This is a Charon upstream limitation - needs either Charon improvements orcfg(hax)-gated alternative code paths that avoid referencing external crypto crates.TryDONE - 0 sorry, excludedspqr::incremental_mlkem768logcrate- Set up a Lean project to actually build/typecheck the generated output
- Charon hanging on
kdfandchainmay be due to heavy use of external crypto crates (hkdf, libcrux-hmac) pulling in large dependency graphs sorryfunctions tend to involve: loops with complex control flow, inline asm (cpufeatures), or iterator patterns- The
encodingmodule extracted the most code successfully since it's mostly pure math (GF16 arithmetic, polynomial ops) - Arrow/function-pointer types are a hard Aeneas limitation - must use
exclude(not justopaque) to fully remove them - Missing Iterator method models (
zip,map,collect) in Aeneas Lean library is the primary remaining blocker RUSTFLAGS="--cfg hax"doesn't work because downstream deps (core-models) fail to compile with hax cfg