- Home
- Post-Quantum Cryptography Tooling
Post-Quantum Cryptography Tooling
Optimizing the Leading Lattice Sieving Toolkit
Updated
Post-quantum cryptography standards like Kyber and Dilithium are built on the assumption that lattice problems are hard. Testing that assumption requires attacking those lattices with the best tools available — and the best tool is G6K. I optimize G6K's C++ sieving engine and its supporting infrastructure. The main performance work lives in the C++ core. Alongside that, I rewrote the spherical code generator — a Python hill-climber that produces SimHash probe banks — in Rust with an algorithmic fix that cut generation time by two orders of magnitude.
What This Means for Your Business
G6K is the toolkit the research community reaches for when stress-testing lattice-based cryptographic schemes — the systems that NIST is standardizing as post-quantum replacements for RSA and ECC. The sieving engine at G6K's core is C++, and that is where the primary optimization work happens: the inner loops that search for short lattice vectors across databases that grow exponentially with dimension. Performance gains in the C++ core translate directly into higher dimensions reached and tighter security estimates for production cryptographic standards.
One concrete example: G6K's SimHash filter depends on sparse spherical codes generated by a Python hill-climber that took minutes per dimension. I rewrote the generator in Rust around the observation that a coordinate swap only modifies one row of the Gram matrix — collapsing per-swap cost from O(N·n) to O(N). The tool now generates the full code bank for dimensions 6 through 1000 on a laptop, with strictly better output quality. But this is a supporting tool. The main work is in the C++ sieving engine itself.
Case Study: G6K's Spherical Code Generator, Python vs Rust
G6K's SimHash filter depends on sparse spherical code banks generated by a Python hill-climber. The rewrite changed the algorithm first, the language second.
| Metric | Original (Python) | Rewrite (Rust) |
|---|---|---|
| Single dimension (n=128, bitlen=512) | 3–6 minutes | 2–5 seconds — roughly 60–100× faster |
| Full sweep n=6..1000 | Days, sequential | One overnight run on a laptop |
| Code quality score (lower is better) | Baseline | Strictly lower across all sampled dimensions |
| Output format | .def sparse index files | Byte-identical .def files — drop-in replacement |
Byte-identical output means no downstream G6K changes were required.
How I Have Used This in Production
g6k-rs: Full Rust Lattice Sieving Library
Solo-built a 16K-line Rust lattice sieving and reduction library from scratch. Implements LLL with incremental GSO and deep insertion, BKZ 2.0 with Schnorr-Euchner enumeration (extended GCD insertion, Gaussian heuristic pruning) and sieve-based SVP oracles (BDGL with FHT-LSH, BGJ1, HK3), Babai CVP solvers, Coppersmith/Howgrave-Graham small-root finding, and LWE attack via Kannan embedding. Includes multi-threaded sieving, Metal GPU acceleration, arbitrary-precision GMP-backed lattices, and 215 tests across 6 suites.
G6K C++ Sieving Engine Optimization
Performance engineering on G6K’s C++ core — the sieving inner loop that maintains vector databases and searches for reduction pairs across exponentially growing dimensions. This is the primary optimization surface: gains in the C++ engine determine what dimensions are practically reachable and how tight the resulting security estimates are for NIST post-quantum standards.
100× Spherical Code Generator in Rust
Rewrote the Python hill-climber that generates SimHash probe banks. Incremental Gram matrix updates collapse per-swap cost from O(N·n) to O(N). A single dimension (n=128, bitlen=512) dropped from 3–6 minutes to 2–5 seconds. The full sweep across n=6..1000 completes on a laptop instead of running for days. Output files are byte-identical — drop-in replacement, no downstream changes required.
Property-Tested Numerical Correctness
Built a property-test suite verifying that every incremental Gram matrix update and every cached row-sum-of-squares matches a from-scratch recompute on every entry. This test caught two off-by-one bugs during development that would have produced silently wrong spherical codes — codes that pass format checks but degrade SimHash filter quality in ways that only surface as unexplained sieving slowdowns.
The Bottleneck Was Algorithmic, Not the Language
The reflex move was to keep the algorithm and accelerate the language: vectorized NumPy bought about 3x, Numba JIT another 5-8x, multiprocessing scaled with cores. Stacked, maybe 30-50x on a good day - but each candidate swap still triggered a full Gram matrix row recompute and a full score pass. The JIT only compiled the wasted work to be wasted faster.
The actual fix was upstream of any language choice: a coordinate swap inside one row of the code matrix only changes one row of the Gram matrix, plus its symmetric column. Updating just those entries is O(bitlen) arithmetic instead of O(bitlen times n), and a cached row-sum-of-squares collapses the score query from O(bitlen squared) to O(bitlen). At bitlen 512, that is roughly 1,000 entries touched per swap instead of 262,144 recomputed.
Cheap swap evaluation then changed the search itself. The Python original kept the first improving swap it found; with incremental updates it became affordable to scan every candidate pair in a row and apply only the best one, which converges to lower local minima in fewer passes. Running multiple seeded candidates in parallel with rayon and keeping the lowest score finished the job.
Technologies
Related Expertise
Cryptographic research tooling demands the same correctness guarantees as financial infrastructure. See why Rust is the foundation for systems where silent bugs have outsized consequences.
Rust for Mission-Critical Systems — Why Rust When Failure Is Not an OptionRust compiled to WebAssembly brings systems-level performance to the browser. See how I built cryptographic pipelines for ZK proof generation in client-side wallet engines.
Rust and WebAssembly — Native Performance in the BrowserZK-SNARK systems share the same intersection of cryptography and performance engineering. See how I built production proving pipelines at Panther Protocol.
ZK-SNARK Development — From Circuits to Production WalletsFrequently Asked Questions
What is post-quantum cryptography tooling and why does it matter?
Post-quantum cryptography tooling is the software used to stress-test the lattice problems behind standards like Kyber and Dilithium — NIST's replacements for RSA and ECC. The leading tool is G6K, the open-source lattice sieving toolkit the research community reaches for. Its C++ sieving engine searches for short vectors across databases that grow exponentially with dimension, so performance gains translate directly into higher dimensions reached and tighter security estimates for production cryptographic standards.
When does it make sense to rewrite research code in Rust instead of optimizing the existing Python?
A rewrite pays off when there is an algorithmic fix that JIT alone cannot deliver. G6K's spherical code generator is the proof case: a Python hill-climber took 3-6 minutes per dimension, but the key observation was that a coordinate swap only modifies one row of the Gram matrix, collapsing per-swap cost from O(N·n) to O(N). The Rust rewrite runs a single dimension in 2-5 seconds and the full n=6..1000 sweep on a laptop instead of days, with byte-identical output.
What experience does Oleksii Vasylenko have with G6K and post-quantum tooling?
Oleksii Vasylenko does performance engineering on G6K's C++ sieving core — the inner loops that maintain vector databases and search for reduction pairs, which determine what dimensions are practically reachable. Alongside that, he rewrote G6K's spherical code generator in Rust with a 100x speedup as a byte-identical drop-in replacement, and solo-built g6k-rs, a 16K-line Rust lattice sieving and reduction library covering LLL, BKZ 2.0, three sieve algorithms, CVP solvers, Coppersmith, and LWE attacks, with 215 tests.
What goes wrong when optimizing cryptographic research code?
The dangerous failures are silent. While rewriting G6K's spherical code generator, Oleksii Vasylenko built a property-test suite verifying every incremental Gram matrix update and cached row-sum-of-squares against a from-scratch recompute on every entry. It caught two off-by-one bugs that would have produced spherical codes that pass format checks but degrade SimHash filter quality — a defect that only surfaces later as unexplained sieving slowdowns. Without property testing, such bugs ship and quietly corrupt downstream results.
How can a research team engage Oleksii Vasylenko when their tooling bottlenecks experiments?
Oleksii Vasylenko takes on engagements where research code bottlenecks experiments. Cryptographic research tools are full of inner loops that silently recompute invariant work; his approach is to find the algorithmic fix first, then build the Rust implementation that makes parameter sweeps practical instead of prohibitive. Teams with a sieving, proving, or numerical pipeline stuck behind a Python bottleneck that JIT alone cannot fix can contact him through ovasylenko.com.
Further Reading
Research code bottlenecking your experiments?
Cryptographic research tools are full of inner loops that silently recompute invariant work. I find the algorithmic fix first, then build the Rust implementation that makes parameter sweeps practical instead of prohibitive. If your sieving, proving, or numerical pipeline has a Python bottleneck that JIT alone cannot fix, let’s talk.
Discuss your cryptography tooling