When we set out to prove that autonomous C-to-Rust transpilation could work at real-world scale, we needed a target that would be unambiguous: a large, widely-used, well-tested open-source C project. We chose JQ — the ubiquitous command-line JSON processor — and the results exceeded our own expectations.
Why JQ?
JQ is the kind of project that stress-tests every aspect of a transpilation pipeline. At 139,000 lines of C spread across 131 files, it's a substantial codebase. But size alone isn't what makes it interesting. JQ features:
- A hand-written parser and lexer
- A custom bytecode compiler and virtual machine
- Complex memory management with reference counting
- Heavy use of C macros, function pointers, and unions
- An extensive test suite with hundreds of edge cases
If Velociportr could handle JQ, it could handle the kind of legacy C code sitting inside defense agencies, banks, and telecom infrastructure.
The 5-Stage Pipeline
Velociportr doesn't just run a find-and-replace on syntax. It operates through a five-stage pipeline, each stage building on the last:
- AST Analysis — Parse the C source into an abstract syntax tree and build a semantic model of types, functions, and dependencies.
- Dependency Mapping — Resolve cross-file dependencies, identify shared state, and plan the transpilation order.
- Transpilation — Generate idiomatic Rust code that preserves the original semantics while leveraging Rust's ownership model.
- Compilation & Repair — Compile the Rust output and autonomously fix any compiler errors through iterative repair cycles.
- Oracle Verification — Test the transpiled Rust against the original C binary to verify functional equivalence.
Oracle-Based Verification
This is the part that separates Velociportr from syntax-level translators. After transpilation, we don't just check if the Rust code compiles — we verify that it behaves identically to the original C.
"Compiling is not correctness. A transpiled function that compiles but returns the wrong result is worse than one that doesn't compile at all — because it ships."
Our oracle takes the original C binary and the transpiled Rust binary, feeds them identical inputs, and compares outputs. For JQ, this meant running hundreds of test cases through both binaries and verifying byte-for-byte equivalence.
The Numbers
Here's what Velociportr achieved on the JQ codebase:
Project: JQ (command-line JSON processor)
Source: 139,000 lines of C
Files: 131 source files
Time: < 8 days (autonomous)
Output: Idiomatic Rust (safe, no unsafe blocks)
Compilation: Successful
Test Suite: Passing
Human Input: Zero during transpilation
What Made It Hard
Several aspects of JQ pushed the pipeline to its limits:
- Reference-counted memory management — JQ uses its own reference counting system for jv values. Velociportr had to map this onto Rust's ownership model without introducing leaks or double-frees.
- C macros — Heavy macro usage required pre-expansion and semantic analysis before transpilation could begin.
- Function pointers and callbacks — The bytecode VM dispatches through function pointers, which needed to be translated into Rust trait objects and closures.
- Union types — C unions don't have a direct Rust equivalent. Velociportr mapped them to tagged enums where type information could be inferred.
What This Means for the Industry
The JQ port isn't just a benchmark — it's proof that autonomous transpilation is viable for production-scale codebases. For organizations sitting on millions of lines of legacy C/C++, this changes the equation entirely:
- Manual rewrites cost $5-15 per line and take years
- Velociportr processed JQ at a rate of ~17,000 lines per day
- No engineers were pulled off other work during the process
- The output is verified, not just translated
The White House, CISA, and the NSA have all called for a transition to memory-safe languages. The technology to make that transition at scale now exists.
What's Next
We're scaling Velociportr to handle even larger codebases and more complex C/C++ patterns — including C++ templates, virtual dispatch, and multi-threaded code. If you're responsible for legacy C/C++ in defense, finance, telecom, or critical infrastructure, we'd love to talk.