ReaComp: Compiling LLM Reasoning into Symbolic Solvers for Efficient Program Synthesis

Atharva Naik, Yash Mathur, Prakam, Carolyn Rose, David Mortensen

Published May 8, 2026

Editorial review7.2

Relevance0.515

Freshness0.000

Why It Matters

What makes this one worth your time

This work is relevant for AI researchers and engineers interested in improving the efficiency and reliability of program synthesis by reducing dependency on LLMs during inference, thus saving computational resources and potentially broadening the applicability of AI solutions.

ReaComp compiles LLM reasoning into symbolic solvers, enhancing program synthesis efficiency and accuracy.

Summary

The paper presents ReaComp, a method for compiling reasoning traces from large language models (LLMs) into symbolic solvers for program synthesis. These solvers operate without LLM calls at test time, achieving high accuracy on benchmarks and complementing LLM inference in hard cases. The approach also shows potential for zero-shot transfer to historical linguistics tasks.

Key contributions

Development of symbolic solvers from LLM reasoning traces that operate without LLM calls at test time.
Demonstration of improved accuracy and efficiency on program synthesis benchmarks.
Zero-shot transfer of solvers to a historical linguistics task.

Notable insights

Compiling reasoning traces into symbolic solvers can significantly reduce token usage and improve accuracy in challenging tasks.
The approach demonstrates zero-shot transfer capabilities to unrelated domains, such as historical linguistics.

Possible limitations

Not stated in the abstract

Abstract

arXiv:2605.05485v1 Announce Type: cross Abstract: LLMs can solve program synthesis tasks but remain inefficient and unreliable on hard instances requiring large combinatorial search. Given a small set of reasoning traces, we use coding agents to compile them into reusable symbolic program synthesizers over constrained DSLs. The resulting solvers require no LLM calls at test time and are strong standalone systems: symbolic solver ensembles reach 91.3% accuracy on PBEBench-Lite and 84.7% on PBEBench-Hard, outperforming LLMs with test-time scaling for the latter by +16.3 percentage points at zero LLM inference cost. They also complement LLM search, improving PBEBench-Hard accuracy from 68.4% to 85.8% while reducing reported token usage by 78%, and raising SLR-Bench hard-tier accuracy from 34.4% to 58.0% in a neuro-symbolic hybrid setting. Compared to directly using coding agents as per-instance solvers, induced solvers are substantially more Pareto-efficient, amortizing a small one-time construction cost over many zero-token executions. Finally, most solvers transfer zero-shot to a real historical linguistics task - predicting sound changes in natural language data - reaching 80.1% accuracy under ensembling and recovering some plausible linguistic rules. Together, these results show that reasoning traces can be compiled into reusable symbolic solvers that solve many tasks directly, complement LLM inference on hard cases, and provide a scalable route to domain-general solver induction. We release code and data for reproducibility.