AI and SAT Solvers Unite to Crack Unsolvable Math Puzzles
November 13, 2025 · 2 min read
In a groundbreaking fusion of artificial intelligence techniques, researchers at Carnegie Mellon University are combining large language models with satisfiability solvers to push the boundaries of automated mathematical proof. Marijn Heule, a key figure at the Institute for Computer-Aided Reasoning in Mathematics, has pioneered this approach, aiming to solve problems that have stumped humans for decades. His work leverages SAT, a symbolic AI method that reduces complex issues into true-or-false logic puzzles, enabling computers to generate irrefutable proofs.
SAT, or satisfiability, operates on propositional formulas akin to massive Sudoku boards, where each cell holds a binary value. This simplicity belies its power; it has already conquered notorious puzzles like the empty hexagon and Kellers conjecture. Heule's expertise lies in encoding these mathematical enigmas into a format that SAT solvers can process, a skill he honed since childhood puzzle-solving. Now, he envisions a future where LLMs automate this translation, making the technology accessible beyond specialists.
The integration with large language models addresses a critical bottleneck: the manual effort required to frame problems for SAT. By training LLMs on examples of successful encodings, the system can propose logical decompositions and lemmas. Automated reasoning then verifies each piece, providing counterexamples to refine the AI's suggestions. This iterative process mirrors human collaboration, with tools like Lean ensuring end-to-end verification for trustworthiness.
Heule argues that this shift from understanding to trust could revolutionize mathematics. While some purists decry machine-generated proofs as 'disgusting,' he contends that reliability trumps comprehension in advancing the field. The partnership between AI and mathematicians amplifies creativity, with humans providing intuitive leaps and machines handling exhaustive validation. This synergy has already yielded results in geometry and combinatorics, hinting at broader applications.
Looking ahead, the team aims to tackle problems deemed unsolvable by humans alone. By democratizing encoding through LLMs, they hope to empower more researchers to harness SAT's capabilities. This collaboration model—where AI handles brute-force reasoning and humans guide high-level strategy—could unlock new frontiers in pure math and beyond, reinforcing the role of trust in scientific progress.
Ultimately, this research underscores a pivotal moment in AI development: the convergence of neural and symbolic approaches to overcome inherent limitations. As Heule notes, the magic emerges not from replacing humans, but from enhancing their efforts with unwavering computational rigor, paving the way for discoveries that redefine possibility.