Compiling Deterministic Structure into SLM Harnesses
Zan Kai Chong, Hiroyuki Ohsaki, Bryan Ng

TL;DR
This paper introduces Semantic Gradient Descent (SGDe), a compilation-based framework that improves structured workflows in language models by using a teacher-student approach with formal PAC learning guarantees.
Contribution
It formalizes SGDe under PAC learning, demonstrating sample efficiency and significant accuracy gains over prompt optimization methods.
Findings
Achieves 91.3% accuracy with 5 examples and 99.3% with 3 examples on GSM-Hard.
Provides theoretical PAC bounds for convergence with minimal training data.
Enhances deterministic workflow structuring through capability offloading and structural consensus.
Abstract
Enterprise SLM deployment faces epistemic asymmetry: small models cannot self-correct reasoning errors, while frontier LLMs incur prohibitive costs and data sovereignty risks at scale. We propose Semantic Gradient Descent (SGDe), a teacher-student framework that compiles agentic workflows into discrete execution plans--DAG topologies, system prompts, and deterministic code. The trailing e distinguishes this discrete, compilation-based approach from stochastic gradient descent. Operating in discrete semantic space, a frontier teacher generates natural-language critiques that serve as directional gradients to iteratively refine the SLM's workflow artefacts. We formalise SGDe under PAC learning, establishing sample-complexity bounds that enable convergence with as few as three training examples by leveraging the teacher as a statistical prior. On an adversarially synthesized GSM-Hard test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
