SLM Finetuning for Natural Language to Domain Specific Code Generation in Production

Renjini R. Nair (Microsoft); Damian K. Kowalczyk (Microsoft); Marco Gaudesi (Microsoft); Chhaya Methani (Microsoft)

arXiv:2604.09952·cs.LG·April 14, 2026

SLM Finetuning for Natural Language to Domain Specific Code Generation in Production

Renjini R. Nair (Microsoft), Damian K. Kowalczyk (Microsoft), Marco Gaudesi (Microsoft), Chhaya Methani (Microsoft)

PDF

TL;DR

This paper demonstrates that fine-tuning small language models for domain-specific code generation improves performance and latency, offering an efficient alternative to large models in production environments.

Contribution

It evaluates fine-tuning small models like Mistral for domain-specific code generation, showing improved accuracy, latency, and adaptability without degrading general performance.

Findings

01

Fine-tuned small models outperform larger models in test accuracy.

02

Fine-tuning enables quick adaptation to customer-specific scenarios.

03

Load testing confirms optimal latency and quality in production.

Abstract

Many applications today use large language models for code generation; however, production systems have strict latency requirements that can be difficult to meet with large models. Small language models with a few billion parameters are resource efficient but may suffer from limited reasoning, hallucinations, or poor retention of longer context. Fine tuning improves task specific accuracy by embedding domain knowledge directly into model weights, reducing reliance on runtime context. We previously implemented a baseline natural language to code generation approach using a retrieval augmented generation pipeline that dynamically selected few shot examples to embed domain specific language context for a large language model. In this study, we evaluate small language models for generating domain specific language from natural language by fine tuning variants of Mistral and other models on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.