Dynamic Template Selection for Output Token Generation Optimization: MLP-Based and Transformer Approaches

Bharadwaj Yadavalli

arXiv:2511.20683·cs.CL·November 27, 2025

Dynamic Template Selection for Output Token Generation Optimization: MLP-Based and Transformer Approaches

Bharadwaj Yadavalli

PDF

Open Access

TL;DR

This paper introduces Dynamic Template Selection (DTS), a cost-efficient method for adaptive response generation in large language models, using MLP and transformer approaches to match templates to query complexity, reducing token usage without sacrificing quality.

Contribution

It presents a novel formal framework for template selection, compares MLP and transformer routing methods, and demonstrates generalization across multiple LLM providers with extensive empirical validation.

Findings

01

MLP router achieves 90.5% accuracy, outperforming RoBERTa's 89.5%.

02

Token reduction ranges from 32.6% to 33.9% across providers.

03

Routing decisions generalize effectively across different LLMs.

Abstract

Contemporary large language model deployments typically employ uniform prompting strategies across diverse query types, applying verbose response patterns to both complex analytical tasks and straightforward factual questions. This one-size-fits-all methodology leads to substantial token inefficiency, a concern amplified by the significant cost differential between input and output tokens--the latter commanding 4-8x higher prices across major providers. We present Dynamic Template Selection (DTS), which adaptively matches response templates to query complexity, achieving significant cost reductions without compromising response quality. We compared two routing approaches: a simple MLP that uses pre-computed embeddings and a more complex fine-tuned RoBERTa transformer. Through comprehensive evaluation on 1,000 MMLU questions, we find that the MLP router achieves 90.5% routing accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Graph Theory and Algorithms · Machine Learning and Data Classification