Interfaze: The Future of AI is built on Task-Specific Small Models
Harsha Vardhan Khurdula, Vineet Agarwal, and Yoeven D Khemlani

TL;DR
Interfaze introduces a modular system that combines small models, external tools, and a controller to enhance AI application flexibility, efficiency, and performance across diverse tasks.
Contribution
The paper presents Interfaze, a novel architecture that integrates heterogeneous small models, external data sources, and a control layer to improve AI task handling and efficiency.
Findings
Achieves high accuracy on multiple benchmarks (e.g., 83.6% on MMLU-Pro).
Handles most queries with small models and tools, reducing reliance on large LLMs.
Demonstrates strong multimodal performance across various datasets.
Abstract
We present Interfaze, a system that treats modern LLM applications as a problem of building and acting over context, not just picking the right monolithic model. Instead of a single transformer, we combine (i) a stack of heterogeneous DNNs paired with small language models as perception modules for OCR involving complex PDFs, charts and diagrams, and multilingual ASR with (ii) a context-construction layer that crawls, indexes, and parses external sources (web pages, code, PDFs) into compact structured state, and (iii) an action layer that can browse, retrieve, execute code in a sandbox, and drive a headless browser for dynamic web pages. A thin controller sits on top of this stack and exposes a single, OpenAI-style endpoint: it decides which small models and actions to run and always forwards the distilled context to a user-selected LLM that produces the final response. On this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Web Data Mining and Analysis · Natural Language Processing Techniques
