Model Callers for Transforming Predictive and Generative AI Applications
Mukesh Dalal

TL;DR
This paper presents a new 'model caller' abstraction that improves AI/ML model deployment by enhancing accuracy, reducing latency, and streamlining development, with a prototype Python library for practical use.
Contribution
Introduction of the 'model caller' abstraction and a prototype Python library to improve AI/ML model deployment and management.
Findings
Enhanced prediction accuracy and reduced latency.
Improved model monitoring and observability.
Simplified AI system architecture and development processes.
Abstract
We introduce a novel software abstraction termed "model caller," acting as an intermediary for AI and ML model calling, advocating its transformative utility beyond existing model-serving frameworks. This abstraction offers multiple advantages: enhanced accuracy and reduced latency in model predictions, superior monitoring and observability of models, more streamlined AI system architectures, simplified AI development and management processes, and improved collaboration and accountability across AI/ML/Data Science, software, data, and operations teams. Model callers are valuable for both creators and users of models within both predictive and generative AI applications. Additionally, we have developed and released a prototype Python library for model callers, accessible for installation via pip or for download from GitHub.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Explainable Artificial Intelligence (XAI)
MethodsLib
