Model Callers for Transforming Predictive and Generative AI Applications

Mukesh Dalal

arXiv:2406.15377·cs.CY·June 25, 2024

Model Callers for Transforming Predictive and Generative AI Applications

Mukesh Dalal

PDF

Open Access

TL;DR

This paper presents a new 'model caller' abstraction that improves AI/ML model deployment by enhancing accuracy, reducing latency, and streamlining development, with a prototype Python library for practical use.

Contribution

Introduction of the 'model caller' abstraction and a prototype Python library to improve AI/ML model deployment and management.

Findings

01

Enhanced prediction accuracy and reduced latency.

02

Improved model monitoring and observability.

03

Simplified AI system architecture and development processes.

Abstract

We introduce a novel software abstraction termed "model caller," acting as an intermediary for AI and ML model calling, advocating its transformative utility beyond existing model-serving frameworks. This abstraction offers multiple advantages: enhanced accuracy and reduced latency in model predictions, superior monitoring and observability of models, more streamlined AI system architectures, simplified AI development and management processes, and improved collaboration and accountability across AI/ML/Data Science, software, data, and operations teams. Model callers are valuable for both creators and users of models within both predictive and generative AI applications. Additionally, we have developed and released a prototype Python library for model callers, accessible for installation via pip or for download from GitHub.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Neural Networks and Applications · Explainable Artificial Intelligence (XAI)

MethodsLib