DNN-Powered MLOps Pipeline Optimization for Large Language Models: A Framework for Automated Deployment and Resource Management
Mahesh Vaijainthymala Krishnamoorthy, Kuppusamy Vellamadam Palavesam,, Siva Venkatesh Arcot, Rajarajeswari Chinniah Kuppuswami

TL;DR
This paper introduces a DNN-based framework that automates and optimizes the deployment and resource management of Large Language Models, significantly improving efficiency and reducing costs in diverse cloud environments.
Contribution
It presents a novel DNN-powered framework with adaptive resource allocation and deployment orchestration for scalable, cost-effective LLM operations, advancing automated MLOps capabilities.
Findings
40% improvement in resource utilization
35% reduction in deployment latency
30% decrease in operational costs
Abstract
The exponential growth in the size and complexity of Large Language Models (LLMs) has introduced unprecedented challenges in their deployment and operational management. Traditional MLOps approaches often fail to efficiently handle the scale, resource requirements, and dynamic nature of these models. This research presents a novel framework that leverages Deep Neural Networks (DNNs) to optimize MLOps pipelines specifically for LLMs. Our approach introduces an intelligent system that automates deployment decisions, resource allocation, and pipeline optimization while maintaining optimal performance and cost efficiency. Through extensive experimentation across multiple cloud environments and deployment scenarios, we demonstrate significant improvements: 40% enhancement in resource utilization, 35% reduction in deployment latency, and 30% decrease in operational costs compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsService-Oriented Architecture and Web Services · Distributed and Parallel Computing Systems · Robotics and Automated Systems
