Cognitive Platform Engineering for Autonomous Cloud Operations
Vinoth Punniyamoorthy, Nitin Saksena, Srivenkateswara Reddy Sankiti, Nachiappan Chockalingam, Aswathnarayan Muthukrishnan Kirubakaran, Shiva Kumar Reddy Carimireddy, Durgaraman Maruthavanan

TL;DR
This paper presents Cognitive Platform Engineering, a new paradigm integrating sensing, reasoning, and autonomous actions into cloud platform management to improve efficiency, resilience, and compliance in dynamic cloud-native environments.
Contribution
It introduces a four-plane reference architecture unifying data collection, inference, policy orchestration, and human interaction, with a prototype demonstrating tangible operational improvements.
Findings
Reduced mean time to resolution
Enhanced resource efficiency
Improved compliance and resilience
Abstract
Modern DevOps practices have accelerated software delivery through automation, CI/CD pipelines, and observability tooling,but these approaches struggle to keep pace with the scale and dynamism of cloud-native systems. As telemetry volume grows and configuration drift increases, traditional, rule-driven automation often results in reactive operations, delayed remediation, and dependency on manual expertise. This paper introduces Cognitive Platform Engineering, a next-generation paradigm that integrates sensing, reasoning, and autonomous action directly into the platform lifecycle. This paper propose a four-plane reference architecture that unifies data collection, intelligent inference, policy-driven orchestration, and human experience layers within a continuous feedback loop. A prototype implementation built with Kubernetes, Terraform, Open Policy Agent, and ML-based anomaly detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · IoT and Edge/Fog Computing · Cognitive Computing and Networks
