Machine Learning and CPU (Central Processing Unit) Scheduling Co-Optimization over a Network of Computing Centers

Mohammadreza Doostmohammadian; Zulfiya R. Gabidullina; Hamid R. Rabiee

arXiv:2510.25176·cs.LG·October 30, 2025

Machine Learning and CPU (Central Processing Unit) Scheduling Co-Optimization over a Network of Computing Centers

Mohammadreza Doostmohammadian, Zulfiya R. Gabidullina, Hamid R. Rabiee

PDF

TL;DR

This paper presents a novel co-optimization algorithm for distributed machine learning over networks of computing centers, enhancing CPU scheduling efficiency and resource allocation with proven convergence and significant cost improvements.

Contribution

It introduces a new distributed co-optimization framework for CPU scheduling and data processing in networked computing centers, with convergence guarantees and quantization handling.

Findings

01

Over 50% reduction in cost optimality gap compared to existing solutions

02

Convergence proven using Lyapunov stability and eigen-spectrum analysis

03

Algorithm supports time-varying networks and log-quantized data exchange

Abstract

In the rapidly evolving research on artificial intelligence (AI) the demand for fast, computationally efficient, and scalable solutions has increased in recent years. The problem of optimizing the computing resources for distributed machine learning (ML) and optimization is considered in this paper. Given a set of data distributed over a network of computing-nodes/servers, the idea is to optimally assign the CPU (central processing unit) usage while simultaneously training each computing node locally via its own share of data. This formulates the problem as a co-optimization setup to (i) optimize the data processing and (ii) optimally allocate the computing resources. The information-sharing network among the nodes might be time-varying, but with balanced weights to ensure consensus-type convergence of the algorithm. The algorithm is all-time feasible, which implies that the computing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.