Computational Advantages of Multi-Grade Deep Learning: Convergence Analysis and Performance Insights
Ronglong Fang, Yuesheng Xu

TL;DR
This paper analyzes the computational benefits of multi-grade deep learning (MGDL) over single-grade models, providing convergence proofs and insights into its improved robustness and stability in image tasks.
Contribution
It offers the first convergence analysis for MGDL under gradient descent and explains its superior performance through eigenvalue distribution analysis.
Findings
MGDL outperforms SGDL in image regression, denoising, and deblurring.
MGDL is more robust to learning rate choices during training.
Eigenvalue analysis explains MGDL's enhanced stability.
Abstract
Multi-grade deep learning (MGDL) has been shown to significantly outperform the standard single-grade deep learning (SGDL) across various applications. This work aims to investigate the computational advantages of MGDL focusing on its performance in image regression, denoising, and deblurring tasks, and comparing it to SGDL. We establish convergence results for the gradient descent (GD) method applied to these models and provide mathematical insights into MGDL's improved performance. In particular, we demonstrate that MGDL is more robust to the choice of learning rate under GD than SGDL. Furthermore, we analyze the eigenvalue distributions of the Jacobian matrices associated with the iterative schemes arising from the GD iterations, offering an explanation for MGDL's enhanced training stability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
