Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks
Surgan Jandial, Ayush Chopra, Mausoom Sarkar, Piyush Gupta, Balaji, Krishnamurthy, Vineeth Balasubramanian

TL;DR
This paper introduces a retrospective loss function that leverages past model states during training to enhance deep neural network performance across various domains and architectures.
Contribution
The paper proposes a novel retrospective loss that utilizes prior model states to improve training effectiveness of deep neural networks.
Findings
Improved performance across images, speech, text, and graphs.
Effective in various architectures and tasks.
Enhances training by guiding parameters towards optimal states.
Abstract
Deep neural networks (DNNs) are powerful learning machines that have enabled breakthroughs in several domains. In this work, we introduce a new retrospective loss to improve the training of deep neural network models by utilizing the prior experience available in past model states during training. Minimizing the retrospective loss, along with the task-specific loss, pushes the parameter state at the current training step towards the optimal parameter state while pulling it away from the parameter state at a previous training step. Although a simple idea, we analyze the method as well as to conduct comprehensive sets of experiments across domains - images, speech, text, and graphs - to show that the proposed loss results in improved performance across input domains, tasks, and architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
