Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time
Anshul Nasery, Soumyadeep Thakur, Vihari Piratla, Abir De, Sunita, Sarawagi

TL;DR
This paper introduces a Gradient Interpolation (GI) loss that regularizes temporal complexity in models, enabling better generalization over time in evolving data distributions, outperforming existing methods.
Contribution
The paper proposes a simple, scalable GI loss for regularizing temporal changes in models, improving temporal generalization without relying on unlabeled future data.
Findings
GI outperforms complex generative and adversarial methods.
GI surpasses simpler gradient regularization techniques.
Method demonstrates superior results on multiple real-world datasets.
Abstract
In several real world applications, machine learning models are deployed to make predictions on data whose distribution changes gradually along time, leading to a drift between the train and test distributions. Such models are often re-trained on new data periodically, and they hence need to generalize to data not too far into the future. In this context, there is much prior work on enhancing temporal generalization, e.g. continuous transportation of past data, kernel smoothed time-sensitive parameters and more recently, adversarial learning of time-invariant features. However, these methods share several limitations, e.g, poor scalability, training instability, and dependence on unlabeled data from the future. Responding to the above limitations, we propose a simple method that starts with a model with time-sensitive parameters but regularizes its temporal complexity using a Gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
