Decoupled Rationalization with Asymmetric Learning Rates: A Flexible Lipschitz Restraint
Wei Liu, Jun Wang, Haozhao Wang, Ruixuan Li, Yang Qiu, YuanKai Zhang,, Jie Han, Yixiong Zou

TL;DR
This paper introduces a novel method called DR that decouples the generator and predictor with asymmetric learning rates to effectively prevent degeneration in rationalization models by controlling the predictor's Lipschitz constant.
Contribution
The paper theoretically links degeneration to the predictor's Lipschitz continuity and proposes a decoupled, asymmetric learning rate approach to address this issue.
Findings
DR effectively restrains predictor Lipschitz constant
Method improves rationalization quality on benchmarks
Decoupling enhances model stability and interpretability
Abstract
A self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales. However, such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces. In this paper, we theoretically bridge degeneration with the predictor's Lipschitz continuity. Then, we empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the predictor, to address the problem of degeneration. The main idea of DR is to decouple the generator and predictor to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies
