Less is More: A Lightweight and Robust Neural Architecture for Discourse Parsing
Ming Li, Ruihong Huang

TL;DR
This paper introduces a lightweight neural architecture for discourse parsing that leverages self-attention modules and pretrained language models, achieving better generalizability and efficiency compared to complex feature extractors.
Contribution
It proposes a simplified neural architecture using only self-attention modules, reducing complexity while maintaining or improving performance in discourse parsing.
Findings
Outperforms complex feature extractor models in generalizability.
Achieves comparable or better accuracy with fewer parameters.
Reduces processing time significantly.
Abstract
Complex feature extractors are widely employed for text representation building. However, these complex feature extractors make the NLP systems prone to overfitting especially when the downstream training datasets are relatively small, which is the case for several discourse parsing tasks. Thus, we propose an alternative lightweight neural architecture that removes multiple complex feature extractors and only utilizes learnable self-attention modules to indirectly exploit pretrained neural language models, in order to maximally preserve the generalizability of pre-trained language models. Experiments on three common discourse parsing tasks show that powered by recent pretrained language models, the lightweight architecture consisting of only two self-attention layers obtains much better generalizability and robustness. Meanwhile, it achieves comparable or even better system performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
