Leveraging Knowledge and Reinforcement Learning for Enhanced Reliability of Language Models
Nancy Tyagi, Surjodeep Sarkar, Manas Gaur

TL;DR
This paper proposes a knowledge-guided reinforcement learning ensemble method that integrates external knowledge sources to improve the reliability and accuracy of language models across multiple NLP benchmarks.
Contribution
It introduces a novel ensemble approach leveraging reinforcement learning and external knowledge graphs to enhance language model reliability, addressing a gap in current evaluation metrics.
Findings
Ensembling improves reliability scores on GLUE datasets.
The method outperforms existing state-of-the-art models.
External knowledge integration enhances model robustness.
Abstract
The Natural Language Processing(NLP) community has been using crowd sourcing techniques to create benchmark datasets such as General Language Understanding and Evaluation(GLUE) for training modern Language Models such as BERT. GLUE tasks measure the reliability scores using inter annotator metrics i.e. Cohens Kappa. However, the reliability aspect of LMs has often been overlooked. To counter this problem, we explore a knowledge-guided LM ensembling approach that leverages reinforcement learning to integrate knowledge from ConceptNet and Wikipedia as knowledge graph embeddings. This approach mimics human annotators resorting to external knowledge to compensate for information deficits in the datasets. Across nine GLUE datasets, our research shows that ensembling strengthens reliability and accuracy scores, outperforming state of the art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Mobile Crowdsensing and Crowdsourcing
MethodsAttention Is All You Need · Residual Connection · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Attention Dropout · WordPiece · Softmax · Dense Connections · Layer Normalization
