Comparative Study of Language Models on Cross-Domain Data with Model   Agnostic Explainability

Mayank Chhipa; Hrushikesh Mahesh Vazurkar; Abhijeet Kumar; Mridul; Mishra

arXiv:2009.04095·cs.CL·September 10, 2020

Comparative Study of Language Models on Cross-Domain Data with Model Agnostic Explainability

Mayank Chhipa, Hrushikesh Mahesh Vazurkar, Abhijeet Kumar, Mridul, Mishra

PDF

Open Access

TL;DR

This paper systematically compares various transformer-based language models on cross-domain datasets, introduces model-agnostic explainability, and achieves new state-of-the-art results in sentiment and rating classification tasks.

Contribution

It provides a comprehensive comparison of BERT, ELECTRA, RoBERTa, ALBERT, and DistilBERT on non-GLUE datasets and introduces a model-agnostic explainability approach.

Findings

01

Achieved new state-of-the-art accuracy on Yelp 2013 and Financial Phrasebank datasets.

02

Demonstrated the effectiveness of model-agnostic explainability in verifying context capturing.

03

Provided insights for industry researchers on model selection based on performance and efficiency.

Abstract

With the recent influx of bidirectional contextualized transformer language models in the NLP, it becomes a necessity to have a systematic comparative study of these models on variety of datasets. Also, the performance of these language models has not been explored on non-GLUE datasets. The study presented in paper compares the state-of-the-art language models - BERT, ELECTRA and its derivatives which include RoBERTa, ALBERT and DistilBERT. We conducted experiments by finetuning these models for cross domain and disparate data and penned an in-depth analysis of model's performances. Moreover, an explainability of language models coherent with pretraining is presented which verifies the context capturing capabilities of these models through a model agnostic approach. The experimental results establish new state-of-the-art for Yelp 2013 rating classification task and Financial Phrasebank…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare

MethodsLinear Layer · LAMB · ELECTRA · ALBERT · Layer Normalization · Weight Decay · Dropout · Linear Warmup With Linear Decay · RoBERTa · Dense Connections