Can Language Model Understand Word Semantics as A Chatbot? An Empirical   Study of Language Model Internal External Mismatch

Jinman Zhao; Xueyan Zhang; Xingyu Yue; Weizhe Chen; Zifan Qian; Ruiyu; Wang

arXiv:2409.13972·cs.CL·September 24, 2024

Can Language Model Understand Word Semantics as A Chatbot? An Empirical Study of Language Model Internal External Mismatch

Jinman Zhao, Xueyan Zhang, Xingyu Yue, Weizhe Chen, Zifan Qian, Ruiyu, Wang

PDF

Open Access

TL;DR

This paper investigates the differences between how language models understand word semantics internally versus externally, highlighting discrepancies across various model architectures through an empirical study.

Contribution

It provides a comprehensive empirical analysis of internal and external semantic mismatches in different pre-trained language model architectures.

Findings

01

Identifies significant internal-external semantic mismatches in models

02

Highlights architecture-specific differences in semantic understanding

03

Provides insights into model internal representations versus external outputs

Abstract

Current common interactions with language models is through full inference. This approach may not necessarily align with the model's internal knowledge. Studies show discrepancies between prompts and internal representations. Most focus on sentence understanding. We study the discrepancy of word semantics understanding in internal and external mismatch across Encoder-only, Decoder-only, and Encoder-Decoder pre-trained language models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · AI in Service Interactions

MethodsFocus · ALIGN