Towards explainable evaluation of language models on the semantic similarity of visual concepts
Maria Lymperaiou, George Manoliadis, Orfeas Menis Mastromichalakis,, Edmund G. Dervakos, Giorgos Stamou

TL;DR
This paper investigates the robustness and explainability of evaluation metrics for language models on semantic similarity tasks involving visual concepts, proposing new metrics and analyzing vulnerabilities.
Contribution
It introduces explainable evaluation metrics for semantic similarity, revealing limitations of existing methods and exposing model vulnerabilities through adversarial interventions.
Findings
Existing metrics lack explainability and robustness.
Proposed metrics offer detailed insights into model performance.
Adversarial attacks reveal vulnerabilities in semantic representations.
Abstract
Recent breakthroughs in NLP research, such as the advent of Transformer models have indisputably contributed to major advancements in several tasks. However, few works research robustness and explainability issues of their evaluation strategies. In this work, we examine the behavior of high-performing pre-trained language models, focusing on the task of semantic similarity for visual vocabularies. First, we address the need for explainable evaluation metrics, necessary for understanding the conceptual quality of retrieved instances. Our proposed metrics provide valuable insights in local and global level, showcasing the inabilities of widely used approaches. Secondly, adversarial interventions on salient query semantics expose vulnerabilities of opaque metrics and highlight patterns in learned linguistic representations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Absolute Position Encodings · Adam · Softmax · Multi-Head Attention · Residual Connection · Position-Wise Feed-Forward Layer · Dropout
