Regulation of Language Models With Interpretability Will Likely Result   In A Performance Trade-Off

Eoin M. Kenny; Julie A. Shah

arXiv:2412.12169·cs.LG·December 18, 2024

Regulation of Language Models With Interpretability Will Likely Result In A Performance Trade-Off

Eoin M. Kenny, Julie A. Shah

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that regulating large language models to use human-defined features causes a performance trade-off but can enhance human-AI collaboration efficiency and confidence in practical applications.

Contribution

It introduces a method to build regulatable LLMs and quantifies the impact of regulation constraints on performance and human collaboration.

Findings

01

Regulation causes a 7.34% drop in classification accuracy.

02

Regulated models improve human task speed.

03

Regulated models increase appropriate confidence in AI assistance.

Abstract

Regulation is increasingly cited as the most important and pressing concern in machine learning. However, it is currently unknown how to implement this, and perhaps more importantly, how it would effect model performance alongside human collaboration if actually realized. In this paper, we attempt to answer these questions by building a regulatable large-language model (LLM), and then quantifying how the additional constraints involved affect (1) model performance, alongside (2) human collaboration. Our empirical results reveal that it is possible to force an LLM to use human-defined features in a transparent way, but a "regulation performance trade-off" previously not considered reveals itself in the form of a 7.34% classification performance drop. Surprisingly however, we show that despite this, such systems actually improve human task performance speed and appropriate confidence in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eoinkenny/regulatable_llms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Multi-Agent Systems and Negotiation · Economic Policies and Impacts

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings