I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case   Study on Text-to-SQL Generation

Cheng-Kuang Wu; Zhi Rui Tam; Chao-Chung Wu; Chieh-Yen Lin; Hung-yi; Lee; Yun-Nung Chen

arXiv:2407.14767·cs.CL·October 1, 2024

I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case Study on Text-to-SQL Generation

Cheng-Kuang Wu, Zhi Rui Tam, Chao-Chung Wu, Chieh-Yen Lin, Hung-yi, Lee, Yun-Nung Chen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how large language models can proactively seek user support in text-to-SQL tasks, proposing metrics to evaluate when and how models should ask for help to balance performance and user burden.

Contribution

It introduces new metrics for assessing support-seeking behavior in LLMs and analyzes their ability to determine when to request user assistance under different information conditions.

Findings

01

Many LLMs struggle to recognize their need for support without external feedback.

02

External signals significantly improve models' ability to seek help appropriately.

03

Insights provided for future development of support-seeking strategies in LLMs.

Abstract

This study explores the proactive ability of LLMs to seek user support. We propose metrics to evaluate the trade-off between performance improvements and user burden, and investigate whether LLMs can determine when to request help under varying information availability. Our experiments show that without external feedback, many LLMs struggle to recognize their need for user support. The findings highlight the importance of external signals and provide insights for future research on improving support-seeking strategies. Source code: https://github.com/appier-research/i-need-help

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

appier-research/i-need-help
noneOfficial

Videos

I Need Help! Evaluating LLM’s Ability to Ask for Users’ Support: A Case Study on Text-to-SQL Generation· underline

Taxonomy

TopicsMathematics, Computing, and Information Processing