IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez; Nicholas Botzer; David Vazquez; Christopher Pal,; Marco Pedersoli; Issam Laradji

arXiv:2411.10670·cs.CL·November 19, 2024·5 cites

IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez, Nicholas Botzer, David Vazquez, Christopher Pal,, Marco Pedersoli, Issam Laradji

PDF

Open Access 3 Reviews

TL;DR

IntentGPT leverages large language models with prompt engineering to discover new user intents in dialogue systems efficiently, requiring minimal labeled data and outperforming traditional data-intensive methods.

Contribution

We introduce IntentGPT, a training-free approach that uses prompt-based learning with LLMs for effective intent discovery without extensive data or fine-tuning.

Findings

01

Outperforms existing intent discovery methods on benchmark datasets.

02

Requires minimal labeled data and no model fine-tuning.

03

Effective in identifying emerging user intents in dialogue systems.

Abstract

In today's digitally driven world, dialogue systems play a pivotal role in enhancing user interactions, from customer service to virtual assistants. In these dialogues, it is important to identify user's goals automatically to resolve their needs promptly. This has necessitated the integration of models that perform Intent Detection. However, users' intents are diverse and dynamic, making it challenging to maintain a fixed set of predefined intents. As a result, a more practical approach is to develop a model capable of identifying new intents as they emerge. We address the challenge of Intent Discovery, an area that has drawn significant attention in recent research efforts. Existing methods need to train on a substantial amount of data for correctly identifying new intents, demanding significant human effort. To overcome this, we introduce IntentGPT, a novel training-free method that…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

1. The paper is written well and cites the relevant literature 2. Intent classification and discovery is an interesting problem on its own 3. It also has many potential useful downstream applications 4. The results improve upon prior baselines

Weaknesses

While studying LLMs and their performance is an important endeavour, I am not convinced by the novelty of the proposed method or the efficacy of the individual components. For instance, 1. The first part of the pipeline (ICP) seems to add little value. From row 3 vs row 5 in bottom panel of Table 2, it seems that on average ICP led to no performance boost. 2. The semantic few shot sampling (SFS) also seems to add limited value (row 3 vs row 4) 3. Intutitively "Feedback" should help, but I am no

Reviewer 02Rating 3· reject, not good enoughConfidence 4

Strengths

1. This is a well-written paper, well-organized, and clear to read and follow. 2. There is a substantive literature review which is well presented, clearly comparing each work to the paper. 3. There are substantive appendices that are very helpful to the reader. 4. Throughout all of it, there is obvious diligence, attention to detail, and care for the presentation. 5. This work could be useful to anyone new to the area of Intent Discovery, and in particular applying LLMs to this problem. a. Thi

Weaknesses

The first LLM does not appear to contribute any value to the model, as discussed below. If this criticism is correct, then this work becomes more of an application paper, a study of applying off-the-shelf LLM to a known problem, showing how readily available new technology outperforms previous methods while significantly increasing simplicity and flexibility. While valuable, it is probably not a good match for this conference/track. The role of the first LLM (LLM1) in the model is to create the

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

The paper has good empirical validation is clear in its explanations and figures and is attacking a difficult and significant problem. The paper does a thorough empirical validation by comparing the multiple LLM models, including an open-source, one. This highlights that the proposed methodology is actually what is working for the task at hand. The paper also shows its proposed methods' performance relative to state-of-the-art methods which thoroughly validates the methods' effectiveness. Overa

Weaknesses

The paper is missing some grounding in previous literature and it's not clear what the value of all of the components of its proposed methodology is. First, the whole component of Semantic Few-shot Sampling (SFS) reads to me as just being Retrieval Augmented Generation or RAG. See https://ai.meta.com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models/ for a description. The SFS component then is not a novel contribution and should be c

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

Methodstravel james · Attention Is All You Need · Dense Connections · Label Smoothing · Adam · Residual Connection · Byte Pair Encoding · Linear Layer · Softmax · Position-Wise Feed-Forward Layer