What should I wear to a party in a Greek taverna? Evaluation for   Conversational Agents in the Fashion Domain

Antonis Maronikolakis; Ana Peleteiro Ramallo; Weiwei Cheng; Thomas; Kober

arXiv:2408.08907·cs.IR·August 20, 2024

What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain

Antonis Maronikolakis, Ana Peleteiro Ramallo, Weiwei Cheng, Thomas, Kober

PDF

Open Access

TL;DR

This paper introduces a multilingual dataset of 4,000 fashion-related conversations to evaluate large language models' ability to serve as conversational agents in online fashion retail, focusing on their capacity to interact with backend systems.

Contribution

It presents a new high-quality, multilingual dataset for evaluating LLMs in fashion e-commerce and demonstrates its utility in assessing models' capabilities for practical deployment.

Findings

01

The dataset effectively scales to business needs.

02

LLMs show varying performance in calling backend systems.

03

The dataset facilitates iterative development of conversational tools.

Abstract

Large language models (LLMs) are poised to revolutionize the domain of online fashion retail, enhancing customer experience and discovery of fashion online. LLM-powered conversational agents introduce a new way of discovery by directly interacting with customers, enabling them to express in their own ways, refine their needs, obtain fashion and shopping advice that is relevant to their taste and intent. For many tasks in e-commerce, such as finding a specific product, conversational agents need to convert their interactions with a customer to a specific call to different backend systems, e.g., a search system to showcase a relevant set of products. Therefore, evaluating the capabilities of LLMs to perform those tasks related to calling other services is vital. However, those evaluations are generally complex, due to the lack of relevant and high quality datasets, and do not align…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Social Robot Interaction and HRI

MethodsSparse Evolutionary Training · ALIGN