Image-Seeking Intent Prediction for Cross-Device Product Search

Mariya Hendriksen; Svitlana Vakulenko; Jordan Massiah; Gabriella Kazai; Emine Yilmaz

arXiv:2511.14764·cs.IR·November 20, 2025

Image-Seeking Intent Prediction for Cross-Device Product Search

Mariya Hendriksen, Svitlana Vakulenko, Jordan Massiah, Gabriella Kazai, Emine Yilmaz

PDF

Open Access

TL;DR

This paper introduces a novel task and model for predicting when a user query in e-commerce requires visual augmentation and cross-device switching, enhancing personalized shopping experiences.

Contribution

It proposes Image-Seeking Intent Prediction, leveraging large-scale data and a new IRP model to improve cross-device product search accuracy.

Findings

01

Combining query semantics with product data improves prediction accuracy.

02

Lightweight summarization enhances model performance.

03

A differentiable loss reduces false positives.

Abstract

Large Language Models (LLMs) are transforming personalized search, recommendations, and customer interaction in e-commerce. Customers increasingly shop across multiple devices, from voice-only assistants to multimodal displays, each offering different input and output capabilities. A proactive suggestion to switch devices can greatly improve the user experience, but it must be offered with high precision to avoid unnecessary friction. We address the challenge of predicting when a query requires visual augmentation and a cross-device switch to improve product discovery. We introduce Image-Seeking Intent Prediction, a novel task for LLM-driven e-commerce assistants that anticipates when a spoken product query should proactively trigger a visual on a screen-enabled device. Using large-scale production data from a multi-device retail assistant, including 900K voice queries, associated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · AI in Service Interactions · Sentiment Analysis and Opinion Mining