SAGE: Structured Attribute Value Generation for Billion-Scale Product Catalogs
Athanasios N. Nikolakopoulos, Swati Kaul, Siva Karthik Gade, Bella, Dubrov, Umit Batur, Suleiman Ali Khan

TL;DR
SAGE is a novel generative model for inferring product attribute values in e-Commerce catalogs, capable of handling implicit, absent, or inapplicable attributes across multiple languages and product types, reducing labeling needs.
Contribution
Introduces SAGE, a Seq2Seq-based approach that predicts attribute values without predefined options, handling implicit and absent attributes, and enabling zero-shot inference in e-Commerce catalogs.
Findings
SAGE outperforms state-of-the-art methods in attribute-value prediction.
Effective in zero-shot settings, reducing the need for labeled data.
Capable of handling multiple languages and product types.
Abstract
We introduce SAGE; a Generative LLM for inferring attribute values for products across world-wide e-Commerce catalogs. We introduce a novel formulation of the attribute-value prediction problem as a Seq2Seq summarization task, across languages, product types and target attributes. Our novel modeling approach lifts the restriction of predicting attribute values within a pre-specified set of choices, as well as, the requirement that the sought attribute values need to be explicitly mentioned in the text. SAGE can infer attribute values even when such values are mentioned implicitly using periphrastic language, or not-at-all-as is the case for common-sense defaults. Additionally, SAGE is capable of predicting whether an attribute is inapplicable for the product at hand, or non-obtainable from the available information. SAGE is the first method able to tackle all aspects of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
