Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach
Liyan Xu, Chenwei Zhang, Xian Li, Jingbo Shang, Jinho D. Choi

TL;DR
This paper introduces a lightly-supervised method for open-world product attribute mining in e-commerce, leveraging self-supervised heuristics and a new dataset to expand attribute vocabularies and discover new attribute types with minimal human input.
Contribution
The paper proposes Amacer, a novel approach that effectively mines open-world product attributes using limited supervision and self-supervised signals, outperforming baselines.
Findings
Achieves 12 F1 improvement over baselines.
Expands attribute types up to 12 times.
Discovers values from 39% new attribute types.
Abstract
We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention. Our supervision comes from a high-quality seed attribute set bootstrapped from existing resources, and we aim to expand the attribute vocabulary of existing seed types, and also to discover any new attribute types automatically. A new dataset is created to support our setting, and our approach Amacer is proposed specifically to tackle the limited supervision. Especially, given that no direct supervision is available for those unseen new attributes, our novel formulation exploits self-supervised heuristic and unsupervised latent attributes, which attains implicit semantic signals as additional supervision by leveraging product context. Experiments suggest that our approach surpasses various baselines by 12 F1,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Rough Sets and Fuzzy Logic · Text and Document Classification Technologies
