TL;DR
This paper introduces unsupervised methods for extracting fashion attributes from Instagram text using deep learning and weak supervision, achieving near-human classification performance despite noisy, multilingual data.
Contribution
It presents a novel approach combining word embeddings and generative models for weakly supervised fashion attribute extraction from Instagram text.
Findings
Word embeddings outperform Levenshtein distance in info extraction
Weak supervision with generative models improves classification accuracy
Achieved F1 score of 0.61, comparable to human performance
Abstract
With the advent of social media, our online feeds increasingly consist of short, informal, and unstructured text. This textual data can be analyzed for the purpose of improving user recommendations and detecting trends. Instagram is one of the largest social media platforms, containing both text and images. However, most of the prior research on text processing in social media is focused on analyzing Twitter data, and little attention has been paid to text mining of Instagram data. Moreover, many text mining methods rely on annotated training data, which in practice is both difficult and expensive to obtain. In this paper, we present methods for unsupervised mining of fashion attributes from Instagram text, which can enable a new kind of user recommendation in the fashion domain. In this context, we analyze a corpora of Instagram posts from the fashion domain, introduce a system for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
