Global dense vector representations for words or items using shared parameter alternating Tweedie model
Taejoon Kim, Haiyan Wang

TL;DR
This paper introduces the Shared parameter Alternating Tweedie (SA-Tweedie) model for analyzing high-dimensional cooccurrence count data, improving estimation accuracy with a novel algorithm suitable for recommender systems and text analysis.
Contribution
The paper proposes the SA-Tweedie model and an estimation algorithm combining Fisher scoring with learning rate adjustment, addressing high-dimensional cooccurrence data challenges.
Findings
The proposed algorithm outperforms alternatives in simulations.
SA-Tweedie effectively models zero-inflated cooccurrence data.
Pseudo-likelihood approach is less suitable for unobserved covariates.
Abstract
In this article, we present a model for analyzing the cooccurrence count data derived from practical fields such as user-item or item-item data from online shopping platform, cooccurring word-word pairs in sequences of texts. Such data contain important information for developing recommender systems or studying relevance of items or words from non-numerical sources. Different from traditional regression models, there are no observations for covariates. Additionally, the cooccurrence matrix is typically of so high dimension that it does not fit into a computer's memory for modeling. We extract numerical data by defining windows of cooccurrence using weighted count on the continuous scale. Positive probability mass is allowed for zero observations. We present Shared parameter Alternating Tweedie (SA-Tweedie) model and an algorithm to estimate the parameters. We introduce a learning rate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsAdam
