Hierarchical Topic Presence Models
Jason Wang, Robert E. Weiss

TL;DR
This paper extends Poisson factor analysis for hierarchical topic modeling by incorporating binary topic presence parameters at web site and page levels, enabling analysis of nested web page data with covariates.
Contribution
It introduces hierarchical topic presence models with local topics and covariate-dependent topic inclusion, using novel data augmentation techniques for efficient inference.
Findings
Identified patterns of topic presence across US health department websites.
Demonstrated the model's ability to handle sparse and nested text data.
Provided insights into national health topics through web page analysis.
Abstract
Topic models analyze text from a set of documents. Documents are modeled as a mixture of topics, with topics defined as probability distributions on words. Inferences of interest include the most probable topics and characterization of a topic by inspecting the topic's highest probability words. Motivated by a data set of web pages (documents) nested in web sites, we extend the Poisson factor analysis topic model to hierarchical topic presence models for analyzing text from documents nested in known groups. We incorporate an unknown binary topic presence parameter for each topic at the web site and/or the web page level to allow web sites and/or web pages to be sparse mixtures of topics and we propose logistic regression modeling of topic presence conditional on web site covariates. We introduce local topics into the Poisson factor analysis framework, where each web site has a local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Data-Driven Disease Surveillance · Bayesian Methods and Mixture Models
MethodsLogistic Regression
