Hamiltonian Monte Carlo for Regression with High-Dimensional Categorical Data
Szymon Sacher, Laura Battaglia, Stephen Hansen

TL;DR
This paper introduces a Hamiltonian Monte Carlo method with automatic differentiation for high-dimensional categorical data, demonstrating its efficiency and applicability in economic latent variable models.
Contribution
It presents a new approach combining HMC and parallelized automatic differentiation for high-dimensional categorical data analysis, improving efficiency and methodological soundness.
Findings
HMC with automatic differentiation effectively analyzes high-dimensional categorical data.
The new model impacts conclusions compared to traditional two-step approaches.
Simulation and case studies validate the method's efficiency and applicability.
Abstract
Latent variable models are increasingly used in economics for high-dimensional categorical data like text and surveys. We demonstrate the effectiveness of Hamiltonian Monte Carlo (HMC) with parallelized automatic differentiation for analyzing such data in a computationally efficient and methodologically sound manner. Our new model, Supervised Topic Model with Covariates, shows that carefully modeling this type of data can have significant implications on conclusions compared to a simpler, frequently used, yet methodologically problematic, two-step approach. A simulation study and revisiting Bandiera et al. (2020)'s study of executive time use demonstrate these results. The approach accommodates thousands of parameters and doesn't require custom algorithms specific to each model, making it accessible for applied researchers
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference
