MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data
Martin Becker, Florian Lemmerich, Philipp Singer, Markus Strohmaier,, Andreas Hotho

TL;DR
MixedTrails introduces a Bayesian framework for comparing hypotheses about the generative processes of heterogeneous sequential data, enabling insights into varying user behaviors across different groups or phases.
Contribution
It presents a novel Bayesian approach using mixed transition Markov chains and Bayes factors to evaluate hypotheses about sequence data with heterogeneous transition dynamics.
Findings
Effective in distinguishing plausible hypotheses on synthetic data
Applied successfully to real-world Wikipedia and Flickr data
Provides analytical and approximate methods for inference
Abstract
Sequential traces of user data are frequently observed online and offline, e.g., as sequences of visited websites or as sequences of locations captured by GPS. However, understanding factors explaining the production of sequence data is a challenging task, especially since the data generation is often not homogeneous. For example, navigation behavior might change in different phases of browsing a website, or movement behavior may vary between groups of users. In this work, we tackle this task and propose MixedTrails, a Bayesian approach for comparing the plausibility of hypotheses regarding the generative processes of heterogeneous sequence data. Each hypothesis is derived from existing literature, theory or intuition and represents a belief about transition probabilities between a set of states that can vary between groups of observed transitions. For example, when trying to understand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
