Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data
Simret Araya Gebreegziabher, Allison E Sproul, Yinuo Yang, Chaoran Chen, Diego G\'omez-Zar\'a, Toby Jia-Jun Li

TL;DR
This paper advocates for a shift to longitudinal, context-aware methods for measuring human-LLM alignment, introducing a framework and a browser tool that reveal preference changes over time.
Contribution
It proposes a novel methodological framework combining in-situ preference capture, follow-up reflection, and privacy-preserving behavioral data, exemplified by the BITE system.
Findings
Differences observed between immediate and later user preferences in LLM outputs.
Longitudinal data reveals limitations of single-moment preference datasets.
The approach underscores the importance of temporal methods for alignment evaluation.
Abstract
Current human-AI alignment and evaluation methods for large language models (LLMs) often rely on preference signals collected immediately after an interaction. This practice implicitly treats preference as static, even though many LLM-mediated decisions unfold over time and may be re-evaluated differently after real-world consequences and observed outcomes. Therefore, we argue for a methodological shift from single-moment preference elicitation to longitudinal, context-situated alignment measurement. We present a methodological framework for collecting temporally grounded alignment signals by combining (1) in-situ preference capture, (2) context-triggered follow-up preference reflection, and (3) privacy-preserving behavioral traces that help interpret preference change. As an instantiation of this methodology, we introduce BITE, a browser-based system that detects consequential LLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
