Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks
Luke Guerdan, Devansh Saxena, Stevie Chancellor, Zhiwei Steven Wu, Kenneth Holstein

TL;DR
Data scientists construct target variables for fuzzy concepts through a bricolage process, creatively combining pragmatic strategies to meet multiple criteria like validity and predictability, especially in education and healthcare contexts.
Contribution
This study reveals how data scientists pragmatically build target variables using bricolage, highlighting adaptive problem reformulation strategies for fuzzy concepts.
Findings
Data scientists use bricolage to construct target variables.
They adaptively swap or combine outcomes to meet criteria.
The process is creative and pragmatic, tailored to limited data.
Abstract
Data scientists often formulate predictive modeling tasks involving fuzzy, hard-to-define concepts, such as the "authenticity" of student writing or the "healthcare need" of a patient. Yet the process by which data scientists translate fuzzy concepts into a concrete, proxy target variable remains poorly understood. We interview fifteen data scientists in education (N=8) and healthcare (N=7) to understand how they construct target variables for predictive modeling tasks. Our findings suggest that data scientists construct target variables through a bricolage process, in which they use creative and pragmatic approaches to make do with the limited data at hand. Data scientists attempt to satisfy five major criteria for a target variable through bricolage: validity, simplicity, predictability, portability, and resource requirements. To achieve this, data scientists adaptively apply problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
