What Shapes Participant Data Quality? A Scoping Review and Case Study of Crowdsourced Webcam Eye Tracking in AI Interviews
Ka Hei Carrie Lau, Enkelejda Kasneci

TL;DR
This paper reviews and analyzes the factors influencing data quality in crowdsourced webcam eye tracking, including a case study revealing key behavioral and technical predictors of data reliability.
Contribution
It provides a comprehensive scoping review of current practices and introduces predictive insights from a case study to improve data quality in crowdsourced eye tracking.
Findings
Higher fixation counts and shorter sessions predict better data quality.
Operating system choice significantly affects data quality.
Fragmented reporting and lack of benchmarks hinder progress in the field.
Abstract
Webcam-based eye tracking is a cost-effective, scalable method for remote research that effectively reaches broader populations. However, uncontrolled environments and hardware diversity lead to inconsistent data quality in crowdsourcing. To assess current practices, we conducted a scoping review of crowdsourced eye-tracking from 2011-2025. The review confirms fragmented reporting and a lack of established quality benchmarks. To address this lack of predictive insight, we conducted a case study on AI fairness interviews (N=205) using the RealEye platform. Applying Ordered Logistic Regression (OLR) to the platform quality metric, we found that behavioral and technical factors significantly predict data quality. Specifically, within the RealEye platform, higher fixation counts, shorter sessions, and operating system choice yield significantly higher quality grades. Based on this review…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
