Eliminating self-selection: Using data science for authentic undergraduate research in a first-year introductory course
Lior Shamir

TL;DR
This paper presents an inclusive data science research experience integrated into a first-year course, enabling all students to participate in authentic research without selection barriers, thereby enhancing engagement and learning in STEM.
Contribution
The study introduces a scalable, inclusive research model that eliminates self-selection bias, allowing all students to engage in authentic research early in their academic journey.
Findings
Students analyze large text datasets using discovery tools.
Students identify patterns in congressional speeches.
All students participate without selection barriers.
Abstract
Research experience and mentoring has been identified as an effective intervention for increasing student engagement and retention in the STEM fields, with high impact on students from undeserved populations. However, one-on-one mentoring is limited by the number of available faculty, and in certain cases also by the availability of funding for stipend. One-on-one mentoring is further limited by the selection and self-selection of students. Since research positions are often competitive, they are often taken by the best-performing students. More importantly, many students who do not see themselves as the top students of their class, or do not identify themselves as researchers might not apply, and that self selection can have the highest impact on non-traditional students. To address the obstacles of scalability, selection, and self-selection, we designed a data science research…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistics Education and Methodologies · Online Learning and Analytics · Teaching and Learning Programming
