How to Data in Datathons
Carlos Mougan, Richard Plant, Clare Teng, Marya Bazzi, Alvaro, Cabrejas-Egea, Ryan Sze-Yin Chan, David Salvador Jasin, Martin Stoffel,, Kirstie Jane Whitaker, Jules Manser

TL;DR
This paper offers guidelines and best practices for organizing datathons, based on extensive experience and case studies, to help address data-related challenges effectively.
Contribution
It introduces a comprehensive framework and practical recommendations for datathon organizers to manage data issues successfully.
Findings
Framework applied successfully to 10 case studies
Guidelines improve data handling in datathons
Enhanced organizer preparedness for data challenges
Abstract
The rise of datathons, also known as data or data science hackathons, has provided a platform to collaborate, learn, and innovate in a short timeframe. Despite their significant potential benefits, organizations often struggle to effectively work with data due to a lack of clear guidelines and best practices for potential issues that might arise. Drawing on our own experiences and insights from organizing >80 datathon challenges with >60 partnership organizations since 2016, we provide guidelines and recommendations that serve as a resource for organizers to navigate the data-related complexities of datathons. We apply our proposed framework to 10 case studies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsBiomedical and Engineering Education · Genetics, Bioinformatics, and Biomedical Research
