We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Fredrik Olsson, Magnus Sahlgren

TL;DR
This paper emphasizes the critical role of data readiness in NLP project success, highlighting the gap between research and application, and proposes a communication method based on Data Readiness Levels to improve collaboration.
Contribution
It introduces a novel method for enhancing communication about data readiness between researchers and stakeholders, facilitating better transfer of NLP research to practical applications.
Findings
The method has been iterated on and applied in multiple projects.
Improved understanding of data accessibility and validity among stakeholders.
Raises awareness of data readiness importance in NLP.
Abstract
In this paper, we identify the state of data as being an important reason for failure in applied Natural Language Processing (NLP) projects. We argue that there is a gap between academic research in NLP and its application to problems outside academia, and that this gap is rooted in poor mutual understanding between academic researchers and their non-academic peers who seek to apply research results to their operations. To foster transfer of research results from academia to non-academic settings, and the corresponding influx of requirements back to academia, we propose a method for improving the communication between researchers and external stakeholders regarding the accessibility, validity, and utility of data based on Data Readiness Levels \cite{lawrence2017data}. While still in its infancy, the method has been iterated on and applied in multiple innovation and research projects…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Explainable Artificial Intelligence (XAI) · Big Data and Business Intelligence
