Evolution and use of data science vocabulary. How much have we changed in 13 years?
Igor Barahona

TL;DR
This paper analyzes the evolution of data science vocabulary over 13 years, identifying key words, documents, and phases of emergence, growth, and boom to understand how the discipline has changed.
Contribution
It introduces a classification of data science's vocabulary evolution into three periods and identifies characteristic words and pioneering documents for each phase.
Findings
Identified three distinct periods: emergence, growth, and boom.
Mapped characteristic vocabulary and key documents for each period.
Provided insights into the discipline's linguistic and conceptual development.
Abstract
Here I present an investigation on the evolution and use of vocabulary in data science in the last 13 years. Based on a rigorous statistical analysis, a database with 12,787 documents containing the words "data science" in the title, abstract or keywords is analyzed. It is proposed to classify the evolution of this discipline in three periods: emergence, growth and boom. Characteristic words and pioneering documents are identified for each period. By proposing the distinctive vocabulary and relevant topics of data science and classified in time periods, these results add value to the scientific community of this discipline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Big Data and Business Intelligence
