Adapting CRISP-DM for Idea Mining: A Data Mining Process for Generating   Ideas Using a Textual Dataset

W. Y. Ayele

arXiv:2105.00574·cs.IR·May 4, 2021

Adapting CRISP-DM for Idea Mining: A Data Mining Process for Generating Ideas Using a Textual Dataset

W. Y. Ayele

PDF

TL;DR

This paper adapts the CRISP-DM data mining process model to idea mining, enabling systematic extraction of innovative ideas from textual datasets using machine learning techniques like Dynamic Topic Modeling.

Contribution

It introduces CRISP-IM, a novel, reusable process model for idea mining that leverages standard data mining practices and machine learning on unstructured textual data.

Findings

01

CRISP-IM facilitates trend identification in scholarly and patent datasets.

02

It integrates Dynamic Topic Modeling for idea generation.

03

The model supports diverse textual datasets across domains.

Abstract

Data mining project managers can benefit from using standard data mining process models. The benefits of using standard process models for data mining, such as the de facto and the most popular, Cross-Industry-Standard-Process model for Data Mining (CRISP-DM) are reduced cost and time. Also, standard models facilitate knowledge transfer, reuse of best practices, and minimize knowledge requirements. On the other hand, to unlock the potential of ever-growing textual data such as publications, patents, social media data, and documents of various forms, digital innovation is increasingly needed. Furthermore, the introduction of cutting-edge machine learning tools and techniques enable the elicitation of ideas. The processing of unstructured textual data to generate new and useful ideas is referred to as idea mining. Existing literature about idea mining merely overlooks the utilization of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.