Method and Dataset Mining in Scientific Papers

Rujing Yao; Linlin Hou; Yingchun Ye; Ou Wu; Ji Zhang; Jian Wu

arXiv:1911.13096·cs.LG·December 2, 2019

Method and Dataset Mining in Scientific Papers

Rujing Yao, Linlin Hou, Yingchun Ye, Ou Wu, Ji Zhang, Jian Wu

PDF

Open Access

TL;DR

This paper introduces MDER, a new entity recognition model for extracting methods and datasets from scientific papers, along with datasets from PAKDD conference papers, to enhance literature analysis in machine learning.

Contribution

The paper presents a novel entity recognition model, MDER, and constructs datasets for extracting methods and datasets from scientific papers, enabling better discipline analysis.

Findings

01

MDER achieves promising extraction performance

02

Mining results are visualized for better understanding

03

Constructed datasets facilitate further research

Abstract

Literature analysis facilitates researchers better understanding the development of science and technology. The conventional literature analysis focuses on the topics, authors, abstracts, keywords, references, etc., and rarely pays attention to the content of papers. In the field of machine learning, the involved methods (M) and datasets (D) are key information in papers. The extraction and mining of M and D are useful for discipline analysis and algorithm recommendation. In this paper, we propose a novel entity recognition model, called MDER, and constructe datasets from the papers of the PAKDD conferences (2009-2019). Some preliminary experiments are conducted to assess the extraction performance and the mining results are visualized.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Biomedical Text Mining and Ontologies