Method and Dataset Mining in Scientific Papers
Rujing Yao, Linlin Hou, Yingchun Ye, Ou Wu, Ji Zhang, Jian Wu

TL;DR
This paper introduces MDER, a new entity recognition model for extracting methods and datasets from scientific papers, along with datasets from PAKDD conference papers, to enhance literature analysis in machine learning.
Contribution
The paper presents a novel entity recognition model, MDER, and constructs datasets for extracting methods and datasets from scientific papers, enabling better discipline analysis.
Findings
MDER achieves promising extraction performance
Mining results are visualized for better understanding
Constructed datasets facilitate further research
Abstract
Literature analysis facilitates researchers better understanding the development of science and technology. The conventional literature analysis focuses on the topics, authors, abstracts, keywords, references, etc., and rarely pays attention to the content of papers. In the field of machine learning, the involved methods (M) and datasets (D) are key information in papers. The extraction and mining of M and D are useful for discipline analysis and algorithm recommendation. In this paper, we propose a novel entity recognition model, called MDER, and constructe datasets from the papers of the PAKDD conferences (2009-2019). Some preliminary experiments are conducted to assess the extraction performance and the mining results are visualized.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
