MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base

Qianyu He; Xintao Wang; Jiaqing Liang; Yanghua Xiao

arXiv:2212.05254·cs.CL·December 13, 2022

MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base

Qianyu He, Xintao Wang, Jiaqing Liang, Yanghua Xiao

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces MAPS-KB, a large-scale probabilistic knowledge base of 4.3 million simile triplets, enhancing AI's understanding and generation of diverse similes beyond high-frequency examples.

Contribution

The paper presents a novel framework for constructing a large-scale probabilistic simile knowledge base and introduces two metrics for better simile understanding.

Findings

01

Constructed MAPS-KB with 4.3 million triplets from 0.4 million terms.

02

Achieved state-of-the-art performance on three downstream tasks.

03

Validated the effectiveness of the proposed framework and metrics.

Abstract

The ability to understand and generate similes is an imperative step to realize human-level AI. However, there is still a considerable gap between machine intelligence and human cognition in similes, since deep models based on statistical distribution tend to favour high-frequency similes. Hence, a large-scale symbolic knowledge base of similes is required, as it contributes to the modeling of diverse yet unpopular similes while facilitating additional evaluation and reasoning. To bridge the gap, we propose a novel framework for large-scale simile knowledge base construction, as well as two probabilistic metrics which enable an improved understanding of simile phenomena in natural language. Overall, we construct MAPS-KB, a million-scale probabilistic simile knowledge base, covering 4.3 million triplets over 0.4 million terms from 70 GB corpora. We conduct sufficient experiments to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques

MethodsBalanced Selection