TrustDataFilter:Leveraging Trusted Knowledge Base Data for More Effective Filtering of Unknown Information
Jinghong Zhang, Yidong Cui, Weiling Wang, and Xianyou Cheng

TL;DR
TrustDataFilter leverages trusted knowledge bases and natural language inference models to enhance the accuracy and consistency of filtering unknown information in domain-specific knowledge base construction.
Contribution
The paper introduces the self-nli-TDF framework that uses large language models for trustworthiness assessment and reasoning, improving filtering performance in knowledge base construction.
Findings
Improved filtering quality with more consistent results
Effective use of large language models like RoBERTa and GPT-3.5
Validated on datasets from biology, radiation, and science domains
Abstract
With the advancement of technology and changes in the market, the demand for the construction of domain-specific knowledge bases has been increasing, either to improve model performance or to promote enterprise innovation and competitiveness. The construction of domain-specific knowledge bases typically relies on web crawlers or existing industry databases, leading to problems with accuracy and consistency of the data. To address these challenges, we considered the characteristics of domain data, where internal knowledge is interconnected, and proposed the Self-Natural Language Inference Data Filtering (self-nli-TDF) framework. This framework compares trusted filtered knowledge with the data to be filtered, deducing the reasoning relationship between them, thus improving filtering performance. The framework uses plug-and-play large language models for trustworthiness assessment and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Cloud Data Security Solutions · Cryptography and Data Security
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Adam · Softmax · Dropout · Weight Decay · Linear Layer · Layer Normalization · WordPiece · Dense Connections
