Hierarchical Neural Network for Extracting Knowledgeable Snippets and Documents
Ganbin Zhou, Rongyu Cao, Xiang Ao, Ping Luo, Fen Lin, Leyu Lin, Qing, He

TL;DR
This paper introduces a CNN-based hierarchical model for extracting knowledgeable snippets and annotating knowledgeable documents from social media, improving efficiency and accuracy over pattern-based methods.
Contribution
A novel semantic-based CNN model with a hierarchical structure that handles multiple domains simultaneously for extracting knowledgeable content.
Findings
The proposed model outperforms pattern-based methods in accuracy.
Joint training across domains reduces training time.
Demonstrated effectiveness on WeChat social media data.
Abstract
In this study, we focus on extracting knowledgeable snippets and annotating knowledgeable documents from Web corpus, consisting of the documents from social media and We-media. Informally, knowledgeable snippets refer to the text describing concepts, properties of entities, or relations among entities, while knowledgeable documents are the ones with enough knowledgeable snippets. These knowledgeable snippets and documents could be helpful in multiple applications, such as knowledge base construction and knowledge-oriented service. Previous studies extracted the knowledgeable snippets using the pattern-based method. Here, we propose the semantic-based method for this task. Specifically, a CNN based model is developed to extract knowledgeable snippets and annotate knowledgeable documents simultaneously. Additionally, a "low-level sharing, high-level splitting" structure of CNN is designed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
