Entity Extraction with Knowledge from Web Scale Corpora
Zeyi Wen, Zeyu Huang, Rui Zhang

TL;DR
This paper introduces techniques that leverage web-scale corpora to enhance entity extraction accuracy and efficiency in text mining tasks.
Contribution
It presents novel post-processing methods utilizing models trained on large web data, improving existing entity extraction techniques.
Findings
Significant improvement in extraction accuracy
Enhanced efficiency in processing large datasets
Robustness across diverse text sources
Abstract
Entity extraction is an important task in text mining and natural language processing. A popular method for entity extraction is by comparing substrings from free text against a dictionary of entities. In this paper, we present several techniques as a post-processing step for improving the effectiveness of the existing entity extraction technique. These techniques utilise models trained with the web-scale corpora which makes our techniques robust and versatile. Experiments show that our techniques bring a notable improvement on efficiency and effectiveness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Web Data Mining and Analysis · Advanced Text Analysis Techniques
