Kernelized Hashcode Representations for Relation Extraction
Sahil Garg, Aram Galstyan, Greg Ver Steeg, Irina Rish and, Guillermo Cecchi, Shuyang Gao

TL;DR
This paper introduces a scalable method using kernelized locality-sensitive hashing with random subspaces to create explicit NLP structure representations, significantly improving relation extraction accuracy and speed.
Contribution
It proposes a novel approach combining KLSH with random subspaces and mutual information optimization for efficient, accurate relation extraction.
Findings
Significant accuracy improvements over state-of-the-art classifiers
Orders-of-magnitude speedup compared to traditional kernel methods
Effective on biomedical relation extraction datasets
Abstract
Kernel methods have produced state-of-the-art results for a number of NLP tasks such as relation extraction, but suffer from poor scalability due to the high cost of computing kernel similarities between natural language structures. A recently proposed technique, kernelized locality-sensitive hashing (KLSH), can significantly reduce the computational cost, but is only applicable to classifiers operating on kNN graphs. Here we propose to use random subspaces of KLSH codes for efficiently constructing an explicit representation of NLP structures suitable for general classification methods. Further, we propose an approach for optimizing the KLSH model for classification problems by maximizing an approximation of mutual information between the KLSH codes (feature vectors) and the class labels. We evaluate the proposed approach on biomedical relation extraction datasets, and observe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Advanced Image and Video Retrieval Techniques · Text and Document Classification Technologies
