svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery
Nicolas Turenne

TL;DR
This paper introduces svcR, an R package for support vector clustering that employs a novel 2D-grid labeling technique and a Jaccard-Radial kernel to enhance clustering speed and accuracy in lexical pattern discovery.
Contribution
The paper presents an original 2D-grid labeling method and applies a Jaccard-Radial kernel within SVC to improve clustering efficiency and classification in lexical pattern discovery tasks.
Findings
Enhanced cluster extraction speed with 2D-grid labeling
Effective classification of biological terms into ontological classes
Improved rule definition for information extraction
Abstract
We present a new R package which takes a numerical matrix format as data input, and computes clusters using a support vector clustering method (SVC). We have implemented an original 2D-grid labeling approach to speed up cluster extraction. In this sense, SVC can be seen as an efficient cluster extraction if clusters are separable in a 2-D map. Secondly we showed that this SVC approach using a Jaccard-Radial base kernel can help to classify well enough a set of terms into ontological classes and help to define regular expression rules for information extraction in documents; our case study concerns a set of terms and documents about developmental and molecular biology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Natural Language Processing Techniques · Data Mining Algorithms and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
