Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions
Sho Yokoi, Sosuke Kobayashi, Kenji Fukumizu, Jun Suzuki, Kentaro Inui

TL;DR
This paper introduces pointwise HSIC (PHSIC), a fast, kernel-based co-occurrence measure for sparse linguistic data that outperforms PMI in speed and accuracy, with applications in dialogue and translation tasks.
Contribution
The paper proposes PHSIC, a novel, linear-time kernelized co-occurrence measure derived from HSIC, offering a scalable alternative to PMI for linguistic expressions.
Findings
PHSIC is thousands of times faster than RNN-based PMI.
PHSIC outperforms PMI in dialogue response accuracy.
PHSIC effectively scores data pairs in machine translation tasks.
Abstract
In this paper, we propose a new kernel-based co-occurrence measure that can be applied to sparse linguistic expressions (e.g., sentences) with a very short learning time, as an alternative to pointwise mutual information (PMI). As well as deriving PMI from mutual information, we derive this new measure from the Hilbert--Schmidt independence criterion (HSIC); thus, we call the new measure the pointwise HSIC (PHSIC). PHSIC can be interpreted as a smoothed variant of PMI that allows various similarity metrics (e.g., sentence embeddings) to be plugged in as kernels. Moreover, PHSIC can be estimated by simple and fast (linear in the size of the data) matrix calculations regardless of whether we use linear or nonlinear kernels. Empirically, in a dialogue response selection task, PHSIC is learned thousands of times faster than an RNN-based PMI while outperforming PMI in accuracy. In addition,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
