URECA: The Chain of Two Minimum Set Cover Problems exists behind Adaptation to Shifts in Semantic Code Search
Seok-Ung Choi, Joonghyuk Hahn, Yo-Sub Han

TL;DR
This paper introduces URECA, a novel clustering algorithm that leverages relationships between disentangled representations to improve few-shot adaptation to semantic code search shifts, addressing limitations of the minimum entropy problem.
Contribution
The paper extends the minimum entropy problem to a chain of two minimum set cover problems and proposes URECA, a new clustering algorithm based on recursive union-find for better adaptation.
Findings
URECA achieves consistent performance gains in few-shot adaptation.
URECA outperforms state-of-the-art in CoSQA query shift scenarios.
Theoretical analysis links minimum entropy and set cover problems.
Abstract
Adaptation is to make model learn the patterns shifted from the training distribution. In general, this adaptation is formulated as the minimum entropy problem. However, the minimum entropy problem has inherent limitation -- shifted initialization cascade phenomenon. We extend the relationship between the minimum entropy problem and the minimum set cover problem via Lebesgue integral. This extension reveals that internal mechanism of the minimum entropy problem ignores the relationship between disentangled representations, which leads to shifted initialization cascade. From the analysis, we introduce a new clustering algorithm, Union-find based Recursive Clustering Algorithm~(URECA). URECA is an efficient clustering algorithm for the leverage of the relationships between disentangled representations. The update rule of URECA depends on Thresholdly-Updatable Stationary Assumption to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Web Data Mining and Analysis
MethodsSparse Evolutionary Training
