The Role of Codeword-to-Class Assignments in Error-Correcting Codes: An Empirical Study
Itay Evron, Ophir Onn, Tamar Weiss Orzech, Hai Azeroual, Daniel Soudry

TL;DR
This paper demonstrates that the way codewords are assigned to classes in error-correcting codes significantly impacts classification performance, with similarity-preserving assignments leading to easier subproblems and better generalization.
Contribution
The study reveals the importance of codeword-to-class assignments in ECC, showing that similarity-preserving mappings enhance performance and adapt predefined codebooks to specific problems.
Findings
Similarity-preserving assignments improve generalization.
Predefined codebooks become problem-dependent with these assignments.
Enhanced codebooks benefit extreme classification tasks.
Abstract
Error-correcting codes (ECC) are used to reduce multiclass classification tasks to multiple binary classification subproblems. In ECC, classes are represented by the rows of a binary matrix, corresponding to codewords in a codebook. Codebooks are commonly either predefined or problem dependent. Given predefined codebooks, codeword-to-class assignments are traditionally overlooked, and codewords are implicitly assigned to classes arbitrarily. Our paper shows that these assignments play a major role in the performance of ECC. Specifically, we examine similarity-preserving assignments, where similar codewords are assigned to similar classes. Addressing a controversy in existing literature, our extensive experiments confirm that similarity-preserving assignments induce easier subproblems and are superior to other assignment policies in terms of their generalization performance. We find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Text and Document Classification Technologies · Machine Learning in Bioinformatics
