TL;DR
This paper investigates how carefully curated fixed training datasets can improve the data efficiency of syndrome-based neural decoders, reducing the need for large amounts of training data.
Contribution
It introduces heuristics for selecting training samples and demonstrates that fixed datasets can outperform dynamic data generation in neural decoding.
Findings
Curated fixed datasets lead to better neural decoder performance.
Sample selection heuristics improve training efficiency.
Fewer training examples are needed for high accuracy.
Abstract
While significant research efforts have been directed toward developing more capable neural decoding architectures, comparatively little attention has been paid to the quality of training data. In this study, we address the challenge of constructing effective training datasets to maximize the potential of existing syndrome-based neural decoder architectures. We emphasize the advantages of using fixed datasets over generating training data dynamically and explore the problem of selecting appropriate training targets within this framework. Furthermore,we propose several heuristics for selecting training samples and present experimental evidence demonstrating that, with carefully curated datasets, it is possible to train neural decoders to achieve superior performance while requiring fewer training examples. Code to reproduce all results is available at https://github.com/lebidan/sbnd.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
