TL;DR
This paper introduces a method to generate synthetic Morse code datasets with adjustable difficulty for machine learning, analyzing how noise and feature expansion impact neural network performance, and proposing metrics to assess dataset difficulty.
Contribution
It presents an open-source algorithm for creating tunable Morse code datasets and evaluates metrics for dataset difficulty in machine learning contexts.
Findings
Adding noise decreases network accuracy.
Increasing feature set size affects learning difficulty.
Proposed metrics effectively indicate dataset complexity.
Abstract
We present an algorithm to generate synthetic datasets of tunable difficulty on classification of Morse code symbols for supervised machine learning problems, in particular, neural networks. The datasets are spatially one-dimensional and have a small number of input features, leading to high density of input information content. This makes them particularly challenging when implementing network complexity reduction methods. We explore how network performance is affected by deliberately adding various forms of noise and expanding the feature set and dataset size. Finally, we establish several metrics to indicate the difficulty of a dataset, and evaluate their merits. The algorithm and datasets are open-source.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
