Morse Code Datasets for Machine Learning

Sourya Dey; Keith M. Chugg; Peter A. Beerel

arXiv:1807.04239·cs.LG·April 29, 2019

Morse Code Datasets for Machine Learning

Sourya Dey, Keith M. Chugg, Peter A. Beerel

PDF

2 Repos

TL;DR

This paper introduces a method to generate synthetic Morse code datasets with adjustable difficulty for machine learning, analyzing how noise and feature expansion impact neural network performance, and proposing metrics to assess dataset difficulty.

Contribution

It presents an open-source algorithm for creating tunable Morse code datasets and evaluates metrics for dataset difficulty in machine learning contexts.

Findings

01

Adding noise decreases network accuracy.

02

Increasing feature set size affects learning difficulty.

03

Proposed metrics effectively indicate dataset complexity.

Abstract

We present an algorithm to generate synthetic datasets of tunable difficulty on classification of Morse code symbols for supervised machine learning problems, in particular, neural networks. The datasets are spatially one-dimensional and have a small number of input features, leading to high density of input information content. This makes them particularly challenging when implementing network complexity reduction methods. We explore how network performance is affected by deliberately adding various forms of noise and expanding the feature set and dataset size. Finally, we establish several metrics to indicate the difficulty of a dataset, and evaluate their merits. The algorithm and datasets are open-source.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.