Efficient Sparse Training with Structured Dropout

Andy Lo

arXiv:2411.01238·cs.LG·November 5, 2024

Efficient Sparse Training with Structured Dropout

Andy Lo

PDF

Open Access 1 Repo

TL;DR

This paper introduces SparseDrop, a structured dropout method that creates hardware-friendly sparsity, enabling faster training on GPUs while maintaining regularisation effectiveness comparable to standard dropout.

Contribution

The paper presents SparseDrop, a novel structured dropout technique with a CUDA implementation that achieves speed-ups on GPUs and retains regularisation benefits.

Findings

01

SparseDrop achieves GPU speed-ups at low sparsity levels.

02

SparseDrop maintains similar regularisation effectiveness as standard dropout.

03

The source code is publicly available for reproducibility.

Abstract

Dropout is a common regularisation technique in deep learning that improves generalisation. Even though it introduces sparsity and thus potential for higher throughput, it usually cannot bring speed-ups on GPUs due to its unstructured nature. In this project, I experiment with SparseDrop, a structured, hardware-friendly variant of dropout that can exploit such sparsity. I provide a CUDA implementation of SparseDrop, achieving speed-ups against its dense counterpart even at low sparsity levels. The empirical results demonstrate that SparseDrop provides similar, or sometimes even better, regularisation properties as standard dropout. This suggests its potential as a drop-in replacement to standard dropout with faster training speeds. The source code is available at https://github.com/andylolu2/sparse-dropout

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

andylolu2/sparse-dropout
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification

MethodsDropout