Efficient Online Processing with Deep Neural Networks

Lukas Hedegaard

arXiv:2306.13474·cs.LG·June 26, 2023·1 cites

Efficient Online Processing with Deep Neural Networks

Lukas Hedegaard

PDF

Open Access 1 Repo

TL;DR

This paper focuses on improving the efficiency of deep neural networks during online inference by introducing Continual Inference Networks (CINs) and structured pruning techniques, reducing computational costs while maintaining accuracy.

Contribution

It proposes Continual Inference Networks (CINs) for online processing and introduces structured pruning adapters for efficient model adaptation and acceleration.

Findings

01

CINs improve online inference efficiency by an order of magnitude.

02

Reformulation of 3D CNNs, ST-GCNs, and Transformers into CINs.

03

Structured pruning adapters outperform fine-tuning in accuracy with fewer weights.

Abstract

The capabilities and adoption of deep neural networks (DNNs) grow at an exhilarating pace: Vision models accurately classify human actions in videos and identify cancerous tissue in medical scans as precisely than human experts; large language models answer wide-ranging questions, generate code, and write prose, becoming the topic of everyday dinner-table conversations. Even though their uses are exhilarating, the continually increasing model sizes and computational complexities have a dark side. The economic cost and negative environmental externalities of training and serving models is in evident disharmony with financial viability and climate action goals. Instead of pursuing yet another increase in predictive performance, this dissertation is dedicated to the improvement of neural network efficiency. Specifically, a core contribution addresses the efficiency aspects during online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LukasHedegaard/continual-inference
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods

MethodsMulti-Head Attention · Attention Is All You Need · Pruning · Absolute Position Encodings · Linear Layer · Position-Wise Feed-Forward Layer · Layer Normalization · Label Smoothing · Adam · Byte Pair Encoding