Patch DCT vs LeNet

David Sinclair

arXiv:2211.02392·cs.CV·November 7, 2022

Patch DCT vs LeNet

David Sinclair

PDF

Open Access

TL;DR

This paper compares a DCT-based neural network approach with LeNet for MNIST digit classification, highlighting the efficiency of DCT basis functions similar to learned features but with faster computation.

Contribution

It introduces a DCT-based neural network approach and compares its performance and efficiency to LeNet on MNIST classification.

Findings

01

DCT-based approach is faster to compute than LeNet.

02

DCT basis functions resemble some learned features of Visual Transformers.

03

Performance comparison shows competitive accuracy.

Abstract

This paper compares the performance of a NN taking the output of a DCT (Discrete Cosine Transform) of an image patch with leNet for classifying MNIST hand written digits. The basis functions underlying the DCT bear a passing resemblance to some of the learned basis function of the Visual Transformer but are an order of magnitude faster to apply.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Hand Gesture Recognition Systems · Image Processing Techniques and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Dense Connections · Absolute Position Encodings · Layer Normalization