TL;DR
PICASO is a novel permutation-invariant set operator that enhances deep learning models' ability to handle set inputs with invariance to permutations and data variations, improving performance across tasks like classification, clustering, and anomaly detection.
Contribution
We introduce PICASO, a permutation-invariant cascaded attentional set operator that outperforms existing algorithms in handling data variations and can be adapted to various machine learning tasks.
Findings
Improves image classification accuracy by 10% on SmallNORB with new viewpoints.
Enhances anomaly detection ROC and PR AUC by 22% and 10%.
Boosts state prediction AP by 40% on CLEVR.
Abstract
Set-input deep networks have recently drawn much interest in computer vision and machine learning. This is in part due to the increasing number of important tasks such as meta-learning, clustering, and anomaly detection that are defined on set inputs. These networks must take an arbitrary number of input samples and produce the output invariant to the input set permutation. Several algorithms have been recently developed to address this urgent need. Our paper analyzes these algorithms using both synthetic and real-world datasets, and shows that they are not effective in dealing with common data variations such as image translation or viewpoint change. To address this limitation, we propose a permutation-invariant cascaded attentional set operator (PICASO). The gist of PICASO is a cascade of multihead attention blocks with dynamic templates. The proposed operator is a stand-alone module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
