Revisiting Implicit Models: Sparsity Trade-offs Capability in   Weight-tied Model for Vision Tasks

Haobo Song; Soumajit Majumder; Tao Lin

arXiv:2307.08013·cs.LG·October 23, 2023

Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks

Haobo Song, Soumajit Majumder, Tao Lin

PDF

Open Access

TL;DR

This paper revisits implicit models, particularly weight-tied models, demonstrating their superior efficiency and stability over DEQs in vision tasks, and explores sparsity techniques to enhance their capacity.

Contribution

It reveals the effectiveness of weight-tied models for vision tasks, compares them with DEQs, and proposes sparsity-based methods to improve their capacity and performance.

Findings

01

Weight-tied models outperform DEQs in efficiency and stability.

02

Sparsity masks can enhance model capacity.

03

Guidelines for designing depth, width, and sparsity in weight-tied models.

Abstract

Implicit models such as Deep Equilibrium Models (DEQs) have garnered significant attention in the community for their ability to train infinite layer models with elegant solution-finding procedures and constant memory footprint. However, despite several attempts, these methods are heavily constrained by model inefficiency and optimization instability. Furthermore, fair benchmarking across relevant methods for vision tasks is missing. In this work, we revisit the line of implicit models and trace them back to the original weight-tied models. Surprisingly, we observe that weight-tied models are more effective, stable, as well as efficient on vision tasks, compared to the DEQ variants. Through the lens of these simple-yet-clean weight-tied models, we further study the fundamental limits in the model capacity of such models and propose the use of distinct sparse masks to improve the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research

MethodsDeep Equilibrium Models