RegCLR: A Self-Supervised Framework for Tabular Representation Learning   in the Wild

Weiyao Wang; Byung-Hak Kim; Varun Ganapathi

arXiv:2211.01165·cs.CV·November 3, 2022

RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild

Weiyao Wang, Byung-Hak Kim, Varun Ganapathi

PDF

Open Access

TL;DR

RegCLR is a novel self-supervised learning framework for tabular and document image applications, combining contrastive and regularized methods to improve representation quality in real-world scenarios.

Contribution

Introduces RegCLR, a self-supervised framework integrating contrastive and regularized approaches compatible with Vision Transformers for tabular data.

Findings

01

Significant AP improvements in table and GUI object detection.

02

Effective in diverse real-world document image scenarios.

03

Enhances downstream performance over supervised baselines.

Abstract

Recent advances in self-supervised learning (SSL) using large models to learn visual representations from natural images are rapidly closing the gap between the results produced by fully supervised learning and those produced by SSL on downstream vision tasks. Inspired by this advancement and primarily motivated by the emergence of tabular and structured document image applications, we investigate which self-supervised pretraining objectives, architectures, and fine-tuning strategies are most effective. To address these questions, we introduce RegCLR, a new self-supervised framework that combines contrastive and regularized methods and is compatible with the standard Vision Transformer architecture. Then, RegCLR is instantiated by integrating masked autoencoders as a representative example of a contrastive method and enhanced Barlow Twins as a representative example of a regularized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Adam · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Layer Normalization