Reproducing BowNet: Learning Representations by Predicting Bags of   Visual Words

Harry Nguyen; Stone Yun; Hisham Mohammad

arXiv:2201.03556·cs.CV·January 19, 2022

Reproducing BowNet: Learning Representations by Predicting Bags of Visual Words

Harry Nguyen, Stone Yun, Hisham Mohammad

PDF

Open Access 1 Repo

TL;DR

This paper attempts to reproduce the results of BowNet, a self-supervised learning method using bag-of-visual-words descriptors, but faces challenges in achieving the original reported accuracy improvements.

Contribution

It provides an implementation effort to reproduce BowNet's results and discusses potential reasons for the difficulties encountered.

Findings

01

Failed to replicate the original accuracy improvements

02

Identified potential factors affecting reproducibility

03

Highlights challenges in reproducing SSL methods

Abstract

This work aims to reproduce results from the CVPR 2020 paper by Gidaris et al. Self-supervised learning (SSL) is used to learn feature representations of an image using an unlabeled dataset. This work proposes to use bag-of-words (BoW) deep feature descriptors as a self-supervised learning target to learn robust, deep representations. BowNet is trained to reconstruct the histogram of visual words (ie. the deep BoW descriptor) of a reference image when presented a perturbed version of the image as input. Thus, this method aims to learn perturbation-invariant and context-aware image features that can be useful for few-shot tasks or supervised downstream tasks. In the paper, the author describes BowNet as a network consisting of a convolutional feature extractor $Φ (\cdot)$ and a Dense-softmax layer $Ω (\cdot)$ trained to predict BoW features from images. After BoW training, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

StoneY1/Reproducing-BowNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques