Scalable Visual Attribute Extraction through Hidden Layers of a Residual ConvNet
Andres Baloian, Nils Murrugarra-Llerena, Jose M. Saavedra

TL;DR
This paper presents a scalable method for extracting visual attributes from images using hidden layers of a ResNet-50, enabling high-accuracy attribute discrimination without retraining for new attribute sets.
Contribution
It introduces a novel approach leveraging pre-trained convnet hidden layers for attribute extraction, avoiding the need for retraining as attributes change.
Findings
Second block of ResNet-50 effectively discriminates colors with over 93% accuracy.
Fourth block of ResNet-50 effectively discriminates textures with over 93% accuracy.
Feature embeddings can be reduced in size with UMAP while maintaining high accuracy.
Abstract
Visual attributes play an essential role in real applications based on image retrieval. For instance, the extraction of attributes from images allows an eCommerce search engine to produce retrieval results with higher precision. The traditional manner to build an attribute extractor is by training a convnet-based classifier with a fixed number of classes. However, this approach does not scale for real applications where the number of attributes changes frequently. Therefore in this work, we propose an approach for extracting visual attributes from images, leveraging the learned capability of the hidden layers of a general convolutional network to discriminate among different visual features. We run experiments with a resnet-50 trained on Imagenet, on which we evaluate the output of its different blocks to discriminate between colors and textures. Our results show that the second block…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
Methods1x1 Convolution · Average Pooling · Batch Normalization · Global Average Pooling · Max Pooling · Residual Connection · Bottleneck Residual Block · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Kaiming Initialization
