TL;DR
This paper presents a deep neural network approach for relative attribute prediction in images, outperforming previous methods by learning effective features end-to-end with a ranking loss.
Contribution
It introduces a convolutional neural network with a ranking layer trained end-to-end for relative attribute prediction, improving accuracy over prior handcrafted feature methods.
Findings
Outperforms baseline and state-of-the-art methods
Learns effective attribute-specific features
Provides visualizations demonstrating learned saliency maps
Abstract
Visual attributes are great means of describing images or scenes, in a way both humans and computers understand. In order to establish a correspondence between images and to be able to compare the strength of each property between images, relative attributes were introduced. However, since their introduction, hand-crafted and engineered features were used to learn increasingly complex models for the problem of relative attributes. This limits the applicability of those methods for more realistic cases. We introduce a deep neural network architecture for the task of relative attribute prediction. A convolutional neural network (ConvNet) is adopted to learn the features by including an additional layer (ranking layer) that learns to rank the images based on these features. We adopt an appropriate ranking loss to train the whole network in an end-to-end fashion. Our proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
