PixelNet: Towards a General Pixel-level Architecture
Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan

TL;DR
PixelNet proposes a versatile pixel-level prediction architecture that improves accuracy across various tasks by using stratified sampling and multi-scale features, outperforming task-specific models without additional post-processing.
Contribution
The paper introduces a unified architecture for pixel-level prediction that enhances learning efficiency and accuracy through stratified sampling and multi-scale features.
Findings
Achieved state-of-the-art results on PASCAL-Context, NYUDv2, and BSDS datasets.
Demonstrated the effectiveness of stratified sampling in training.
Showed that a single architecture can outperform task-specific models.
Abstract
We explore architectures for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation. Convolutional predictors, such as the fully-convolutional network (FCN), have achieved remarkable success by exploiting the spatial redundancy of neighboring pixels through convolutional processing. Though computationally efficient, we point out that such approaches are not statistically efficient during learning precisely because spatial redundancy limits the information learned from neighboring pixels. We demonstrate that (1) stratified sampling allows us to add diversity during batch updates and (2) sampled multi-scale features allow us to explore more nonlinear predictors (multiple fully-connected layers followed by ReLU) that improve overall accuracy. Finally, our objective is to show how a architecture can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
