Semantic Video Segmentation : Exploring Inference Efficiency

Subarna Tripathi; Serge Belongie; Youngbae Hwang; Truong Nguyen

arXiv:1509.02441·cs.CV·September 9, 2015

Semantic Video Segmentation : Exploring Inference Efficiency

Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen

PDF

1 Repo

TL;DR

This paper presents a method for efficient joint inference in video semantic segmentation that combines semantic co-labeling with expressive models, significantly improving accuracy without extra computational cost.

Contribution

It introduces a novel inference approach that enables rapid, accurate video semantic segmentation by integrating co-labeling and expressive models, outperforming previous image segmentation methods.

Findings

01

Achieves up to 8% accuracy improvement on CamVid dataset

02

Performs inference over 10,000 images within seconds

03

No additional time overhead for improved accuracy

Abstract

We explore the efficiency of the CRF inference beyond image level semantic segmentation and perform joint inference in video frames. The key idea is to combine best of two worlds: semantic co-labeling and more expressive models. Our formulation enables us to perform inference over ten thousand images within seconds and makes the system amenable to perform video semantic segmentation most effectively. On CamVid dataset, with TextonBoost unaries, our proposed method achieves up to 8% improvement in accuracy over individual semantic image segmentation without additional time overhead. The source code is available at https://github.com/subtri/video_inference

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

subtri/video_inference
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConditional Random Field