FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation
Junjie He, Pengyu Li, Yifeng Geng, Xuansong Xie

TL;DR
FastInst is a simple, query-based, real-time instance segmentation model that achieves high accuracy and speed, outperforming many existing methods without complex techniques.
Contribution
The paper introduces FastInst, a novel, efficient query-based framework for real-time instance segmentation that balances speed and accuracy with innovative design strategies.
Findings
FastInst runs at 32.5 FPS on COCO test-dev.
Achieves 40.5 AP, surpassing many real-time models.
Outperforms state-of-the-art real-time counterparts in speed and accuracy.
Abstract
Recent attention in instance segmentation has focused on query-based models. Despite being non-maximum suppression (NMS)-free and end-to-end, the superiority of these models on high-accuracy real-time benchmarks has not been well demonstrated. In this paper, we show the strong potential of query-based models on efficient instance segmentation algorithm designs. We present FastInst, a simple, effective query-based framework for real-time instance segmentation. FastInst can execute at a real-time speed (i.e., 32.5 FPS) while yielding an AP of more than 40 (i.e., 40.5 AP) on COCO test-dev without bells and whistles. Specifically, FastInst follows the meta-architecture of recently introduced Mask2Former. Its key designs include instance activation-guided queries, dual-path update strategy, and ground truth mask-guided learning, which enable us to use lighter pixel decoders, fewer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Colorectal Cancer Screening and Detection
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · Dropout
