Collaborative Annotation of Semantic Objects in Images with Multi-granularity Supervisions
Lishi Zhang, Chenghan Fu, Jia Li

TL;DR
This paper introduces a collaborative human-agent system for efficiently annotating per-pixel semantic object masks in images using multi-granularity supervision, reducing manual effort and improving accuracy.
Contribution
It presents a novel approach combining weak and strong supervisions with human interaction to automate and enhance semantic object mask annotation.
Findings
Reduces annotation time compared to traditional methods
Produces masks highly consistent with manual annotations
Effective in leveraging multi-granularity supervisions
Abstract
Per-pixel masks of semantic objects are very useful in many applications, which, however, are tedious to be annotated. In this paper, we propose a human-agent collaborative annotation approach that can efficiently generate per-pixel masks of semantic objects in tagged images with multi-granularity supervisions. Given a set of tagged image, a computer agent is first dynamically generated to roughly localize the semantic objects described by the tag. The agent first extracts massive object proposals from an image and then infer the tag-related ones under the weak and strong supervisions from linguistically and visually similar images and previously annotated object masks. By representing such supervisions by over-complete dictionaries, the tag-related object proposals can pop-out according to their sparse coding length, which are then converted to superpixels with binary labels. After…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
