C-DLinkNet: considering multi-level semantic features for human parsing

Yu Lu; Muyan Feng; Ming Wu; Chuang Zhang

arXiv:2001.11690·cs.CV·April 7, 2020·1 cites

C-DLinkNet: considering multi-level semantic features for human parsing

Yu Lu, Muyan Feng, Ming Wu, Chuang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces C-DLinkNet, an end-to-end human parsing model that effectively combines multi-level features to improve semantic segmentation accuracy without extra data or large inputs.

Contribution

The paper presents C-DLinkNet with a novel Smooth Module for multi-level feature fusion, achieving competitive results with smaller inputs and no additional information.

Findings

01

Achieves mIoU=53.05 on LIP dataset validation set.

02

Outperforms some state-of-the-art methods with smaller input sizes.

03

Uses a new Smooth Module for feature combination.

Abstract

Human parsing is an essential branch of semantic segmentation, which is a fine-grained semantic segmentation task to identify the constituent parts of human. The challenge of human parsing is to extract effective semantic features to resolve deformation and multi-scale variations. In this work, we proposed an end-to-end model called C-DLinkNet based on LinkNet, which contains a new module named Smooth Module to combine the multi-level features in Decoder part. C-DLinkNet is capable of producing competitive parsing performance compared with the state-of-the-art methods with smaller input sizes and no additional information, i.e., achiving mIoU=53.05 on the validation set of LIP dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MindSpore-MS-Code2/code0/tree/main/dlinknet
mindspore

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications