TIPS: Text-Induced Pose Synthesis
Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael, Blumenstein

TL;DR
This paper introduces TIPS, a novel text-based human pose transfer method that generates poses from textual descriptions, addressing limitations of existing image-based approaches and including a new dataset for training.
Contribution
The paper proposes a new text-to-pose transfer framework with a three-stage process and introduces the DF-PASS dataset with descriptive pose annotations.
Findings
The method achieves promising qualitative results.
Quantitative scores demonstrate effectiveness.
The approach outperforms existing pose transfer techniques.
Abstract
In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose from an already available observation of that person. Though researchers have recently proposed several methods to achieve this task, most of these techniques derive the target pose directly from the desired target image on a specific dataset, making the underlying process challenging to apply in real-world scenarios as the generation of the target image is the actual aim. In this paper, we first present the shortcomings of current pose transfer algorithms and then propose a novel text-based pose transfer technique to address those issues. We divide the problem into three independent stages: (a) text to pose representation, (b) pose refinement, and (c) pose rendering. To the best of our knowledge, this is one of the first attempts to develop a text-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Advanced Neural Network Applications
