IntentTuner: An Interactive Framework for Integrating Human Intents in Fine-tuning Text-to-Image Generative Models
Xingchen Zeng, Ziyao Gao, Yilin Ye, and Wei Zeng

TL;DR
IntentTuner is an interactive framework that enhances fine-tuning of text-to-image models by integrating human intentions through user-friendly tools and new metrics, improving model alignment and reducing effort.
Contribution
It introduces an interactive system for incorporating human intentions into fine-tuning, with novel metrics for measuring intent alignment and improved user experience.
Findings
Reduces cognitive effort in fine-tuning process
Produces models with better alignment to user intentions
Streamlines the fine-tuning workflow
Abstract
Fine-tuning facilitates the adaptation of text-to-image generative models to novel concepts (e.g., styles and portraits), empowering users to forge creatively customized content. Recent efforts on fine-tuning focus on reducing training data and lightening computation overload but neglect alignment with user intentions, particularly in manual curation of multi-modal training data and intent-oriented evaluation. Informed by a formative study with fine-tuning practitioners for comprehending user intentions, we propose IntentTuner, an interactive framework that intelligently incorporates human intentions throughout each phase of the fine-tuning workflow. IntentTuner enables users to articulate training intentions with imagery exemplars and textual descriptions, automatically converting them into effective data augmentation strategies. Furthermore, IntentTuner introduces novel metrics to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · 3D Modeling in Geospatial Applications · Image Processing and 3D Reconstruction
