TL;DR
OmniColor is a versatile framework for multi-modal lineart colorization that effectively combines spatial and semantic guidance to produce high-quality, controllable, and temporally stable colorization results.
Contribution
It introduces a unified approach with novel encoding, loss, and gating strategies to handle diverse control signals in lineart colorization.
Findings
Achieves superior controllability and visual quality.
Demonstrates enhanced temporal stability in colorization.
Outperforms existing methods in handling multi-modal guidance.
Abstract
Lineart colorization is a critical stage in professional content creation, yet achieving precise and flexible results under diverse user constraints remains a significant challenge. To address this, we propose OmniColor, a unified framework for multi-modal lineart colorization that supports arbitrary combinations of control signals. Specifically, we systematically categorize guidance signals into two types: spatially-aligned conditions and semantic-reference conditions. For spatially-aligned inputs, we employ a dual-path encoding strategy paired with a Dense Feature Alignment loss to ensure rigorous boundary preservation and precise color restoration. For semantic-reference inputs, we utilize a VLM-only encoding scheme integrated with a Temporal Redundancy Elimination mechanism to filter repetitive information and enhance inference efficiency. To resolve potential input conflicts, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
