TL;DR
This paper presents a lightweight, pixel-wise generator architecture for high-resolution image translation that is significantly faster than existing methods while maintaining comparable quality.
Contribution
The authors introduce a novel pixel-wise network architecture with spatially varying parameters and coordinate encoding, enabling fast and efficient high-resolution image translation.
Findings
Model is up to 18x faster than state-of-the-art methods.
Achieves comparable visual quality across various resolutions.
Effective for high-resolution image-to-image translation.
Abstract
We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. We design the generator to be an extremely lightweight function of the full-resolution image. In fact, we use pixel-wise networks; that is, each pixel is processed independently of others, through a composition of simple affine transformations and nonlinearities. We take three important steps to equip such a seemingly simple function with adequate expressivity. First, the parameters of the pixel-wise networks are spatially varying so they can represent a broader function class than simple 1x1 convolutions. Second, these parameters are predicted by a fast convolutional network that processes an aggressively low-resolution representation of the input; Third, we augment the input image with a sinusoidal encoding of spatial coordinates, which provides an effective inductive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
