
TL;DR
This paper introduces Attentive VQ-VAE, which enhances traditional VQ-VAE models with a Residual Encoder and Pixel Attention, improving data representation and generation capabilities efficiently.
Contribution
The paper proposes the Attentive Residual Encoder (AREN) with inter-pixel auto-attention for better VQ-VAE performance while maintaining practical parameter levels.
Findings
Significant improvements in data representation quality.
Enhanced generative performance across tasks.
Efficient attention mechanism with minimal parameters.
Abstract
We present a novel approach to enhance the capabilities of VQ-VAE models through the integration of a Residual Encoder and a Residual Pixel Attention layer, named Attentive Residual Encoder (AREN). The objective of our research is to improve the performance of VQ-VAE while maintaining practical parameter levels. The AREN encoder is designed to operate effectively at multiple levels, accommodating diverse architectural complexities. The key innovation is the integration of an inter-pixel auto-attention mechanism into the AREN encoder. This approach allows us to efficiently capture and utilize contextual information across latent vectors. Additionally, our models uses additional encoding levels to further enhance the model's representational power. Our attention layer employs a minimal parameter approach, ensuring that latent vectors are modified only when pertinent information from other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods
MethodsVQ-VAE
