TL;DR
This study demonstrates that simple convolutional neural networks outperform complex attention-based models in physics-constrained InSAR phase unwrapping, emphasizing the importance of simplicity and physical consistency.
Contribution
The paper provides the first large-scale ablation study showing that simpler U-Net architectures outperform complex attention models in geophysical phase unwrapping tasks.
Findings
Vanilla U-Net outperforms attention-based models in R^2 and RMSE.
Attention models introduce unphysical high-frequency artifacts.
U-Net achieves faster inference suitable for operational early-warning systems.
Abstract
Operational phase unwrapping is the primary computational bottleneck in InSAR-based volcanic and seismic monitoring. We challenge the industry trend of adopting high-complexity computer vision architectures, such as attention mechanisms, without validating their suitability for physics-constrained geophysical regression. We present the first large-scale architectural ablation study on a global LiCSAR benchmark (20 frames, 39,724 patches, 651M pixels). Our results reveal a significant "complexity penalty": a vanilla U-Net (7.76M parameters) achieves and RMSE cm, outperforming 11.37M-parameter attention-based models by 34% in and 51% in RMSE. Power Spectral Density (PSD) analysis provides the physical justification: while attention excels at capturing sharp semantic edges in natural images, it injects unphysical high-frequency artifacts ( cycles/pixel)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
