Inference from Real-World Sparse Measurements
Arnaud Pannatier, Kyle Matoba, Fran\c{c}ois Fleuret

TL;DR
This paper introduces a simple, robust attention-based model for forecasting from sparse, irregular spatiotemporal measurements, outperforming existing graph neural network approaches across multiple domains.
Contribution
The authors propose a ViT-like transformer model that processes both context and read-out positions uniformly, eliminating the need for domain-specific encoders and improving performance.
Findings
Model outperforms state-of-the-art in wind nowcasting and heat diffusion.
Reduces RMSE significantly in key tasks.
Addresses bottleneck issues in latent representations.
Abstract
Real-world problems often involve complex and unstructured sets of measurements, which occur when sensors are sparsely placed in either space or time. Being able to model this irregular spatiotemporal data and extract meaningful forecasts is crucial. Deep learning architectures capable of processing sets of measurements with positions varying from set to set, and extracting readouts anywhere are methodologically difficult. Current state-of-the-art models are graph neural networks and require domain-specific knowledge for proper setup. We propose an attention-based model focused on robustness and practical applicability, with two key design contributions. First, we adopt a ViT-like transformer that takes both context points and read-out positions as inputs, eliminating the need for an encoder-decoder structure. Second, we use a unified method for encoding both context and read-out…
Peer Reviews
Decision·ICLR 2024 Conference Withdrawn Submission
_Originality:_ Neither the task nor the method seem to be particularly new. The modularization of the transformer into different compontents is appealing, though. _Quality:_ Given the conducted experiments, it is not all clear whether the proposed model holds what the authors promise and whether it is state-of-the-art. My biggest concern is the limited amount of related work cited in the manuscript and I do not see the field of related work well explored. There are numerous works that operate o
1. Several unclear details. What is the spatial and temporal distance of measurements in the different datasets? what do you mean with bottleneck? Do you refer to information compression when projecting to latent space (I could not find where it is defined)? 2. The naming of context and target was confusing to me. In Section 3.1 you specify: `Data is in the form of pairs of vectors $(c_x,c_y)$ and $(t_x,t_y)$ where $c_x$ and $t_x$ are the position and $c_y$ and $t_y$ are the measurements (or val
I think this paper is a nice application paper. Clearly flexible function approximators are needed for weather forecasting from sparse mobile sensors. The empirical validation also appears sound, and some of the experiments pulling apart the architectures in Section 5 are a great complement. The paper itself is also reasonably well written and reads very easily. Figures are nicely prepared and support the textual exposition. The supplementary materials and code (although I didn’t actually r
I found this a really difficult paper to review. The paper is enjoyable to read, fairly self-contained, and the experiments seem to support the hypothesis. Presenting a simple method _should not_ present a barrier to publication. This paper is also an absolute A+ in terms of clarity and supporting materials. Kudos to the authors. My only real concrete criticism is that I think the technical contribution is minor. Separating the input encoder is not exactly an awe-inspiring innovation – b
- The paper is well written; - The authors demonstrate the soundness of the proposed method on several problems; - Experiments are convincing; - In particular, results on the wind nowcasting problem are impressive.
- The link with the literature is unclear making the novelty of the approach difficult to assess. In particular, works related to motion forecasting (which shows strong similarities with the reference example of this paper) are not referenced [1, 2, 3]. In a context of explosion of attention-based architectures, a more detailed literature review is expected. - The global presentation of the attention mechanism does not follow the same notations and presentation as is usually the case in the lite
- This paper tackles irregularly sampled datasets, which is an under-explored area in the scientific ML community. - The proposed method is simple and powerful. It is applicable to data from different domains.
- The presentation of this paper can be optimized. I think the description for baseline models can be shrunken, and the authors may expand the introduction of their proposed method in the Methodology section. - The Experiments part is kind of weak, in terms of evaluation metrics. This paper only considers RMSE. For ERA5 datasets, scientists usually conduct comprehensive evaluations from different perspectives [1-2]. For scientific data (heat diffusion, and fluid dynamics), relative L2 errors a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Air Quality Monitoring and Forecasting · Advanced Neural Network Applications
MethodsDiffusion · Test
