Optimizing a Transformer-based network for a deep learning seismic processing workflow
Randy Harsuko, Tariq Alkhalifah

TL;DR
This paper enhances a Transformer-based seismic processing model by replacing key components with more efficient alternatives, leading to faster training and competitive results on realistic datasets.
Contribution
It introduces relative positional encoding and low-rank attention matrices into StorSeismic, improving efficiency and expressiveness for seismic tasks.
Findings
Faster pretraining observed with the proposed modifications
Achieved competitive results on seismic processing tasks
Reduced number of trainable parameters
Abstract
StorSeismic is a recently introduced model based on the Transformer to adapt to various seismic processing tasks through its pretraining and fine-tuning training strategy. In the original implementation, StorSeismic utilized a sinusoidal positional encoding and a conventional self-attention mechanism, both borrowed from the natural language processing (NLP) applications. For seismic processing they admitted good results, but also hinted to limitations in efficiency and expressiveness. We propose modifications to these two key components, by utilizing relative positional encoding and low-rank attention matrices as replacements to the vanilla ones. The proposed changes are tested on processing tasks applied to a realistic Marmousi and offshore field data as a sequential strategy, starting from denoising, direct arrival removal, multiple attenuation, and finally root-mean-squared velocity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSeismic Imaging and Inversion Techniques · Reservoir Engineering and Simulation Methods · Seismology and Earthquake Studies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Label Smoothing · Adam · Residual Connection · Dense Connections · Dropout
