Do Masked Autoencoders Improve Downhole Prediction? An Empirical Study on Real Well Drilling Data

Aleksander Berezowski; Hassan Hassanzadeh; Gouri Ginde

arXiv:2604.20909·cs.LG·April 24, 2026

Do Masked Autoencoders Improve Downhole Prediction? An Empirical Study on Real Well Drilling Data

Aleksander Berezowski, Hassan Hassanzadeh, Gouri Ginde

PDF

TL;DR

This study empirically evaluates masked autoencoder pretraining for downhole drilling metric prediction, demonstrating significant error reduction and insights into architectural choices using real well data.

Contribution

First empirical assessment of MAE pretraining for downhole drilling prediction, showing its effectiveness over supervised models and analyzing key design factors.

Findings

01

MAE reduces test MAE by 19.8% compared to supervised GRU.

02

Latent space width is the most influential architectural parameter.

03

Masking ratio has negligible effect due to high temporal redundancy.

Abstract

Downhole drilling telemetry presents a fundamental labeling asymmetry: surface sensor data are generated continuously at 1~Hz, while labeled downhole measurements are costly, intermittent, and scarce. Current machine learning approaches for downhole metric prediction universally adopt fully supervised training from scratch, which is poorly suited to this data regime. We present the first empirical evaluation of masked autoencoder (MAE) pretraining for downhole drilling metric prediction. Using two publicly available Utah FORGE geothermal wells comprising approximately 3.5 million timesteps of multivariate drilling telemetry, we conduct a systematic full-factorial design space search across 72 MAE configurations and compare them against supervised LSTM and GRU baselines on the task of predicting Total Mud Volume. Results show that the best MAE configuration reduces test mean absolute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.