# Cross-well machine learning prediction of sonic logs in Newfoundland and Labrador

**Authors:** Bahare Zare, Mohammad Mojammel Huque, Lesley A. James, Hamid Usefi

PMC · DOI: 10.1038/s41598-026-36053-9 · Scientific Reports · 2026-01-15

## TL;DR

This paper presents a machine learning approach to predict sonic logs from non-sonic data in offshore wells in Newfoundland and Labrador, reducing costs and improving field planning.

## Contribution

The study introduces a leakage-free, features-only strategy for cross-well sonic log prediction using tree-based models and depth-aware feature engineering.

## Key findings

- Tuned XGBoost achieved high performance with R² = 0.895 in cross-well prediction.
- Random Forest performed competitively, while BiLSTM underperformed on the dataset.
- Leakage control and feature engineering were key to achieving strong results.

## Abstract

Predicting compressional slowness (DTCO) from non-sonic logs can reduce acquisition cost, fill data gaps, and support field planning. We evaluate blind cross-well DTCO prediction on two offshore Newfoundland & Labrador wells using a strictly leakage-free, features-only strategy: causal lag windows are built from past non-sonic logs and all sonic/sonic-derived channels are excluded. The pipeline includes deterministic depth conditioning, relative-depth features, multi-scale depth derivatives, rank-aggregated feature selection, and time-aware validation on the training well. We compare three model families: Random Forest (RF), Extreme Gradient Boosting (XGBoost), and a BiLSTM. In this setting, tuned XGBoost with the top 20 predictors and a 10-sample lag attains blind cross-well performance of \documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$R^2=0.895$$\end{document}, MAE\documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$=11.38~\mu \mathrm {s/m}$$\end{document}, RMSE\documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$=15.12~\mu \mathrm {s/m}$$\end{document} when trained on Well 1 and tested on Well 2; the reverse direction is lower, indicating inter-well distribution shift. RF performs competitively in several configurations, whereas BiLSTM underperforms on these data. Overall, rigorous leakage control, depth-aware feature engineering, and principled feature selection are key drivers of performance, and tree-based ensembles provide strong, data-efficient baselines for cross-well pseudo-sonic prediction.

## Full-text entities

- **Diseases:** XGB (MESH:D000141), TVD (MESH:D007222), LSTM (MESH:D000088562)
- **Chemicals:** HU (MESH:D006918), MMH (MESH:D009002)
- **Mutations:** A40H, P40H, P16H, P28H

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12881373/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12881373/full.md

## References

8 references — full list in the complete paper: https://tomesphere.com/paper/PMC12881373/full.md

---
Source: https://tomesphere.com/paper/PMC12881373