Efficient dataset construction using active learning and uncertainty-aware neural networks for plasma turbulent transport surrogate models
Aaron Ho (1), Lorenzo Zanisi (2), Bram de Leeuw (3), Vincent Galvan (1), Pablo Rodriguez-Fernandez (1), Nathaniel T. Howard (1) ((1) MIT Plasma Science, Fusion Center, Cambridge, USA, (2) UKAEA Culham Centre for Fusion Energy, Abingdon, UK, (3) Radboud University, Nijmegen

TL;DR
This paper presents a method combining uncertainty-aware neural networks and active learning to efficiently build datasets for surrogate models of plasma turbulent transport, reducing data needs while maintaining high accuracy.
Contribution
It introduces an active learning strategy with uncertainty-aware architectures applied to plasma turbulence modeling, improving dataset efficiency for surrogate models.
Findings
Achieved ~0.8 F1 classification performance
Reached ~0.75 R^2 regression performance
Reduced dataset size from 10^2 to 10^4 samples
Abstract
This work demonstrates a proof-of-principle for using uncertainty-aware architectures, in combination with active learning techniques and an in-the-loop physics simulation code as a data labeller, to construct efficient datasets for data-driven surrogate model generation. Building off of a previous proof-of-principle successfully demonstrating training set reduction on static pre-labelled datasets, using the ADEPT framework, this strategy was applied again to the plasma turbulent transport problem within tokamak fusion plasmas, specifically the QuaLiKiz quasilinear electrostatic gyrokinetic turbulent transport code. While QuaLiKiz provides relatively fast evaluations, this study specifically targeted small datasets to serve as a proxy for more expensive codes, such as CGYRO or GENE. The newly implemented algorithm uses the SNGP architecture for the classification component of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
