A Physics-Informed Neural Network for In Vivo Dosimetry Using Quantitative Radiacoustic Imaging

Leshan Sun; Kristina Bjegovic; Lucia Rodriguez-Gonzalez; Yifei Xu; Yuchen Yan; Gilberto Gonzalez; Lucy Whitmore; Luke Connell; Yankun Lang; Prabodh Pandey; Lei Ren; Emil Sch¸ler; Yong Chen; Shawn Xiang

PMC · DOI:10.21203/rs.3.rs-8503498/v1·January 20, 2026

A Physics-Informed Neural Network for In Vivo Dosimetry Using Quantitative Radiacoustic Imaging

Leshan Sun, Kristina Bjegovic, Lucia Rodriguez-Gonzalez, Yifei Xu, Yuchen Yan, Gilberto Gonzalez, Lucy Whitmore, Luke Connell, Yankun Lang, Prabodh Pandey, Lei Ren, Emil Sch¸ler, Yong Chen, Shawn Xiang

PDF

Open Access

TL;DR

This paper introduces a new method using physics-informed neural networks to measure radiation dose inside patients during treatment, enabling more accurate and real-time dosimetry.

Contribution

The novel contribution is a physics-informed neural network framework for quantitative radiacoustic imaging that enables in vivo dose reconstruction.

Findings

01

The PINN-based qRAI method successfully reconstructs quantitative dose maps from limited-view radiacoustic data.

02

The method outperforms purely data-driven models in robustness and generalizability across clinical scenarios.

03

Validation in water tanks, phantoms, and FLASH therapy shows the potential for real-time in vivo dosimetry.

Abstract

Accurate dosimetry is critical for safe and effective radiotherapy, yet no clinical method currently measures dose directly within the patient in vivo. Radiacoustic imaging (RAI), which detects acoustic waves generated by thermoelastic expansion during radiation delivery, offers a promising solution but has been limited to qualitative output. We present a quantitative RAI (qRAI) framework powered by a physics-informed neural network (PINN) that reconstructs quantitative dose maps in vivo. The PINN incorporates the physics of acoustic wave generation and propagation, along with a digital twin of the radiation delivery and radiacoustic detection systems, enabling accurate reconstruction from limited-view data. Reconstructed pressure maps are calibrated against experimental and simulated dose references. We validate the method across diverse clinical scenarios, including water tank…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Chemicals1

water

Figures6

Click any figure to enlarge with its caption.

b](#F1)). This sinogram is then reconstructed using a time-reversal algorithm68 resulting in a limited-view dose map ([Fig. 1c](#F1)). To improve this reconstruction, we apply a PINN-enhanced model, yielding an enhanced dose map ([Fig. 1d](#F1)). The complete workflow of the radiacoustic imaging process is illustrated in [Fig. 1e](#F1) and will be further detailed in subsequent sections. Our PINN incorporates the underlying physics of acoustic wave generation and propagation, along with digital twins of the radiation delivery and ultrasound detection systems. This hybrid approach enables robus

a](#F2), radiation dose deposition induces an initial pressure rise, *p*_0_, which generates spherically-propagating acoustic waves. These pressure signals, $[eqn]$ , are detected by an ultrasound transducer array over time, forming the measured sinogram S_*m*_. This forward process is modeled using a digital twin system that incorporates: the radiation pulse profile, acoustic propagation through heterogeneous media, transducer impulse response and geometry via finite-element modeling. This comprehensive forward operator *F* provides the physical basis for accurate modeling of the qRAI data ac

a](#F3)). This virtual replica captures the influence of both the radiation delivery system and the ultrasound detection hardware. Here, we present validation results demonstrating the improved fidelity of the digital twin simulation compared to conventional acoustic-only models. [Fig. 3b](#F3) compares three representative radiacoustic signals: the yellow line shows a standard k-Wave simulation^[76](#R76),[77](#R77)^ that includes only medium properties and basic acoustic propagation, producing a single peak corresponding to the initial pressure response; the red line represents the experimen

c](#F4) shows representative comparisons of four reconstruction methods—ground truth (GT), time-reversal reconstruction (TR Rec), U-Net enhancement, and PINN enhancement—across axial, sagittal, and coronal views, as well as full 3D volumes. [Fig. 4d](#F4) illustrates the reconstructions from off-center beam positions (left, up, down, right). The green dashed lines denote the beam center of the original central position, highlighting the spatial shifts and the ability of PINN to correct off-center distortions. We additionally evaluated performance under varying proton pulse numbers, and Supplem

a](#F5), the proton beam was delivered from beneath the human torso, targeting the liver region, while a planar ultrasound transducer array was placed on the abdominal surface. The liver was chosen due to the presence of a favorable acoustic window in the abdomen and the absence of bone in the proton beam path, allowing for relatively homogeneous acoustic propagation. According to Supplementary Table 1, the acoustic properties of soft tissues and organs in this region are sufficiently similar to justify this assumption. Three distinct proton energies—160.67, 165.95, and 170.08 MeV—were used to

c](#F6)): ground truth (GT), time-reversal (TR) reconstruction, U-Net enhancement, and PINN enhancement, across axial, sagittal, coronal, and 3D views. The GT dose maps were obtained using TOPAS^[80](#R80)^ Monte Carlo simulation, serving as the reference standard (Summary statistics of the training dataset are provided in Supplementary Figure 6 and Supplementary Table 2). Both U-Net and PINN models were fine-tuned via transfer learning using a new FLASH electron dataset that included six different collimator configurations to introduce variability and improve generalization (training details

Funding2

—National Institute of Health
—UCI Chao Family Comprehensive Cancer Center

Keywords

RadiotherapyIn Vivo DosimetryQuantitative Radiacoustic ImagingPhysics-informed Neural Network

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEffects of Radiation Exposure · Photoacoustic and Ultrasonic Imaging · Advanced Radiotherapy Techniques

Full text

Introduction

Cancer remains one of the leading causes of death globally, with radiotherapy playing a central role in treatment—used in over half of all cancer cases^1,2^. As treatment technologies evolve, advanced modalities such as proton therapy^3^ and FLASH^4^ radiotherapy have gained prominence due to their ability to improve tumor targeting while minimizing damage to healthy tissues. FLASH radiotherapy^5,6^ delivers radiation at ultra-high dose rates (≥40 Gy/s), with instantaneous rates reaching up to millions of Gy/s, and proton therapy^7,8^ uses charged particles to deposit energy precisely at the Bragg peak, reducing exposure beyond the tumor. The clinical adoption of proton therapy is accelerating rapidly, with treated patient numbers tripling between 2012 and 2021, and over 110 centers now in operation globally^9^. A recent milestone in accessibility was marked by Stanford’s installation of the first compact Proton Therapy System^™^ in a standard LINAC vault^10^, indicating a scalable path to broader clinical use. As these advanced radiotherapies gain traction, the need for accurate in vivo dosimetry—the ability to measure the delivered dose in real time inside the patient—becomes critical to ensuring both treatment efficacy and patient safety^11–13^.

Yet, no existing clinical method enables direct, real-time in vivo measurement of radiation dose^14,15^. Current techniques rely on pre-treatment planning CT and external detectors like ion chambers^16,17^ or radiochromic films^18–20^, which cannot capture dose variations during treatment. Alternatives such as PET^21–23^ and prompt gamma imaging^24–27^ offer limited utility due to low detection efficiency, delayed signal acquisition, and complex system integration^28^. Cerenkov imaging provides promising surface dose verification but lacks the ability to resolve deep tissue dose in three dimensions^29–31^. These limitations highlight the urgent need for a real-time, 3D volumetric dosimetry method capable of verifying delivered dose directly on patients during radiotherapy^13^.

Radiacoustic imaging (RAI) has recently emerged as a promising solution for this unmet need^32–39^. RAI detects acoustic waves generated by rapid thermoelastic expansion of irradiated tissues, enabling non-invasive dose visualization with ultrasound detectors^40^. A major milestone was achieved with real-time 3D visualization of radiation dose in liver cancer patients using clinical linear accelerators^41^. RAI has been demonstrated with X-rays^42^, electrons^43^, and proton beams^37^. However, most existing RAI image reconstructions rely on simple back-projection algorithms^44^, which assume idealized imaging conditions and fail under practical constraints such as limited-view acquisition, noise, and finite transducer bandwidth^45–48^. These algorithms produce artifacts, including negative pixel values that invalidate physical dose interpretation^49^. While model-based iterative reconstruction methods can yield higher-fidelity images by incorporating accurate physical models and regularization, they are computationally intensive and not feasible for real-time applications in clinical settings^50–53^.

To address these challenges, deep learning-based reconstruction has shown promise in recovering accurate images from limited-view data^54–61^. Neural networks can learn inverse mappings that correct artifacts and recover dose distributions efficiently^62–64^. However, purely data-driven approaches require large and diverse training datasets^65–67^—which are currently unavailable for RAI due to its novelty. Furthermore, generating ground truth for network training is nontrivial, as the initial pressure distribution inside biological tissues is experimentally inaccessible^14,15^.

To overcome these limitations, we introduce a quantitative RAI (qRAI) framework powered by a physics- informed neural network (PINN) in Fig.1. In this framework, the radiation beam is delivered to the targeted area, and the resulting qRAI signals are captured by an ultrasound (US) transducer array, forming a radiofrequency sinogram (Fig. 1b). This sinogram is then reconstructed using a time-reversal algorithm68 resulting in a limited-view dose map (Fig. 1c). To improve this reconstruction, we apply a PINN-enhanced model, yielding an enhanced dose map (Fig. 1d). The complete workflow of the radiacoustic imaging process is illustrated in Fig. 1e and will be further detailed in subsequent sections. Our PINN incorporates the underlying physics of acoustic wave generation and propagation, along with digital twins of the radiation delivery and ultrasound detection systems. This hybrid approach enables robust reconstruction of quantitative dose map from limited-view measurements, without the need for extensive ground-truth datasets. Unlike traditional deep learning, PINNs integrate governing physical equations directly into the training process, resulting in models that are more data-efficient, interpretable, and generalizable^69–73^. Specifically, our framework includes a general radiacoustic model for wave generation and propagation, a transducer twin capturing system-specific characteristics (geometry, aperture, impulse response), and a radiation beam twin created using a treatment planning system or TOPAS to simulate dose deposition and temporal pulse profiles. These components guide the PINN training through both forward and inverse models, enabling accurate image reconstruction under clinically relevant conditions. We validate the proposed system using proton and electron beam experiments in water phantoms and a human torso phantom, demonstrating the feasibility of real-time, in vivo radiation dosimetry with potential to significantly advance the precision and adaptability of next-generation radiotherapy.

Results

Framework of PINN

We developed a physics-informed neural network (PINN) to enhance limited-view qRAI, as illustrated in Fig. 2. Unlike conventional deep learning approaches that require large, labeled datasets, our PINN integrates physical models directly into the learning process, enabling robust reconstruction from limited experimental data. There are four major components: a) Forward Operation of radiacoustic imaging (RAI). As shown in Fig. 2a, radiation dose deposition induces an initial pressure rise, p0, which generates spherically-propagating acoustic waves. These pressure signals, $[eqn]$ , are detected by an ultrasound transducer array over time, forming the measured sinogram S_m. This forward process is modeled using a digital twin system that incorporates: the radiation pulse profile, acoustic propagation through heterogeneous media, transducer impulse response and geometry via finite-element modeling. This comprehensive forward operator F provides the physical basis for accurate modeling of the qRAI data acquisition process. b) Inverse Operation: Time-Reversal Reconstruction. As shown in Fig. 2b, the inverse process is approximated via time-reversal (TR) reconstruction^68^. The measured sinogram Sm_ is reversed in time to obtain S_TR, which is then backpropagated through the medium to yield a reconstructed pressure 110 map *p_rec. However, this TR-based inverse operator does not yield a unique solution under limited-view conditions^55^, which is a major limitation in clinical settings. Our prior work^63^ addressed this issue using a U-Net architecture for post-processing and a direct reconstruction approach^62^, both validated in simulation only due to the lack of experimental datasets. c) Dose Mapping via PINN Enhancement. To overcome the limited-view artifacts and derive quantitative dose maps, we introduce a PINN (see Fig. 2c). The TR reconstruction p_rec_ is enhanced by the PINN to yield a predicted pressure map p_pred_, which is then calibrated into dose map using experimentally determined coefficients (detailed in Methods). d) PINN Architecture and Physics-Based Loss. The PINN architecture is based on a 3D U-Net (Fig. 2d), designed to map p_rec_ to the initial pressure p0. To embed physical consistency, we incorporate a non-trainable forward operator module F into each training iteration: The enhanced prediction p_rec_ is passed through F to generate the predicted sinogram S_pred_. A physics-based loss ℒ_S_ is computed between S_pred_ and the actual measured sinogram S_m. A conventional loss *ℒ_p, comparing p_rec_ to p_rec_, is also computed. The total loss is then defined as a weighted combination: L=λ_1_ℒ_p_+λ_1_ℒ_S_, where λ_1_ and λ_2_ are tunable weighting factors. This joint optimization enforces both data fidelity and physical plausibility, enabling the model to generalize well in the absence of large-scale training data—a key advantage for this emerging imaging modality.

Improved Signal Fidelity with Digital Twin

To enable accurate simulation of system-specific effects in qRAI, we developed a digital twin^74,75^ model of our experimental setup (Fig. 3a). This virtual replica captures the influence of both the radiation delivery system and the ultrasound detection hardware. Here, we present validation results demonstrating the improved fidelity of the digital twin simulation compared to conventional acoustic-only models. Fig. 3b compares three representative radiacoustic signals: the yellow line shows a standard k-Wave simulation^76,77^ that includes only medium properties and basic acoustic propagation, producing a single peak corresponding to the initial pressure response; the red line represents the experimentally measured signal from a water tank irradiated by a proton beam, which exhibits a distinct double-peak structure; and the blue line shows the digital twin simulation that incorporates radiation pulse profile, medium characteristics, transducer geometry, and measured impulse response—closely matching the experimental signal aside from minor electromagnetic interference. Fig. 3c further demonstrates that time-reversal reconstruction based on the experimental and digital twin signals both yield two localized peaks of comparable size and location, accurately reflecting the expected dose distribution. In contrast, reconstruction from the standard simulation fails to recover this structure, showing only a single, broadened peak. These results confirm that integrating system-specific effects through the digital twin significantly enhances agreement with experimental data in both temporal and spatial domains. Moreover, the high fidelity of the digital twin simulation supports its use as a reliable surrogate for generating training data in physics-informed or hybrid deep learning frameworks, addressing the current limitation of experimental data scarcity in emerging qRAI applications.

Water Tank Evaluation in Proton Therapy

To evaluate the performance of the PINN, we first conducted a validation study using experimental data acquired from a clinical proton beam irradiation in a water tank (Fig. 4a). A clinical proton therapy system (Hyperscan S250i, Mevion, USA) was operated in service mode to enable precise control of pencil beam energy and pulse number. The beam operated at a pulse repetition rate of 750 Hz, delivering approximately 8 picocoulombs of protons per pulse. Log files from the proton system were recorded and used for quantitative benchmarking. A 16×16 matrix ultrasound array (Doppler Tech Inc., Guangzhou, China) was positioned on the side opposite to the proton gantry without interference to the proton beam. Radiacoustic signals generated by the proton beam were captured by this matrix array, amplified, and digitized using a custom data acquisition (DAQ) system (Photosound Tech Inc., Houston, USA).

The proton beam was initially targeted at the center of the array, followed by four additional measurements with 1 cm transducer shifts in the left, right, up, and down directions to simulate off-axis scenarios (Fig. 4b). This yielded a dataset of 7 different proton beam energies captured at 5 transducer positions, resulting in 35 sinograms. Two datasets with severe electromagnetic interference (EMI) were excluded, leaving 33 usable sinograms. These were divided into 20 for training, 5 for validation, and 8 for testing. Due to the limited size of the experimental dataset, we augmented the training data with synthetic sinograms generated from clinical treatment plans using our validated digital twin model. Specifically, 50 pencil beams from a prostate cancer plan and 25 from a liver cancer plan were simulated to create high-fidelity training examples. Initial model performance was evaluated using only the simulated data (details provided in Supplementary Information I and Supplementary Fig. 4). To further improve robustness and suppress experimental noise, the water tank data were combined with the simulation dataset to form a joint training set.

Fig. 4c shows representative comparisons of four reconstruction methods—ground truth (GT), time-reversal reconstruction (TR Rec), U-Net enhancement, and PINN enhancement—across axial, sagittal, and coronal views, as well as full 3D volumes. Fig. 4d illustrates the reconstructions from off-center beam positions (left, up, down, right). The green dashed lines denote the beam center of the original central position, highlighting the spatial shifts and the ability of PINN to correct off-center distortions. We additionally evaluated performance under varying proton pulse numbers, and Supplementary Fig. 7 shows that the quantitative reconstruction faithfully recovers distributions from the summed signals. Together, these results validate the accuracy and generalizability of the PINN framework across both central and off-axis beam positions using real-world proton beam data. Quantitative analysis in Table 1 confirms that the PINN outperforms the U-Net across multiple metrics: it achieves a substantially higher structural similarity index, improved peak signal-to-noise ratio, and higher Gamma Index passing rate. These results demonstrate that integrating the forward physics model into the network makes the PINN a powerful tool for accurate quantitative dose reconstruction even with the presence of noise measurement.

Human Torso Evaluation during Proton Therapy

To evaluate the PINN under conditions that closely mimic clinical scenarios, we performed a validation experiment using an adult human-torso phantom (True Phantom Solutions, Ontario, Canada), offering a more anatomically realistic model than the water tank setup. A planning CT scan of the phantom was acquired, and single pencil beam dose distributions were generated using a commercial treatment planning system (RayStation, RaySearch Laboratories, Stockholm, Sweden).

As shown in Fig. 5a, the proton beam was delivered from beneath the human torso, targeting the liver region, while a planar ultrasound transducer array was placed on the abdominal surface. The liver was chosen due to the presence of a favorable acoustic window in the abdomen and the absence of bone in the proton beam path, allowing for relatively homogeneous acoustic propagation. According to Supplementary Table 1, the acoustic properties of soft tissues and organs in this region are sufficiently similar to justify this assumption. Three distinct proton energies—160.67, 165.95, and 170.08 MeV—were used to deliver Bragg peaks at depths of 5.2 cm, 6.2 cm, and 7.2 cm from the transducer, respectively. Because this dataset included only three pencil beam energies, it was not sufficient for model retraining or fine-tuning. Instead, we directly applied the PINN model trained on simulated and water tank data to this new dataset. Fig. 5b presents line profile comparisons of the reconstructed dose distributions for all three energies. Time-reversal reconstruction (blue solid lines) consistently localized the Bragg peaks but failed to reproduce the correct amplitude and peak morphology. The U-Net predictions (green dashed lines) recovered amplitude more effectively, but introduced noticeable shape distortion, particularly for the 7.2 cm beam. In contrast, the PINN results (black dashed lines) closely matched the ground truth dose profiles (red solid lines), accurately capturing both shape and intensity across all energy levels.

Fig. 5c further visualizes dose reconstructions in axial, sagittal, and coronal views, overlaid on the planning CT. Compared to time-reversal and U-Net results, the PINN-enhanced reconstructions show superior spatial agreement with the planned dose distributions. These findings highlight the PINN’s ability to generalize to complex, real-world anatomies without retraining, thanks to its embedded physics-based forward model.

Evaluation in FLASH Therapy

To further evaluate the versatility of PINN, we tested its performance using a different radiation modality—FLASH electron radiotherapy. As shown in Fig. 6a, the experiment was conducted in a water tank setup, through a collimator. This controlled configuration allowed us to capture dose distributions with high temporal resolution, making it ideal for testing qRAI’s real-time dosimetry capabilities under ultra-high dose rate conditions. A key characteristic of this experiment is the extreme dose rate of the FLASH electron beam, illustrated in Fig. 6b. The system delivers 1 Gy of dose in just 1 microsecond per pulse, corresponding to an instantaneous dose rate of 10^6 Gy/s. This mirrors the conditions of future clinical FLASH therapy^78,79^, where precise in vivo dosimetry remains a major technical challenge due to the rapid dose deposition.

To benchmark performance, we compared the reconstruction results of four methods (Fig. 6c): ground truth (GT), time-reversal (TR) reconstruction, U-Net enhancement, and PINN enhancement, across axial, sagittal, coronal, and 3D views. The GT dose maps were obtained using TOPAS^80^ Monte Carlo simulation, serving as the reference standard (Summary statistics of the training dataset are provided in Supplementary Figure 6 and Supplementary Table 2). Both U-Net and PINN models were fine-tuned via transfer learning using a new FLASH electron dataset that included six different collimator configurations to introduce variability and improve generalization (training details provided in Supplementary Information III).

The reconstruction results show that the TR method can roughly localize the dose but lacks structural accuracy. The U-Net model recovers some spatial features but performs inconsistently across different views and energy levels. The PINN model shows modest improvements in both spatial fidelity and intensity reconstruction; however, it still exhibits limitations in capturing the full dose structure. These challenges are largely attributed to the limited size and variability of the training dataset under FLASH conditions.

Despite these constraints, the experiment demonstrates the feasibility of extending PINN empowered qRAI and the framework to FLASH therapy applications. With further optimization and larger training datasets, this approach holds promise for accurate, real-time in vivo dosimetry in ultra-high dose rate radiotherapy.

Discussion

We introduce a physics-informed neural network (PINN) framework for radiacoustic imaging (RAI), enabling quantitative, real-time, volumetric dosimetry during radiotherapy. Our results demonstrate that PINN offers several critical advantages over conventional reconstruction approaches—most notably, the ability to reconstruct accurate, quantitative dose maps from limited-view data without requiring labeled in vivo ground truth during treatment. This capability represents a major step towards in vivo dosimetry, where dose delivery can be verified in real time, even under clinical constraints.

A major challenge for deep learning in novel imaging modalities is the scarcity of large-scale annotated datasets. PINN directly addresses this by dramatically reducing training data requirements. Rather than relying on patient images, PINN is trained using synthesized sinograms generated from digital twins, which simulate a wide range of dose distributions and anatomical features. This not only eliminates the need for time-consuming and costly data collection but also mitigates overfitting to specific anatomical characteristics present in small clinical datasets. By encoding the governing physics into the loss function, PINN leverages self-supervision from the measured sinogram, ensuring that the learned reconstruction remains physically consistent even in out-of-distribution scenarios.

This approach is particularly advantageous for emerging imaging technologies like qRAI, where ground truth data from patients is either unavailable or impractical to obtain. Unlike purely data-driven networks (e.g., U-Net), PINN’s physics-constrained architecture enables robust performance in real-world conditions with sparse or noisy measurements. In our studies, PINN consistently outperformed both time-reversal and U-Net approaches across water tank, human phantom, and FLASH electron experiments—accurately recovering dose shape and magnitude, even with limited experimental training data.

The digital twin model plays a pivotal role in enabling quantitative reconstruction. Traditional RAI forward models often ignore real-world system complexities, leading to discrepancies between simulations and measurements. Our digital twin framework overcomes this by incorporating measured beam profiles, transducer impulse responses, and medium-specific acoustic properties—tuned to each experimental setup. With the digital twin, simulation data can now closely replicate experimental data, enabling networks trained on simulation data to be directly applicable to experimental setups. Although calibration requires upfront effort, once constructed, the digital twin remains stable and reusable across experiments, forming a foundation for reliable training data generation.

In our human torso phantom experiments, we demonstrated that PINN, trained from simulation and water tank data, can be directly applied to realistic anatomical geometries without retraining. This highlights its potential for clinical deployment, where acquiring diverse patient data for model training is infeasible. To address tissue heterogeneity and motion in future patient studies, we are developing a method to dynamically update the acoustic model using co-registered 3D ultrasound and planning CT. This will further improve robustness for in vivo applications during treatment^81^.

A key limitation remains the low signal-to-noise ratio (SNR) of radiacoustic signals, particularly in proton therapy where multiple beam pulses must be accumulated to produce usable images. This requirement currently limits temporal resolution. While FLASH electrons generate stronger signals due to higher per-pulse doses, 10 pulses are still needed to achieve optimal reconstruction. Although we have developed deep-learning-based denoising methods in prior work, their purely data-driven nature requires clinical datasets that are not yet available. A future direction is to integrate physics-informed denoising into the PINN workflow to enable single pulse imaging.

Importantly, the principles behind PINN are generalizable to other imaging modalities governed by physical models, such as ultrasound tomography^82^, X-ray computed tomography^83^, photoacoustic imaging^84,85^, and magnetic resonance imaging^86^. By embedding modality-specific physics into the network, PINN can serve as a flexible and powerful reconstruction engine across diverse applications, especially in settings where conventional supervised learning falls short due to data limitations.

In conclusion, qRAI powered by the PINN framework offers a transformative solution for precision dosimetry. It achieves quantitative, real-time dose reconstruction with minimal data requirements, supports in vivo imaging without labeled ground truth, and is extensible to a wide range of radiation and imaging modalities. Our experimental results confirm its accuracy in both proton and FLASH electron therapy and its compatibility with realistic anatomical settings. Looking ahead, we aim to further enhance performance through improved SNR, volumetric ultrasound integration, and expanded datasets that capture patient heterogeneity. With these developments, we envision qRAI as a cornerstone technology for adaptive radiotherapy, enabling online dose verification and ultimately improving patient safety and outcomes.

Methods

3D RAI system

We evaluated the performance of the physics-informed neural network (PINN) framework using a radiacoustic imaging (RAI) system (Supplementary Fig. 5), integrated with a clinical radiotherapy machine capable of delivering either proton beams or FLASH electron beams. Radiacoustic signals were detected by a 256-element matrix ultrasound array, amplified, and processed through a custom 256-channel data acquisition system. To ensure precise synchronization, a trigger signal was generated using a photodiode coupled with a scintillator. This setup eliminates the need for mechanical scanning and enables real-time 3D radiacoustic imaging during radiation delivery. With the proton beam operating at a repetition rate of 750 Hz, the system achieves imaging rates of up to 75 frames per second using 10-signal averaging.

Physics Model

This section outlines the theoretical foundations of the radiacoustic physics model integrated into our PINN framework, covering both the forward model of radiacoustic wave generation and the inverse reconstruction of the initial pressure distribution.

Radiacoustic Wave Generation and Propagation:

In quantitative radiacoustic imaging (qRAI), radiation energy deposition leads to a rapid local temperature rise, which in turn induces thermoelastic expansion and generates acoustic waves. Under the assumptions of thermal confinement and negligible acoustic attenuation, the wave equation governing pressure wave propagation is given by^87^:

[eqn]

where $[eqn]$ denotes the acoustic pressure at location $[eqn]$ and time t, c is the speed of sound, $[eqn]$ is the initial pressure distribution, and δ(t) is a temporal distribution of $[eqn]$ , which in RAI, denotes the temporal profile of radiation pulse

The initial pressure $[eqn]$ is proportional to the deposited dose, $[eqn]$ and can be modeled as:

[eqn]

where Γ is the Grüneisen parameter, η_th_ is the fraction of absorbed dose converted to heat, and ρ is the density of the irradiated medium.

The analytical solution for the pressure at a detection point $[eqn]$ and time t is ^44^:

[eqn]

where S′(t) denotes a spherical surface defined by $[eqn]$ . In compact operator notation, this relationship is expressed as^88^:

[eqn]

where M represents the physical operator encompassing the forward process of acoustic wave generation and propagation.

Initial Pressure Reconstruction via Time-Reversal:

To reconstruct the initial pressure distribution $[eqn]$ , we apply a time-reversal (TR) method^68^, which numerically back-propagates the recorded acoustic signals in time. This inverse solution leverages the time-symmetry of the acoustic wave equation, allowing accurate recovery of the original pressure distribution:

[eqn]

where $[eqn]$ is the measured pressure signal and T is the acquisition time. The initial conditions for solving the wave equation during back-propagation are:

[eqn]

Solving the time-reversed wave equation with these conditions yields the reconstructed pressure distribution:

[eqn]

where TR denotes the time-reversal operator. Importantly, incorporating prior knowledge of the acoustic properties of the medium—such as speed of sound, density, and attenuation—into the TR process enhances quantitative accuracy compared to standard universal back-projection (UBP) methods.

Digital Twin Modeling

To accurately simulate the system-specific signal formation process in quantitative radiacoustic imaging (qRAI), we developed a digital twin^74,75^ model that incorporates the temporal characteristics of the radiation beam, the spatial integration effects of the ultrasound transducer, and the transducer’s impulse response. This comprehensive forward model enables the generation of realistic training data and serves as the physics engine embedded in the PINN framework.

The digital twin framework consists of four key components:

Radiation Beam Temporal Profile: As the first component of the digital twin, we captured the temporal profile of the radiation beam delivered by the radiotherapy system. A photodiode–scintillator assembly was placed directly beneath the beam to record its intensity output as a function of time. This temporal profile, denoted as δ(t), modulates the pressure waveform generated by the radiation energy deposition. Accordingly, the time-resolved pressure signal $[eqn]$ at a detection point $[eqn]$ is given by the convolution of the pressure output from the physical model $[eqn]$ with the radiation pulse profile:

[eqn]

The measured pulse shapes are shown in Supplementary Fig. 1. 2. Finite-Element Transducer Model: Next, we account for the spatial averaging effect of the finite-sized transducer elements. Each transducer in the planar array has a physical size of 3×3 mm. To match the resolution of our image reconstruction, each element is subdivided into nine 1×1 mm sub-elements. The resulting pressure signal $[eqn]$ is calculated as the sum of the contributions from all sub-elements:

[eqn]

where $[eqn]$ is the displacement vector from the center of the transducer element to the ith sub-element. This procedure is illustrated in Supplementary Figure 2. 3. Transducer Impulse Response: The final component of the digital twin model incorporates the frequency response of the ultrasound transducer (shown in Supplementary Figure 3), which behaves as a damped harmonic oscillator. The measured pressure signal $[eqn]$ is modeled as the convolution of the finite-element signal $[eqn]$ with the transducer’s impulse response, IR.

[eqn]

Unified Forward Operator: By combining all three components—the radiation pulse profile, finite-element spatial averaging, and transducer impulse response—we define a unified forward operator F, which represents the complete signal acquisition process in our digital twin system. Thus, the measured pressure signal can be expressed as:

[eqn]

This digital twin framework provides a high-fidelity, end-to-end simulation of the qRAI system and plays a central role in training and validating the PINN model.

Physics-Informed Neural Network

The core of our PINN framework is a 3D U-Net^89^ that maps the time-reversal reconstructed pressure map $[eqn]$ to the estimated initial pressure distribution $[eqn]$ . This mapping is represented by a nonlinear function N(⋅), and the traditional learning objective for such a model is to minimize the L2 norm between the predicted and true initial pressure maps:

[eqn]

To enforce physical consistency, we extend this formulation by integrating the forward physics operator F, which maps predicted pressure distributions to simulated sinograms. This leads to a dual-loss objective:

[eqn]

where λ1 and λ2 are weighting parameters that control the trade-off between data fidelity and physics consistency. The model is trained by minimizing a composite loss function consisting of two terms:

Pressure loss (image domain):

[eqn]

Sinogram loss (measurement domain):

[eqn]

This physics-informed formulation provides two key benefits: (1) it enables the network to generalize unseen data by learning from both image-level supervision and sinogram-level self-consistency, and (2) it reduces the dependency on large amounts of labeled data, which are difficult to obtain for emerging imaging modalities like qRAI.

Pressure-to-Dose Calibration

Once the PINN reconstructs a quantitative pressure map, the final step is to convert this acoustic pressure into radiation dose distribution. While Eq. (2) provides a theoretical link between dose and pressure, the reconstructed pressure ppred is relative in practice. This is because the acquired signals are measured in millivolts rather than Pascals and are affected by unknown amplifier gains within the acquisition system. To establish a pressure-to-dose relationship, we performed a calibration using a water tank experiment. We selected the Bragg peak location at 7 cm from the transducer as the calibration point, where the proton machine delivers 1.72 cGy per pulse (8 pC). The delivered dose at this location, Dc, was extracted from the machine log files, and the corresponding reconstructed pressure value, pc, was obtained from the qRAI reconstruction. We then define the calibration factor K as:

[eqn]

This scalar factor enables conversion of relative pressure values into quantitative dose by applying K·ppred across the reconstructed dose map in the water tank setup.

For the human torso phantom, a direct pressure-to-dose conversion is more challenging because the absolute Grüneisen parameter (Γ) is unknown. Therefore, we performed a separate calibration using the same strategy. We selected the Bragg peak location at 5.2 cm depth, where the proton machine delivers a known dose of 1.88 cGy per pulse (8 pC). This approach allows for a relative calibration within the phantom geometry, enabling approximate dose estimation even in the absence of a precisely known tissue response.

Evaluation Metrics

To quantitatively evaluate the accuracy of the reconstructed dose distributions, we employed three widely used image quality and dosimetric metrics: gamma index (γ), structural similarity index (SSIM), and peak signal-to-noise ratio (PSNR).

Gamma Index (γ):90

The gamma index $[eqn]$ was selected as the primary evaluation metric, as it is a standard criterion in radiotherapy for assessing both dose differences and spatial agreement between the planned and delivered dose. It combines distance-to-agreement (DTA) and dose difference into a single scalar value and is defined as:

[eqn]

where:

$[eqn]$ is the reference (ground-truth) dose, $[eqn]$ is the predicted dose; $[eqn]$ is the coordinate in $[eqn]$ and $[eqn]$ is the coordinated in $[eqn]$ ;
Δd is the distance-to-agreement (DTA) criterion (in mm) and ΔD is dose difference criterion (as a percentage of the reference dose).

A gamma value $[eqn]$ indicates that the reconstructed dose passes the specified acceptance criteria at location $[eqn]$ .

Structural Similarity Index (SSIM):

To assess structural fidelity, we used the structural similarity index (SSIM), which measures perceptual similarity between the reconstructed dose distribution and the reference dose. SSIM is defined as^91^:

[eqn]

where:

I and R are the reconstructed and reference dose images, respectively,
μ_R_ is an average of R, $[eqn]$ is a variance of I and τ_IR_ is a covariance of I and R.

There are two variables to stabilize the division such as c1 = (k1L)^2^ and c2 = (k2L)^2^. L is a dynamic range of the pixel intensities. k1 and k2 are constants by default k1 = 0.01 and k2 = 0.03.

Peak Signal-to-Noise Ratio (PSNR):

Lastly, PSNR was used to quantify the dose reconstruction fidelity. It is defined as^92^:

[eqn]

Where:

T is the reconstructed dose image,
G is the ground-truth dose image,
M and N are the image dimensions (rows and columns),
∥G∥∞ is the maximum pixel valuein G.

Together, these metrics provide a comprehensive evaluation of both the spatial and dosimetric accuracy of our reconstructed dose maps.

Supplementary Material

This is a list of supplementary files associated with this preprint. Click to download.

• Supplementary.pdf

Bibliography92

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Baskar R., Lee K. A., Yeo R. & Yeoh K.-W. Cancer and Radiation Therapy: Current Advances and Future Directions. Int J Med Sci 9, 193–199 (2012).22408567 10.7150/ijms.3635 PMC 3298009 · doi ↗ · pubmed ↗
2Abdel-Wahab M. Global Radiotherapy: Current Status and Future Directions—White Paper. JCO Glob Oncol 827–842 (2021) doi:10.1200/GO.21.00029.34101482 PMC 8457786 · doi ↗ · pubmed ↗
3Liu H. & Chang J. Y. Proton therapy in clinical practice. Chin J Cancer 30, 315–326 (2011).21527064 10.5732/cjc.010.10529 PMC 4013396 · doi ↗ · pubmed ↗
4Gao Y. A potential revolution in cancer treatment: A topical review of FLASH radiotherapy. J Appl Clin Med Phys 23, e 13790 (2022).36168677 10.1002/acm 2.13790 PMC 9588273 · doi ↗ · pubmed ↗
5Tang R., Yin J., Liu Y. & Xue J. FLASH radiotherapy: A new milestone in the field of cancer radiotherapy. Cancer Letters 587, 216651 (2024).38342233 10.1016/j.canlet.2024.216651 · doi ↗ · pubmed ↗
6Matuszak N. FLASH radiotherapy: an emerging approach in radiation therapy. Reports of Practical Oncology and Radiotherapy 27, 343–351 (2022).
7Proton Therapy: Current Status and Controversies | JCO Oncology Practice. https://ascopubs.org/doi/10.1200/OP.24.00132.
8Hughes J. R. & Parsons J. L. FLASH Radiotherapy: Current Knowledge and Future Insights Using Proton-Beam Therapy. Int J Mol Sci 21, E 6492 (2020).