Thermoacoustic Ultrasound Assessment of Liver Steatosis—A Novel Approach for MASLD Diagnosis
Jang Hwan Cho, Christopher M. Bull, Michael Thornton, Jing Gao, Jonathan M. Rubin, Idan Steinberg

TL;DR
This study introduces a new non-invasive ultrasound method using thermoacoustic signals to accurately assess liver fat levels, offering a promising alternative to expensive MRI scans for diagnosing liver disease.
Contribution
The novel thermoacoustic fat fraction (TAFF) method provides a non-invasive, accurate, and cost-effective alternative for diagnosing MASLD.
Findings
TAFF showed strong correlation (r = 0.89) with MRI-PDFF and low average error (3.04%).
High classification performance with AUROC of 0.92 at 12% and 0.99 at 20% fat fraction thresholds.
TAFF demonstrated robust stability with ICC of 0.89 and negligible inter-operator differences.
Abstract
Background/Objectives: Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD) is a global health crisis, but current diagnostics are limited. Liver biopsy is invasive, magnetic resonance imaging-proton density fat fraction (MRI-PDFF) is expensive, and quantitative ultrasound methods are low-accuracy, especially in patients with a high body mass index (BMI). This study introduces a novel thermo-acoustic (TA) method that generates ultrasound signals based on tissue electrical conductivity, where lean tissue (high in water and electrolytes) absorbs more radio-frequency (RF) energy than fatty tissue, providing a direct molecular contrast for fat. Methods: A prospective, cross-sectional feasibility study compared a new thermo-acoustic fat fraction (TAFF) score with the reference standard MRI-PDFF in 40 subjects with suspected fatty liver disease. Bland–Altman analysis, Deming…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLiver Disease Diagnosis and Treatment · Infrared Thermography in Medicine · Body Composition Measurement Techniques
1. Introduction
1.1. Clinical Background
Metabolically dysfunction-associated steatotic liver disease (MASLD), formerly known as nonalcoholic fatty liver disease (NAFLD), is a complex, multi-system metabolic disorder characterized by elevated liver fat fraction (LFF). The steatotic liver is not passive; it acts as an active endocrine organ, secreting hepatokines such as Fetuin-A that drive systemic insulin resistance and chronic inflammation through a vicious cycle of lipotoxicity and oxidative stress [1,2,3,4,5,6,7,8,9,10,11]. This metabolic cascade has profound systemic implications, such as cardiovascular disease (CVD) [12], type 2 diabetes (T2D) [13], hypertension, and neurodegenerative diseases [14]. Recently, MASLD has emerged as a major global health concern, with an estimated prevalence of approximately 30%, establishing MASLD as the most common chronic liver disease in Western nations [15,16,17].
An estimated 20–30% of individuals with MASLD develop metabolic dysfunction-associated steatohepatitis (MASH), characterized by inflammation and fibrosis that places them at high risk for severe morbidity [18]. This disease progression is clinically significant as MASH is a leading driver for the development of cirrhosis, hepatocellular carcinoma (HCC), liver failure, and ultimately liver transplantation [19,20]. Furthermore, MASLD is strongly associated with a complex of chronic diseases that extend beyond the liver, with CVD representing the primary cause of mortality in the MASLD patient population [12]. However, this risk is dynamic rather than static, as clinical evidence shows that reversing hepatic steatosis successfully ameliorates systemic dysfunction and significantly reduces the risk of associated cardiovascular events [21,22,23,24,25].
Historically, the management of MASH has been predicated upon lifestyle modifications, primarily focused on inducing sustained weight loss [26]. The therapeutic landscape, however, is undergoing a rapid and profound transformation. This evolution was dramatically underscored by the 2024 landmark FDA approval of the first MASH-specific pharmacotherapy, Resmetirom (being marketed as Rezdiffra) [27]. In parallel, robust Phase 3 clinical trial data have demonstrated the compelling efficacy of various incretin therapies, including Glucagon-like Peptide-1 receptor agonists (GLP-1 RAs) such as semaglutide. These agents exhibit high rates of MASH resolution and marked improvements in hepatic fibrosis, firmly establishing them as a cornerstone of future metabolic-based therapeutic strategies [28]. And because MASLD patients die primarily from CVD, GLP-1 RAs offer crucial extrahepatic benefits that target these metabolic complications. Patients with MASLD and coexisting type 2 diabetes, GLP-1 RAs are associated with a significantly lower incidence of major adverse cardiovascular events and all-cause mortality compared to other glucose-lowering drugs [29,30,31].
Current diagnostic methods for steatotic liver disease exhibit a clear hierarchy based on their accuracy and clinical application. Liver biopsy remains the only definitive method for diagnosing MASH by identifying characteristic inflammation and ballooning; however, its utility for precise, longitudinal quantification is significantly limited by its invasiveness, substantial cost, high risk of sampling error, and inherent inter-observer variability [32,33]. In contrast, magnetic resonance imaging-proton density fat fraction (MRI-PDFF) has emerged as the non-invasive reference standard for quantifying hepatic fat, demonstrating high accuracy and reproducibility, thereby establishing it as the preferred modality for clinical trials [34,35]. Positioning itself as a widely accessible screening method, the first-generation quantitative ultrasound tool, the controlled attenuation parameter (CAP), offers adequate diagnostic accuracy. However, its performance is known to be confounded by factors such as high body mass index (BMI) [36,37,38]. This emphasis on objective measurement is mandated by the 2025 American Association for the Study of Liver Diseases (AASLD) guidance, which explicitly recommends quantitative methodologies, such as CAP or MRI-PDFF, over standard, subjective gray-scale ultrasound for detecting steatosis [39].
1.2. MASLD Diagnosis
The most recent advancements involve state-of-the-art Quantitative Ultrasound (QUS) techniques that leverage advanced analysis of beamformed images or raw radiofrequency ultrasound signals to assess complementary acoustic properties [40,41,42]. Quantitative ultrasound methods, specifically Attenuation Coefficient (AC) algorithms and/or backscatter signals for liver fat assessment, such as the CAP, ultrasound-derived fat fraction (UDFF), and ultrasound-guided attenuation parameter (UGAP), offer non-invasive approaches for assessing hepatic steatosis.
While the prevalence and related morbidity of MASLD, in combination with the introduction of new treatment options such as GLP-1 drugs, necessitate screening and monitoring of a broad population [43], none of the abovementioned methods is suitable for such a task. Invasive biopsies are painful, costly, and introduce a non-negligible risk to patients [32]. Although MRI-PDFF is accurate and non-invasive [34], it is cost-prohibitive and cannot be performed at the bedside. Ultrasound-based methods, albeit low cost and portable, often suffer from low predictive power, especially in patients with high BMI or confounding fibrosis due to reliance on surrogate parameters in place of fatty tissue molecular contrast. Thus, there is an unmet clinical need for a non-invasive, low-cost, rapid, and portable method with high predictive power for screening and monitoring MASLD treatment.
Thermoacoustic (TA) imaging is a hybrid electromagnetic (EM) and acoustic imaging modality that leverages differential absorption of time-varying EM radiation to generate pressure waves in tissue [44]. As EM radiation propagates through tissue, its energy is differentially absorbed by regions based on electrical conductivity [45,46,47]. This selective absorption results in a rapid, localized increase in temperature. Although the temperature rise is minuscule—often on the order of milli-Kelvins—the almost instantaneous, localized change in tissue temperature is sufficient to generate an acoustic pressure wave that is detectable by an ultrasound transducer array. The wave intensity is proportional to the tissue conductivity, thus providing molecular contrast [48]. The detected pressure signals can then be used to reconstruct images or volumes of data related to tissue composition [49].
Currently, most academic research on TA focuses on system engineering [50,51,52] or pre-clinical imaging, such as functional brain imaging and cerebral hemorrhage detection [53] or joint and bone imaging [54]. Other examples include deep-tissue (>6 cm) kidney imaging in large swine and real-time monitoring of therapeutic interventions, such as microwave ablation (MWA) therapy [55]. However, in recent years, TA imaging has emerged as a clinical tool for diagnosing and monitoring various tissues and disease states. For example, in their seminal paper, Kruger et al. [56] demonstrated clear contrast enhancement in the tumor region for three patients imaged both before and after chemotherapy. Conversely, the two breast cancer patients imaged after completing their chemotherapy showed no contrast enhancement. This finding correlated with subsequent pathologic examinations, which confirmed their disease was in complete remission.
This paper describes a system and the clinical study used to assess the feasibility of a point-of-care, portable thermoacoustic system for measuring liver fat fraction. To that end, TA techniques can estimate liver fat since their contrast mechanism relies on differences in tissue conductivity when subjected to an RF pulse: lean tissues, which have high water and ion content, absorb significantly more energy than fatty tissues, which have low water and ion content. This difference in energy absorption, particularly at the boundary between muscle and liver, generates a stronger thermoacoustic signal as liver fat content increases, thereby enabling quantitative measurement of steatosis. Moreover, the layered structure of the tissue (skin, subcutaneous fat, muscle, and then liver) provides an inherent calibration method for the measured signal, as subcutaneous fatty tissue is predominantly fat (>90%) and muscle is predominantly lean tissue (>95%). Thus, for each measurement, signals from purely fatty and purely lean tissues are acquired along with those from the unknown liver tissue, enabling their use for calibration. The aim of this paper is to estimate the feasibility, accuracy compared to the reference standard MRI-PDFF, and reproducibility of using a handheld TA device for liver fat estimation.
2. Materials and Methods
2.1. TAEUS® Liver System
The TAEUS^®^ Liver System (ENDRA Life Sciences, Inc., Ann Arbor, MI, USA) is a point-of-care thermoacoustic system for measuring LFF. It includes four main subsystems, as illustrated in Figure 1A: (1) Console: A cart-mounted enclosure that contains an RF Source, electronics, and software. (2) Probe: A handheld component that includes an RF applicator and a receive-only ultrasound transducer. (3) Display assembly: A touchscreen tablet that is the graphical user interface for the ultrasound probe, a touchscreen monitor (TA display) that is the graphical user interface and input device. (4) Handheld ultrasound probe: a wireless U/S probe. Figure 1B shows a photo of the system.
During a scan, the TA console generates high-power RF pulses (7 kW peak power). It delivers these pulses to the RF applicator within the TA probe via a coaxial cable. The RF applicator is an open-ended waveguide antenna designed to efficiently couple RF energy into human tissue. Using an RF tap, a small fraction (<0.01%) of the power is redirected to a power monitor inside the console, ensuring every pulse is monitored for safety. The system operates below the specific absorption rate (SAR) limits for standard mode MRI operation [57] and below the maximal permissible exposure (MPE) for the electric field [58].
The transducer within the probe receives the ultrasound signal generated by the TA conversion of the RF pulse energy into acoustic energy. The ultrasound signal is then amplified, digitized, and sent to the console for processing and subsequent display on the graphical user interface. A TA exam includes a few preview scans (i.e., 125 pulses each) to ensure that the probe is adequately positioned and can pick a clear liver signal, followed by a longer one or more measurement scans (250 pulses each). Both the preview and measure scans are processed in the same way. Acquired scan data are averaged and digitally filtered to reduce noise and artifacts. A Hilbert transform is applied to the filtered data, followed by universal back-projection [59] and envelope detection to generate the beamformed image. Since the probe is handheld, its positioning and placement affect the quality of the acquired scan. To identify valid scans with features critical to the accurate thermo-acoustic fat fraction (TAFF) estimation, a scan-quality metric was designed and calculated for each scan. The scan quality metric comprises various geometric and image-based features to ensure that signals are free of interference and their locations are consistent with ultrasound imaging. Measurements with low scan quality were automatically removed from the study. On average, each patient had 5 valid measurements, with a range of 1 to 16. If multiple TAFF scores were obtained per patient, the mean of all TAFF scores was used in the analysis.
For each scan, the TAFF value is estimated from the corresponding beamformed image. Since the TA probe was positioned to include the skin, subcutaneous fat, muscle, and liver in the field of view, the beamformed image from that location is expected to show three signals: the skin-to-subcutaneous fat boundary, the fat-to-muscle boundary, and the muscle-to-liver boundary. Based on these expected signals at the tissue boundaries, the TAFF estimation is formulated as an optimization problem that seeks the liver fat content by minimizing the difference between measured and theoretically expected signal strengths. The cost function relies on a thermoacoustic signal-generation model that incorporates various tissue properties, including the Grüneisen parameter, tissue permittivity, tissue electrical conductivity, and subject-specific electric-field distributions generated by an electric-field model, as described in Appendix A. The electric field model generated the necessary electric field distributions for subjects with various body habitus and liver fat content, and was estimated using Ansys HFSS 2023 R2 (Ansys, Canonsburg, PA, USA) simulation software to account for these variations.
2.2. Study Description
To assess the ability of TA to accurately estimate LFF across various disease states and body habitus, a prospective, HIPAA-compliant, cross-sectional study was conducted as a proof-of-concept and preliminary evaluation of the method. A single-site (Ann Arbor, MI, USA) single-operator study was approved by the Advarra Institutional Review Board (IRB approval number: Pro00089043). First, prospective participants recruited from the local population were screened via questionnaire for clinically proven or suspected elevated liver steatosis (based on known comorbidities, lifestyle, and body habitus). If liver steatosis was confirmed or suspected, participants were invited to join the study, and written informed consent was obtained. This was done primarily to increase the number of subjects with liver steatosis in the study and to enable testing of the system across all disease grades. Inclusion and exclusion criteria are described in Table 1 below:
Subjects with liver diseases other than steatosis, such as cancer or ascites, or abnormalities such as missing liver lobes, polycystic disease (multiple hepatic cysts) were excluded from the study to prevent potential confounding factors. Any subject with an obstruction that prevented imaging of the liver capsule through the right intercostal imaging window was also rejected to ensure adequate measurement of the liver. Subjects with perihepatic fat at the measurement site, as identified by ultrasound or MRI, were also excluded from the study, as the current TA liver fat estimation algorithm assumes no perihepatic fat at that location. Finally, due to the low-frequency transducer (0.5 MHz) of the probe, subjects with subcutaneous fat at the measurement location of less than or equal to 6 mm were also rejected, as this thin fatty layer produced poorly resolved signals at the fat-to-muscle boundary. A flowchart of the study is shown in Figure 2.
Subjects were instructed to fast for 4 h prior to the ultrasound scan. The scan was performed by an operator experienced in using the TAEUS system. Subjects were in a slight (30-degree) left lateral decubitus position, with the right arm raised to extend the intercostal space. During a scanning session, the subject’s anatomy was assessed using a wireless, handheld ultrasound system (Vscan Air™ CL, GE Health Care, Milwaukee, WI, USA). A liver window clear of obstructions was first identified with the curved-array side of the probe, after which the subcutaneous fat thickness and liver capsule depth were measured using the linear-array side. The measurement position was marked on the subject with a surgical marker, and the probe orientation was captured with a 9-axis Bluetooth inclinometer (WT901BLECL, WitMotion, Shenzhen, China). Subsequently, a TA scan was performed. A second inclinometer was attached to the TA probe to ensure its orientation matched that of the ultrasound probe. The procedure time for the entire session was usually shorter than 10 min.
TAFF and MRI-PDFF evaluations were obtained within a four-week period to minimize changes in the subject’s liver fat. MRI-PDFF images were acquired by certified personnel at the University of Michigan Hospital on a 3.0-T clinical MRI scanner (Ingenia, Philips Healthcare, Amsterdam, The Netherlands). A three-plane localization gradient-echo sequence was first obtained, followed by an mDIXON-Quant sequence in a single breath hold, which automatically reconstructs the PDFF map. The sequence parameters were: 6 TEs (first TE 0.98 ms, delta TE 0.8 ms), TR 6.3 ms, flip angle 3°, parallel imaging SENSE factor 2, no signal averaging, matrix size 160 × 140, field-of-view (FOV) 400 × 350 mm, 77 slices, and slice thickness 3 mm. MRI-PDFF measurements were obtained from nine manually placed circular regions of interest (ROIs), one in each hepatic segment (I, II, III, IVa, IVb, V, VI, VII, and VIII), ranging from 78 mm^2^ (~10 mm diameter) to 314 mm^2^ (~20 mm diameter), while avoiding large blood vessels [60]. The mean PDFF of the nine ROIs was calculated for each scan. University personnel were blinded to the TA LFF results to avoid skewing the comparison.
2.3. Data Analysis
Regression, area under the receiver operating characteristic curve (AUROC), and accuracy analyses were conducted to evaluate TAFF’s performance relative to the reference-standard MRI-PDFF, with each subject represented by a paired measurement (mean TAFF, mean MRI-PDFF). A Bland–Altman analysis [61] was used to visualize the agreement and distribution of measurement errors. As this clinical study is the first to utilize TAFF, the Bland–Altman plot was also used to identify and correct for measurement bias (i.e., the mean error). Following bias correction, a Deming regression analysis [62] was used to evaluate the relationship between the MRI-PDFF and TAFF estimates of the liver fat. Deming regression was chosen because it appropriately accounts for measurement error in both the reference and test methods, making it well-suited to equivalence testing across methods. Unlike ordinary linear regression, which assumes the independent variable is measured without error, Deming regression provides an unbiased estimate of the relationship between two imperfectly measured variables. It is widely used for comparing clinical and analytical methods and is recommended in several methodological guidelines [63].
To estimate the Deming regression line, the ratio of variances between TAFF and MRI-PDFF measurements was required. Measurement variability was calculated from repeated measurements on multiple subjects. To avoid underestimating the variance of TAFF measurements, subjects with only one valid scan were excluded from the average estimate (35 subjects). As MRI performance can vary across vendors and sites, a conservative approach was adopted, relying on the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA) publication, which quantified MRI-PDFF’s variance using phantom data [15,64]. These publications suggest a standard deviation of 1.67% for pt fat fraction, corresponding to a variance of 2.778. For living subjects, this variance is expected to increase due to inherent variability among in vivo measurements; thus, the performance reported here represents a lower bound. For the TA, the single-operator repeated-measurement standard deviation was computed within subjects with multiple valid scans and then averaged across subjects. The resulting Deming regression confidence interval was compared with the identity line (MRI-PDFF = TAFF) to assess method equivalence.
In addition, the accuracy of TA in differentiating grades of steatotic liver disease was evaluated using paired MRI-PDFF measurements to estimate sensitivity, specificity, and positive and negative predictive values for binary classification. As there are no widely accepted cut-offs for MASLD staging other than S0-S1, cut-offs were selected based on the published studies comparing biopsy staging of SLD to MRI-PDFF measurements, with multiple cutoffs presented based on different publications [65,66,67,68,69]. The steatosis grade cutoffs analyzed are: S0-S1 = 5% LFF, S1-S2 = 12% & 17% LFF, and S2-S3 = 20% & 22% LFF. Additionally, the accuracy of TA in determining whether a subject has ≤8% MRI-PDFF was estimated, as clinical guidelines [70] suggest this MRI-PDFF threshold as one of the decision criteria for therapeutic intervention with Resmetirom. Receiver operating characteristic (ROC) analysis was also performed at all thresholds using Harrell’s C-statistic [71].
Finally, an exploratory analysis was conducted to identify demographic characteristics commonly associated with MASLD that might contribute to measurement error. To determine anatomical and population-based systematic errors, the correlation between measurement error and gender, BMI, and hip-to-waist ratio was assessed.
2.4. Measurement Reliability Analysis
To characterize system stability, measurement repeatability (intra-operator) and reproducibility (inter-operator) were evaluated in a dedicated study under the IRB committee of Rocky Vista University (IRB approval number: 2018-0029). Two operators (one experienced with the TAEUS system and one a radiologist experienced in U/S imaging). For each subject, two operators independently performed two examinations, with the sequence of operators randomized for each subject to prevent systematic bias. Each of the four examinations per subject was conducted as a completely independent event. Operators were required to fully reset the clinical workflow for every acquisition, which involved repositioning the subject, performing independent ultrasound anatomical localization to identify a clear liver window, identifying distances to tissue boundaries, and finally conducting the thermoacoustic exam. To ensure consistency across the study, a measurement was standardized to require five valid thermoacoustic scans per examination. Scan validity was determined using the scan-quality metric described in Section 2, ensuring that each scan met the necessary geometric and signal-based criteria for accurate TAFF estimation. A single TAFF measurement was defined for analysis as the mean of these five liver fat estimates.
A total of 56 TAFF measurements (14 subjects, 2 operators, 2 repetitions) were included in the final analysis. Reliability and precision were evaluated using a two-way random-effects ANOVA model, which partitioned total variance into components attributable to subjects, operators, and residual error. This model was used to calculate the ICC(2,1) for single measurements. The ICC(2,1) was selected to assess absolute agreement between measurements, ensuring that the results are generalizable to any trained operator in a clinical setting. The Standard Error of Measurement (SEM) was also calculated as the square root of the residual mean square error from the ANOVA model to quantify the precision of an individual measurement. Finally, a Bland–Altman analysis was performed to visualize inter-operator reproducibility, enabling assessment of systematic bias and calculation of 95% limits of agreement across users.
3. Results
3.1. Study Subjects
Forty-nine (49) subjects consented, enrolled, and were scanned, with eight (8) excluded due to the exclusion criteria shown in Table 1. A preliminary U/S scan determined that three subjects had subcutaneous fat thickness of less than 6 mm. Due to anatomical variation, the liver could not be identified in three excluded subjects who lacked the required right intercostal ultrasound access. Two subjects were excluded due to the presence of perihepatic fat at the measurement site, which was later confirmed by MRI in each case. Finally, one additional subject could not be analyzed due to MRI-PDFF motion artifacts, preventing accurate measurement of LFF. No subject reported a prior diagnosis of MASH or Fibrosis.
The study population is described in Table 2 below. The study population had a higher mean BMI than the US general population, as only subjects with suspected elevated liver fat were enrolled. The mean BMI of the subjects was 34 kg/m^2^ in females and 32.2 kg/m^2^ in males, compared with the national average of 29.6 kg/m^2^ [72]. Two-thirds of the subjects are obese (BMI > 30). Waist circumference was also larger than the national average and exceeded the threshold for abdominal obesity. For females, the mean waist circumference was 113 cm, substantially exceeding the national average of 95.5 cm and the cutoff for abdominal obesity of 88 cm. For males, the mean waist circumference was 112 cm or higher, exceeding the national average of 100.5 cm and the cutoff for abdominal obesity of 102 cm [72].
Due to enrollment criteria, the incidence of MASLD in this study was higher than that of the United States population [15]. The distribution of liver steatosis grades is shown in Figure 3A. This more uniform distribution of MASLD grades enabled adequate estimation of TA performance across all MASLD grades. Similarly, the distributions of body habitus (subcutaneous fat thickness and liver capsule depth) shown in Figure 3B and Figure 3C, respectively, indicate a greater tendency toward a larger body habitus.
3.2. TAFF Performance
The mean estimated TAFF value and the paired MRI-PDFF measurement were used to generate a Bland–Altman plot to visualize the distribution of error across MASLD grades. The mean TAFF measurement error for the 40 subjects who met the inclusion criteria was 1.95% LFF. This 1.95% error was subtracted from each subject’s TAFF value to correct for bias. This systematic offset is attributed to the transition from theoretical models and idealized simulations to actual in vivo human data. To further characterize measurement variability and investigate potential dependencies, a separate Repeatability and Reproducibility (R&R) study was conducted; details of which are provided in the following section. Figure 4A shows that across a wide range of LFF (1–32%), and for all steatosis grades (colored differently), the average absolute error is 3.04%. TAFF measurement error was higher in subjects with low MRI-PDFF values than in those with higher LFF. The limits of agreement (LOA) were ±7.21% LFF. Figure 4B illustrates the distribution of absolute error across LFF. It represents the percentage of subjects whose measurement error falls below a given threshold. Percentiles were calculated via linear interpolation of the empirical cumulative distribution function (folded CDF). From the plot, we observe that 49.1% of subjects had a TAFF estimate within 2.5% of their MRI-PDFF value, and 89.1% within 5.0%. These plots illustrate the accuracy and consistency of the TAFF estimates of LFF across the study population compared with MRI-PDFF measurements.
The within-subject variability of TAFF was assessed, yielding an average standard deviation of 2.66% (variance of 7.085) LFF for repeated measurements with TA for subjects with more than a single valid scan. Deming regression, accounting for TAFF’s higher variability relative to MRI-PDFF (standard variation of 1.667% or variance of 2.778, yielding a TAFF to MRI-PDFF variance ratio of 2.55), was used to evaluate agreement between TAFF and MRI-PDFF estimates of LFF, as shown in Figure 4B. The Deming regression yielded a slope of 0.86 (95% confidence interval: 0.71 to 1.02) and an intercept of 1.31% LFF (95% confidence interval: −0.30 to 2.93), which includes the identity line, indicating good agreement between TAEUS and MRI-PDFF. In addition, the Pearson correlation coefficient was determined to be 0.89, further demonstrating a strong correlation between TAFF and MRI-PDFF measurements of LFF. The within-subject variability of repeated TAFF estimates was higher at low LFF than at higher LFF.
The classification accuracy of TAFF across different LFF thresholds is summarized in Table 3 below. For each value, the 95% confidence interval was also calculated and presented in parentheses. Consistent with the earlier observation that signal variability is higher at low fat fractions, classification performance is limited at the lowest LFF threshold (5%). As LFF increases, both sensitivity and specificity improve, resulting in higher overall accuracy and a larger AUROC. The positive predictive values are good across the board, and the negative predictive values increase markedly at moderate fat fractions. Furthermore, as illustrated in Figure 5, the ROC curves show a marked improvement at higher fat fraction thresholds, accompanied by corresponding increases in AUROC, reflecting enhanced classification performance of TAFF for moderate-to-severe liver fat.
3.3. Relationship Between TAFF and Demographic Information
Differences between TAFF and MRI were analyzed with respect to demographics. We found that the average difference in males (n = 15) was 1.6% LFF lower than in females (n = 25), yielding a p-value of 0.26. This finding suggests that there is no statistically significant difference between the genders. Additionally, a statistical analysis was conducted to investigate the relationship between the various metadata collected on subjects and the estimation differences as reported in Table 4 below.
This analysis found low correlations and no statistically significant relationships between metadata variables and TA estimation errors. Figure 6 presents the distribution of TAFF estimation errors, categorized by subjects’ demographic and anthropometric characteristics. These box plots illustrate the median, interquartile range, and outliers for each subgroup (gender, height, weight, BMI, and body circumference metrics). Consistent with the low correlation coefficients (r < 0.16) and non-significant p-values (p > 0.25) previously noted, the plots show no systematic trends or distinct separation between categories. While specific subgroups, such as the 90 to 110 kg weight class or the 30 to 35 BMI range, exhibit greater variability, the substantial overlap in error distributions across those bins reinforces the conclusion that these metadata variables are unreliable predictors of TAFF estimation performance.
3.4. Measurement Reliability
The reliability of the TAFF estimation was evaluated by a total of 56 measurements obtained from 14 subjects. The two-way random-effects ANOVA model yielded an ICC(2,1) of 0.89 (95% CI: 0.74–0.93), indicating good reliability for clinical liver fat quantification.
The system’s precision was characterized by a SEM of 3.31%, similar in magnitude to the 3.04% average absolute error observed in the primary feasibility cohort, suggesting comparable overall measurement variability across datasets.
Inter-operator reproducibility was further visualized via Bland–Altman analysis (Figure 7). The analysis revealed high agreement between operators, with a mean difference of only 0.36% and 95% limits of agreement ranging from −5.82% to 5.09%. The mean absolute error between operators was 2.28%. These results demonstrate that the TAEUS system provides stable and consistent estimates across different users and independent scan setups.
4. Discussion
This single-site feasibility study is the first of its kind to utilize thermoacoustic principles to estimate fat in human subjects. The TA liver system accurately estimates liver fat across a wide range of body habitus, including mild to severe disease. The portable TA liver system provides liver fat estimates that are consistent with MRI-PDFF, with costs and mobility similar to those of cart-based ultrasound. Thus, it indicates the potential of such technology as a valuable tool to aid in the diagnosis of MASLD, to inform therapeutic decisions, and to monitor disease progression.
Bland–Altman plots across all disease grades show the limits of agreement to be ±7.21% as compared to the reference standard MRI-PDFF. The cumulative error distribution (mountain plot) indicates that half (49.1%) of subjects had a TAFF estimate within 2.5% of their MRI-PDFF value, and the vast majority (89.1%) within 5.0%. Results demonstrate narrower limits of agreement than those reported for CAP, which exhibits a 95% limit of agreement for ±12% in fat fraction after converting its native measurements to match the MRI [73]. More importantly, the TA estimates of LFF were strongly correlated with MRI-PDFF scores (r = 0.89). In contrast, quantitative ultrasound–based estimates of LFF have shown substantial variability, with reported correlations with MRI-PDFF ranging from 0.49 to 0.89 at baseline [74,75,76,77], and more limited correlation with longitudinal changes in MRI-PDFF (r ≈ 0.52) [74,75]. In contrast to these quantitative ultrasound approaches, TAFF estimates in the present study exhibit approximately constant bias and do not demonstrate evidence of proportional bias, as indicated by the Bland–Altman analysis.
The 1.95% systematic bias identified via Bland–Altman analysis reflects the transition from idealized theoretical models to actual in vivo human data. While the specific contributions of device, operator, or cohort factors remain to be characterized, data from the separate Repeatability and Reproducibility (R&R) study provide evidence of robust measurement stability. The analysis yielded an ICC(2,1) of 0.89 (95% CI: 0.74–0.93) and a SEM of 3.31%. Notably, this SEM closely aligns with the 3.04% average absolute error observed in the primary cohort, suggesting that measurement precision is a fundamental characteristic of the current system configuration rather than a result of operator technique. In future multi-center studies, these data-driven refinements will be integrated into automated calibration protocols to ensure consistent performance across diverse populations without the need for manual bias correction.
In predicting the presence of moderate to severe steatosis (S2-S3) at the 12% LFF threshold, TAFF scores achieve 100% specificity and PPV of 100%, with an AUROC of 0.92 in the available cohort. Although derived from a limited sample, these results indicate exceptional accuracy in identifying subjects with moderate-to-severe SLD, a critical clinical threshold for urgent and aggressive management of MASLD and MASH. Although this study did not include a direct ‘head-to-head’ comparison with QUS, literature shows, through a large meta-analysis of widely used point-of-care quantitative U/S methods for assessing moderate steatosis, reports AUROCs of approximately 0.82 and lower specificities ranging from 74% to 79% [78,79]. While these existing tools have a role in screening, their relatively low specificity often necessitates confirmatory testing, in contrast with TA LFF estimation. The data gathered in this study suggests that the TAFF has the potential to offer high diagnostic accuracy for quantifying moderate liver fat (AUROC > 0.90) as it has a similar contrast mechanism as MRI-PDFF. Thus, it might overcome some of the limitations of current quantitative U/S and provide superior performance.
Performance in the low liver fat range is inherently more variable because the diminished contrast between lean liver and muscle results in a lower contrast-to-noise ratio. In this feasibility assessment, the system achieved an accuracy of 77.5% and an AUROC of 0.83 at an 8% MRI-PDFF threshold. While these results indicate that detection in the ‘lean’ population remains challenging with the current system configuration, the negative predictive value (NPV) of 83.3% at this threshold provides preliminary evidence of the technology’s potential to rule out significant steatosis. These findings help define the system’s current operating envelope, demonstrating that while the methodology shows promise for identifying moderate-to-severe steatosis, additional refinements are necessary to improve precision and reliability for early-stage MASLD assessment.
Demographic analysis revealed no significant difference in errors between male and female subjects. In addition, the TAFF scores demonstrated low correlation (r < 0.16) between estimation errors and subjects’ age or body habitus parameters (e.g., height, BMI, waist circumference). While a larger body habitus and increased liver capsule depth (as shown in Figure 3) can theoretically reduce the signal-to-noise ratio by attenuating RF energy, these factors did not significantly impact estimation accuracy in this cohort. This implies that the method is applicable across various body types, age groups, and sexes. For QUS, reported clinical findings indicate a significant dependence of LFF measurement accuracy on BMI [77,80], which was not demonstrated by the TAEUS liver system in this study. UGAP, for example, demonstrates markedly lower accuracy for diagnosing severe steatosis (S3) in subjects with BMI > 30 kg/m^2^ [76]. These BMI-associated limitations arise from increased skin-to-capsule distance that degrades acoustic signal quality [12], depth-dependent variability in measurement thresholds, and anthropometric factors that independently affect whether CAP measurements achieve acceptable interquartile ranges [81]. Additional BMI-related tissue characteristics further impair liver fat quantification for these techniques: Karlas et al. [82] described “muscular viscoelastic adaptations” in overweight individuals that alter acoustic propagation, and Maar et al. showed that hepatic imaging quality decreases with increasing BMI. However, high-performance probes can partially mitigate this effect [83]. Consistent with these findings, Cusi et al. [84] reported that ultrasound exhibits low sensitivity for detecting hepatic steatosis in obese and severely obese patients. Depth dependence compounds these challenges, as a retrospective study of acoustic attenuation coefficient algorithms from Canon, Philips, and Siemens demonstrated a progressive decline in AC values with increasing measurement depth, as well as significantly higher values with a 1 cm ROI compared with a 3 cm ROI [85]. In addition, some quantitative ultrasound methods showed a nonlinear relationship (i.e., proportional error) with MRI-PDFF [86,87]. Collectively, these BMI- and depth-related constraints highlight the need for standardized acquisition protocols and improved quantitative ultrasound approaches capable of producing accurate and reliable liver fat measurements across diverse body habitus. Unlike all other imaging techniques except perhaps MRI, thermoacoustics is strictly a fat-specific parameter that does not rely on non-fat-related tissue properties to produce contrast.
These overall findings support TA as a potentially accurate, non-invasive, point-of-care solution that could address the cost and access constraints of MRI and the limitations of current quantitative ultrasound methods. The high PPV and NPV allow for the accurate identification of individuals with severe MASLD and for separating healthy individuals and those with mild MASLD from individuals with moderate and severe MASLD. These trends indicate that TAFF is most effective at detecting moderate to severe steatosis, whereas classification at very low fat fractions remains more challenging due to inherently weaker signals. Specifically, at an LFF of 8%, the Resmetirom (THrβ) treatment threshold, the negative predictive value exceeds 83%, suggesting that TA may be valuable in guiding therapeutic decisions and monitoring disease progression.
The current feasibility study also highlights some limitations of thermoacoustic imaging and the first-generation TA system. In subjects, the current system is not suitable as an early screening tool since it can only resolve contrast differences in tissue conductivity. Thus, it suffers from a lack of signal at low LFF as the differences between lean liver tissue and (lean) muscle become minute, leading to a lower contrast-to-noise ratio and more variable measurements. As liver tissue becomes fattier, the contrast increases, and with it, the robustness of the liver fat estimation. This limitation of the first-generation thermoacoustic system can be addressed by incorporating a more sensitive transducer to improve signal resolution for low LFF. The current 16-element, 0.5 MHz phased-array configuration offers limited performance that can be improved. In addition, the improvement in contrast with increasing fat fraction, rather than extinguishing signal observed with pure ultrasonic methods [77], suggests that a combined quantitative ultrasound and TA device can deliver improved performance across all steatosis grades.
Another difficulty with this first-generation TA system is the need to rely on a separate U/S device to acquire tissue anatomy before a TA scan can proceed. Proper positioning of the wireless ultrasound probe and the larger TA probe contributed substantially to estimation inaccuracies, lengthened the session time, and imposed additional strain on the user. This observation suggests that a future device combining both ultrasound and TA capabilities would be much easier to use and would achieve greater accuracy in fat fraction estimation. It is unclear if fibrosis has a non-negligible effect on TA measurements. The TA signal amplitude is proportional to the tissue conductivity, which decreases with liver fibrosis [88,89,90], but also to the tissue bulk modulus (through the Grüneisen parameter), which increases with fibrosis [91,92]. In addition, liver fibrosis and other potential confounders should be thoroughly investigated in future studies to ensure TA liver fat estimation is applicable to a wide range of patients. Fibrosis can affect permittivity, perturbing the amount of electromagnetic power that reaches and is deposited in the liver.
It is also worth noting that while the results reported in this study are quite promising, they are based on a preliminary study with a single site and a limited number of subjects (N = 40). Moreover, the current study excluded subjects with liver fibrosis to minimize this confounder, relying on self-reported medical history. It is worth noting that many patients with steatosis may be unaware of their fibrosis status, which may contribute to inaccuracies in the results. Future studies should include direct measurement of fibrosis to address that. Although the current work successfully established the system’s initial reliability through a dedicated R&R study, demonstrating good agreement (ICC = 0.89) and a negligible mean inter-operator difference of 0.36%, a larger-scale, multi-site, multi-user follow-up study remains essential to further validate these findings across broader clinical environments. In addition, direct comparison with QUS by a qualified sonographer would allow better benchmarking of the different techniques, thus outlining the benefits and drawbacks of each method. Thus, it is expected that in the coming years, more studies will be conducted using TA, opening new avenues for accurate, low-cost, portable liver fat estimation and monitoring.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Koo D.J. Lee W.Y. The crosstalk between insulin resistance and nonalcoholic fatty liver disease/metabolic dysfunction-associated fatty liver disease: A culprit or a consequence?Cardiovasc. Prev. Pharmacother.2022413214110.36011/cpp.2022.4.e 17 · doi ↗
- 2VeskovićM. ŠutulovićN. HrnčićD. StanojlovićO. Macut D. MladenovićD. The interconnection between hepatic insulin resistance and metabolic dysfunction-associated steatotic liver disease-the transition from an adipocentric to liver-centric approach Curr. Issues Mol. Biol.2023459084910210.3390/cimb 4511057037998747 PMC 10670061 · doi ↗ · pubmed ↗
- 3Diehl A.M. Tumor necrosis factor and its potential role in insulin resistance and nonalcoholic fatty liver disease Clin. Liver Dis.2004861963810.1016/j.cld.2004.04.01215331067 · doi ↗ · pubmed ↗
- 4Meex R.C.R. Watt M.J. Hepatokines: Linking nonalcoholic fatty liver disease and insulin resistance Nat. Rev. Endocrinol.20171350952010.1038/nrendo.2017.5628621339 · doi ↗ · pubmed ↗
- 5Stefan N. Schick F. Birkenfeld A.L. Häring H.U. White M.F. The role of hepatokines in NAFLD Cell Metab.20233523625210.1016/j.cmet.2023.01.00636754018 PMC 10157895 · doi ↗ · pubmed ↗
- 6Zhao Y.C. Zhao G.J. Chen Z. She Z.G. Cai J. Li H. Nonalcoholic fatty liver disease Hypertension 20207527528410.1161/HYPERTENSIONAHA.119.1341931865799 · doi ↗ · pubmed ↗
- 7Miao X. Alidadipour A. Saed V. Sayyadi F. Jadidi Y. Davoudi M. Amraee F. Jadidi N. Afrisham R. Hepatokines: Unveiling the molecular and cellular mechanisms connecting hepatic tissue to insulin resistance and inflammation Acta Diabetol.2024611339136110.1007/s 00592-024-02335-939031190 · doi ↗ · pubmed ↗
- 8Alam S. Mustafa G. Alam M. Ahmad N. Insulin resistance in development and progression of nonalcoholic fatty liver disease World J. Gastrointest. Pathophysiol.2016721121710.4291/wjgp.v 7.i 2.21127190693 PMC 4867400 · doi ↗ · pubmed ↗
