# Rapid and reagent-free screening of occult hepatitis B virus infection based on plasma Vis-NIR spectral pattern recognition

**Authors:** Linbin Huang, Xiaoting Huang, Jingjing Xia, Lining Huang, Huanjie Zhou, Min Chen, Baoren He, Meijun Chen, Qiuhong Mo, Tao Pan, Chao Ou

PMC · DOI: 10.3389/fbioe.2026.1774782 · Frontiers in Bioengineering and Biotechnology · 2026-02-23

## TL;DR

This paper introduces a fast and cost-effective method using light-based analysis to detect hidden hepatitis B virus infections in blood samples.

## Contribution

The study proposes a reagent-free, rapid Vis-NIR spectral pattern recognition method for accurate occult hepatitis B detection.

## Key findings

- A Vis-NIR spectral model achieved 98.7% accuracy in distinguishing OBI from normal plasma samples.
- The few-wavelength model using 5 optimal wavelengths reached 100% specificity and 96.6% sensitivity.
- The method is suitable for large-scale, low-cost OBI screening and small blood analyzers.

## Abstract

Occult hepatitis B virus infection (OBI) is a specific form of hepatitis B virus (HBV) infection characterized by testing negative for Hepatitis B surface antigen (HBsAg) with the presence of HBV DNA in the blood. Due to the complexity and high cost of HBV DNA testing, which is rarely included in routine physical examinations, leading to underdiagnosis of OBI. In this study, plasma visible-near-infrared (Vis-NIR) spectroscopy pattern recognition was employed to develop the discriminant analysis models for distinguishing between OBI from healthy (normal controls) plasma.

A total of 444 plasma samples from voluntary blood donors (OBI 204, normal controls 240) were collected, and their Vis-NIR spectra were measured. The samples were rigorously divided into training, prediction, and independent external validation sets. Partial least squares-discriminant analysis (PLS-DA) and k-nearest neighbor (kNN) were used as spectral classifiers; standard normal variate (SNV) and norris derivative filtering (NDF) were applied for spectral preprocessing. The integrated algorithm combining separation degree priority combination (SDPC) with wavelength step-by-step phase-out (WSP) was utilized for the optimal wavelength selection.

The plasma spectral discriminant models for OBI and normal control were successfully established. Based on the optimal SNV-NDF preprocessed spectra, the SDPC-WSP-kNN and SDPC-WSP-PLS-DA methods determined the optimal number of wavelengths N to be 5 and 26, respectively. When evaluated on the independent external validation set, the SDPC-WSP-kNN model demonstrated better robustness, achieving sensitivity, specificity, and total recognition accuracy rates of 96.6%, 100%, and 98.7%, respectively. By introducing a grey judgment zone, both SEN and SPE reached 100%, with a detection recovery rate of 96.8%.

These results indicated that Vis-NIR spectroscopy pattern recognition can accurately discriminate between OBI and normal controls’ plasma samples. This method is reagent-free, rapid, and simple, making it suitable for large-scale, low-cost rapid screening of OBI. In particular, the proposed few-wavelength model can provide an important reference for the development of small specialized blood analyzers for OBI detection.

## Linked entities

- **Diseases:** hepatitis B virus infection (MONDO:0005344)

## Full-text entities

- **Genes:** UGT1A (UDP glucuronosyltransferase family 1 member A complex locus) [NCBI Gene 7361] {aka GNT1, UGT, UGT1, UGT1A@}, ADH1A (alcohol dehydrogenase 1A (class I), alpha polypeptide) [NCBI Gene 124] {aka ADH1}, AKR1A1 (aldo-keto reductase family 1 member A1) [NCBI Gene 10327] {aka ALDR1, ALR, ARM, DD3, HEL-S-6}
- **Diseases:** death (MESH:D003643), breast cancer (MESH:D001943), hepatitis (MESH:D056486), ovarian cancer (MESH:D010051), HCC (MESH:D006528), infectious disease (MESH:D003141), trauma (MESH:D014947), cirrhosis (MESH:D005355), NDF (MESH:C537849), TTIs (MESH:D065227), HBV infections (MESH:D006509), beta-thalassemia (MESH:D017086)
- **Chemicals:** C2H5NO2 (-), Si (MESH:D012825), bile acid (MESH:D001647), amino acids (MESH:D000596), C3H7NO3 (MESH:D012694), fatty acid (MESH:D005227), Arginine (MESH:D001120), lipid (MESH:D008055), hydrogen (MESH:D006859), Proline (MESH:D011392), Alanine (MESH:D000409), C6H13NO2 (MESH:D007930), retinol (MESH:D014801), Glycine (MESH:D005998)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12968168/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12968168/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12968168/full.md

---
Source: https://tomesphere.com/paper/PMC12968168