# AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis

**Authors:** Rui Miao, Hao-Yang Yu, Bing-Jie Zhong, Hong-Xia Sun, Qiang Xia

PMC · DOI: 10.1371/journal.pcbi.1012773 · PLOS Computational Biology · 2025-02-13

## TL;DR

This paper introduces AWGE-ESPCA, a new AI model for analyzing the Hermetia illucens genome under copper stress, improving gene and pathway selection.

## Contribution

AWGE-ESPCA is the first AI model designed specifically for Hermetia illucens genomic data with adaptive noise elimination and pathway-based gene weighting.

## Key findings

- AWGE-ESPCA outperforms existing models in selecting target genes and key pathways in Hermetia illucens.
- The model's adaptive noise elimination and weighted gene network improve biological interpretability.
- The model is shown to be extendable to other insect genome analysis tasks.

## Abstract

Hermetia illucens is an important insect resource. Studies have shown that exploring the effects of Cu2+-stressed on the growth and development of the Hermetia illucens genome holds significant scientific importance. There are three major challenges in the current studies of Hermetia illucens genomic data analysis: firstly, the lack of available genomic data which limits researchers in Hermetia illucens genomic data analysis. Secondly, to the best of our knowledge, there are no Artificial Intelligence (AI) feature selection models designed specifically for Hermetia illucens genome. Unlike human genomic data, noise in Hermetia illucens data is a more serious problem. Third, how to choose those genes located in the pathway enrichment region. Existing models assume that each gene probe has the same priori weight. However, researchers usually pay more attention to gene probes which are in the pathway enrichment region. Based on the above challenges, we initially construct experiments and establish a new Cu2+-stressed Hermetia illucens growth genome dataset. Subsequently, we propose AWGE-ESPCA: an edge Sparse PCA model based on adaptive noise elimination regularization and weighted gene network. The AWGE-ESPCA model innovatively proposes an adaptive noise elimination regularization method, effectively addressing the noise challenge in Hermetia illucens genomic data. We also integrate the known gene-pathway quantitative information into the Sparse PCA(SPCA) framework as a priori knowledge, which allows the model to filter out the gene probes in pathway-rich regions as much as possible. Ultimately, this study conducts five independent experiments and compared four latest Sparse PCA models as well as representative supervised and unsupervised baseline models to validate the model performance. The experimental results demonstrate the superior pathway and gene selection capabilities of the AWGE-ESPCA model. Ablation experiments validate the role of the adaptive regularizer and network weighting module. To summarize, this paper presents an innovative unsupervised model for Hermetia illucens genome analysis, which can effectively help researchers identify potential biomarkers. In addition, we also provide a working AWGE - ESPCA model code in the address: https://github.com/yhyresearcher/AWGE_ESPCA.

Hermetia illucens is an insect of high economic value, which is widely used in the field of feed. Existing research suggests that Cu2+-stressed can significantly affects the growth of Hermetia illucens. Therefore, the identification of genetic target information affecting the growth and development of Hermetia illucens is crucial for food safety. However, due to the lack of high-quality data sets, high data noisy and low sample number. None of the existing genomic analysis models can handle the Hermetia illucens data well. Based on the above problems, a novel unsupervised Hermetia illucens genomic analysis model (AWGE-ESPCA) is proposed in this paper. The AWGE-ESPCA model proposes a daptive noise elimination regularization to solve noise challenges in data and uses weighted gene network to enhance the biological interpretability capability of the model. The experimental results show that the AWGE-ESPCA model can well select potential target genes and key pathways. In addition, we demonstrate that the AWGE-ESPCA model can be extended to other insect genome analysis tasks.

## Linked entities

- **Chemicals:** Cu2+ (PubChem CID 27099)
- **Species:** Hermetia illucens (taxon 343691)

## Full-text entities

- **Chemicals:** Cu2+ (-)
- **Species:** Hermetia illucens (black soldier fly, species) [taxon 343691], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11825005/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11825005/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/PMC11825005/full.md

---
Source: https://tomesphere.com/paper/PMC11825005