# The variable selection of two-part regression model for semicontinuous data

**Authors:** Yahui Lu, Aiyi Liu, Tao Jiang, Flavio Ziegelmann, Flavio Ziegelmann, Flavio Ziegelmann

PMC · DOI: 10.1371/journal.pone.0322937 · PLOS One · 2025-06-03

## TL;DR

This paper introduces new variable selection methods for analyzing data with many zeros, using a two-part regression model.

## Contribution

The paper proposes novel Lasso and adaptive Lasso-based variable selection methods for the Bernoulli-Normal two-part regression model.

## Key findings

- The proposed variable selection methods are effective for the Bernoulli-Normal two-part regression model.
- Adaptive Lasso outperforms standard Lasso in variable selection performance.
- The methods were successfully applied to dietary intake data to identify important factors.

## Abstract

In many research fields, measurement data containing too many zeros are often called semicontinuous data. For semicontinuous data, the most common method is the two-part model, which establishes the corresponding regression model for both the zero-valued part and the nonzero-valued part. Considering that each part of the two-part regression model often encounters a large number of candidate variables, the variable selection becomes an important problem in semicontinuous data analysis. However, there is little research literature on this topic. To bridge this gap, we propose a new type of variable selection methods for the two-part regression model. In this paper, the Bernoulli-Normal two-part (BNT) regression model is presented, and a variable selection method based on Lasso penalty function is proposed. To solve the problem that Lasso estimator does not have Oracle attribute, we then propose a variable selection method based on adaptive Lasso penalty function. The simulation results show that both methods can select variables for BNT regression model and are easy to implement, and the performance of adaptive Lasso method is superior to the Lasso method. We demonstrate the effectiveness of the proposed tools using dietary intake data to further analyze the important factors affecting dietary intake of patients.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12132987/full.md

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12132987/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/PMC12132987/full.md

---
Source: https://tomesphere.com/paper/PMC12132987