Variable selection in a specific regression time series of counts

Marina Gomtsyan

arXiv:2307.00929·stat.ME·July 4, 2023

Variable selection in a specific regression time series of counts

Marina Gomtsyan

PDF

Open Access

TL;DR

This paper introduces a new variable selection method for overdispersed count time series using sparse negative binomial GLARMA models, improving variable recovery and computational efficiency.

Contribution

It develops a three-step estimation procedure for sparse negative binomial GLARMA models, integrating ARMA coefficient estimation and variable selection, implemented in the NBtsVarSel package.

Findings

01

Outperforms existing methods in variable selection accuracy

02

Effective on synthetic and RNA sequencing data

03

Computationally efficient for large datasets

Abstract

Time series of counts occurring in various applications are often overdispersed, meaning their variance is much larger than the mean. This paper proposes a novel variable selection approach for processing such data. Our approach consists in modelling them using sparse negative binomial GLARMA models. It combines estimating the autoregressive moving average (ARMA) coefficients of GLARMA models and the overdispersion parameter with performing variable selection in regression coefficients of Generalized Linear Models (GLM) with regularised methods. We describe our three-step estimation procedure, which is implemented in the NBtsVarSel package. We evaluate the performance of the approach on synthetic data and compare it to other methods. Additionally, we apply our approach to RNA sequencing data. Our approach is computationally efficient and outperforms other methods in selecting variables,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Metabolomics and Mass Spectrometry Studies · Bayesian Methods and Mixture Models