Variable selection in a specific regression time series of counts
Marina Gomtsyan

TL;DR
This paper introduces a new variable selection method for overdispersed count time series using sparse negative binomial GLARMA models, improving variable recovery and computational efficiency.
Contribution
It develops a three-step estimation procedure for sparse negative binomial GLARMA models, integrating ARMA coefficient estimation and variable selection, implemented in the NBtsVarSel package.
Findings
Outperforms existing methods in variable selection accuracy
Effective on synthetic and RNA sequencing data
Computationally efficient for large datasets
Abstract
Time series of counts occurring in various applications are often overdispersed, meaning their variance is much larger than the mean. This paper proposes a novel variable selection approach for processing such data. Our approach consists in modelling them using sparse negative binomial GLARMA models. It combines estimating the autoregressive moving average (ARMA) coefficients of GLARMA models and the overdispersion parameter with performing variable selection in regression coefficients of Generalized Linear Models (GLM) with regularised methods. We describe our three-step estimation procedure, which is implemented in the NBtsVarSel package. We evaluate the performance of the approach on synthetic data and compare it to other methods. Additionally, we apply our approach to RNA sequencing data. Our approach is computationally efficient and outperforms other methods in selecting variables,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Metabolomics and Mass Spectrometry Studies · Bayesian Methods and Mixture Models
