Variable selection in sparse multivariate GLARMA models: application to germination control by environment

Bibliographic Details
Title: Variable selection in sparse multivariate GLARMA models: application to germination control by environment
Authors: Gomtsyan, M., Lévy-Leduc, Céline, Ouadah, Sarah, Sansonnet, Laure, Bailly, Christophe, Rajjou, Loïc
Contributors: Lévy-Leduc, Céline, Mathématiques et Informatique Appliquées (MIA Paris-Saclay), AgroParisTech-Université Paris-Saclay-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Laboratoire de Biologie du Développement IBPS (LBD), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut de Biologie Paris Seine (IBPS), Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Institut Jean-Pierre - Sciences du végétal (IJPB)
Source: Statistical Methods & Applications. 34:291-324
Publication Status: Preprint
Publisher Information: Springer Science and Business Media LLC, 2025.
Publication Year: 2025
Subject Terms: FOS: Computer and information sciences, multivariate GLARMA, sparsity, Statistics - Applications, 01 natural sciences, [STAT] Statistics [stat], [STAT]Statistics [stat], Methodology (stat.ME), gene expression, Applications (stat.AP), 0101 mathematics, seed quality, Statistics - Methodology, variable selection
Description: We propose a novel and efficient iterative two-stage variable selection approach for multivariate sparse GLARMA models, which can be used for modelling multivariate discrete-valued time series. Our approach consists in iteratively combining two steps: the estimation of the autoregressive moving average (ARMA) coefficients of multivariate GLARMA models and the variable selection in the coefficients of the Generalized Linear Model (GLM) part of the model performed by regularized methods. We explain how to implement our approach efficiently. Then we assess the performance of our methodology using synthetic data and compare it with alternative methods. Finally, we illustrate it on RNA-Seq data resulting from polyribosome profiling to determine translational status for all mRNAs in germinating seeds. Our approach, which is implemented in the MultiGlarmaVarSel R package and available on the CRAN, is very attractive since it benefits from a low computational load and is able to outperform the other methods for recovering the null and non-null coefficients.
Document Type: Article
Conference object
Report
Language: English
ISSN: 1613-981X
1618-2510
DOI: 10.1007/s10260-025-00786-0
DOI: 10.48550/arxiv.2208.14721
Access URL: http://arxiv.org/abs/2208.14721
https://hal.science/hal-04239746v1
Rights: Springer Nature TDM
arXiv Non-Exclusive Distribution
Accession Number: edsair.doi.dedup.....5e726040a6a73b694adb6a9b743cfc0f
Database: OpenAIRE
Description
ISSN:1613981X
16182510
DOI:10.1007/s10260-025-00786-0