On the inconsistency of ℓ1-penalised sparse precision matrix estimation

Show full item record




Heinävaara , O , Leppä-Aho , J , Corander , J & Honkela , A 2016 , ' On the inconsistency of ℓ1-penalised sparse precision matrix estimation ' , BMC Bioinformatics , vol. 17 , no. Suppl 16 , pp. 99-107 . https://doi.org/10.1186/s12859-016-1309-x

Title: On the inconsistency of ℓ1-penalised sparse precision matrix estimation
Author: Heinävaara, Otte; Leppä-Aho, Janne; Corander, Jukka; Honkela, Antti
Contributor organization: Helsinki Institute for Information Technology
Department of Computer Science
Department of Mathematics and Statistics
Jukka Corander / Principal Investigator
Probabilistic Mechanistic Models for Genomics research group / Antti Honkela
The Finnish Center of Excellence in Computational Inference Research (COIN)
Biostatistics Helsinki
Date: 2016-12-13
Language: eng
Number of pages: 9
Belongs to series: BMC Bioinformatics
ISSN: 1471-2105
DOI: https://doi.org/10.1186/s12859-016-1309-x
URI: http://hdl.handle.net/10138/208461
Abstract: Background: Various l(1)-penalised estimation methods such as graphical lasso and CLIME are widely used for sparse precision matrix estimation and learning of undirected network structure from data. Many of these methods have been shown to be consistent under various quantitative assumptions about the underlying true covariance matrix. Intuitively, these conditions are related to situations where the penalty term will dominate the optimisation. Results: We explore the consistency of l(1)-based methods for a class of bipartite graphs motivated by the structure of models commonly used for gene regulatory networks. We show that all l(1)-based methods fail dramatically for models with nearly linear dependencies between the variables. We also study the consistency on models derived from real gene expression data and note that the assumptions needed for consistency never hold even for modest sized gene networks and l(1)-based methods also become unreliable in practice for larger networks. Conclusions: Our results demonstrate that l(1)-penalised undirected network structure learning methods are unable to reliably learn many sparse bipartite graph structures, which arise often in gene expression data. Users of such methods should be aware of the consistency criteria of the methods and check if they are likely to be met in their application of interest.
Subject: 112 Statistics and probability
113 Computer and information sciences
Peer reviewed: Yes
Rights: cc_by
Usage restriction: openAccess
Self-archived version: publishedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
s12859_016_1309_x.pdf 974.0Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record