Information Criteria and Effective Feature Size Estimation for Data with Inherent Dependencies

Show full item record

Title: Information Criteria and Effective Feature Size Estimation for Data with Inherent Dependencies
Author: Bouri, Ioanna
Contributor: University of Helsinki, Faculty of Science
Publisher: Helsingin yliopisto
Date: 2019
Language: eng
Thesis level: master's thesis
Degree program: Datatieteen maisteriohjelma
Master's Programme in Data Science
Magisterprogrammet i data science
Specialisation: ei opintosuuntaa
no specialization
ingen studieinriktning
Discipline: none
Abstract: In model selection, it is necessary to select a model from a set of candidate models based on some observed data. The model should fit the data well, but without being overly complex, since that would not allow the model to generalize well its predictions to unseen data. Information criteria are widely used model selection methods that select a model based on some criteria. Information criteria estimate a score for each candidate model, and use that score to make a selection. A common way of estimating such a score, rewards the candidate model for its goodness of fit on some observed data and penalizes for the model complexity. Many popular information criteria, such as Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC) penalize model complexity by the feature dimension. However, in a non-standard setting with inherent dependencies, these criteria are prone to over-penalizing the complexity of the model. Motivated by how these commonly used criteria tend to over-penalize, we evaluate AIC and BIC on a multi-target setting with correlated features. We compare AIC and BIC, with the Fisher Information Criterion (FIC), a criterion that takes into consideration correlations amongst features and does not penalize model complexity solely by the feature dimension of the candidate model. We evaluate the feature selection and predictive performances of the three information criteria in a linear regression setting with correlated features. We evaluate the precision, recall and F1 score of the set of features each criterion selects, compared to the feature set of the generative model. Under this setting's assumptions, we find that FIC yields the best results, compared to AIC and BIC, both in the feature selection and predictive performance evaluation. Finally, using FIC's properties for feature selection, we derive a formulation that allows to approximate the effective feature dimension of models with correlated features, in linear regression settings.

Files in this item

Total number of downloads: Loading...

Files Size Format View
thesis.pdf 703.3Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record