Browsing by Subject "model selection"

Sort by: Order: Results:

Now showing items 1-4 of 4
  • Laine, Marko (2008)
    Finnish Meteorological Institute Contributions
  • Lehtonen, Jussi (Helsingfors universitet, 2006)
    The Seychelles magpie robin Copsychus sechellarum is among the most threatened of the twelve avian species endemic to Seychelles. Originally relatively abundant and widespread in Seychelles, its population has severely diminished, mainly on account of introduced predators and habitat destruction. The lowest count was recorded in 1965, when the number of birds was between eight and fifteen, all confined to just one island. In 1990 a successful recovery program was implemented, and the population has now recovered to approximately 150 birds on four islands. Translocations to new islands are a major part of the recovery program. So far Seychelles magpie robins have been translocated to three islands, and three more target islands still need to be chosen if the long-term goals of the project are to be reached. Determining habitat suitability at the translocation target area is essential before attempting a translocation. Cockroaches are a major part of the bird's diet. Cockroach abundance has been shown to be a good indicator of magpie robin habitat quality, but it is very laborious and time consuming to measure. The objective of this study is to build a model that can be used to predict cockroach abundance, and thereby Seychelles magpie robin habitat quality, with relatively simple and inexpensive measurements. The data for this project was collected on Cousin, Cousine and Aride islands in Seychelles during October and November 2004. Several potential predictors of cockroach abundance were measured using simple and low-cost methods. The measured variables were Seychelles skink Mabuya sechellensis abundance, altitude from sea level, slope gradient, rock type, canopy cover, shrub cover, litter depth, soil depth, the relative canopy cover of different tree species, and percentage of ground covered by litter, rock, grass, woody vegetation and bare ground. These variables were measured in several 25 × 25 m plots on each island. Cockroach abundance had been measured in the same plots previously. Several multiple linear regression models were built using these variables, with cockroach abundance as the dependent variable. The Akaike Information Criterion was used to rank the models and to find the most important explanatory variables. Stepwise regression was used as a secondary method in order to see if two different methods point to the same result. Both methods indicated that slope gradient and granitic rock type are the two most important predictors of those tested. These are features that are invariably associated with granitic terrain, and both are positively correlated with cockroach abundance. A simple comparison of average cockroach abundance between granitic and coralline rock types also showed a significant difference, with granitic terrain providing higher cockroach abundance. The islands in Seychelles can be divided into two main categories: granitic and coralline islands. Steep slopes and granitic rock are both features associated with granitic terrain and granitic islands only. Therefore, based on the results of this study, granitic islands are likely to provide higher cockroach abundance and thus better Seychelles magpie robin habitat quality. Hence it is recommended that Seychelles magpie robin translocation efforts are focused preferentially on granitic islands.
  • Matsuda, Takeru; Uehara, Masatoshi; Hyvärinen, Aapo (2021)
    Many statistical models are given in the form of non-normalized densities with an intractable normalization constant. Since maximum likelihood estimation is computationally intensive for these models, several estimation methods have been developed which do not require explicit computation of the normalization constant, such as noise contrastive estimation (NCE) and score matching. However, model selection methods for general non normalized models have not been proposed so far. In this study, we develop information criteria for non-normalized models estimated by NCE or score matching. They are approximately unbiased estimators of discrepancy measures for non-normalized models. Simulation results and applications to real data demonstrate that the proposed criteria enable selection of the appropriate non-normalized model in a data-driven manner.
  • Grunwald, Peter; Roos, Teemu (2019)
    This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model selection and averaging and hypothesis testing, as well as the first completely general definition of MDL estimators. Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC versus BIC and cross-validation versus Bayes can, to a large extent, be viewed from a unified perspective.