Browsing by Subject "RANDOM FOREST"

Sort by: Order: Results:

Now showing items 1-6 of 6
  • Peris Tamayo, Ana-Maria; Devineau, Olivier; Praebel, Kim; Kahilainen, Kimmo K.; ostbye, Kjartan (2020)
    Adaptive radiation is the diversification of species to different ecological niches and has repeatedly occurred in different salmonid fish of postglacial lakes. In Lake Tinnsjoen, one of the largest and deepest lakes in Norway, the salmonid fish, Arctic charr (Salvelinus alpinus(L.)), has likely radiated within 9,700 years after deglaciation into ecologically and genetically segregated Piscivore, Planktivore, Dwarf, and Abyssal morphs in the pelagial, littoral, shallow-moderate profundal, and deep-profundal habitats. We compared trait variation in the size of the head, the eye and olfactory organs, as well as the volumes of five brain regions of these four Arctic charr morphs. We hypothesised that specific habitat characteristics have promoted divergent body, head, and brain sizes related to utilized depth differing in environmental constraints (e.g., light, oxygen, pressure, temperature, and food quality). The most important ecomorphological variables differentiating morphs were eye area, habitat, and number of lamellae. The Abyssal morph living in the deepest areas of the lake had the smallest brain region volumes, head, and eye size. Comparing the olfactory bulb with the optic tectum in size, it was larger in the Abyssal morph than in the Piscivore morph. The Piscivore and Planktivore morphs that use more illuminated habitats have the largest optic tectum volume, followed by the Dwarf. The observed differences in body size and sensory capacities in terms of vision and olfaction in shallow and deepwater morphs likely relates to foraging and mating habitats in Lake Tinnsjoen. Further seasonal and experimental studies of brain volume in polymorphic species are needed to test the role of plasticity and adaptive evolution behind the observed differences.
  • Hurskainen, Pekka; Adhikari, Hari; Siljander, Mika; Pellikka, Petri; Hemp, Andreas (2019)
    Classifying land use/land cover (LULC) with sufficient accuracy in heterogeneous landscapes is challenging using only satellite imagery. To improve classification accuracy inclusion of features from auxiliary geospatial datasets in classification models is applied since 1980s. However, the method is mostly limited to pixel-based classifications, and the coverage, accuracy and resolution of free and open-access auxiliary datasets have been poor until recent years. We evaluated how recent global coverage open-access geospatial datasets improve object-based LULC classification accuracy compared to using only spectral and texture features from satellite images. We applied feature sets topography, population, soil, canopy cover, distance to watercourses and spectral-temporal metrics from Landsat-8 time series on the southern foothills and savanna of Mt. Kilimanjaro, Tanzania, where the landscape is characterized by heterogeneous and fragmented mosaic of disturbed savanna vegetation, croplands, and settlements. The classification was based on image objects (groups of spectrally similar pixels) derived from segmentation of four Formosat-2 scenes with 8m spatial resolution using 1370 ground reference points for training, validation, and for defining 17 LULC classes. We built six Random Forest classification models with different sets of object features in each. The baseline model having only spectral and texture features was compared with five other models supplemented with auxiliary features. Inclusion of auxiliary features significantly improved classification overall accuracy (OA). The baseline model gave a median OA of 60.7%, but auxiliary features in other models increased median OA between 6.1 and 16.5 percentage points. The best OA was achieved with a model including all features of which elevation was the most important auxiliary feature followed by Enhanced Vegetation Index temporal range and slope degree. Applying object-based classification to geospatial information on topography, soil, settlement patterns and vegetation phenology, the discriminatory potential of challenging LULC classes can be significantly improved. We demonstrated this for the first time, and the technique shows good potential for improving LULC mapping across a multitude of fragmented landscapes worldwide.
  • Lange, Moritz Johannes; Suominen, Henri Johannes; Kurppa, Mona; Järvi, Leena; Oikarinen, Emilia; Savvides, Rafael; Puolamäki, Kai (2021)
    Running large-eddy simulations (LESs) can be burdensome and computationally too expensive from the application point of view, for example, to support urban planning. In this study, regression models are used to replicate modelled air pollutant concentrations from LES in urban boulevards. We study the performance of regression models and discuss how to detect situations where the models are applied outside their training domain and their outputs cannot be trusted. Regression models from 10 different model families are trained and a cross-validation methodology is used to evaluate their performance and to find the best set of features needed to reproduce the LES outputs. We also test the regression models on an independent testing dataset. Our results suggest that in general, log-linear regression gives the best and most robust performance on new independent data. It clearly outperforms the dummy model which would predict constant concentrations for all locations (multiplicative minimum RMSE (mRMSE) of 0.76 vs. 1.78 of the dummy model). Furthermore, we demonstrate that it is possible to detect concept drift, i.e. situations where the model is applied outside its training domain and a new LES run may be necessary to obtain reliable results. Regression models can be used to replace LES simulations in estimating air pollutant concentrations, unless higher accuracy is needed. In order to have reliable results, it is however important to do the model and feature selection carefully to avoid overfitting and to use methods to detect the concept drift.
  • Okser, Sebastian; Pahikkala, Tapio; Airola, Antti; Salakoski, Tapio; Ripatti, Samuli; Aittokallio, Tero (2014)
  • Mikola, Juha; Virtanen, Tarmo; Linkosalmi, Maiju; Vähä, Emmi; Nyman, Johanna; Postanogova, Olga; Räsänen, Aleksi; Kotze, D. Johan; Laurila, Tuomas; Juutinen, Sari; Kondratyev, Vladimir; Aurela, Mika (2018)
    Arctic tundra ecosystems will play a key role in future climate change due to intensifying permafrost thawing, plant growth and ecosystem carbon exchange, but monitoring these changes may be challenging due to the heterogeneity of Arctic landscapes. We examined spatial variation and linkages of soil and plant attributes in a site of Siberian Arctic tundra in Tiksi, northeast Russia, and evaluated possibilities to capture this variation by remote sensing for the benefit of carbon exchange measurements and landscape extrapolation. We distinguished nine land cover types (LCTs) and to characterize them, sampled 92 study plots for plant and soil attributes in 2014. Moreover, to test if variation in plant and soil attributes can be detected using remote sensing, we produced a normalized difference vegetation index (NDVI) and topographical parameters for each study plot using three very high spatial resolution multispectral satellite images. We found that soils ranged from mineral soils in bare soil and lichen tundra LCTs to soils of high percentage of organic matter (OM) in graminoid tundra, bog, dry fen and wet fen. OM content of the top soil was on average 14 g dm(-3) in bare soil and lichen tundra and 89 g dm(-3) in other LCTs. Total moss biomass varied from 0 to 820 gm(-2), total vascular shoot mass from 7 to 112 gm(-2) and vascular leaf area index (LAI) from 0.04 to 0.95 among LCTs. In late summer, soil temperatures at 15 cm depth were on average 14 degrees C in bare soil and lichen tundra, and varied from 5 to 9 degrees C in other LCTs. On average, depth of the biologically active, unfrozen soil layer doubled from early July to mid-August. When contrasted across study plots, moss biomass was positively associated with soil OM % and OM content and negatively associated with soil temperature, explaining 14-34% of variation. Vascular shoot mass and LAI were also positively associated with soil OM content, and LAI with active layer depth, but only explained 6-15% of variation. NDVI captured variation in vascular LAI better than in moss biomass, but while this difference was significant with late season NDVI, it was minimal with early season NDVI. For this reason, soil attributes associated with moss mass were better captured by early season NDVI. Topographic attributes were related to LAI and many soil attributes, but not to moss biomass and could not increase the amount of spatial variation explained in plant and soil attributes above that achieved by NDVI. The LCT map we produced had low to moderate uncertainty in predictions for plant and soil properties except for moss biomass and bare soil and lichen tundra LCTs. Our results illustrate a typical tundra ecosystem with great fine-scale spatial variation in both plant and soil attributes. Mosses dominate plant biomass and control many soil attributes, including OM % and temperature, but variation in moss biomass is difficult to capture by remote sensing reflectance, topography or a LCT map. Despite the general accuracy of landscape level predictions in our LCT approach, this indicates challenges in the spatial extrapolation of some of those vegetation and soil attributes that are relevant for the regional ecosystem and global climate models.
  • Tang, Zhipeng; Amatulli, Giuseppe; Pellikka, Petri; Heiskanen, Janne (2022)
    The number of Landsat time-series applications has grown substantially because of its approximately 50-year history and relatively high spatial resolution for observing long term changes in the Earth's surface. However, missing observations (i.e., gaps) caused by clouds and cloud shadows, orbit and sensing geometry, and sensor issues have broadly limited the development of Landsat time-series applications. Due to the large area and temporal and spatial irregularity of time-series gaps, it is difficult to find an efficient and highly precise method to fill them. The Missing Observation Prediction based on Spectral-Temporal Metrics (MOPSTM) method has been proposed and delivered good performance in filling large-area gaps of single-date Landsat images. However, it can be less practical for a time series longer than one year due to the lack of mechanics that exclude dissimilar data in time series (e.g., different phenology or changes in land cover). To solve this problem, this study proposes a new gap-filling method, Spectral Temporal Information for Missing Data Reconstruction (STIMDR), and examines its performance in Landsat reflectance time series. Two groups of experiments, including 2000 x 2000 pixel Landsat single-date images and Landsat time series acquired from four sites (Kenya, Finland, Germany, and China), were performed to test the new method. We simulated artificial gaps to evaluate predicted pixel values with real observations. Quantitative and qualitative evaluations of gap-filled images through comparisons with other state-of-the-art methods confirmed the more robust and accurate performance of the proposed method. In addition, the proposed method was also able to fill gaps contaminated by extreme cloud cover for a period (e.g., winter in high-latitude areas). A down-stream task of random forest supervised classification through both gap-filled simulated datasets and the original valid datasets verified that STIMDR-generated products are relevant to the user community for land cover applications.