Browsing by Subject "Random Forest"

Sort by: Order: Results:

Now showing items 1-4 of 4
  • Hurskainen, Pekka; Adhikari, Hari; Siljander, Mika; Pellikka, Petri; Hemp, Andreas (2019)
    Classifying land use/land cover (LULC) with sufficient accuracy in heterogeneous landscapes is challenging using only satellite imagery. To improve classification accuracy inclusion of features from auxiliary geospatial datasets in classification models is applied since 1980s. However, the method is mostly limited to pixel-based classifications, and the coverage, accuracy and resolution of free and open-access auxiliary datasets have been poor until recent years. We evaluated how recent global coverage open-access geospatial datasets improve object-based LULC classification accuracy compared to using only spectral and texture features from satellite images. We applied feature sets topography, population, soil, canopy cover, distance to watercourses and spectral-temporal metrics from Landsat-8 time series on the southern foothills and savanna of Mt. Kilimanjaro, Tanzania, where the landscape is characterized by heterogeneous and fragmented mosaic of disturbed savanna vegetation, croplands, and settlements. The classification was based on image objects (groups of spectrally similar pixels) derived from segmentation of four Formosat-2 scenes with 8m spatial resolution using 1370 ground reference points for training, validation, and for defining 17 LULC classes. We built six Random Forest classification models with different sets of object features in each. The baseline model having only spectral and texture features was compared with five other models supplemented with auxiliary features. Inclusion of auxiliary features significantly improved classification overall accuracy (OA). The baseline model gave a median OA of 60.7%, but auxiliary features in other models increased median OA between 6.1 and 16.5 percentage points. The best OA was achieved with a model including all features of which elevation was the most important auxiliary feature followed by Enhanced Vegetation Index temporal range and slope degree. Applying object-based classification to geospatial information on topography, soil, settlement patterns and vegetation phenology, the discriminatory potential of challenging LULC classes can be significantly improved. We demonstrated this for the first time, and the technique shows good potential for improving LULC mapping across a multitude of fragmented landscapes worldwide.
  • Kantola, Tuula; Vastaranta, Mikko; Lyytikäinen-Saarenmaa, Päivi; Holopainen, Markus; Kankare, Ville; Talvitie, Mervi; Hyyppä, Juha (2013)
    Forest disturbances caused by pest insects are threatening ecosystem stability, sustainable forest management and economic return in boreal forests. Climate change and increased extreme weather patterns can magnify the intensity of forest disturbances, particularly at higher latitudes. Due to rapid responses to elevating temperatures, forest insect pests can flexibly change their survival, dispersal and geographic distributions. The outbreak pattern of forest pests in Finland has evidently changed during the last decade. Projection of shifts in distributions of insect-caused forest damages has become a critical issue in the field of forest research. The Common pine sawfly (Diprion pini L.) (Hymenoptera, Diprionidae) is regarded as a significant threat to boreal pine forests. Defoliation by D. pini has resulted in severe growth loss and mortality of Scots pine (Pinus sylvestris L.) (Pinaceae) in eastern Finland. In this study, tree-wise defoliation was estimated for five different needle loss category classification schemes and for 10 different simulated airborne laser scanning (ALS) pulse densities. The nearest neighbor (NN) approach, a nonparametric estimation method, was used for estimating needle loss of 701 Scots pines, using the means of individual tree features derived from ALS data. The Random Forest (RF) method was applied in NN-search. For the full dense data (~20 pulses/m2), the overall estimation accuracies for tree-wise defoliation level varied between 71.0% and 86.5% (kappa-values of 0.56 and 0.57, respectively), depending on the classification scheme. The overall classification accuracies for two class estimation with different ALS pulse densities varied between 82.8% and 83.7% (kappa-values of 0.62 and 0.67, respectively). We conclude that ALS-based estimation of needle losses may be of acceptable accuracy for individual trees. Our method did not appear sensitive to the applied pulse densities.
  • Pelttari, Hannu (Helsingin yliopisto, 2020)
    Federated learning is a method to train a machine learning model on multiple remote datasets without the need to gather the data from the remote sites to a central location. In healthcare, gathering the data from different hospitals into a central location can be a difficult and time-consuming task, due to privacy concerns and regulations regarding the use of sensitive data, making federated learning an attractive alternative to more traditional methods. This thesis adapted an existing federated gradient boosting model and developed a new federated random forest model and applied them to mortality prediction in intensive care units. The results were then compared to the centralized counterparts of the models. The results showed that while the federated models did not perform as well as the centralized models on a similar sized dataset, the federated random forest model can achieve superior performance when trained on multiple hospitals' data compared to centralized models trained on a single hospital. In scenarios where the centralized models had data from multiple hospitals the federated models could not perform as well as the centralized models. It was also found that the performance of the centralized models could not be improved with further federated training. In addition to practical advantages such as possibility of parallel or asynchronous training without modifications to the algorithm, the federated random forest performed better in all scenarios compared to the federated gradient boosting. The performance of the federated random forest was also found to be more consistent over different scenarios than the performance of federated gradient boosting, which was highly dependent on factors such as the order with the hospitals were traversed.
  • Joensuu, Marianna (Helsingfors universitet, 2014)
    In forest inventories, more and more detailed information about the constantly growing stock is intended to obtain both at national and at private forests level. At present, in forest planning the features describing wood quality are rarely estimated from standing trees since there are limited resources for the precise measurements of the trees due to high expenses. The principal aim of this study was to determine the precision whereby the externally reviewed predictive features of the internal quality of a log-size pine wood can be estimated manually using Terrestrial Laser Scanning (TLS). The examined features were tree height, diameter at breast height, upper diameter as well as the heights of the lowest dead and living branch. The second main objective was to determine the precision whereby the tree class can be predicted based on measured and derived tree attributes. The derived attributes were the volume of the wood, crown ratio, the relation of dead branched and branch free part of the tree to the tree height, and form factor. For forecasting the nearest neighbor method was used where the search for the nearest neighbors was performed using the Random Forest -method. The relative accuracy (RMSE %) of TLS data in relation to the reference field data was found to be 7.54% (bias -6.16%) for the tree height, 6.39% ( -2.46%) for the breast height diameter, 10.01% (0.40%) for the upper diameter, 9.21% ( -5.99%) for the height of the lowest living branch and 34.95% ( -1.47%) for the height of the lowest dead branch. On the prediction of the tree class indicating the stem quality, the TLS data reached 78 % classification accuracy (5 tree classes). With harsher three tree class categorization 87% classification accuracy was reached. Based on the results can be said that quality factors, such as the lowest branches can be measured from the TLS data with reasonably adequate accuracy. Also the prediction of the tree class turns out decently (5 classes) and with harsher categorization (3 classes) well. The forecasting method described in this study can still be improved for example by the automatic interpretation of the laser scanning data, as well as combining several laser scanning points from the examined target. The most potential near future application is that TLS data can work as reference for airborne laser scanning because for this purpose the harsher categorization accuracy seems to be already very promising.