Geometric Approaches to Big Data Modeling and Performance Prediction

Näytä kaikki kuvailutiedot



Pysyväisosoite

http://urn.fi/URN:NBN:fi-fe201804208657
Julkaisun nimi: Geometric Approaches to Big Data Modeling and Performance Prediction
Tekijä: Goetsch, Peter
Muu tekijä: Helsingin yliopisto, Matemaattis-luonnontieteellinen tiedekunta, Tietojenkäsittelytieteen laitos
Julkaisija: Helsingin yliopisto
Päiväys: 2018
Kieli: eng
URI: http://urn.fi/URN:NBN:fi-fe201804208657
http://hdl.handle.net/10138/273494
Opinnäytteen taso: pro gradu -tutkielmat
Tiivistelmä: Big Data frameworks (e.g., Spark) have many configuration parameters, such as memory size, CPU allocation, and the number of nodes (parallelism). Regular users and even expert administrators struggle to understand the relationship between different parameter configurations and the overall performance of the system. In this work, we address this challenge by proposing a performance prediction framework to build performance models with varied configurable parameters on Spark. We take inspiration from the field of Computational Geometry to construct a d-dimensional mesh using Delaunay Triangulation over a selected set of features. From this mesh, we predict execution time for unknown feature configurations. To minimize the time and resources spent in building a model, we propose an adaptive sampling technique to allow us to collect as few training points as required. Our evaluation on a cluster of computers using several workloads shows that our prediction error is lower than the state-of-art methods while having fewer samples to train.


Tiedostot

Latausmäärä yhteensä: Ladataan...

Tiedosto(t) Koko Formaatti Näytä
Peter Goetsch Master's Thesis 6-June-2018.pdf 923.0KB PDF Avaa tiedosto

Viite kuuluu kokoelmiin:

Näytä kaikki kuvailutiedot