A Survey on Automatic Parameter Tuning for Big Data Processing Systems

Show full item record



Permalink

http://hdl.handle.net/10138/318447

Citation

Herodotou , H , Chen , Y & Lu , J 2020 , ' A Survey on Automatic Parameter Tuning for Big Data Processing Systems ' , ACM Computing Surveys , vol. 53 , no. 2 , 43 , pp. 1-37 . https://doi.org/10.1145/3381027

Title: A Survey on Automatic Parameter Tuning for Big Data Processing Systems
Author: Herodotou, Herodotos; Chen, Yuxing; Lu, Jiaheng
Other contributor: University of Helsinki, Department of Computer Science
University of Helsinki, Unified DataBase Management System research group / Jiaheng Lu

Date: 2020-04
Language: eng
Number of pages: 37
Belongs to series: ACM Computing Surveys
ISSN: 0360-0300
DOI: https://doi.org/10.1145/3381027
URI: http://hdl.handle.net/10138/318447
Abstract: Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration parameters controlling parallelism, I/O behavior, memory settings, and compression. Improper parameter settings can cause significant performance degradation and stability issues. However, regular users and even expert administrators grapple with understanding and tuning them to achieve good performance. We investigate existing approaches on parameter tuning for both batch and stream data processing systems and classify them into six categories: rule-based, cost modeling, simulation-based, experiment-driven, machine learning, and adaptive tuning. We summarize the pros and cons of each approach and raise some open research problems for automatic parameter tuning.
Subject: 113 Computer and information sciences
Parameter tuning
self-tuning
MapReduce
Spark
Storm
stream
MAPREDUCE
PERFORMANCE
OPTIMIZATION
MANAGEMENT
SIMULATION
TOOLKIT
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
3381027.pdf 847.4Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record