Speeding up dynamic bit vectors

Show full item record


Title: Speeding up dynamic bit vectors
Author: Dönges, Saska
Contributor: University of Helsinki, Faculty of Science
Publisher: Helsingin yliopisto
Date: 2021
Language: eng
URI: http://urn.fi/URN:NBN:fi:hulib-202105112139
Thesis level: master's thesis
Degree program: Tietojenkäsittelytieteen maisteriohjelma
Master's Programme in Computer Science
Magisterprogrammet i datavetenskap
Specialisation: Algoritmit
Abstract: Bit vectors have many applications within succinct data structures, compression and bioinformatics among others. Any improvements in bit vector performance translates to improvements in the applications. In this thesis we focus on dynamic bit vector performance. Fully dynamic succinct bit vectors enable other dynamic succinct data structures, for example dynamic compressed strings. We briefly discuss the theory of bit vectors and the current state of research related to static and dynamic bit vectors. The main focus of the thesis is on our research into improving the dynamic bit vector implementation in the DYNAMIC C++ library (Prezza, 2017). Our main contribution is the inclusion of buffering to speed up insertions and deletions while not negatively impacting non-modifying operations. In addition, we optimized some of the code in the DYNAMIC library and experimented with vectorizing some of the access operations. Our code optimizations yield a substantial improvement to insertion and deletion performance. Our buffering implementation speeds up insertions and deletions significantly, with negligible impact to other operations or space efficiency. Our implementation acts as proof-of-concept for buffering and suggests that future research into more advanced buffering is likely to increase performance. Finally, our testing indicates that using vectorized instructions in the AVX2 and AVX512 microarchitecture extensions is beneficial in at least some cases and should be researched further. Our implementation available at https://github.com/saskeli/DYNAMIC should only be considered a proof-of-concept as there are known bugs in some of the operations that are not extensively tested.
Subject: bit vector
data structure
SIMD Instructions

Files in this item

Total number of downloads: Loading...

Files Size Format View
Donges_Saska_Thesis_2021.pdf 3.322Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record