# Relations Between Greedy and Bit-Optimal LZ77 Encodings

﻿

http://hdl.handle.net/10138/237761

#### Citation

Kosolobov , D 2018 , Relations Between Greedy and Bit-Optimal LZ77 Encodings . in R Niedermeier & B Vallée (eds) , Relations Between Greedy and Bit-Optimal LZ77 Encodings . , 46 , Leibniz International Proceedings in Informatics (LIPIcs) , vol. 96 , Schloss Dagstuhl - Leibniz-Zentrum für Informatik , Dagstuhl , pp. 46:1-46:14 , Symposium on Theoretical Aspects of Computer Science , Caen , France , 28/02/2018 . https://doi.org/10.4230/LIPIcs.STACS.2018.46

 Title: Relations Between Greedy and Bit-Optimal LZ77 Encodings Author: Kosolobov, Dmitry Other contributor: Niedermeier, Rolf Vallée, Brigitte Contributor organization: Department of Computer ScienceGenome-scale Algorithmics research group / Veli Mäkinen Publisher: Schloss Dagstuhl - Leibniz-Zentrum für Informatik Date: 2018 Language: eng Number of pages: 14 Belongs to series: Relations Between Greedy and Bit-Optimal LZ77 Encodings Belongs to series: Leibniz International Proceedings in Informatics (LIPIcs) ISBN: 978-3-95977-062-0 ISSN: 1868-8969 DOI: https://doi.org/10.4230/LIPIcs.STACS.2018.46 URI: http://hdl.handle.net/10138/237761 Abstract: This paper investigates the size in bits of the LZ77 encoding, which is the most popular and efficient variant of the Lempel--Ziv encodings used in data compression. We prove that, for a wide natural class of variable-length encoders for LZ77 phrases, the size of the greedily constructed LZ77 encoding on constant alphabets is within a factor $O(\frac{\log n}{\log\log\log n})$ of the optimal LZ77 encoding, where $n$ is the length of the processed string. We describe a series of examples showing that, surprisingly, this bound is tight, thus improving both the previously known upper and lower bounds. Further, we obtain a more detailed bound $O(\min\{z, \frac{\log n}{\log\log z}\})$, which uses the number $z$ of phrases in the greedy LZ77 encoding as a parameter, and construct a series of examples showing that this bound is tight even for binary alphabet. We then investigate the problem on non-constant alphabets: we show that the known $O(\log n)$ bound is tight even for alphabets of logarithmic size, and provide tight bounds for some other important cases. Subject: 113 Computer and information sciences Peer reviewed: Yes Rights: cc_by Usage restriction: openAccess Self-archived version: publishedVersion
﻿