TY - GEN
T1 - RePair Grammars Are the Smallest Grammars for Fibonacci Words
AU - Mieno, Takuya
AU - Inenaga, Shunsuke
AU - Horiyama, Takashi
N1 - Publisher Copyright:
© Takuya Mieno, Shunsuke Inenaga, and Takashi Horiyama; licensed under Creative Commons License CC-BY 4.0
PY - 2022/6/1
Y1 - 2022/6/1
N2 - Grammar-based compression is a loss-less data compression scheme that represents a given string w by a context-free grammar that generates only w. While computing the smallest grammar which generates a given string w is NP-hard in general, a number of polynomial-time grammar-based compressors which work well in practice have been proposed. RePair, proposed by Larsson and Moffat in 1999, is a grammar-based compressor which recursively replaces all possible occurrences of a most frequently occurring bigrams in the string. Since there can be multiple choices of the most frequent bigrams to replace, different implementations of RePair can result in different grammars. In this paper, we show that the smallest grammars generating the Fibonacci words Fk can be completely characterized by RePair, where Fk denotes the k-th Fibonacci word. Namely, all grammars for Fk generated by any implementation of RePair are the smallest grammars for Fk, and no other grammars can be the smallest for Fk. To the best of our knowledge, Fibonacci words are the first non-trivial infinite family of strings for which RePair is optimal.
AB - Grammar-based compression is a loss-less data compression scheme that represents a given string w by a context-free grammar that generates only w. While computing the smallest grammar which generates a given string w is NP-hard in general, a number of polynomial-time grammar-based compressors which work well in practice have been proposed. RePair, proposed by Larsson and Moffat in 1999, is a grammar-based compressor which recursively replaces all possible occurrences of a most frequently occurring bigrams in the string. Since there can be multiple choices of the most frequent bigrams to replace, different implementations of RePair can result in different grammars. In this paper, we show that the smallest grammars generating the Fibonacci words Fk can be completely characterized by RePair, where Fk denotes the k-th Fibonacci word. Namely, all grammars for Fk generated by any implementation of RePair are the smallest grammars for Fk, and no other grammars can be the smallest for Fk. To the best of our knowledge, Fibonacci words are the first non-trivial infinite family of strings for which RePair is optimal.
UR - http://www.scopus.com/inward/record.url?scp=85134335970&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134335970&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.CPM.2022.26
DO - 10.4230/LIPIcs.CPM.2022.26
M3 - Conference contribution
AN - SCOPUS:85134335970
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022
A2 - Bannai, Hideo
A2 - Holub, Jan
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022
Y2 - 27 June 2022 through 29 June 2022
ER -