Online Algorithms for Finding Distinct Substrings with Length and Multiple Prefix and Suffix Conditions

Laurentius Leonard, Shunsuke Inenaga, Hideo Bannai, Takuya Mieno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Let two static sequences of strings P and S, representing prefix and suffix conditions respectively, be given as input for preprocessing. For the query, let two positive integers k1 and k2 be given, as well as a string T given in an online manner, such that Ti represents the length-i prefix of T for 1 ≤ i≤ | T|. In this paper we are interested in computing the set ansi of distinct substrings w of Ti such that k1≤ | w| ≤ k2, and w contains some p∈ P as a prefix and some s∈ S as a suffix. More specifically, the counting problem is to output | ansi|, whereas the reporting problem is to output all elements of ansi, for each iteration i. Let σ denote the alphabet size, and for a sequence of strings A, ‖ A‖ = ∑ uA| u|. Then, we show that after O((‖ P‖ + ‖ S‖ ) log σ) -time preprocessing, the solutions for the counting and reporting problems for each iteration up to i can be output in O(| Ti| log σ) and O(| Ti| log σ+ | ansi| ) total time. The preprocessing time can be reduced to O(‖ P‖ + ‖ S‖ ) for integer alphabets of size polynomial with regard to ‖ P‖ + ‖ S‖. Our algorithms have possible applications to network traffic classification.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 29th International Symposium, SPIRE 2022, Proceedings
EditorsDiego Arroyuelo, Diego Arroyuelo, Barbara Poblete
PublisherSpringer Science and Business Media Deutschland GmbH
Pages24-37
Number of pages14
ISBN (Print)9783031206429
DOIs
Publication statusPublished - 2022
Event29th International Symposium on String Processing and Information Retrieval, SPIRE 2022 - Concepción, Chile
Duration: Nov 8 2022Nov 10 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13617 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Symposium on String Processing and Information Retrieval, SPIRE 2022
Country/TerritoryChile
CityConcepción
Period11/8/2211/10/22

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Online Algorithms for Finding Distinct Substrings with Length and Multiple Prefix and Suffix Conditions'. Together they form a unique fingerprint.

Cite this