A Branch-and-Bound Approach to Efficient Classification and Retrieval of Documents

Kotaro Ii, Hiroto Saigo, Yasuo Tabei

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Text classification and retrieval have been crucial tasks in natural language processing. In this paper, we present novel techniques for these tasks by leveraging the invariance of feature order to the evaluation results. Building on the assumption that text retrieval or classification models have already been constructed from the training documents, we propose efficient approaches that can restrict the search space spanned by the test documents. Our approach encompasses two key contributions. The first contribution introduces an efficient method for traversing a search tree, while the second contribution involves the development of novel pruning conditions. Through computational experiments using real-world datasets, we consistently demonstrate that the proposed approach outperforms the baseline method in various scenarios, showcasing its superior speed and efficiency.

Original languageEnglish
Title of host publicationProceedings of the 13th International Conference on Pattern Recognition Applications and Methods
EditorsModesto Castrillon-Santana, Maria De Marsico, Ana Fred
PublisherScience and Technology Publications, Lda
Pages205-214
Number of pages10
ISBN (Print)9789897586842
DOIs
Publication statusPublished - 2024
Event13th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2024 - Rome, Italy
Duration: Feb 24 2024Feb 26 2024

Publication series

NameInternational Conference on Pattern Recognition Applications and Methods
Volume1
ISSN (Electronic)2184-4313

Conference

Conference13th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2024
Country/TerritoryItaly
CityRome
Period2/24/242/26/24

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'A Branch-and-Bound Approach to Efficient Classification and Retrieval of Documents'. Together they form a unique fingerprint.

Cite this