Budgeted Recommendation with Delayed Feedback

Kweiguu Liu, Setareh Maghsudi, Makoto Yokoo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In a conventional contextual multi-armed bandit problem, the feedback (or reward) is immediately observable after an action. Nevertheless, delayed feedback arises in numerous real-life situations and is particularly crucial in time-sensitive applications. The exploration-exploitation dilemma becomes particularly challenging under such conditions, as it couples with the interplay between delays and limited resources. Besides, a limited budget often aggravates the problem by restricting the exploration potential. A motivating example is the distribution of medical supplies at the early stage of COVID-19. The delayed feedback of testing results, thus insufficient information for learning, degraded the efficiency of resource allocation. Motivated by such applications, we study the effect of delayed feedback on constrained contextual bandits. We develop a decision-making policy, delay-oriented resource allocation with learning (DORAL), to optimize the resource expenditure in a contextual multi-armed bandit problem with arm-dependent delayed feedback.

Original languageEnglish
Title of host publicationGood Practices and New Perspectives in Information Systems and Technologies - WorldCIST 2024
EditorsÁlvaro Rocha, Hojjat Adeli, Gintautas Dzemyda, Fernando Moreira, Aneta Poniszewska-Maranda
PublisherSpringer Science and Business Media Deutschland GmbH
Pages202-213
Number of pages12
ISBN (Print)9783031602207
DOIs
Publication statusPublished - 2024
Event12th World Conference on Information Systems and Technologies, WorldCIST 2024 - Lodz, Poland
Duration: Mar 26 2024Mar 28 2024

Publication series

NameLecture Notes in Networks and Systems
Volume987 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference12th World Conference on Information Systems and Technologies, WorldCIST 2024
Country/TerritoryPoland
CityLodz
Period3/26/243/28/24

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Budgeted Recommendation with Delayed Feedback'. Together they form a unique fingerprint.

Cite this