Abstract

A common challenge in AI is that of an agent possessing N distinct behaviors, each of which works effectively in certain tasks and circumstances, to choose from at any particular time. A popular approach to learn optimal selections in such contexts is found in bandit algorithms. However, traditional bandits often (1) assume the environment is stationary, (2) focus on asymptotic performance, and (3) do not incorporate available external information about the environment. While contextual bandits that do consider external information are designed to combat one of these challenges, the others remain problematic in realistic domains. This is especially true of non-stationary domains, such as multi-agent environments (where agents adapt their behavior to each other), as most bandits assume that regret minimization, a reasonable goal in stationary environments, is always valid. Furthermore, existing contextual bandits are not without their own difficulties: they typically either assume one bandit per context (which is not feasible in domains with large state spaces) or employ expert advice without consideration of how the current state might impact expert performance. To help combat these issues, we explored the use of Assumption-Alignment Tracking (AAT) to design contextual bandit algorithms that are successful in complex and varied multi-agent domains. Specifically, this dissertation is composed of four studies. The first three study the ability of an AAT-based bandit to achieve considerable performance in small, multi-agent systems; a complex, real-world, zero-sum domain; and a complex, general-sum, collective-action game. The third study also explores the impact of a bandit's learning framework. The fourth and final study explores improvements to the AAT design process, which, while successful, can be time intensive and demanding. Specifically, the fourth study explores how to automate the design process to boost simplicity and ease of implementation.

Degree

PhD

College and Department

Computer Science; Computational, Mathematical, and Physical Sciences

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2025-11-13

Document Type

Dissertation

Keywords

bandit algorithms, contextual bandits, bandits with expert advice, proficiency self-assessment, multi-agent domains

Language

english

Share

COinS