KDD2024

A Tutorial on Multi-Armed Bandit Applications for Large Language Models

Djallel Bouneffouf, Raphaël Féraud

被引用 2 次

摘要

This tutorial offers a comprehensive guide on using multi-armed bandit (MAB) algorithms to improve Large Language Models (LLMs). As Natural Language Processing (NLP) tasks grow, efficient and adaptive language generation systems are increasingly needed. MAB algorithms, which balance exploration and exploitation under uncertainty, are promising for enhancing LLMs.