bn:02220500n
Noun Concept
Categories: Stochastic optimization, Sequential methods, Sequential experiments, Machine learning
EN
multi-armed bandit  Approximate solutions of the multi-armed bandit problem  bandit  Bandit model  bandit problem
EN
In probability theory and machine learning, the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. Wikipedia
Definitions
Relations
Sources
EN
In probability theory and machine learning, the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. Wikipedia
A problem in probability theory Wikipedia Disambiguation
Reinforcement learning problem exemplifying the exploration–exploitation tradeoff Wikidata