A problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice’s properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. – Wikipedia
Relevance in CRO
The multi-armed bandit problem is a thought experiment where a gambler is presented with a row of slot machines and needs to determine how to maximize their winnings. A “one-armed bandit” is a nickname for a slot machine. You know, since they’re a form of gambling and more often than not steal your money. Thus, a row of slot machines is a “mutli-armed bandit” and therein lies the origin of the term.
Within the field of A/B testing, a multi-armed bandit is a type of split test that employs machine learning to dynamically control bucketing and direct users into variations more likely to win. Think of it like this: instead of variations having a set 50/50 split, the split is adjusted dynamically; it could be 60/40, 83/17, etc. The testing tool regularly evaluates and adjusts the split based on which variations are performing better or worse for the primary goal.
Two terms you’ll see crop up a lot when reading about MABs are “exploration” and “exploitation” and the trade-off between the two. Exploration is the process of gathering information; repeatedly pulling a lever to see what happens. Exploitation is the process of maximizing returns; repeatedly pulling a winning lever to get the payout. Traditional A/B testing focuses on exploration and risk mitigation through achieving statistical significance whereas MAB testing focuses on exploitation and maximizing lift by sacrificing statistical significance. One testing method is not strictly better than the other, and both have valid use cases. Your strategy will depend on what data and resources you have available: things like traffic sample size, acceptable level of risk, what is being tested, etc.
- Comparing Multi-Armed Bandit Algorithm on Marketing Use Cases – Towards Data Science
- When to Run Bandit Tests Instead of A/B/n Tests – CXL
- Minimize Your A/B Test Losses Due to Low-Performing Variations – VWO