- Published on
Start with five restaurants whose true quality is unknown, then follow the same problem through multi-armed bandits, dueling bandits, noisy pairwise ranking, and top-K selection: the optimization structure is known, but its parameters must be learned by trying.