Efficient exploration in sequential decision making problems ---- Yasin Abbasi

  • When Sep 17, 2019 from 11:00 AM to 12:00 PM (Europe/Amsterdam / UTC200)
  • Where L016
  • Add event to calendar iCal

I will discuss recent results in designing more adaptive bandit algorithms. Our first approach is based on the bootstrap method and leads to a more efficient and data-dependent algorithm for the multi-armed bandit problem. Our second approach is a model-selection method for bandit problems. As an example of the usefulness of the approach, when the reward function is largely independent of the contexts, the method will automatically converge to the simpler and more efficient non-contextual algorithm.