Contextual Bandits with Stochastic Experts

Time: Thursday, April 26, 2018 - 2:00pm - 3:00pm
Type: Seminar Series
Presenter: Karthikeyan Shanmugam; IBM Research, NY
Room/Office: Room 335
Location:
17 Hillhouse Avenue
New Haven, CT 06511
United States

Department of Electrical Engineering & the Yale Institute for Network Science (YINS) Seminar


"Contextual Bandits with Stochastic Experts"

Karthikeyan Shanmugam
IBM Research, NY

Abstract: Contextual Bandits is an online learning problem that has applications in online recommendation systems. Examples of this include targeting ads or suggesting a treatment in healthcare etc. The basic problem is to learn a good policy that maximizes reward given contextual information. We consider this problem in the expert setting (non-parametric).

We consider a set of stochastic experts, where each expert is a conditional distribution over the choices given a context. The regret is defined with respect to the best expert. We propose upper-confidence bound (UCB) algorithms to optimize regret w.r.t the best expert. Our UCB algorithm is based on two different importance sampling based estimators that we propose. Both these estimators exploit information leakage, thus using samples collected under all the experts to estimate the mean of any given expert.

We derive problem dependent logarithmic regret bounds. The pre-log factor of these regret bounds qualifies the information leakage between the experts. We then implement this algorithm along with a binary classification based oracle that generates experts online. Our implementation shows superior performance on some real-world datasets compared to other state of the art contextual bandit algorithms in the expert setting.

Joint Work with Rajat Sen [UT Austin] and Sanjay Shakkottai [UT Austin]

Bio: Karthikeyan Shanmugam is currently a Research Staff Member at IBM Research NY in the AI Science group. Previously, he was a Herman Goldstine Postdoctoral Fellow in the Math Sciences Division at IBM Research, NY. He obtained his Ph.D. in Electrical and Computer Engineering from UT Austin in 2016 under the supervision of Dr. Alex Dimakis. Prior to this, he obtained his MS degree from USC, B. Tech and M.Tech degrees from IIT Madras. His research interests broadly lie in Statistical Machine learning, Graph Algorithms, Coding Theory and Information Theory. In machine learning, his current research focus is on Causal inference, Online Learning and Interpretability in ML.

Hosted by: Professor Leandros Tassiulas

Thursday, April 26, 2018
2:00pm
17 Hillhouse Avenue, Room 335