Тёмный

Susan Murphy: Inference for Batched Bandits 

Online Causal Inference Seminar
Подписаться 7 тыс.
Просмотров 1,4 тыс.
50% 1

"Inference for Batched Bandits"
Susan Murphy, Harvard University
Discussant: Stefan Wager, Stanford University
Abstract: As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively collected data. In this work, we develop methods for inference on data collected in batches using a bandit algorithm. When there is no unique arm we prove that the ordinary least squares estimator(OLS) is not asymptotically normal on data collected using standard bandit algorithms. This is the case even when the bandit is constrained to select each arm with probabilities bounded away from 0 and 1. We show that this problem can be traced to the fact that the arm selection probabilities do not concentrate. We take advantage of the batched setting to develop a Batched OLS estimator (BOLS) that we prove is (1) asymptotically normal on data collected from both multi-arm and contextual bandits and (2) robust to nonstationarity in the baseline reward. This is joint work with Kelly Zhang and Lucas Janson.
May 19, 2020

Опубликовано:

 

3 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии    
Далее
Meet the Mind: The Brain Behind Shor’s Algorithm
9:12
LOLLIPOP-SCHUTZ-GADGET 🍭 DAS BRAUCHST DU!
00:28
Faster and Cheaper Offline Batch Inference with Ray
28:04
Introduction to Poker Theory
30:49
Просмотров 1,4 млн
Chan Park: Single Proxy Control
27:12
Просмотров 332
Necessity of complex numbers
7:39
Просмотров 2,6 млн