Multi-Armed Bandits and A/B Testing

Jay Feng

Подписаться 47 тыс.

Просмотров 6 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

16 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 10

@tinawang1291 2 года назад

Learnt something today , thanks! I think for the last example of unlearnai, they will still need to test few real people with placebo to validate their model performance. With a proven working model, they can test mainly with real drug for side effect, etc

@YaminiKurra 2 года назад

Such a great talk sandy! So proud of you

@CruiserPup 2 года назад

Wow, this was such a great convo! Thanks Sandeep for sharing your wisdom, going to be checking out your other work!

@adhithyajoe1417 2 года назад

Great content!!

@sriharshamadala4656 2 года назад

Its not often you hear a researcher give a high level talk that regular folks can understand. Great talk. Enjoyed it thoroughly. About that 20$ though, whats the algo haha

@ravennsiregar 8 месяцев назад

at the moment it is often using UCB/Upper Confidence Bound to maximise utility return. But the overall problem is, in casino the reward is not simply one state. It is far complex than simple one state bandit context tho. The casino example is a mere oversimplifying.

@ravennsiregar 8 месяцев назад

Hello Sandeep, thank you for the quick overrun. Do you mind to tell us how to connect or discuss with you after this session? Follow up, so I feel that Multi Armed Bandit is sort of Optimisation Problem given such constraint that it is quite hard and ineffective to perform AB Testing? Do you agree with such motion? Let me know your inputs

@shankars4384 11 месяцев назад

This was a great video!

@iancheung3587 2 года назад

What's Sandeep's full name/ linkedin

@radio-controlledcouk Год назад

You cant use Multi armed bandits in online experimentation because they cause return user bias. MAB's can only be used once per user. The problem is that bandit machines have a fixed probability of payout.... whilst a user of a websites probability of buying something increases over time. This means that if they are switched into a new variation that new variation is more likely to incur an outcome of a sale...... flawed experiment!