Тёмный

ML Interpretability: feature visualization, adversarial example, interp. for language models 

Umar Jamil
Подписаться 41 тыс.
Просмотров 7 тыс.
50% 1

In this video, I will be introducing Machine Learning Interpretability, a vast topic that aims at understanding the inner mechanisms of how machine learning models make their predictions, with the aim of debugging them, making them more transparent and trustworthy.
I will start by reviewing deep learning and the back-propagation algorithm, which are necessary for understanding adversarial example generation and feature visualization for computer vision classification models. In the second part, I will show how we can leverage the knowledge built in the first part of the video and apply it to language models. In particular, we will see how we can get insights on the bias of a language model by generating a prompt that maximizes the likelihood of the next token being a certain concept of our choice. This allows us to answer questions like:
"What does my language model think of women?"
"What does my language model think of minorities?"
This video has been built in collaboration with Leap Labs - an AI research lab that deals with machine learning interpretability and built the Leap Labs Interpretability Engine, which allows to get insights on how computer vision models work and how to improve them by generating prototypes, isolating features and understanding entanglement between classes.
Leap Labs: www.leap-labs....
Leap Labs Tutorials: docs.leap-labs...
As usual, the code and PDF slides are available at the following links:
- PDF slides: github.com/hkp...
- Adversarial Example Generation (tricking a classifier): github.com/hkp...
- Generate inputs for language models: github.com/jes...

Опубликовано:

 

16 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 42   
Далее
Why Computer Vision Is a Hard Problem for AI
8:39
Просмотров 130 тыс.
Why Does Diffusion Work Better than Auto-Regression?
20:18
The Reparameterization Trick
17:35
Просмотров 19 тыс.
The moment we stopped understanding AI [AlexNet]
17:38
The Most Important Algorithm in Machine Learning
40:08
Просмотров 421 тыс.
25 AI Concepts EVERYONE Should Know
10:17
Просмотров 4,3 тыс.