Тёмный
No video :(

Inferencing and training LLMs with less GPUs - Hung Tran 

Machine Learning and AI Meetup
Подписаться 474
Просмотров 433
50% 1

From the January 2024 Machine Learning & AI Meetup: www.meetup.com/machine-learning-ai-meetup/
Talk Description: This talk dives into clever ways to run Large Language Models (LLMs) with fewer GPUs, tackling both inference and training stages. For inference on limited hardware, we'll explore techniques like model partitioning/offloading and quantization to shrink their memory footprint. To slash training costs, we'll delve into ZeRO memory optimization, LoRA, and prompt-tuning, paving the way for more efficient model development. And to cap it off, we'll showcase running LLMs solely on a personal laptop with CPUs and RAM, demonstrating the potential to democratise access to these powerful language models.
Speaker Bio: As an advancing Ph.D. student at Deakin University, specializing in video understanding, Hung Tran is a pioneer in applying deep learning and large language models to complex video data. His research involves processing video data, designing deep learning architectures and training them on multiple GPUs to uncover hidden patterns and predict future events. Recently, he explored how the inductive biases of large language models can enhance video analysis. Beyond research, he is a passionate coder with some experiences in backend web development.
Link to Slides: docs.google.co...

Опубликовано:

 

29 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 1   
@mobes3
@mobes3 6 месяцев назад
Fewer GPUs
Далее
Elizabeth Silver - Causality and Causal Discovery
1:17:58
[1hr Talk] Intro to Large Language Models
59:48
Просмотров 2,1 млн
This is why Deep Learning is really weird.
2:06:38
Просмотров 384 тыс.
A Survey of Techniques for Maximizing LLM Performance
45:32
The Turing Lectures: The future of generative AI
1:37:37
Просмотров 591 тыс.