Тёмный

Inference & GPU Optimization: AWQ 

AI Makerspace
Подписаться 10 тыс.
Просмотров 340
50% 1

Join us as we explore cutting-edge techniques to optimize Large Language Models (LLMs) for inference! This event will dive into the tradeoffs between performance and cost in both LLMs and Small Language Models (SLMs). Learn how quantization, specifically Activate-aware Quantization (AWQ), compresses models while maintaining top-notch performance. We'll break down the findings from recent research and show you how to apply these techniques using Transformers. If you're interested in maximizing output while minimizing compute, this is an event you won't want to miss!
Event page: bit.ly/GPUOpti...
Have a question for a speaker? Drop them here:
app.sli.do/eve...
Speakers:
Dr. Greg, Co-Founder & CEO AI Makerspace
/ gregloughane
The Wiz, Co-Founder & CTO AI Makerspace
/ csalexiuk
Apply for our new AI Engineering Bootcamp on Maven today!
bit.ly/aie1
For team leaders, check out!
aimakerspace.i...
Join our community to start building, shipping, and sharing with us today!
/ discord
How'd we do? Share your feedback and suggestions for future events.
forms.gle/ZTeb...

Опубликовано:

 

1 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 1   
@AI-Makerspace
@AI-Makerspace 6 дней назад
AI Makerspace: Activation Aware Weight Quantization (AWQ): colab.research.google.com/drive/1eCwenXmSd7u8ZM3TSqIDm4V7BZ_3K6dA?usp=sharing Event Slides: www.canva.com/design/DAGRxxAiqtw/MIy6aqafzIfThBRBTPc86Q/view?DAGRxxAiqtw&
Далее
Production-Ready Optimization Strategies for RAG
1:03:35
Decrusting the tokio crate
3:31:48
Просмотров 100 тыс.
Дикий Бармалей разозлил всех!
01:00
Свадьба Раяна Асланбекова ❤️
00:12
#慧慧很努力#家庭搞笑#生活#亲子#记录
00:11
GEOMETRIC DEEP LEARNING BLUEPRINT
3:33:23
Просмотров 182 тыс.
Hyprland Rices and COSMIC First Look
2:43:03
Просмотров 20 тыс.
Optimising Code - Computerphile
19:43
Просмотров 147 тыс.
Дикий Бармалей разозлил всех!
01:00