vLLM Office Hours - Model Quantization for Efficient vLLM Inference - July 25, 2024

Deploy LLMs More Efficiently with vLLM and Neural Magic

The Joker's betrayal of Harley Quinn has been discovered!#joker #shorts

Хаос Неделимый в #warhammer40k #hobsplay #вархамер

Парень Вали Карнавал 😉 #тнт #shorts #юмор #камедиклаб #воля #харламов #валякарнавал #свадьба #жених

😳 Кассирша забыла как пользоваться весами или хотела схитрить? | Новостничок

vLLM Office Hours - FP8 Quantization Deep Dive - July 9, 2024

Подписаться 1,6 тыс.

Просмотров 832

50% 1

Видео Поделиться Скачать Добавить в

In this session, we brought on vLLM Committers from Anyscale to give an in-depth dive into FP8 quantization. They discussed why FP8 is important, how to get started with FP8 in vLLM, and shared quality and performance results of FP8 quantization.
We also covered the latest updates in vLLM v0.5.1, including pipeline parallelism and model support for Gemma 2, Jamba, and DeepSeek-V2.
For more details, check out the session slides here: docs.google.co...
Join our bi-weekly vLLM office hours to stay current with vLLM, ask questions, meet the community, and give feedback: neuralmagic.co...

Опубликовано:

21 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии

Далее

vLLM Office Hours - Model Quantization for Efficient vLLM Inference - July 25, 2024

50:37

vLLM Office Hours - Model Quantization for Efficient vLLM Inference - July 25, 2024

Просмотров 659

Deploy LLMs More Efficiently with vLLM and Neural Magic

33:21

Deploy LLMs More Efficiently with vLLM and Neural Magic

Просмотров 555

The Joker's betrayal of Harley Quinn has been discovered!#joker #shorts

00:18

The Joker's betrayal of Harley Quinn has been discovered!#joker #shorts

Просмотров 5 млн

Хаос Неделимый в #warhammer40k #hobsplay #вархамер

00:32

Хаос Неделимый в #warhammer40k #hobsplay #вархамер

Просмотров 195 тыс.

Парень Вали Карнавал 😉 #тнт #shorts #юмор #камедиклаб #воля #харламов #валякарнавал #свадьба #жених

00:51

Парень Вали Карнавал 😉 #тнт #shorts #юмор #камедиклаб #воля #харламов #валякарнавал #свадьба #жених

Просмотров 1,5 млн

😳 Кассирша забыла как пользоваться весами или хотела схитрить? | Новостничок

00:26

😳 Кассирша забыла как пользоваться весами или хотела схитрить? | Новостничок

Просмотров 1,3 млн

vLLM Office Hours - Multimodal Models in vLLM with Roblox - August 8, 2024

50:03

vLLM Office Hours - Multimodal Models in vLLM with Roblox - August 8, 2024

Просмотров 395

Yuval Noah Harari: “We Are on the Verge of Destroying Ourselves” | Amanpour and Company

18:40

Yuval Noah Harari: “We Are on the Verge of Destroying Ourselves” | Amanpour and Company

Просмотров 545 тыс.

MIT EI seminar, Hyung Won Chung from OpenAI. "Don't teach. Incentivize."

35:56

MIT EI seminar, Hyung Won Chung from OpenAI. "Don't teach. Incentivize."

Просмотров 9 тыс.

A Day in the Life of a Machine Learning Engineer (at a *small* startup)

14:53

A Day in the Life of a Machine Learning Engineer (at a *small* startup)

Просмотров 395 тыс.

vLLM Office Hours - Using NVIDIA CUTLASS for High-Performance Inference - September 05, 2024

1:13:14

vLLM Office Hours - Using NVIDIA CUTLASS for High-Performance Inference - September 05, 2024

Просмотров 1,5 тыс.

Rory Sutherland - Are We Now Too Impatient to Be Intelligent? | Nudgestock 2024

31:27

Rory Sutherland - Are We Now Too Impatient to Be Intelligent? | Nudgestock 2024

Просмотров 313 тыс.

Have you heard these exciting AI news? - September 20, 2024 AI Updates Weekly

28:21

Have you heard these exciting AI news? - September 20, 2024 AI Updates Weekly

Просмотров 897

Unlock Faster and More Efficient LLMs with SparseGPT

42:27

Unlock Faster and More Efficient LLMs with SparseGPT

Просмотров 2,1 тыс.

vLLM Office Hours - vLLM on AMD GPUs and Google TPUs - August 21, 2024

48:13

vLLM Office Hours - vLLM on AMD GPUs and Google TPUs - August 21, 2024

Просмотров 428

Deep Dive: Optimizing LLM inference

36:12

Deep Dive: Optimizing LLM inference

Просмотров 21 тыс.

The Joker's betrayal of Harley Quinn has been discovered!#joker #shorts

00:18

The Joker's betrayal of Harley Quinn has been discovered!#joker #shorts

Просмотров 5 млн