vLLM Office Hours - vLLM on AMD GPUs and Google TPUs - August 21, 2024

The Best Programmer I Know • Daniel Terhorst-North • GOTO 2024

爸爸带娃自己睡着了，看兜兜怎么整你！#cute #baby #funny #comedy

От первого лица: Школа 7😡 ПРОВЕЛИ НОЧЬ в МЕНТОВКЕ 😱 УЖАСНЫЙ 1 СЕНТЯБРЯ 😰 НОВЕНЬКАЯ ГЛАЗАМИ ШКОЛЬНИКА

Woodworking Tips and Tricks! How to Join Wood #shorts #woodworking #tips #skills

GIANT Gummy Worm Pt.6 #shorts

vLLM Office Hours - Multimodal Models in vLLM with Roblox - August 8, 2024

Подписаться 1,6 тыс.

Просмотров 396

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

21 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 2

@hari000-f6y 15 дней назад

I have a question!. I'm serving multimodal on vLLM, quantized (InternVL2) on L4 , it takes ~5-6 secs to complete a request, so when multiple request hit at a time, it takes much time ~30 secs to complete the requests. how to handle it like multiple requests also gets completed in ~5 secs. I have less understanding in batch_requesting and all.

@shumshvenhiszali Месяц назад

Say code opensource but where?

Далее

vLLM Office Hours - vLLM on AMD GPUs and Google TPUs - August 21, 2024

48:13

vLLM Office Hours - vLLM on AMD GPUs and Google TPUs - August 21, 2024

Просмотров 428

The Best Programmer I Know • Daniel Terhorst-North • GOTO 2024

48:33

The Best Programmer I Know • Daniel Terhorst-North • GOTO 2024

Просмотров 46 тыс.

爸爸带娃自己睡着了，看兜兜怎么整你！#cute #baby #funny #comedy

00:16

爸爸带娃自己睡着了，看兜兜怎么整你！#cute #baby #funny #comedy

Просмотров 5 млн

От первого лица: Школа 7😡 ПРОВЕЛИ НОЧЬ в МЕНТОВКЕ 😱 УЖАСНЫЙ 1 СЕНТЯБРЯ 😰 НОВЕНЬКАЯ ГЛАЗАМИ ШКОЛЬНИКА

19:30

От первого лица: Школа 7😡 ПРОВЕЛИ НОЧЬ в МЕНТОВКЕ 😱 УЖАСНЫЙ 1 СЕНТЯБРЯ 😰 НОВЕНЬКАЯ ГЛАЗАМИ ШКОЛЬНИКА

Просмотров 1,8 млн

Woodworking Tips and Tricks! How to Join Wood #shorts #woodworking #tips #skills

00:21

Woodworking Tips and Tricks! How to Join Wood #shorts #woodworking #tips #skills

Просмотров 346 тыс.

GIANT Gummy Worm Pt.6 #shorts

00:46

GIANT Gummy Worm Pt.6 #shorts

Просмотров 40 млн

DevSecOps Series#7: Understanding Key Vulnerability Types in Software Security | Whiteboard

8:56

DevSecOps Series#7: Understanding Key Vulnerability Types in Software Security | Whiteboard

Просмотров 10

No, Einstein Didn’t Solve the Biggest Problem in Physics

8:04

No, Einstein Didn’t Solve the Biggest Problem in Physics

Просмотров 303 тыс.

vLLM Office Hours - FP8 Quantization Deep Dive - July 9, 2024

56:09

vLLM Office Hours - FP8 Quantization Deep Dive - July 9, 2024

Просмотров 832

Crafting the Perfect Model: Fine-Tuning and Merging LLMs | GenAI DevCon London

33:19

Crafting the Perfect Model: Fine-Tuning and Merging LLMs | GenAI DevCon London

Просмотров 2,6 тыс.

vLLM on Kubernetes in Production

27:31

vLLM on Kubernetes in Production

Просмотров 2,9 тыс.

What Is an AI Anyway? | Mustafa Suleyman | TED

22:02

What Is an AI Anyway? | Mustafa Suleyman | TED

Просмотров 1,5 млн

Unlock Faster and More Efficient LLMs with SparseGPT

42:27

Unlock Faster and More Efficient LLMs with SparseGPT

Просмотров 2,1 тыс.

Turns out REST APIs weren't the answer (and that's OK!)

10:38

Turns out REST APIs weren't the answer (and that's OK!)

Просмотров 158 тыс.

Why Agent Frameworks Will Fail (and what to use instead)

19:21

Why Agent Frameworks Will Fail (and what to use instead)

Просмотров 62 тыс.

Expert AI Researcher Reacts to o1 and Shares What's Next in Reasoning and Post-Training

57:25

Expert AI Researcher Reacts to o1 and Shares What's Next in Reasoning and Post-Training

Просмотров 4,6 тыс.

爸爸带娃自己睡着了，看兜兜怎么整你！#cute #baby #funny #comedy

00:16

爸爸带娃自己睡着了，看兜兜怎么整你！#cute #baby #funny #comedy

Просмотров 5 млн