Accelerating LLM Inference with vLLM

How do QR codes work? (I built one myself to find out)

Собираю дочку в корейскую школу! Back to school в Корее/ Виктория Ким #корея #корейцы #школавкорее

Нельзя смеяться | Смех с водой | 84 #shorts

Ахбори Тоҷикистон ва ҷаҳон (07.10.2024) اخبار تاجیکستان

نترس تو برق نبود😅😅

Xupeng Miao (Purdue University) - Faster Inference of LLMs Seminar ⚡️

Подписаться 59

Просмотров 62

50% 1

Видео Поделиться Скачать Добавить в

About the seminar: faster-llms.ve...
Title: Towards Fast and Affordable Serving Systems for Large Language Models
Abstract: In the rapidly evolving field of generative artificial intelligence, efficient deployment of large language models (LLMs) is a critical challenge. In this talk, I will introduce our three innovative approaches to enhancing the efficiency and cost-effectiveness of LLM inference and serving systems. First, I will present SpecInfer, the inaugural tree-based speculative inference system that reduces LLM serving latency by 1.5-3.5x compared to existing solutions by leveraging a novel token tree speculation and verification mechanism. Next, I will describe SpotServe, the first LLM serving system on spot instances, handling preemptions with dynamic reparallelization, ensuring relatively low tail latency, and reducing monetary cost by 54%. Finally, I will exhibit Mirage, a superoptimizer that automatically discovers highly-optimized GPU implementations for LLMs and beyond, which might even be faster than existing expert-designed implementations like FlashAttention.
Recorded on Aug 28, 2024.

Опубликовано:

7 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии

Далее

Accelerating LLM Inference with vLLM

35:53

Accelerating LLM Inference with vLLM

Просмотров 4,1 тыс.

How do QR codes work? (I built one myself to find out)

35:13

How do QR codes work? (I built one myself to find out)

Просмотров 5 млн

Собираю дочку в корейскую школу! Back to school в Корее/ Виктория Ким #корея #корейцы #школавкорее

01:01

Собираю дочку в корейскую школу! Back to school в Корее/ Виктория Ким #корея #корейцы #школавкорее

Просмотров 319 тыс.

Нельзя смеяться | Смех с водой | 84 #shorts

00:59

Нельзя смеяться | Смех с водой | 84 #shorts

Просмотров 664 тыс.

Ахбори Тоҷикистон ва ҷаҳон (07.10.2024) اخبار تاجیکستان

15:11

Ахбори Тоҷикистон ва ҷаҳон (07.10.2024) اخبار تاجیکستان

Просмотров 349 тыс.

نترس تو برق نبود😅😅

00:17

نترس تو برق نبود😅😅

Просмотров 934 тыс.

Thermoelectric cooling: it's not great.

32:51

Thermoelectric cooling: it's not great.

Просмотров 2,5 млн

NSA Releases Internal 1982 Lecture by Computing Pioneer Rear Admiral Grace Hopper

1:29:36

NSA Releases Internal 1982 Lecture by Computing Pioneer Rear Admiral Grace Hopper

Просмотров 240 тыс.

An Overview of FAS and Its Potential Application for 6G Wireless

1:05:42

An Overview of FAS and Its Potential Application for 6G Wireless

Просмотров 224

ZEN 5 has a 3D V-Cache Secret

19:32

ZEN 5 has a 3D V-Cache Secret

Просмотров 72 тыс.

WEBINAR: Online Master's at Purdue Mechanical Engineering

29:47

WEBINAR: Online Master's at Purdue Mechanical Engineering

Просмотров 3,3 тыс.

How are holograms possible?

46:24

How are holograms possible?

Просмотров 501 тыс.

Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral

30:25

Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral

Просмотров 15 тыс.

Boris Johnson says he regrets apologising for Partygate scandal (Full Interview) | ITV News

25:08

Boris Johnson says he regrets apologising for Partygate scandal (Full Interview) | ITV News

Просмотров 120 тыс.

Generative AI in a Nutshell - how to survive and thrive in the age of AI

17:57

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Просмотров 2,1 млн

Mircea Stan, Ph.D. Speaker Series

48:20

Mircea Stan, Ph.D. Speaker Series

Просмотров 102

Собираю дочку в корейскую школу! Back to school в Корее/ Виктория Ким #корея #корейцы #школавкорее

01:01

Собираю дочку в корейскую школу! Back to school в Корее/ Виктория Ким #корея #корейцы #школавкорее

Просмотров 319 тыс.