Тёмный

19 - Google BigQuery / Dremel (CMU Advanced Databases / Spring 2023) 

CMU Database Group
Подписаться 68 тыс.
Просмотров 9 тыс.
50% 1

Опубликовано:

 

12 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 2   
@StasPakhomov-wj1nn
@StasPakhomov-wj1nn 8 месяцев назад
The missing 18th lecture can be substituted with 2020's at the link below! Cheers
@SteveLoughran
@SteveLoughran 8 месяцев назад
AFAIK Hadoop MR will only use local storage between Map and Reduce; output of each job is committed to shared storage. That is where writing to HDFS takes place; writing to cloud storage is "tricker" due to non-Posix semantics, especially on rename, plus tendency to throttle. And complex SQL-equivalent statements can be multiple MR jobs. Storage of intermediate shuffle data is managed by the Yarn Node Manager, so outlives mapper/reducer processes. And you can also plug in new shufflers, e.g. for Spark. Does contain the "hosts are long lived" assumption, so doesn't suit compute-only VMs running on spot prices.
Далее
Вопрос Ребром - Булкин
59:32
Просмотров 1,1 млн
07 - Tree Indexes I (CMU Databases Systems / Fall 2019)
1:18:02
11 - Join Algorithms  (CMU Databases Systems / Fall 2019)
1:11:35