Тёмный

The future of Delta Lake and Apache Iceberg 

NextGenLakehouse
Подписаться 3,3 тыс.
Просмотров 905
50% 1

Опубликовано:

 

27 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 3   
@jeanchindeko5477
@jeanchindeko5477 2 месяца назад
Why did Apple switch to Apache Iceberg?
@jeanchindeko5477
@jeanchindeko5477 2 месяца назад
Yes Liquid Clustering is a good starting point and moving thing in the right direction in terms of user/developer experience. But Liquid Clustering might not solve all the problems, but will already help with the part of your small files concern.
@utsavchanda4190
@utsavchanda4190 2 месяца назад
Really insightful discussion. Thank you for that. Honestly, I've always wondered whether these lakehouses built on open format tables can guarantee the same performance as MPP warehouses. And the biggest reason for that concern has been how in delta every operation (insert, update or delete) is essentially an insert (new file) under the hood. And then there are other considerations like small file problems and optimized writes. And always felt there was a significant development/operational overhead in terms of running OPTIMIZE, Z-ORDER and now enabling DELETING VECTORS in order to keep the tables performant as they grow. Does LIQUID CLUSTERING take that overhead away from customers and make their life easier? I know Databricks promises intelligent optimization and automatic clustering for managed tables but what about external tables because most companies would be having external tables where the underlying files are in their realm.
Далее
Open Data Foundations across Hudi, Iceberg and Delta
34:24
The future of Data Warehousing with Reynold Xin
37:32
CORTE DE CABELO RADICAL
00:59
Просмотров 1,7 млн
What is Apache Iceberg?
12:54
Просмотров 25 тыс.
Open Sourcing Unity Catalog
26:45
Просмотров 363
The Harsh Reality of Being a Data Engineer
14:21
Просмотров 238 тыс.
CORTE DE CABELO RADICAL
00:59
Просмотров 1,7 млн