Тёмный

Delta Lake Deep Dive: Liquid Clustering 

Delta Lake
Подписаться 2,8 тыс.
Просмотров 6 тыс.
50% 1

Join us on Thursday, December 7 at 10AM PST for an enlightening session on Delta Lake's Liquid Clustering, a transformative approach in data management and optimization with Vítor Teixeira, Senior Data Engineer at Veeva Systems.
Liquid Clustering is Delta Lake's answer to the complex challenges of Big Data. Traditionally, partitioning and Z-Order clustering have been used to improve query performance by managing large datasets effectively. However, these methods come with limitations such as complexity in implementation, rigidity in data layout, and the need for frequent data rewrites. Delta Lake’s Liquid Clustering offers a dynamic solution. It allows for flexible redefinition of clustering keys without the need to rewrite existing data, adapting effortlessly to evolving analytic needs.
This session will cover how Liquid Clustering simplifies data layout decisions and optimizes query performance, marking a significant advancement over traditional partitioning and Z-Order clustering methods. Don’t miss this opportunity to learn about Liquid Clustering and how it can revolutionize your data management strategy.
Quick Links
Join us on Slack: go.delta.io/slack
GitHub: github.com/del...
Join Google Groups: groups.google....

Опубликовано:

 

28 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 7   
@alexischicoine2072
@alexischicoine2072 5 месяцев назад
Very interesting. For zordering you can store the columns in table properties at table creation and then retrieve them when optimizing it's not that much code.
@luisriveros1119
@luisriveros1119 9 месяцев назад
Hi !! I have a question is it possible to implementing liquid clustering for DataFrames directly saved to delta files (df.write.format("delta").save("path")), The conventional approach involving table creation
@alexischicoine2072
@alexischicoine2072 5 месяцев назад
It's a great combo with vector deletions as you don't have to rewrite the data. Without vector deletions it could make deletes more expensive as the data would be spread and mixed across files.
@k.saibhargav8072
@k.saibhargav8072 6 месяцев назад
what is difference between bucket By vs Liquid Clustering
@chrisstephenson9890
@chrisstephenson9890 8 месяцев назад
Thank for sharing this talk. Would you be so kind to share a link to the slide deck presented by Vitor?
@raviv5109
@raviv5109 7 месяцев назад
One question, is it wise decision to apply partition to liquid clustering table?
@paulfunigga
@paulfunigga 7 месяцев назад
partitioning is not compatible with liquid clustering
Далее
Optimizing MERGE Performance using Liquid Clustering
43:34
Delta Lake Deep Dive: Rust Crate
1:00:41
Просмотров 883
100 Identical Twins Fight For $250,000
35:40
Просмотров 39 млн
FATAL CHASE 😳 😳
00:19
Просмотров 1,1 млн
DP-203: 08 - Common file types (Delta lake)
59:18
Просмотров 5 тыс.
Deep-Dive into Delta Lake
46:30
Просмотров 12 тыс.
AI-Accelerated Delta Tables: Faster, Easier, Cheaper
39:13