Whenever we explain how Delta works with parquet, performing redundant copies of "unchanged" data whenever a record is updated or deleted, people are understandably shocked - it's a huge amount of unnecessary work. With Delta Deletion Vectors, we finally have a better answer - deleting records is now a quick, simply metadata operation!
In this video Simon walks through the concept of deletion vectors, looking at how they are implemented and walking through a simple example - following what happens at the file & transaction log level.
To learn more about deletion vectors, check out: docs.databricks.com/en/delta/...
And if you need help on your Data & AI journey, give Advancing Analytics a call!
15 окт 2023