Тёмный
No video :(

Physical Plans in Spark SQL-continues - David Vrba (Socialbakers) 

Databricks
Подписаться 114 тыс.
Просмотров 8 тыс.
50% 1

Опубликовано:

 

21 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 6   
@JoHeN1990
@JoHeN1990 4 года назад
These are some of the best optimizations I have seen. These kind of optimization can only come with deep understanding of spark internals + lots of experience. Kudos to the speaker! Much appreciated!
@navjotmannan7303
@navjotmannan7303 3 года назад
You have shared great optimizations out here!!
@hasun-tv
@hasun-tv 4 года назад
This is the slides of this video. www.slideshare.net/mobile/databricks/physical-plans-in-spark-sql
@sgarai
@sgarai 4 года назад
In my program anti join or not exist condition within the same table dataset is creating broadcasthashjoin and it is doing nested loop join. I have tried cache and repartition but every time it is hitting the broadcast threshold of 8 gb. Even disabling broadcast threshold using set conf of spark does not seems to work. Can you please suggest some solution.
@sgarai
@sgarai 4 года назад
Actually its tricky to explain the whole scenario but the take away from this video would be enabling cbo and analyzing table just before the anti join. The program is written in pyspark. But any suggestions around efficiently dealing with anti joins or not exists with corelated sub query ( which actually breaks down to a join) would be of great help
@antonioperalta6338
@antonioperalta6338 4 года назад
​@@sgarai Have you found a solution?
Далее
Construction site video BEST.99
01:00
Просмотров 340 тыс.
I've been using Redis wrong this whole time...
20:53
Просмотров 353 тыс.
Tuning and Debugging Apache Spark
47:14
Просмотров 59 тыс.