Тёмный

Making PySpark code faster with DuckDB 

MotherDuck
Подписаться 4,1 тыс.
Просмотров 3,3 тыс.
50% 1

In this video ‪@mehdio‬ dives into the new experimental feature of DuckDB : running PySpark code but with DuckDB engine ⚡
Note : This is not yet supported on MotherDuck
📓 Resources
* Github Repo of the tutorial : github.com/mehd-io/duckdb-pys...
* Niels Claes's benchmark on SQL engines : / head-to-head-compariso...
➡️ Follow Us
LinkedIn: / motherduck
X (formerly known as Twitter) : / motherduck
Blog: motherduck.com/blog/
0:00 Intro
0:53 Challenges of Apache Spark development
3:24 The Java boat load
6:01 Pyspark with DuckDB demo
8:10 A word about benchmarks
8:50 Limitations
9:26 Conclusions
#duckdb #pyspark #apachespark #dataengineering

Опубликовано:

 

28 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 4   
@Gretschi
@Gretschi 6 месяцев назад
Very interesting topic! Im currently writing my master thesis about PySpark Performance Optimization on Kubernetes regarding Spark configuration parameters 👍 will also take a look at duckdb
@ryguyrg
@ryguyrg 8 месяцев назад
Another awesome video Mehdi! Love the animations! ❤
@tosinadekunle646
@tosinadekunle646 2 месяца назад
Do we still need to install Java, Hadoop and modify the environment variables on the local machine to do this or we just install DuckDB and pip install pyspark and start using sparksession and sparkcontext? Thank you.
@motherduckdb
@motherduckdb 2 месяца назад
It's an API translation, so meaning you can write spark code, but the execution is done on DuckDB if you want. So in that case, no pyspark/java/hadoop needed. Hope it clarify!
Далее
DuckDB vs Pandas vs Polars For Python devs
12:05
Просмотров 15 тыс.
МЕГА ФОКУС С ЧИПСАМИ
00:42
Просмотров 163 тыс.
Why should you care about DuckDB? ft. Mihai Bojin
14:35
This New Angular Release Is Wild
5:53
Просмотров 101 тыс.
Why use DuckDB in your data pipelines ft. Niels Claeys
22:26
Big Data is Dead | MotherDuck
25:58
Просмотров 12 тыс.
I've been using Redis wrong this whole time...
20:53
Просмотров 343 тыс.
МЕГА ФОКУС С ЧИПСАМИ
00:42
Просмотров 163 тыс.