Тёмный

Designing Structured Streaming Pipelines-How to Architect Things Right - Tathagata Das Databricks 

Databricks
Подписаться 107 тыс.
Просмотров 18 тыс.
50% 1

Structured Streaming has proven to be the best platform for building distributed stream processing applications. Its unified SQL/Dataset/DataFrame APIs and Spark's built-in functions make it easy for developers to express complex computations. However, expressing the business logic is only part of the larger problem of building end-to-end streaming pipelines that interact with a complex ecosystem of storage systems and workloads. It is important for the developer to truly understand the business problem needs to be solved.
What are you trying to consume? Single source? Joining multiple streaming sources? Joining streaming with static data?
What are you trying to produce? What is the final output that the business wants? What type of queries does the business want to run on the final output?
When do you want it? When does the business want to the data? What is the acceptable latency? Do you really want to millisecond-level latency?
How much are you willing to pay for it? This is the ultimate question and the answer significantly determines how feasible is it solve the above questions.
These are the questions that we ask every customer in order to help them design their pipeline. In this talk, I am going to go through the decision tree of designing the right architecture for solving your problem.
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: databricks.com/product/unifie...
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com/databricks-nam...

Наука

Опубликовано:

 

15 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 5   
@vinr
@vinr 4 года назад
Super presentation, thank you
@karthikeyanbalachandran4146
Thank you very much for the education.
@karthikeyanbalachandran4146
Could you please share the links to reference previous deep dive talks/sessions/demos?
@abhinee
@abhinee 4 года назад
basically use delta lake and all problems solved !!!!
@agammishra9674
@agammishra9674 2 года назад
🤣🤣🤣🤣
Далее
Аварийный выход
00:38
Просмотров 549 тыс.
Самый надежный автомобиль
01:00
Просмотров 468 тыс.
The Outlast Trials ► КООП-СТРИМ #5
2:15:34
Просмотров 501 тыс.
Deep Dive into the New Features of Apache Spark™ 3.4
1:13:19
Holographic transparent flexible LED panel.
0:20
Просмотров 3,4 млн
Bardak ile Projektör Nasıl Yapılır?
0:19
Просмотров 6 млн