Тёмный

Riccardo Amadio | Declarative data manipulation pipeline with Dagster | PyData Amsterdam 2023 

PyData
Подписаться 160 тыс.
Просмотров 460
50% 1

Bored of old pipeline orchestrator? Difficult to understand if data is up-to-date? Trouble with development workflow of data pipeline?
Dagster, an open-source tool, offers a unique paradigm that simplifies the orchestration and management of data pipelines.
By adopting declarative principles, data engineers and data scientists can build scalable, maintainable, and reliable pipelines effortlessly.
We will commence with an introduction to Dagster, covering its fundamental concepts to ensure a comprehensive understanding of the material.
Subsequently, we will explore practical scenarios and use cases, with also DBT for empower the power of SQL language.
Minutes 0-5: Explain the design pattern problem of actual data pipeline framework.
Minutes 5-15: Introduction to Dagster and its core concepts.
Minutes 10-25: Practical examples of building declarative data pipelines with Dagster, with also DBT, the power of gRPC server.
Minutes 25-30: Q&A and conclusion.
Are you tired of struggling with outdated pipeline orchestrators? Do you find it challenging to ensure your data is always up-to-date? Are you facing difficulties with the development workflow of your data pipeline?
In this session, we will introduce Dagster, an open-source tool that revolutionizes the orchestration and management of data pipelines. By embracing declarative principles, data engineers and data scientists can effortlessly build scalable, maintainable, and reliable pipelines.
We will begin by providing an overview of the design pattern problem that many existing data pipeline frameworks face. Understanding the limitations of these frameworks will set the stage for exploring the transformative capabilities of Dagster
Next, we will delve into the core concepts of Dagster, ensuring a comprehensive understanding of the material. You will learn how Dagster simplifies pipeline development and execution by providing a declarative and intuitive approach. Through practical examples and hands-on demonstrations, we will showcase how you can leverage Dagster to build powerful data pipelines.
But that's not all! We will also explore the integration of DBT, empowering you to harness the full potential of the SQL language within your data pipelines. You will witness the synergy between Dagster and DBT, unlocking new possibilities for data manipulation and transformation.
By the end, you'll be equipped with the knowledge and inspiration to elevate your data pipeline workflows to new heights.
Outline:
Minutes 0-5: Understanding the design pattern problem of existing data pipeline frameworks
Minutes 5-15: Introduction to Dagster and its core concepts
Minutes 10-25: Practical examples of building declarative data pipelines with Dagster, including the integration with DBT and the power of gRPC server
Minutes 25-30: Q&A and conclusion
Bio:
Riccardo Amadio
Senior Data Engineer at Agile Lab with a background of Data Scientist and Software Engineer.
When I don't work with data pipelines , I juggle between closing some of my 100+ open tabs on the browser and my true passion: collecting stars on GitHub 🔭🌟. In this treasure trove of more than 2,000 repositories, I am pretty sure I can find any tool to solve a problem, and I can’t wait to share them with you.
===
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our RU-vid videos to help with discoverability? Find out more here: github.com/numfocus/RU-vidVi...

Наука

Опубликовано:

 

5 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии    
Далее
Впервые дал другу машину…
00:57
Your Lambdas, In Rust! - Luciano Mammino
33:35
Why Donut Media Is Falling Apart: An Explainer
17:07
Просмотров 201 тыс.
Data Quality as part of the Data Pipeline
23:08
Просмотров 1,9 тыс.
This New Angular Release Is Wild
5:53
Просмотров 100 тыс.
Сложная распаковка iPhone 15
1:01
Просмотров 12 тыс.