Тёмный

Intro to Airflow with DBT and Cosmos, Big Data Utah Meetup 

UtahGeekEvents
Подписаться 409
Просмотров 4,8 тыс.
50% 1

Originally presented at Big Data Utah Meetup on January 2023.
www.meetup.com...
Abstract
Data teams often use dbt and Airflow together as complementary tools - dbt (data build tool) is an open-source command-line tool that enables data analysts and engineers to transform data in their warehouse using SQL. And Airflow, traditionally used by Python savvy data engineers, is a tool that allows for the orchestration of data pipelines across various systems. Users of dbt are traditionally analytics engineers, folks who sit closer to the business and are skilled in SQL, but are perhaps less well-versed in Python. And due to the multiple ways in which Airflow and dbt can be integrated, sometimes confusion and challenges can arise, leading to a loss of data observability and/or dependency conflicts.
Cosmos is an Apache 2.0 licensed OSS project from Astronomer designed to generate DAGs (directed acyclic graphs) dynamically from other frameworks. Leveraging Cosmos with dbt in Airflow results in a first-class ETL authoring experience, allowing for native Airflow connections and virtual environment management, as well as native Airflow operators to run dbt commands.
This talk will go through an introduction to Apache Airflow as well as the pros/cons of using different options for integrating Airflow and dbt. We’ll end with an introduction to and demo of Cosmos.
Chris Hronek Bio
Chris Hronek is an experienced Analytics Engineer with a proven track record in collecting, storing, analyzing, and presenting data to meet organizational needs. His professional background includes experience in senior data engineering, data engineering, and business intelligence engineering roles with expertise in Apache Airflow and SQL. Chris also enjoys backcountry skiing, mountain biking, and backpacking in the Uintas.
Ben Garrison Bio
Ben Garrison is a Field Engineer at Astronomer, serving as a technical advisor to folks desiring to improve how they manage, scale and leverage Apache Airflow for all their data pipeline needs. Ben is a recent transplant to Salt Lake City and loves hiking, mountain biking and snowboarding in the glorious Wasatch mountains. Ben is also an avid board game enthusiast and lover of barbecue.

Опубликовано:

 

7 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 8   
@litan5006
@litan5006 Год назад
Great video. Thank you. You should also speak at the summit
@hemkumarchheda1581
@hemkumarchheda1581 Год назад
This is Huge. Thank you so much for sharing this video. Loved it!! :)
@softwareengineer5764
@softwareengineer5764 5 дней назад
If wanna deploy this dbt, cosmos, airflow on top kubernetes like GKE with BigQuery storage. any suggestion plz.
@phnv
@phnv 8 месяцев назад
Great Video! thanks a lot! One question: do you run 'dbt init' to start another dbt project ?
@nonojinomo
@nonojinomo Год назад
Great video! Would you have any example of how to run only a specific model, or any other commands, instead of the whole project? Couldn't find it on the docs!
@paulellicapadilla3421
@paulellicapadilla3421 Год назад
This is great but the whole stack only accounts for pushing data. The stack doesn't account for pulling data. As a data orchestrating stack, you would expect all the metadata would be accounted for except for one place, and that would be data streams like Kafka. Kafka is a big thing if you have to deal with massive data that has to be real time for data analytics and I don't see this being accounted for in this stack. Maybe I'm missing something.
@mezo9163
@mezo9163 10 месяцев назад
How could we run a macro in our dags rather than a model?
@popo-je8ze
@popo-je8ze Год назад
Does it support dbt cloud
Далее
Airflow with DBT tutorial - The best way!
17:54
Просмотров 44 тыс.
Fixing Plastic with Staples
00:18
Просмотров 1,4 млн
🎙А не СПЕТЬ ли мне ПЕСНЮ?🍂
3:04:50
Новый хит Люси Чеботиной 😍
00:33
dbt and Python-Better Together
34:19
Просмотров 11 тыс.
To Debug a DAG: The Airflow local dev story
20:03
Просмотров 1,2 тыс.
Orchestrating Airbyte and dbt with Airflow
13:31
Просмотров 3,9 тыс.
Intro To Data Orchestration With Airflow
53:56
Просмотров 8 тыс.