Тёмный

Unit testing with Databricks | Jonathan Neo | November 2021 

Melbourne Databricks User Group
Подписаться 230
Просмотров 17 тыс.
50% 1

Just like eating vegetables, no one likes writing tests. However, writing unit tests is good for your programming diet. It helps ensure that data flows from one end of the pipeline to the other without any hitches.
In this talk, Jonathan Neo, Senior Data Engineer at Cuusoo, will explain why and how you can write unit tests, and where does unit testing fit in the bigger picture.
Jonathan will demo how you can write your own unit tests in Databricks using Databricks Connect and PyTest (a popular Python testing library), and also automate the execution of unit tests using CI/CD pipelines.
--
This was the talk from the November 2021 event of the Melbourne Databricks User Group.

Опубликовано:

 

4 ноя 2021

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 12   
@harshaaleti5609
@harshaaleti5609 2 года назад
Unit testing info starts at 10:32
@MuzicForSoul
@MuzicForSoul Месяц назад
Hi Jonathan, thanks for this demo, this is fantastic. few things have changed in this 2 years but the basics are same. When are you going to show us the remaining two parts like integration testing and data quality testing? or if you already have those videos can you please upload them to your channel. Thanks.
@jhonsen9842
@jhonsen9842 4 месяца назад
Great session very much thankful to you.
@adityaranjanmohanty5980
@adityaranjanmohanty5980 2 года назад
Thanks a ton. Loved it
@anoj4985
@anoj4985 2 года назад
Good stuff! Liked and subscribed ! :)
@allieubisse316
@allieubisse316 2 года назад
informative
@BjarneThorsted
@BjarneThorsted 2 года назад
Great video! Since databricks-connect is now deprecated, how should we set up unit testing?
@yoshitjuh
@yoshitjuh 2 года назад
Hi Bjarne, have you managed to get an answer on this somewhere else?
@BjarneThorsted
@BjarneThorsted 2 года назад
@@yoshitjuh, actually there's an update to the databricks-connect package so that it now supports runtime 10.4 LTS, but databricks apparently recommends not using it and rather use dbx by databrickslab to setup a project, run all unit testing locally and supply convenient command line functions for deployment and running jobs. Still not sure about testing during pull requests and stuff like that.
@peterko8871
@peterko8871 Год назад
Why 45 minutes needed for a demo example?
@brijesh0808
@brijesh0808 Год назад
nothing is visible on your code screenshots. Very bad presentation.
Далее
Learn to Efficiently Test ETL Pipelines
35:13
Просмотров 10 тыс.
How To Write Unit Tests in Python • Pytest Tutorial
35:34
🔴Ютуб закрывают... Пока?
00:39
Просмотров 979 тыс.
How to design a modern CI/CD Pipeline
9:59
Просмотров 109 тыс.
What does larger scale software development look like?
24:15
Simplify ETL pipelines on the Databricks Lakehouse
30:19