Тёмный

How to test your Python ETL pipelines | Data pipeline | Pytest 

BI Insights Inc
Подписаться 14 тыс.
Просмотров 13 тыс.
50% 1

Опубликовано:

 

26 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 35   
@BiInsightsInc
@BiInsightsInc Год назад
Part two Pytest integration with ETL pipeline: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7FPksG-LYOA.html Part three of Pytest - Data Quality report: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Sv6QWF7J63k.html
@srh1034
@srh1034 3 месяца назад
Can you mention a blog or link that shows roadmap/sequence of your videos for ETL ?
@BiInsightsInc
@BiInsightsInc 3 месяца назад
@@srh1034 sure. Here is an overview of the channel's content and the ETL series sequence. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-pjiv6j7tyxY.html
@safkaify7875
@safkaify7875 20 дней назад
Nicely explained. Good presentation, well organized, well spoken. Keep up the good work.
@willosullivan3571
@willosullivan3571 Год назад
The best data engineering RU-vidr I've had the pleasure to find. Thanks and please keep it up!
@ShubbhasmitaSahani
@ShubbhasmitaSahani Год назад
Heart felt thanks to you for all these recorded sessions/tutorials .. you have made life so simple.
@farhadshakibaca
@farhadshakibaca Год назад
The best data engineering RU-vidr Thank you
@poojaak1678
@poojaak1678 Год назад
Articulate explanation!You’re the Best!!Thank you so much .
@Sreenu1523
@Sreenu1523 Год назад
You did a great job. I was looking same material for long time. Thanks man for sharing great content. I have many questions on pytest, will ask many questions once I go through all videos . Thanks
@MyChannel-ns3ct
@MyChannel-ns3ct 6 месяцев назад
Thanks for this video, is there a video on how to do these runs on SQL server, pgadmin or Athena ?
@BiInsightsInc
@BiInsightsInc 6 месяцев назад
Here is the link to the video in the series that runs data quality test against sql server. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7FPksG-LYOA.html Here is the link to the series: ru-vid.com/group/PLaz3Ms051BAkgmoRZEcGFvQzY4YW_SR8b
@ZarifouDjibril
@ZarifouDjibril 6 месяцев назад
Very helpul. Thank you.
@BillusTinnus
@BillusTinnus Год назад
Great video, thanks
@ashishvats1515
@ashishvats1515 Год назад
could you please do this with apache beam…. jdbc source to Bigquery …. or you help me in this… i really need this kind of information
@bharamkarvivek4632
@bharamkarvivek4632 Год назад
Thanks for such important info. How to automate these test cases?
@BiInsightsInc
@BiInsightsInc Год назад
You can embed these tests in your Data Pipeline, below is an example. Once you schedule it via an orchestrator then these tests will run each time your pipeline is triggered. You can use any tool like Airflow, Dagsters, Prefect or cron to schedule Python based pipelines. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7FPksG-LYOA.html&ab_channel=BIInsightsInc Airflow: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-eZfD6x9FJ4E.html&ab_channel=BIInsightsInc Dagster: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-f1TbVGdhmYg.html&ab_channel=BIInsightsInc
@kiranpatil4968
@kiranpatil4968 Год назад
Please make video on etl automation testing from scratch and make seperate playlists
@BiInsightsInc
@BiInsightsInc Год назад
I will try and cover this in the future. In the meantime you can check out the following videos on the testing and automating the ETL pipelines. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7FPksG-LYOA.html ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Sv6QWF7J63k.html&t ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7UQ91Ib7PtU.html&t How to automate Python based ETL pipelines. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-f1TbVGdhmYg.html&t ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-eZfD6x9FJ4E.html&t ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-IsuAltPOiEw.html
@vivek2319
@vivek2319 Год назад
Thanks
@SP-db6sh
@SP-db6sh Год назад
How to add a logger to it with Tqdm progress bar
@BiInsightsInc
@BiInsightsInc Год назад
If you want to log the test for review or sharing then check out the next video. I haven't played around with Tqdm but here is there docs and implementation. Maybe in the future I will implement this in a project. github.com/tqdm/tqdm
@gulnarabekirova4741
@gulnarabekirova4741 8 месяцев назад
Thank you for a great tutorial! You already have few different videos, can you add a number(to order them) to each tutorial it can help which video is the first and which one is the last.
@BiInsightsInc
@BiInsightsInc 8 месяцев назад
Thanks and good suggestion. I have consolidated the data quality videos in their own playlist. Here is the link: ru-vid.com/group/PLaz3Ms051BAkgmoRZEcGFvQzY4YW_SR8b
@lalalf4535
@lalalf4535 Год назад
Function test_null_check(df) will always return passed
@BiInsightsInc
@BiInsightsInc Год назад
Thanks for spotting this. I have updated the code base. You can use the following assertion. # check for nulls def test_null_check(df): assert df['ProductKey'].notnull().all()
@lalalf4535
@lalalf4535 Год назад
@@BiInsightsInc Thank you. Your content is very useful.
@dmunagala
@dmunagala 3 месяца назад
def test_Genre_dtype_str(df): assert (df["Genre"].dtype == str or df["Genre"].dtype == 'O') This test case is always returned Pass
@BiInsightsInc
@BiInsightsInc 3 месяца назад
If the data type of this column is string or object then it will be pass. If you have datatype of Int or float then it will fail. You can also remove the "O" and test for string if that's the objective. Here is an example of this test with int. github.com/hnawaz007/pythondataanalysis/blob/main/ETL%20Pipeline/Pytest/Session%20one/string%20and%20object%20test%20result.png
@dmunagala
@dmunagala 3 месяца назад
@@BiInsightsIncThanks for responding. When I have the column value as 1, which is int below assertion is passing. I tried to remove "O" and then it's failing but it fails even if the data type is string. assert (df["Genre"].dtype == str or df["Genre"].dtype == 'O')
@BiInsightsInc
@BiInsightsInc 3 месяца назад
@@dmunagala you need to check the data type. Value might be 1 but it can be stored as string. Check my previous comment I have link to this test and it’s failing with int data type.
@dmunagala
@dmunagala 3 месяца назад
@@BiInsightsInc Yes, you are right. I checked the datatype by using, df.info() and got to know the exact datatypes for all columns in my csv file. It is working as expected. Thank you so much for your help, you are amazing!!
@robertclayton3189
@robertclayton3189 Год назад
Video resolution is poor.
@BiInsightsInc
@BiInsightsInc Год назад
Please try it in HD 1080p.
@muddashir
@muddashir Год назад
Thanks
@soheilahg921
@soheilahg921 Год назад
Great and very helpful Content. Thank you.
Далее
Learn to Efficiently Test ETL Pipelines
35:13
Просмотров 10 тыс.
Help Me Celebrate! 😍🙏
00:35
Просмотров 10 млн
We finally APPROVED @ZachChoi
00:31
Просмотров 2,2 млн
How To Write Unit Tests in Python • Pytest Tutorial
35:34
Data Pipelines Explained
8:29
Просмотров 155 тыс.
ETL with Python
57:19
Просмотров 65 тыс.
Unit testing Python code using Pytest + GitHub Actions
23:02
Help Me Celebrate! 😍🙏
00:35
Просмотров 10 млн