Тёмный

End To End Data Engineering Project With Snowflake | Parquet, JSON & CSV Data Files 

Data Engineering Simplified
Подписаться 46 тыс.
Просмотров 17 тыс.
50% 1

Опубликовано:

 

7 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 68   
@parikshitchavan2211
@parikshitchavan2211 10 месяцев назад
wow superb you covered almost all the things required to survive as an data engineer in industry 🙂🤐
@DataEngineering
@DataEngineering 10 месяцев назад
Thanks for your note... If you want to manage snowflake more programatically.. you can watch my paid contents .. many folks don't know the power of snowpark... this 2 videos... will help you to broaden your knowledge.. These contents are available in discounted price for limited time.. (one for JSON and one for CSV).. it can automatically create DDL and DML and also run copy command... and make all SQL statement available for CI/CD... 1. www.udemy.com/course/snowpark-python-ingest-json-data-automatically-in-snowflake/?couponCode=SPECIAL50 2. www.udemy.com/course/automatic-data-ingestion-using-snowflake-snowpark-python-api/?couponCode=SPECIAL35
@uttapa22
@uttapa22 8 месяцев назад
Great video. It would have been wonderful if it also contained 1. how to do end to end CICD 2. How to setup pipeline dependency between data ingestion tool and snow flake task ( assuming we can bundle up all the loading steps you have covered in this video into a snowflake task) Apologies if you have already got these covered else where , if so please direct me. Many Thanks
@harshshah3546
@harshshah3546 5 месяцев назад
Appreciate the time and effort you've put in to create this tutorial.
@uttapa22
@uttapa22 8 месяцев назад
Great video. It would have been wonderful if it also contained 1. how to do end to end CICD 2. How to setup pipeline dependency between data ingestion tool and snow flake task ( assuming we can bundle up all the loading steps you have covered in this video into a snowflake task) Apologies if you have already got these covered else where , if so please direct me. Many Thanks 1:21:30
@DataEngineering
@DataEngineering 8 месяцев назад
Glad you linked the content and your request for CI/CD is noted. The CI/CD is not yet covered in my video.
@ugvidhyalakshmi1530
@ugvidhyalakshmi1530 20 дней назад
thanks a lot. great work...👏
@anilkumark3573
@anilkumark3573 Год назад
Sir , We appreciate your efforts and knowledge sharing.
@DataEngineering
@DataEngineering Год назад
Glad you liked the content.. thank you so much for you note Anil...
@rakeshkumarsharma195
@rakeshkumarsharma195 2 месяца назад
Awasome just awesome 👍
@dilshadsayed7202
@dilshadsayed7202 4 месяца назад
great video, thanks, would it be possible to share the data used in this project
@ganeshlakshman2506
@ganeshlakshman2506 Год назад
Very comprehensive, thank you :) I don't see 2 years data in the gitlab link but just one month, Jan 2020. Am I missing looking at the wrong location?
@user-tv4cl1wm6f
@user-tv4cl1wm6f 11 месяцев назад
Thanks for everything. U helped a lot ❤! May i ask if u can make videos on the exception handling and error logging? E.g. one of the csv has an additional column. Another example is when loading data into the internal stage, wifi connection failed and how to resume the job? Thanks bro! :)
@DataEngineering
@DataEngineering 11 месяцев назад
if file are not loaded, and if you try to load, the load will ignore them...
@gyt7504
@gyt7504 3 месяца назад
great tutorial. thanks!
@user-uv5xf1mc6b
@user-uv5xf1mc6b Год назад
Thanks for Everything...
@DataEngineering
@DataEngineering Год назад
Always welcome
@satishbalaji1832
@satishbalaji1832 Год назад
Thank you for informative session. Can’t we achieve same solution through snowflake sql baes queries/stored procs
@DataEngineering
@DataEngineering Год назад
yes, you can do it... and snowpark is nothing but SQL generator with current version .. may be you can watch this video.. what it is and what it is not 1. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE--awSPRW9AOY.html (What is snowpark) 2. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7tToBddZ_is.html (What is NOT snowpark)
@mariumbegum7325
@mariumbegum7325 Год назад
Interesting video and great explanation.
@DataEngineering
@DataEngineering Год назад
Glad you liked the content...
@hritiksharma7154
@hritiksharma7154 Год назад
Really great project 👌
@DataEngineering
@DataEngineering Год назад
glad you liked it..
@rodrigoschammass5205
@rodrigoschammass5205 11 месяцев назад
Thank you very much but Step 4.2 Loading Data To Internal Stage Using Snowpark File API is not working for me. I run the code but no data is there.
@prajeetkatari3742
@prajeetkatari3742 Год назад
HI , i was tryin to run the getting the csv etc. files on the internal stage , I even get the output of the directory but Im not able to see the data as a result ! pls do help have been trying to rectify for hours but got no clue! thanks
@DataEngineering
@DataEngineering Год назад
Not sure which step you are talking about... if you can give me a timestamp, it will be helpful or you share a screenshot to my instagram account (instagram.com/learn_dataengineering/)
@srinivasp6579
@srinivasp6579 11 месяцев назад
Thank you for sharing such a good content. I should say you are a rockstar in Snowflake world. I have a question. In this case, since there are lot of Data frames created in snowpark-python scripts and running the code from local machine ,does it consume local system storage/compute or push everything to the Snowflake storage/Compute? Thank you in advance!
@DataEngineering
@DataEngineering 11 месяцев назад
thanks for your note.. when you perform an operation using dataframe in snowflake, it uses snowflake's compute power. When you pull data to your location machine..in that case.. it uses your local compute...
@srinivasp6579
@srinivasp6579 11 месяцев назад
Thank you for your quick response. If i would like to push everything to the snowflake storage and compute, how should we do it? How should we register the snowpark-python programs in snowflake database and run/debug it(Instead of Stored proc route) ? is is really possible? May be having a separate video might help@@DataEngineering
@DataEngineering
@DataEngineering 11 месяцев назад
Watch ch-08 from this snowpark playlist.. and you would understand how to deploy it (playlist link ru-vid.com/group/PLba2xJ7yxHB4yPg3pUrobdzeMxk4mP24S)
@srinivasp6579
@srinivasp6579 11 месяцев назад
​@@DataEngineeringThank you. I already watched it. Does that mean we should test it locally first and then deploy on SF sandbox. I am looking for options if we can develop,test, debug and deploy directly in the SF sandbox itself? Is it possible? Any insight?
@affanamin105
@affanamin105 8 месяцев назад
Hi, thanks for this end to end project, Where can I find complete dataset which you used in this video ?
@DataEngineering
@DataEngineering 8 месяцев назад
complete data set is too big.. the desc has the link that has limited data. ----- and yes, I know many of us are not fully aware of snowpark Python API, if you want to manage snowflake more programatically.. you can watch my paid contents (data + code available) .. many folks don't know the power of snowpark... these 2 videos... will help you to broaden your knowledge.. These contents are available in udemy.. (one for JSON and one for CSV).. it can automatically create DDL and DML and also run copy command... 1. www.udemy.com/course/snowpark-python-ingest-json-data-automatically-in-snowflake/ 2. www.udemy.com/course/automatic-data-ingestion-using-snowflake-snowpark-python-api/
@uravakondakhadhar5458
@uravakondakhadhar5458 Год назад
Appreciate your work
@DataEngineering
@DataEngineering Год назад
Thank you so much 😀
@jaelinjordan1104
@jaelinjordan1104 Год назад
Quick question: I am on Part 4 but for some reason I downloaded the data to my computer but it does not show when I try to run it through Snowflake. Is there a reason for that?
@DataEngineering
@DataEngineering Год назад
could you please provide additional detail, not able to understand the issue. pls attach a time stamp or share a screenshot via my instagram account.
@user-le8cf9ck4v
@user-le8cf9ck4v Год назад
Sir can you please do one end to end project in snowsql as well. that will be very beneficial for us.
@DataEngineering
@DataEngineering Год назад
snowsql is just a cli tool.. you mean Snowflake SQL? if so.. watch ch-19 from my snowflake tutorial .. the end to end flow is covered using SQL.
@balajikarthik8366
@balajikarthik8366 11 месяцев назад
Thanks for you effort. A question: how would you productionise the entire flow. Should your python code be converted to a stored procedure?
@DataEngineering
@DataEngineering 11 месяцев назад
Yes, exactly.. or it needs a runtime environment outside of snowflake and it has to be scheduled with some kind of scheduler.
@user-tv4cl1wm6f
@user-tv4cl1wm6f 11 месяцев назад
@@DataEngineeringCan we have a video on that as well? Like how can .py first become a stored procedure 😅😅😅
@Vidhvamsam-Villain
@Vidhvamsam-Villain 3 месяца назад
curated to consumption code is not working properly.
@hansrony5684
@hansrony5684 11 месяцев назад
Hi Bro, While going through the course, I found out that not all the data is provided in the gitlab link as well as the exchange_rates.csv at 50:00 . The exchange rate column is null for all rows after moving the file into curated stage. Could you update the link with all the files as mentioned in the course? Thanks
@sabarisri4515
@sabarisri4515 Год назад
There is no primary key in snowflake. Then why do we use Primary and Foreign key here ? Can you please explain.
@DataEngineering
@DataEngineering Год назад
When you connect to any BI tool like PowerBI..they need these relationship.. and can build the model for slice and dice... and if you have to draw the ER diagram.. to understand the relationship.. in such case.. you have to have those relationship are important..
@vaibhavverma1340
@vaibhavverma1340 10 месяцев назад
Can you please tell me how to update row in snowflake_sample_data.tpch_sf100.orders??? getting error - "Object 'ORDERS' does not exist or not authorized."
@DataEngineering
@DataEngineering 9 месяцев назад
that is a shared object, you can not update it.
@bharathkandati3911
@bharathkandati3911 Год назад
Hi, thank you for sharing the project. Where to find the python code ? Gitlab has data files. Please advise
@DataEngineering
@DataEngineering Год назад
In the description
@praveenkumar-sk8nx
@praveenkumar-sk8nx 10 месяцев назад
How we can do reverse engineering without third party tool
@DataEngineering
@DataEngineering 10 месяцев назад
then you have to write program for it.... snowpark can do .. or you can also write python unless snowsight come up with some kind of UI for that.. and yes, If you want to manage snowflake more programatically.. you can watch my paid contents .. many folks don't know the power of snowpark... this 2 videos... will help you to broaden your knowledge.. These contents are available in discounted price for limited time.. (one for JSON and one for CSV).. it can automatically create DDL and DML and also run copy command... and make all SQL statement available for CI/CD... 1. www.udemy.com/course/snowpark-python-ingest-json-data-automatically-in-snowflake/?couponCode=SPECIAL50 2. www.udemy.com/course/automatic-data-ingestion-using-snowflake-snowpark-python-api/?couponCode=SPECIAL35
@antonybest2599
@antonybest2599 10 месяцев назад
Hi I keep getting this error File "C:\Users\anbest\OneDrive - Capgemini\Documents\Git\Snowpark_project\LoadData.py", line 57, in main put_result(file_element," => ",put_result[0].status) TypeError: 'list' object is not callable I tested the traverse func on its own, and it is picking up my file names location etc. seems to be the put_result causing issues
@DataEngineering
@DataEngineering 10 месяцев назад
Not sure clear what kind of error you are getting... your result is not what the program expect.. so you need to check the typeof(object) and if it is list or not.
@srinigoud7393
@srinigoud7393 7 месяцев назад
You have any udemy cource.Can you please send me gitlab repo or udemy course details
@DataEngineering
@DataEngineering 7 месяцев назад
These contents are available in udemy.. (one for JSON and one for CSV).. it can automatically create DDL and DML and also run copy command... 1. www.udemy.com/course/snowpark-python-ingest-json-data-automatically-in-snowflake/ 2. www.udemy.com/course/automatic-data-ingestion-using-snowflake-snowpark-python-api/
@ramakrishnatirumala428
@ramakrishnatirumala428 11 месяцев назад
sir...where i can get all these code ?
@DataEngineering
@DataEngineering 11 месяцев назад
Check description
@chittaranjanpradhan5290
@chittaranjanpradhan5290 9 месяцев назад
Hello, How can I get the source code of this project?
@DataEngineering
@DataEngineering 9 месяцев назад
it is in the description... and yes..I know many of us are not fully aware of snowpark Python API, if you want to manage snowflake more programatically.. you can watch my paid contents (data + code available) .. many folks don't know the power of snowpark... these 2 videos... will help you to broaden your knowledge.. These contents are available in discounted price for limited time.. (one for JSON and one for CSV).. it can automatically create DDL and DML and also run copy command... 1. www.udemy.com/course/snowpark-python-ingest-json-data-automatically-in-snowflake/?couponCode=DIWALI50 2. www.udemy.com/course/automatic-data-ingestion-using-snowflake-snowpark-python-api/?couponCode=DIPAWALI35
@uravakondakhadhar5458
@uravakondakhadhar5458 Год назад
Hi bro
@DataEngineering
@DataEngineering Год назад
pls share your query..
@SaurabhYadav-ep6cu
@SaurabhYadav-ep6cu Год назад
gitlab link just have one month, data Jan 2020. Can you send us the proper link containing whole data file. @DataEngineering
@DataEngineering
@DataEngineering Год назад
Yes, if it hard to put so much of data in any platform.... that's why given only 1 month of data.
@s.satishkumar8089
@s.satishkumar8089 Год назад
How to contact you brother
@DataEngineering
@DataEngineering Год назад
instagram.com/learn_dataengineering/
@s.satishkumar8089
@s.satishkumar8089 Год назад
@@DataEngineering already sent a msg to you but no reply
Далее
#01 | What is Snowpark in Snowflake
21:36
Просмотров 25 тыс.
Data Engineering With Python In Snowflake
37:09
Просмотров 12 тыс.
БЕЛКА РОЖАЕТ?#cat
00:22
Просмотров 436 тыс.
Первый день школы Катя vs Макс
19:37
I've been using Redis wrong this whole time...
20:53
Просмотров 356 тыс.
Snowpark for Python | Snowflake Tutorial
28:18
Просмотров 30 тыс.
What is ETL | What is Data Warehouse | OLTP vs OLAP
8:07
БЕЛКА РОЖАЕТ?#cat
00:22
Просмотров 436 тыс.