Using Fabric notebooks (pySpark) to clean and transform real-world JSON data

Learn Microsoft Fabric with Will

Подписаться 18 тыс.

Просмотров 6 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

20 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 18

@woliveiras 9 месяцев назад

Hi Will. Really good material. Keep going!!! Congratulations!!!

@LearnMicrosoftFabric 9 месяцев назад

Hey thanks for watching :) plenty more to come in 2024 😀

@adilmajeed8439 8 дней назад

Thanks for sharing. I hope the sequence of the videos should be me maintained to go through the whole sequence properly instead of finding it out where the other piece of the video is.

@josuedegbun6270 4 месяца назад

i really like your videos, there are quite simple, short and things are well explained

@AmritaOSullivan Год назад

Awesome video!!!! I understand the data factory pipeline runs daily and loads the daily json file in the lake house folders. Then the notebook code is extracting the data, transforming it and then loading it to the table. Appending daily. How is the notebook executed daily? Thank you!

@HasanCatalgol 3 месяца назад

Will hi, in Azure Data Factory, transformations were handled by no-code drag and drop buttons. But in Fabric version, there is Power Query like transformations and notebooks. Are these only options for transformation inside Fabric? Thanks

@LearnMicrosoftFabric 3 месяца назад

Hi there, for no-code transformations in Fabric, here's two options: 1) Dataflow Gen2 visual editor and 2) Data Warehouse T-SQL Visual Query Editor The Data Pipeline is another no/low code solution that performs similar role to ADF, but mostly for orchestration only, transformations will need to be done with Dataflows, Notebooks or T-SQL Scripts/ stored procs 👍

@hasancatalgol1273 3 месяца назад

I really wouldn’t wanna dwell into no-code transformation because it was hard to manage in ADF. Do you prefer Spark notebooks for daily transformations?

@gvasvas 10 месяцев назад

Hi Will. Thanks for your tutorials! Very smooth learning experience. Do you have a sample code for how to loop through YYYY/MM/DD folders and read and then load files incrementally? Also, have you shared your tutorial notebooks on your GitHub by chance? I see only some oldest notebooks there.

@LearnMicrosoftFabric 9 месяцев назад

Hey thanks for watching... hmm let me have a look and see which notebooks are not on my GitHub and I'll add the ones which aren't yet

@pphong Год назад

Hey @LearnMicrosoftFabric, why do you prefer Azure Storage Explorer over OneLake File Explorer?

@LearnMicrosoftFabric Год назад

Just habit really, have been using Azure Storage Explorer for a while. It is also more functional, you can do a lot more things in it that OneLake File Explorer (which is mainly just an upload/ delete thing. I also read that Storage Explorer is quicker for uploading bigger files as it's more optimised.

@AmritaOSullivan Год назад

Another question, currently in the code the json file oath is hard coded. How can that be made dynamic? Thanks!!!

@LearnMicrosoftFabric Год назад

Hey thanks for your awesome questions!! I think the answers will be helpful for everyone so I'll make a short video about these two topics and post later. Thanks, Will

@AmritaOSullivan Год назад

@@LearnMicrosoftFabric thanks a mill that would be awesome!

@LearnMicrosoftFabric Год назад

Uploaded... let me know if that answers your question. Thanks, Will