Extract and Load from External API to Lakehouse using Data Pipelines (Microsoft Fabric)

Learn Microsoft Fabric with Will

Подписаться 16 тыс.

Просмотров 13 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

8 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 49

@KurtJ-r8w 16 дней назад

Really hope you do more Fabric content. You were clear, structured and concise in the teachings done

@jampeauk Год назад

Just want to say a massive thank you for your Fabric videos they have been amazing. Keep up the great work.

@LearnMicrosoftFabric Год назад

Hi, thanks for watching! don’t worry, there’s plenty more videos to come!

@jampeauk 11 месяцев назад

@@LearnMicrosoftFabric I may have missed this in your videos but do you have a section on how to show the contents of a file directly and load the most recent file (my files all have date stamps in them). I have not had any luck with os.listdir().

@LearnMicrosoftFabric 11 месяцев назад

@@jampeauk Hi James, for file system searching you probably want to use mssparkutils which has that kind of list files in a directory functionality - I plan to cover this in my upcoming video on mssparkutils 👍

@jampeauk 11 месяцев назад

@@LearnMicrosoftFabric awesome thanks Will, looking forward to this. To provide a little extra context I would like to list the files located in my S3 Bucket which I have added as a Shortcut.

@chescov Год назад

Much appreciated my good sir 👏👏

@LearnMicrosoftFabric Год назад

No problem, thanks for watching ☺️

@samirsahin5653 11 месяцев назад

I came here for same question. That some people already asked. How to call this api for multiple cities. I watched your other videos that you used notebook to transform data and in other video scheduled in pipeline. If you can show how to call this api for multiple cities, would be a great project. You can create a playlist as a end to end project. I really like your channel, following your daily spark videos. I believe this channel will be one of the main source of fabric youtube channels.

@samirsahin5653 11 месяцев назад

Just saw you already have a playlist:)

@LearnMicrosoftFabric 11 месяцев назад

Hey! Yes, I plan on continuing this series and going a bit deeper on data pipelines v soon! Thanks for watching and for your kind words 💪🙏

@stevengarcia7277 2 месяца назад

thanks mate, well explained.

@peternguynguyen5208 9 месяцев назад

Nice instructions, thank you

@LearnMicrosoftFabric 9 месяцев назад

thanks for watching!

@chetan2309 Год назад

Hey! Massive thanks! Do you’ve plans to cover any oauth based API on your system! Also how to parallelise these APIs for massive data loads! Let say you want to fetch data for 100 cities on everyday basis. Also triggers when 101st is added all those scenarios

@LearnMicrosoftFabric Год назад

Hi, Greats questions! Absolutely yes, I plan to do more videos about handling different auth scenarios, and also loading v big datasets with parallel reads. Watch this space :)

@user-fk5qk6zb1z 9 месяцев назад

Good explanations mate keep up the good work

@LearnMicrosoftFabric 9 месяцев назад

Cheers Will!

@KAshIf0o7 Год назад

waiting for next part

@gguuyypp 2 месяца назад

Thanks, can you make a video about extracting a file from SFTP ?

@FranciscoRodriguezFabric 7 месяцев назад

Thanks !

@LearnMicrosoftFabric 7 месяцев назад

No problem, thanks for watching!

@hotrung5469 10 месяцев назад

Thank you so much Will for your detailed instructions!!! Could you help me make an instruction to load Excel files in OneLake (specifically stored in lakehouse) into Tables in Datawahouse?

@LearnMicrosoftFabric 9 месяцев назад

hey thanks for watching! to read excel into a lakehouse table, you can either use pandas to load into a pandas df and convert to spark df (and then lakehouse table) or you can use the pyspark.pandas library (pandas within spark) - good luck!

@anushav3342 9 месяцев назад

Great content. Thanks for explaining about different options available in Fabric. I need to load a Fact data which is a bookings data through REST API call. How to setup the loading into lakehouse for ingesting weekly updates. Do i need to start with pipeline or is there a way to start with notebook directly to load data into the lakehouse.

@LearnMicrosoftFabric 9 месяцев назад

thanks for watching! it depends on the complexity of your api call really! if it’s simple, then you can use dataflows or data pipelines, more complex authentication or transformation will require a notebook

@sreekanth0112 3 месяца назад

Hi, Please make the video on extracting the files from share point to lakehouse through Data pipeline ( Data Factory) in fabric

@matask23 7 месяцев назад

Amazing video, thanks for this Will! I wanted to ask if PySpark would be the most optimal choice to achieve this or if I could use SQL to achieve the same goal?

@LearnMicrosoftFabric 7 месяцев назад

Yes you could also use SQL! The good thing about fabric is that you're free to use whichever language you are comfortable with! (well as long as it's T-SQL, Python, R, Scala or KQL)

@matask23 7 месяцев назад

@@LearnMicrosoftFabric Thanks for that, that's really useful to know! I guess my follow up would be whether there's any compatibility issues or limitations that I might encounter if I was to use SQL within MS Fabric?

@alex24tech 5 месяцев назад

how to run a pipeline for data copying. In fact, I have an API that uses two authentication systems: token and basic authentication (user and password). the first connection to the API (via the post method) allows you to retrieve the token which will be used afterward by the second request to execute the request itself. Is it possible to create a paper that can do the job? should I use nodebooks or is there a solution? the result of the second query will of course be stored in a lakehouse table.

@LearnMicrosoftFabric 5 месяцев назад

Yes, should be possible either in Data Pipeline, or Notebook. You can make the post request, then pass the token to your next activity.

@alex24tech 5 месяцев назад

@@LearnMicrosoftFabric Thanks sir. Please do you have any ressource that can help me?

@fnplazatuc 5 месяцев назад

Hi, how are u? After data extraction, How its the next step to transform the data and visualize this in MS PowerBi?

@LearnMicrosoftFabric 5 месяцев назад

Hi there, good thanks, you? In this video here I go right from end-to-end talking about extraction , storage and then visualization. Hope it helps 👍ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-hwwU8V48g-4.html

@fnplazatuc 4 месяца назад

@@LearnMicrosoftFabric Will how are u? Your video are util! I have a question.. It's possible obtain data from JSON API rest and will transformate to table in a datalake? I can't execute this.. only transform in a Warehouse! Thanks!

@mshparber Год назад

Thanks. Please explain what is best practice to make a nested api calls and merge the results back into one json file? For example, the first api call /students - gives me a list of all students, then for each I need to make another call /{sudent_id}/courses to get their courses information. I need to save the results of all students’ courses as one json file. It’s easy to do in Dataflow, but it cannot save the results as json, only table. So what is the right way to do it in Pipeline?Thanks!

@LearnMicrosoftFabric Год назад

Hey it's not something I've done with Data Pipelines tbh, but might be possible with the For loop activity? If you know how to use Python, I would recommend doing this in Fabric Notebooks with the requests library - much easier to manage this kind of logic in a notebook.

@mshparber Год назад

Thaks. One of the main advantages in Power BI tools is low-code/ no-code. I know Python, but I we need a simple GUI low-code experience. Like a Power Query / Dataflow. I hope Pipeline can provide it @@LearnMicrosoftFabric

@jampeauk Год назад

@@mshparber if it helps there is now a GUI which should do what you are after, do some watching/reading on "Data Wrangler" it is currently only avaliable for Pandas in Notebooks but it should be useful.

@dineshreddy2207 4 месяца назад

Hi, I have an XML file an want to ingest this file into MS Fabric without using notebook, Can you help me ?

@LearnMicrosoftFabric 4 месяца назад

Should be able to use either Dataflow or Data Pipeline, but if it’s horribly nested XML, notebook will probably be necessary

@itversityitversity7690 4 месяца назад

I used copy activity but seems some problem and suggestions please give other way..

@rashane1000 9 месяцев назад

Awesome video, keep it coming! How about having Oauth2 protocol? New subscriber here, thanks very much!

@LearnMicrosoftFabric 9 месяцев назад

Hey thanks for watching! Currently I haven't covered this yet, but I should make something about oauth2 yes because it's such a common use case.

@rashane1000 9 месяцев назад

@@LearnMicrosoftFabric thanks heaps.looking forward for your next vids 🔥🔥🔥

@DinoAMAntunes 6 месяцев назад

Hello Very good Tks very much. My ERP is 100% online but i can´t connect to it. I think i have all the data necessary. URL, db Name, Username Password or API.

@LearnMicrosoftFabric 6 месяцев назад

Hey if it's 100% online and an ERP system, it's likely to have an API to connect to. Google " {ERP NAME} API documentation" and find out how to connect to it. Or if it's one of the big ERP systems, you could use a dataflow because they might have a pre-built connector for your ERP system available. Good luck