Тёмный
No video :(

3.06 Mastering Common Silver and Gold zone transformations with PySpark in Microsoft Fabric 

Fikrat Azizov
Подписаться 551
Просмотров 2,5 тыс.
50% 1

• Microsoft Fabric For B...
This video explores common transformation techniques in Silver and Gold zones that are part of Medallion architecture. I explain data enrichment and type conversion transformations and demonstrate how to use PySpark API's and methods to address these tasks.
I also demonstrate how to process historical data from the Bronze layer using Window functions. Next, I explain core Kimball dimensional modelling concepts and demonstrate how they can be implemented using PySpark methods.
Finally, I demonstrate creating aggregates.
You can download the related demo notebook from here: github.com/faz...
Chapters:
00:00- Introduction
02:21- Preview
06:19- Lakehouse historical data storage strategy
09:00- Demo start- preparing data
10:24- Creating shortcuts to Bronze tables
11:24- Notebook demo- reading data from shortcuts
12:30- Inspecting data frame schema
13:48- Data Type conversion transformations
16:05- Ordering data
20:00- Handling historical data using Window functions
24:25- Data enrichment transformations
25-45- Using regular expressions to parse text data
26:40- Generating time dimension
30:45- Dimensional modelling concepts
32:12- Slowly changing dimensions (SCD)
33:05- SCD Type-2 dimensions
34:54- Surrogate keys
35:32- Relationships between facts and dimensions
37:00- Generating surrogate keys using monotonically_increasing_id function
38:00- Distributed computing and Spark partitions
41:31- Reducing data frame partition count
43:02- How to link Fact and Dimension tables
47:14- Incremental write into destination tables
49:02- Using MERGE INTO query for destination write
50:50- Aggregation transformations
Please subscribe: / @fazizov
Official Documentation:
learn.microsof...
learn.microsof...
sparkbyexample...
www.kimballgro...
spark.apache.o...
Hashtags:
#datafactory, #microsoft,#microsoftfabric ,#azure, #dataengineering,#cloudcomputing, #dataanalytics, #lakehouse, #azuretutorial, #azuretraining, #datapipeline, #dataextraction , #dataintegration, #datatransfer, #dataflow, #spark, #deltalake, #synapse, #synapsedataenginering, #demo, #datalake, #transformation, #ingested, #datawarehouse, #dataintegration, #azuredatabricks ,#databricks, #bigdata, #bigdatatechnologies, #pyspark, #sparksql, #notebook ,#transformationvideo, #bronze, #medallion, #kimball, #dimensions , #modeling, #facts, #silver, #gold, #historical data, #dimensional

Опубликовано:

 

17 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 6   
@joseluiscorreasalazar5670
@joseluiscorreasalazar5670 Месяц назад
Thank you very much! This is one of the best tutorials on Fabric Lakehouses out there
@fazizov
@fazizov Месяц назад
Thanks for watching!
@kevthebandit
@kevthebandit 6 месяцев назад
Thanks for breaking this down!
@fazizov
@fazizov 6 месяцев назад
Thanks for feedback!
@digitalevidenceofthings
@digitalevidenceofthings 6 месяцев назад
This is incredible, exactly what I needed to see to ensure I'm on the right track. Thank you for taking the time to do this video!
@fazizov
@fazizov 6 месяцев назад
Glad it was helpful, thanks!
Далее
معركة من أجل العصيدة 👧ضد🪳
00:26
女孩妒忌小丑女? #小丑#shorts
00:34
Просмотров 21 млн
Microsoft Fabric - Incremental ETL
26:29
Просмотров 14 тыс.