Raja's Data Engineering

Raja's Data Engineering

131
2 233 948

Подписаться

Welcome to Raja's Data Engineering!

Are you ready to embark on a thrilling journey into the world of Azure Databricks and Apache Spark? Look no further! Our channel is your go-to destination for all things related to these powerful data processing and analytics tools.

Join us as we delve into the depths of Azure Databricks and Apache Spark, unraveling their capabilities, exploring best practices, and unlocking the secrets to harnessing their true potential. Whether you're a data engineer, data scientist, or a curious learner passionate about big data technologies, our channel offers a wealth of knowledge to fuel your growth.

Here's what you can expect:
In-depth Tutorials
Best Practices and Tips
Use Case Discussions
Performance Optimization
Interview Preparation

Get ready to unlock the full potential of Azure Databricks and Apache Spark with our engaging and informative videos. Don't forget to subscribe to our channel and hit the notification bell, so you never miss an update.

130. Databricks | Pyspark| Delta Lake: Change Data Feed

17:26

130. Databricks | Pyspark| Delta Lake: Change Data Feed

2 месяца назад

129. Databricks | Pyspark| Delta Lake: Deletion Vectors

25:03

129. Databricks | Pyspark| Delta Lake: Deletion Vectors

2 месяца назад

128. Databricks | Pyspark| Built-In Function: TRANSFORM

15:36

128. Databricks | Pyspark| Built-In Function: TRANSFORM

2 месяца назад

127. Databricks | Pyspark| SQL Coding Interview:LeetCode-1045: Customers Who Bought All Products

9:51

127. Databricks | Pyspark| SQL Coding Interview:LeetCode-1045: Customers Who Bought All Products

4 месяца назад

126. Databricks | Pyspark | Downloading Files from Databricks DBFS Location

8:00

126. Databricks | Pyspark | Downloading Files from Databricks DBFS Location

5 месяцев назад

125. Databricks | Pyspark| Delta Live Table: Data Quality Check - Expect

8:16

125. Databricks | Pyspark| Delta Live Table: Data Quality Check - Expect

6 месяцев назад

124. Databricks | Pyspark| Delta Live Table: Datasets - Tables and Views

15:09

124. Databricks | Pyspark| Delta Live Table: Datasets - Tables and Views

10 месяцев назад

123. Databricks | Pyspark| Delta Live Table: Declarative VS Procedural

10:17

123. Databricks | Pyspark| Delta Live Table: Declarative VS Procedural

10 месяцев назад

122. Databricks | Pyspark| Delta Live Table: Introduction

24:25

122. Databricks | Pyspark| Delta Live Table: Introduction

10 месяцев назад

121. Databricks | Pyspark| AutoLoader: Incremental Data Load

34:56

121. Databricks | Pyspark| AutoLoader: Incremental Data Load

10 месяцев назад

120. Databricks | Pyspark| SQL Coding Interview: Employees Earning More Than Department Avg Salary

11:36

120. Databricks | Pyspark| SQL Coding Interview: Employees Earning More Than Department Avg Salary

11 месяцев назад

119. Databricks | Pyspark| Spark SQL: Except Columns in Select Clause

8:54

119. Databricks | Pyspark| Spark SQL: Except Columns in Select Clause

11 месяцев назад

118. Databricks | PySpark| SQL Coding Interview: Employees Earning More than Managers

10:58

118. Databricks | PySpark| SQL Coding Interview: Employees Earning More than Managers

11 месяцев назад

117. Databricks | Pyspark| SQL Coding Interview: Total Grand Slam Titles Winner

19:08

117. Databricks | Pyspark| SQL Coding Interview: Total Grand Slam Titles Winner

Год назад

116. Databricks | Pyspark| Query Dataframe Using Spark SQL

10:46

116. Databricks | Pyspark| Query Dataframe Using Spark SQL

Год назад

115. Databricks | Pyspark| SQL Coding Interview: Number of Calls and Total Duration

16:52

115. Databricks | Pyspark| SQL Coding Interview: Number of Calls and Total Duration

Год назад

114. Databricks | Pyspark| Performance Optimization: Re-order Columns in Delta Table

18:14

114. Databricks | Pyspark| Performance Optimization: Re-order Columns in Delta Table

Год назад

113. Databricks | PySpark| Spark Reader: Skip Specific Range of Records While Reading CSV File

13:19

113. Databricks | PySpark| Spark Reader: Skip Specific Range of Records While Reading CSV File

Год назад

112. Databricks | Pyspark| Spark Reader: Skip First N Records While Reading CSV File

6:31

112. Databricks | Pyspark| Spark Reader: Skip First N Records While Reading CSV File

Год назад

111. Databricks | Pyspark| SQL Coding Interview: Exchange Seats of Students

22:50

111. Databricks | Pyspark| SQL Coding Interview: Exchange Seats of Students

Год назад

110. Databricks | Pyspark| Spark Reader: Reading Fixed Length Text File

7:47

110. Databricks | Pyspark| Spark Reader: Reading Fixed Length Text File

Год назад

109. Databricks | Pyspark| Coding Interview Question: Pyspark and Spark SQL

20:46

109. Databricks | Pyspark| Coding Interview Question: Pyspark and Spark SQL

Год назад

108. Databricks | Pyspark| Window Function: First and Last

12:27

108. Databricks | Pyspark| Window Function: First and Last

Год назад

107. Databricks | Pyspark| Transformation: Subtract vs ExceptAll

8:37

107. Databricks | Pyspark| Transformation: Subtract vs ExceptAll

Год назад

106.Databricks|Pyspark|Automation|Real Time Project:DataType Issue When Writing to Azure Synapse/SQL

14:06

106.Databricks|Pyspark|Automation|Real Time Project:DataType Issue When Writing to Azure Synapse/SQL

Год назад

105. Databricks | Pyspark |Pyspark Development: Spark/Databricks Interview Question Series - V

25:00

105. Databricks | Pyspark |Pyspark Development: Spark/Databricks Interview Question Series - V

Год назад

104. Databricks | Pyspark |Pyspark Development: Spark/Databricks Interview Question Series - IV

29:53

104. Databricks | Pyspark |Pyspark Development: Spark/Databricks Interview Question Series - IV

Год назад

103. Databricks | Pyspark |Delta Lake: Spark/Databricks Interview Question Series - III

26:14

103. Databricks | Pyspark |Delta Lake: Spark/Databricks Interview Question Series - III

Год назад

102. Databricks | Pyspark |Performance Optimization: Spark/Databricks Interview Question Series - II

38:27

102. Databricks | Pyspark |Performance Optimization: Spark/Databricks Interview Question Series - II

Год назад

Комментарии

@ronakpatil9402 День назад

Sir which is your hindi channel??

@ArupSankarRoy 2 дня назад

Mistake 14:15 Partition Size should be 100 mb

@ramswaroop1520 2 дня назад

in one of the interview they asked, "What is biggest/wellknown drawback in azure synapse analytics ", can you please clarify.

@nitin1808 3 дня назад

Snowflake and Databricks is almost the same...?

@srijanbansal6078 4 дня назад

In the UDF, can you please explain how data is read for individual element and not the entire array.

@NetNet-sn3nd 4 дня назад

Can you share this CSV file in drive for practice

@sowjanyagvs7780 5 дней назад

every piece of code written in notebook is a piece of python and is applied on Data frames. so how does it differ when using UDF

@basavasankethbn3016 6 дней назад

Nice!!!

@rajasdataengineering7585 6 дней назад

Thank you! Cheers!

@shivanaga3302 6 дней назад

Thanks mate for the detailed info as i started to watch with first video then i continue watch the complete series. Please could you attach the lab work what you explain the video and much appreciated for your great work!

@YerramBhavana 7 дней назад

where can we write code

@amrutavastrad 10 дней назад

Thank you!!!

@rajasdataengineering7585 10 дней назад

You're welcome!

@PinaakGoel 10 дней назад

I have a doubt regarding update operation, you mentioned that delta engine scans for those particular files which have records that needs to updated and then updates on them, but if this the case, how time travel could be possible because updating existing files will result in loss of historical data.

@rajasdataengineering7585 10 дней назад

Parquet files are immutable in nature. So during update, relevant files are getting scanned and based on updated value new parquet files are getting created. It won't overwrite existing parquet files

@PinaakGoel 10 дней назад

@@rajasdataengineering7585 Understood, thanks for your reply and kudos to your effort for compiling this databricks playlist!

@rajasdataengineering7585 10 дней назад

You are welcome!

@strikerarijit 11 дней назад

Great Sir!!

@rajasdataengineering7585 11 дней назад

Thanks

@shivanaga3302 12 дней назад

Best intro of spark which i've seen till now.......

@rajasdataengineering7585 12 дней назад

Thank you

@NA-dg6um 13 дней назад

Can you provide documentation of this video

@harimanikanta1 14 дней назад

Thanks anna. I found this explanation far more easier to understand than other sources available on youtube.

@rajasdataengineering7585 14 дней назад

Glad it was helpful! You are welcome

@Hemant-k51 14 дней назад

Please sent notebook copy with me

@srijanbansal6078 16 дней назад

Not getting correct outputs

@abhi.isnt.awesome 16 дней назад

Hello sir, do you take live paid batches for data engineering. If yes, please do let me know. I want to learn data engineering.

@NikhilGosavi-go7be 16 дней назад

done

@NikhilGosavi-go7be 16 дней назад

done

@NikhilGosavi-go7be 16 дней назад

done

@rajasdataengineering7585 16 дней назад

Good progress!

@kamaltheja 16 дней назад

Can you please share all the notebooks in this series?

@NikhilGosavi-go7be 16 дней назад

done

@NikhilGosavi-go7be 16 дней назад

done

@NikhilGosavi-go7be 16 дней назад

done

@YashSharma-ou7rh 20 дней назад

Sir your videos are really good and very understandable in a very simple language. I'm stuck as my cluster is not running. it is saying - Azure Quota Exceeded Exception. I'd be grateful if you could help solve this.

@seenme2951 22 дня назад

That's great sir

@rajasdataengineering7585 22 дня назад

Thank you

@Reddy-i3e 22 дня назад

Awesome explanation,you will deserve a big applause for this.Every second in this video plays main role in understanding the concept of partitioning of data.Really ,loved the conent you explained in this manner.

@rajasdataengineering7585 22 дня назад

It's my pleasure! Thanks for your comment 😊

@sowjanyagvs7780 22 дня назад

What is Lit() : whenever we want to add a constant literal value to entire data frame, then we go with LIT(). we can also add these values only to certain records using when and otherwise. Eg: EMPDF = df.withcolname("Bonus",when(df.sal>50k, lit(sal*10)).otherwise(lit(sal*20))....Thanks for the amazing session Raj sir

@rahulbhusari1478 22 дня назад

Super can you please cover scenario based pyspark interview question for optimization ..

@swapnilgosawi 23 дня назад

Wonderful, Isnt the delta lake schema on write? Delta Lake tables are schema on write, which means that the schema is already defined when the data is read. Delta Lakes are aware when data with other schemas have been appended.

@swapnilgosawi 23 дня назад

If possible can you also try to explain if we can update only certain range of partition data. For eg. if the data is partition by month , and i want to update only last 3 months of partition data then how we can achieve that?

@arabajshaikh8411 23 дня назад

Excellent, Thank you so much.

@rajasdataengineering7585 23 дня назад

Glad it was helpful! You are welcome

@swapnilgosawi 24 дня назад

Do you have a document with all these details ?if yes, that would be great to share on git., Really Great explanation. Thank you !!

@debasishkalia135 24 дня назад

this explanation is great , very detailed

@rajasdataengineering7585 24 дня назад

Thank you!

@firazmohiuddin7183 24 дня назад

It would be better if you linked the files you worked on in the descrition.

@Ustaad_Phani 25 дней назад

Nice explanation sir

@rajasdataengineering7585 25 дней назад

Thank you! Keep watching

@Ustaad_Phani 25 дней назад

Nice explanation sir

@rajasdataengineering7585 25 дней назад

Thanks and welcome

@Ustaad_Phani 25 дней назад

Very nice explanation sir

@rajasdataengineering7585 25 дней назад

Thanks for liking

@Ustaad_Phani 25 дней назад

Very informative

@rajasdataengineering7585 25 дней назад

Glad it was helpful!

@sowjanyagvs7780 26 дней назад

am trying to grab an opportunity on data bricks, glad i found your channel. Your explanations are far better than these trainings

@rajasdataengineering7585 26 дней назад

Welcome aboard! Thank you

@manibaddireddy5477 26 дней назад

what is the difference between UDF and using Transform

@pianikalje2758 27 дней назад

Recently i was asked interview question on handling bad records which needs to be deleted. Company was Bosch.

@rajasdataengineering7585 26 дней назад

Thank you for sharing your experience

@sowjanyagvs7780 27 дней назад

when you mention referring the other videos, can you also keep mentioning those links in description. Thanks a lot for your explanation!!

@rajasdataengineering7585 26 дней назад

Sure thing! Will add links

@avinash1722 27 дней назад

Very Informative. Way better then paid courses

@rajasdataengineering7585 26 дней назад

Thank you!

@priyankatangirala6342 27 дней назад

please provide the answer for the exercise

@OnionsBonnie-w1m 28 дней назад

Moore Angela Taylor Steven Wilson Sarah

@hanumantharaokaryampudi8857 28 дней назад

Hi sir, are you providing any trainings on Databricks? Let me know the details if you have

@Hemant-k51 28 дней назад

Nice you explained excellent 👌

@rajasdataengineering7585 28 дней назад

Thank you so much 🙂