Тёмный
Raja's Data Engineering
Raja's Data Engineering
Raja's Data Engineering
Подписаться
Welcome to Raja's Data Engineering!

Are you ready to embark on a thrilling journey into the world of Azure Databricks and Apache Spark? Look no further! Our channel is your go-to destination for all things related to these powerful data processing and analytics tools.

Join us as we delve into the depths of Azure Databricks and Apache Spark, unraveling their capabilities, exploring best practices, and unlocking the secrets to harnessing their true potential. Whether you're a data engineer, data scientist, or a curious learner passionate about big data technologies, our channel offers a wealth of knowledge to fuel your growth.

Here's what you can expect:
In-depth Tutorials
Best Practices and Tips
Use Case Discussions
Performance Optimization
Interview Preparation

Get ready to unlock the full potential of Azure Databricks and Apache Spark with our engaging and informative videos. Don't forget to subscribe to our channel and hit the notification bell, so you never miss an update.

Комментарии
@ronakpatil9402
@ronakpatil9402 День назад
Sir which is your hindi channel??
@ArupSankarRoy
@ArupSankarRoy 2 дня назад
Mistake 14:15 Partition Size should be 100 mb
@ramswaroop1520
@ramswaroop1520 2 дня назад
in one of the interview they asked, "What is biggest/wellknown drawback in azure synapse analytics ", can you please clarify.
@nitin1808
@nitin1808 3 дня назад
Snowflake and Databricks is almost the same...?
@srijanbansal6078
@srijanbansal6078 4 дня назад
In the UDF, can you please explain how data is read for individual element and not the entire array.
@NetNet-sn3nd
@NetNet-sn3nd 4 дня назад
Can you share this CSV file in drive for practice
@sowjanyagvs7780
@sowjanyagvs7780 5 дней назад
every piece of code written in notebook is a piece of python and is applied on Data frames. so how does it differ when using UDF
@basavasankethbn3016
@basavasankethbn3016 6 дней назад
Nice!!!
@rajasdataengineering7585
@rajasdataengineering7585 6 дней назад
Thank you! Cheers!
@shivanaga3302
@shivanaga3302 6 дней назад
Thanks mate for the detailed info as i started to watch with first video then i continue watch the complete series. Please could you attach the lab work what you explain the video and much appreciated for your great work!
@YerramBhavana
@YerramBhavana 7 дней назад
where can we write code
@amrutavastrad
@amrutavastrad 10 дней назад
Thank you!!!
@rajasdataengineering7585
@rajasdataengineering7585 10 дней назад
You're welcome!
@PinaakGoel
@PinaakGoel 10 дней назад
I have a doubt regarding update operation, you mentioned that delta engine scans for those particular files which have records that needs to updated and then updates on them, but if this the case, how time travel could be possible because updating existing files will result in loss of historical data.
@rajasdataengineering7585
@rajasdataengineering7585 10 дней назад
Parquet files are immutable in nature. So during update, relevant files are getting scanned and based on updated value new parquet files are getting created. It won't overwrite existing parquet files
@PinaakGoel
@PinaakGoel 10 дней назад
@@rajasdataengineering7585 Understood, thanks for your reply and kudos to your effort for compiling this databricks playlist!
@rajasdataengineering7585
@rajasdataengineering7585 10 дней назад
You are welcome!
@strikerarijit
@strikerarijit 11 дней назад
Great Sir!!
@rajasdataengineering7585
@rajasdataengineering7585 11 дней назад
Thanks
@shivanaga3302
@shivanaga3302 12 дней назад
Best intro of spark which i've seen till now.......
@rajasdataengineering7585
@rajasdataengineering7585 12 дней назад
Thank you
@NA-dg6um
@NA-dg6um 13 дней назад
Can you provide documentation of this video
@harimanikanta1
@harimanikanta1 14 дней назад
Thanks anna. I found this explanation far more easier to understand than other sources available on youtube.
@rajasdataengineering7585
@rajasdataengineering7585 14 дней назад
Glad it was helpful! You are welcome
@Hemant-k51
@Hemant-k51 14 дней назад
Please sent notebook copy with me
@srijanbansal6078
@srijanbansal6078 16 дней назад
Not getting correct outputs
@abhi.isnt.awesome
@abhi.isnt.awesome 16 дней назад
Hello sir, do you take live paid batches for data engineering. If yes, please do let me know. I want to learn data engineering.
@NikhilGosavi-go7be
@NikhilGosavi-go7be 16 дней назад
done
@NikhilGosavi-go7be
@NikhilGosavi-go7be 16 дней назад
done
@NikhilGosavi-go7be
@NikhilGosavi-go7be 16 дней назад
done
@rajasdataengineering7585
@rajasdataengineering7585 16 дней назад
Good progress!
@kamaltheja
@kamaltheja 16 дней назад
Can you please share all the notebooks in this series?
@NikhilGosavi-go7be
@NikhilGosavi-go7be 16 дней назад
done
@NikhilGosavi-go7be
@NikhilGosavi-go7be 16 дней назад
done
@NikhilGosavi-go7be
@NikhilGosavi-go7be 16 дней назад
done
@YashSharma-ou7rh
@YashSharma-ou7rh 20 дней назад
Sir your videos are really good and very understandable in a very simple language. I'm stuck as my cluster is not running. it is saying - Azure Quota Exceeded Exception. I'd be grateful if you could help solve this.
@seenme2951
@seenme2951 22 дня назад
That's great sir
@rajasdataengineering7585
@rajasdataengineering7585 22 дня назад
Thank you
@Reddy-i3e
@Reddy-i3e 22 дня назад
Awesome explanation,you will deserve a big applause for this.Every second in this video plays main role in understanding the concept of partitioning of data.Really ,loved the conent you explained in this manner.
@rajasdataengineering7585
@rajasdataengineering7585 22 дня назад
It's my pleasure! Thanks for your comment 😊
@sowjanyagvs7780
@sowjanyagvs7780 22 дня назад
What is Lit() : whenever we want to add a constant literal value to entire data frame, then we go with LIT(). we can also add these values only to certain records using when and otherwise. Eg: EMPDF = df.withcolname("Bonus",when(df.sal>50k, lit(sal*10)).otherwise(lit(sal*20))....Thanks for the amazing session Raj sir
@rahulbhusari1478
@rahulbhusari1478 22 дня назад
Super can you please cover scenario based pyspark interview question for optimization ..
@swapnilgosawi
@swapnilgosawi 23 дня назад
Wonderful, Isnt the delta lake schema on write? Delta Lake tables are schema on write, which means that the schema is already defined when the data is read. Delta Lakes are aware when data with other schemas have been appended.
@swapnilgosawi
@swapnilgosawi 23 дня назад
If possible can you also try to explain if we can update only certain range of partition data. For eg. if the data is partition by month , and i want to update only last 3 months of partition data then how we can achieve that?
@arabajshaikh8411
@arabajshaikh8411 23 дня назад
Excellent, Thank you so much.
@rajasdataengineering7585
@rajasdataengineering7585 23 дня назад
Glad it was helpful! You are welcome
@swapnilgosawi
@swapnilgosawi 24 дня назад
Do you have a document with all these details ?if yes, that would be great to share on git., Really Great explanation. Thank you !!
@debasishkalia135
@debasishkalia135 24 дня назад
this explanation is great , very detailed
@rajasdataengineering7585
@rajasdataengineering7585 24 дня назад
Thank you!
@firazmohiuddin7183
@firazmohiuddin7183 24 дня назад
It would be better if you linked the files you worked on in the descrition.
@Ustaad_Phani
@Ustaad_Phani 25 дней назад
Nice explanation sir
@rajasdataengineering7585
@rajasdataengineering7585 25 дней назад
Thank you! Keep watching
@Ustaad_Phani
@Ustaad_Phani 25 дней назад
Nice explanation sir
@rajasdataengineering7585
@rajasdataengineering7585 25 дней назад
Thanks and welcome
@Ustaad_Phani
@Ustaad_Phani 25 дней назад
Very nice explanation sir
@rajasdataengineering7585
@rajasdataengineering7585 25 дней назад
Thanks for liking
@Ustaad_Phani
@Ustaad_Phani 25 дней назад
Very informative
@rajasdataengineering7585
@rajasdataengineering7585 25 дней назад
Glad it was helpful!
@sowjanyagvs7780
@sowjanyagvs7780 26 дней назад
am trying to grab an opportunity on data bricks, glad i found your channel. Your explanations are far better than these trainings
@rajasdataengineering7585
@rajasdataengineering7585 26 дней назад
Welcome aboard! Thank you
@manibaddireddy5477
@manibaddireddy5477 26 дней назад
what is the difference between UDF and using Transform
@pianikalje2758
@pianikalje2758 27 дней назад
Recently i was asked interview question on handling bad records which needs to be deleted. Company was Bosch.
@rajasdataengineering7585
@rajasdataengineering7585 26 дней назад
Thank you for sharing your experience
@sowjanyagvs7780
@sowjanyagvs7780 27 дней назад
when you mention referring the other videos, can you also keep mentioning those links in description. Thanks a lot for your explanation!!
@rajasdataengineering7585
@rajasdataengineering7585 26 дней назад
Sure thing! Will add links
@avinash1722
@avinash1722 27 дней назад
Very Informative. Way better then paid courses
@rajasdataengineering7585
@rajasdataengineering7585 26 дней назад
Thank you!
@priyankatangirala6342
@priyankatangirala6342 27 дней назад
please provide the answer for the exercise
@OnionsBonnie-w1m
@OnionsBonnie-w1m 28 дней назад
Moore Angela Taylor Steven Wilson Sarah
@hanumantharaokaryampudi8857
@hanumantharaokaryampudi8857 28 дней назад
Hi sir, are you providing any trainings on Databricks? Let me know the details if you have
@Hemant-k51
@Hemant-k51 28 дней назад
Nice you explained excellent 👌
@rajasdataengineering7585
@rajasdataengineering7585 28 дней назад
Thank you so much 🙂