Тёмный
BigData Thoughts
BigData Thoughts
BigData Thoughts
Подписаться
Bigdata Thoughts is a channel focused on Bigdata technologies and cloud solutions.
Would be sharing my experience of building different use cases on cloud, the challenges faced and the different solutions. Would also talk about different technology options, architecture and design principles for each of the solutions.
All you need to know about Spark Monitoring
14:11
2 месяца назад
Google Gemini vs ChatGPT
11:55
4 месяца назад
What is generative AI
19:36
4 месяца назад
Stream Processing Fundamentals
19:05
5 месяцев назад
Evolution of Data Architectures in last 40 years
21:26
6 месяцев назад
Spark low level API  Distributed variables
10:24
9 месяцев назад
Spark low level API -  RDDs
12:11
9 месяцев назад
Spark structured API - Dataframe and Datasets
17:05
10 месяцев назад
Spark structured API  - Dataframe
14:37
10 месяцев назад
Spark Architecture in Depth Part2
15:15
11 месяцев назад
Spark Architecture in Depth Part1
13:02
11 месяцев назад
All About Continuous Integration
12:05
Год назад
Understanding Spark Execution
15:33
Год назад
Structured Streaming in spark
18:56
Год назад
All about Debugging Spark
18:29
Год назад
What is Quantum Computing?
12:28
Год назад
All about Blockchains
16:54
Год назад
All about Data Vaults
21:15
Год назад
Top 8 Bigdata Trends
15:40
Год назад
How to build efficient Data lakes
23:13
Год назад
All about stream processing
11:20
Год назад
All about partitions in spark
12:27
Год назад
What is Kubernetes
14:30
Год назад
What is Azure AD
9:42
Год назад
Комментарии
@yashawanthraj8872
@yashawanthraj8872 10 дней назад
Can Node/Thread have more partition than no of executors, if yes where the no of partition information will be stored.
@gvnreddy2244
@gvnreddy2244 15 дней назад
Very good session mam if it was a practically show means it is very useful. thank you for your efforts
@BigDataThoughts
@BigDataThoughts 14 дней назад
Thanks
@nishchaysharma5904
@nishchaysharma5904 17 дней назад
Thank you for this video.
@BigDataThoughts
@BigDataThoughts 14 дней назад
Thanks
@vaibhavjoshi6853
@vaibhavjoshi6853 19 дней назад
Getting confidence in spark because of you only. Thanks so so much!
@BigDataThoughts
@BigDataThoughts 14 дней назад
Thanks
@ambar752
@ambar752 24 дня назад
To summarize, what the Datamarts are for a DataWarehouse, same are the DataMesh for a DataLake
@rovashri566
@rovashri566 28 дней назад
How did you make such a good visual explanation? Which tool you used to draw sketches ? Pls guide 🙏
@muralichiyan
@muralichiyan Месяц назад
Data mesh and snowflake same..? Data mesh and microsoft fabric same?
@Learn2Share786
@Learn2Share786 Месяц назад
Thanks, appreciate it.. is there a plan to post practical videos around spark performance tuning?
@user-zb9hm5yh1m
@user-zb9hm5yh1m Месяц назад
Thank you for sharing your thoughts.
@BishalKarki-pe8hs
@BishalKarki-pe8hs Месяц назад
this is not excatly asnwer
@ranyasri1092
@ranyasri1092 Месяц назад
Please do videos with sample data sets so that it would help for hands on
@mindwithcuriosity5347
@mindwithcuriosity5347 Месяц назад
Seems it is PAAS as mentioned on Microsoft website
@sanketdhamane5941
@sanketdhamane5941 Месяц назад
Really Thanks to Good And Indepth Explantion
@BigDataThoughts
@BigDataThoughts Месяц назад
Thanks
@sindhuchowdary572
@sindhuchowdary572 2 месяца назад
lets say there is no change in records for the next day.. then.. does the data gets overwrite again?? with same records..??
@BigDataThoughts
@BigDataThoughts 2 месяца назад
No we are only taking the new differential data when we do CDC
@sunnyd9878
@sunnyd9878 2 месяца назад
This is excellent and valuable knowledge sharing... Easily one can make out these trainings are coming out of personal deep hands-on experience and not the mere theory ..Great work
@BigDataThoughts
@BigDataThoughts 2 месяца назад
thanks
@Learn2Share786
@Learn2Share786 2 месяца назад
Thank you, pls also post some practical videos around the same topic
@user-zb9hm5yh1m
@user-zb9hm5yh1m 2 месяца назад
Thank you for sharing thoughts
@KiranKumar-cg3yg
@KiranKumar-cg3yg 2 месяца назад
First one to monitor the notification from you
@shreyachakravarty1347
@shreyachakravarty1347 2 месяца назад
Thanks
@ahmedaly6999
@ahmedaly6999 2 месяца назад
how i join small table with big table but i want to fetch all the data in small table like the small table is 100k record and large table is 1 milion record df = smalldf.join(largedf, smalldf.id==largedf.id , how = 'left_outerjoin') it makes out of memory and i cant do broadcast the small df idont know why what is best case here pls help
@harigovindk
@harigovindk 2 месяца назад
18/april/2024
@karthikeyanr1171
@karthikeyanr1171 3 месяца назад
your videos on spark are hidden gems
@BigDataThoughts
@BigDataThoughts 2 месяца назад
Thanks
@mdatasoft1525
@mdatasoft1525 3 месяца назад
@mdatasoft1525
@mdatasoft1525 3 месяца назад
@rupaghosh6251
@rupaghosh6251 3 месяца назад
Nice explanation
@BigDataThoughts
@BigDataThoughts 3 месяца назад
Thanks
@RameshKumar-ng3nf
@RameshKumar-ng3nf 3 месяца назад
At the start of the video i was so happy seing all the diagrams.. Later got fully confused & felt complicated and i didnt understand well 😢
@nahomg.4191
@nahomg.4191 3 месяца назад
I wish I could give 1000 likes. You’re an excellent teacher!
@BigDataThoughts
@BigDataThoughts 3 месяца назад
Thanks
@user-eg9ed5nr8z
@user-eg9ed5nr8z 3 месяца назад
Nice explaination
@BigDataThoughts
@BigDataThoughts 3 месяца назад
Thanks
@amitgupta3
@amitgupta3 3 месяца назад
found it helpful. You may go slower though. I had to stop and rewind few times.
@husnabanu4370
@husnabanu4370 3 месяца назад
what a wonderfull explanation to the point... thank you
@BigDataThoughts
@BigDataThoughts 3 месяца назад
Thanks
@sumonmal009
@sumonmal009 3 месяца назад
Good playlist for Spark ru-vid.com/group/PL1RS9FR9qIPEAtSWX3rKLVcRWoaBDqVBV
@BigDataThoughts
@BigDataThoughts 3 месяца назад
Thanks
@mohnishverma87
@mohnishverma87 4 месяца назад
Just woow, very simple explanation of a complex cluster overview.. Thanks.
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@masoom002
@masoom002 4 месяца назад
best explanation ever i came across on RU-vid. watching all the parts .... Thank you for explaining it so smoothly.
@user-zb9hm5yh1m
@user-zb9hm5yh1m 4 месяца назад
Thank you for sharing thoughts!
@utsavchanda4190
@utsavchanda4190 4 месяца назад
That was very well explained. Thank you for putting this together. One question though, do you really think data modelling should be done on the Gold layer? I don't think so because Gold datasets are just busineess level aggregates suited to particular business consumption needs. Whereas Silver layer is the warehouse in Lakehouse. That is where modelling should be done, if needed.
@shrabanti84
@shrabanti84 4 месяца назад
Thank you so much.. all the vdos are very much clear and effective.
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@user-zb9hm5yh1m
@user-zb9hm5yh1m 4 месяца назад
Thank you for sharing your thoughts.
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@deepalirathod4929
@deepalirathod4929 4 месяца назад
Finally it got cleared to me after reading here and there . thank you .
@himanshupandey8576
@himanshupandey8576 4 месяца назад
one of the helpful session !
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@Learn2Share786
@Learn2Share786 4 месяца назад
Nicely explained, thank you ..looking forward to learn more around this topic
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@srinivas123j
@srinivas123j 4 месяца назад
well explained!!!
@srinivas123j
@srinivas123j 4 месяца назад
Well explained!!!
@srinivas123j
@srinivas123j 4 месяца назад
well explained!!
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@srinivas123j
@srinivas123j 4 месяца назад
well explained
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@user-zm2me1gc5z
@user-zm2me1gc5z 4 месяца назад
Nicely explained and thanks. helping a lot
@hlearningkids
@hlearningkids 4 месяца назад
kindly do similar simple thing for dataproc also bigquery.
@user-fz4in8bf1y
@user-fz4in8bf1y 4 месяца назад
Thank you for the detailed explanation. However the problems that I faced with reading dates prior to 1900, does not resolve even after setting all the mentioned properties. Does any one have a working example that solved the issue of reading dates prior to 1900. Below is the code that I added but did not work. conf = sparkContext.getConf() conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "CORRECTED") conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "CORRECTED") conf.set("spark.sql.datetime.java8API.enabled", "true")
@hlearningkids
@hlearningkids 4 месяца назад
Very good information 🎉
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@hlearningkids
@hlearningkids 4 месяца назад
Very nice 👍
@BigDataThoughts
@BigDataThoughts 4 месяца назад
Thanks
@hlearningkids
@hlearningkids 4 месяца назад
@@BigDataThoughts did you explained in this style big query also. ? improvement in this video can be summary in slow way. please dont get hurt because i gave comment. you did really well in video. excellent explanation.
@vishalmehta5171
@vishalmehta5171 5 месяцев назад
Can you make a content on JVON separate?
@user-zb9hm5yh1m
@user-zb9hm5yh1m 5 месяцев назад
Thank you for sharing thoughts!
@BigDataThoughts
@BigDataThoughts 5 месяцев назад
Thanks