No video :(

Hadoop vs Spark | Lec-3 | In depth explanation

Подписаться 20 тыс.

Просмотров 26 тыс.

50% 1

In this video I have talked about Apache spark vs hadoop. I have talked the difference in detail. If you have some doubt please shoot your questions in comment section.
Directly connect with me on:- topmate.io/manish_kumar25
For more queries reach out to me on my below social media handle.
Follow me on LinkedIn:- / manish-kumar-373b86176
Follow Me On Instagram:- / competitive_gyan1
Follow me on Facebook:- / manish12340
My Second Channel -- / @competitivegyan1
Interview series Playlist:- • Interview Questions an...
My Gear:-
Rode Mic:-- amzn.to/3RekC7a
Boya M1 Mic-- amzn.to/3uW0nnn
Wireless Mic:-- amzn.to/3TqLRhE
Tripod1 -- amzn.to/4avjyF4
Tripod2:-- amzn.to/46Y3QPu
camera1:-- amzn.to/3GIQlsE
camera2:-- amzn.to/46X190P
Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
Pentab (Small size):-- amzn.to/3RpmIS0
Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
Laptop -- amzn.to/3Ns5Okj
Mouse+keyboard combo -- amzn.to/3Ro6GYl
21 inch Monitor-- amzn.to/3TvCE7E
27 inch Monitor-- amzn.to/47QzXlA
iPad Pencil:-- amzn.to/4aiJxiG
iPad 9th Generation:-- amzn.to/470I11X
Boom Arm/Swing Arm:-- amzn.to/48eH2we
My PC Components:-
intel i7 Processor:-- amzn.to/47Svdfe
G.Skill RAM:-- amzn.to/47VFffI
Samsung SSD:-- amzn.to/3uVSE8W
WD blue HDD:-- amzn.to/47Y91QY
RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
Gigabyte Motherboard:-- amzn.to/3RFUTGl
O11 Dynamic Cabinet:-- amzn.to/4avkgSK
Liquid cooler:-- amzn.to/472S8mS
Antec Prizm FAN:-- amzn.to/48ey4Pj

Опубликовано:

25 мар 2023

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 75

@pranavbhawane7591 5 месяцев назад

Manish bhai, kya gajab admi ho yrr tum, content aur knowledge bohot kamal hai, thankyou for the videos

@Lakshvedhi Год назад

I have been following your channel for long time. I love your content. I am preparing for data for data engineering. And these videos are helping me very much. Thank you so much.

@rishav144 Год назад

well explained . Thanks for consistent videos

@rawat7203 Год назад

Thank you Manish, started following you lately ... Amazing content .. Keep up the good work

@nilavnayan4521 Год назад

Great content Manish bhai, really good comparison, good points! Thanks!

@yashbaviskar6317 Год назад

Amazing content Manish bhaiya 🙌.. Looking forward to more such exciting and knowledgeable video content.....

@nakulbageja2232 Год назад

Great work, thank you👌🙌

@danishthev-log2264 Год назад

Aag laga diya sir ji aapne maine phle spark complete kr rkha h pr itna deeply aaj sikne ko mila mujhe aapke channel se..overwhelming content.🙂🙂

@shreeb7352 Год назад

thanks for explaining WHYs! very helpful!

@coding7241 Год назад

i watched it 3 times.....awesome video

@pritiiBisht 7 месяцев назад

Really Appreciated. I like the content.

@sunnyd9878 6 месяцев назад

Bhai bahut Badhiya explain kiya Hai... excellent thanks

@ujjalroy1442 Год назад

Very detailed yaar.... Thanks

@vaibhavkamble3325 3 месяца назад

Right Class for individual. For beginners.❤❤❤ Thank you.

@SANJAYYADAV-hm2bs 5 месяцев назад

Manish brother, our content is really awesome. Feeling lucky to find your channel.

@deeksha6514 4 месяца назад

Best playlist over internet

@vedant_dhamecha 8 месяцев назад

I am watching two hours before my university exams! All clearly i can understand! Hatts off man

@talhaaziz4847 3 месяца назад

Outstanding... Keep it up. A very good and short informative videos. make more videos with more details. highly recommended for all

@ANJALISINGH-nr6nk 7 месяцев назад

You are the best.

@chandrakantkumar1276 6 месяцев назад

Namastey Sir, Time 21:00 explanation me ek doubt hai Fault Tolerance jo HDFS me hota hai wo cluster level par hota hai, in-case koi node fail ho gaya tab recovery hota hai aur ye recovery master node karti hai. Lekin Spark to ek Compute engine hai, aur yadi storage HDFS hi ho aur yaha pe ek node fail ho jata hai to yaha pe bhi data-recovery to waise hi hoga jaise Hadoop Ecosystem me hota tha, fir DAG Spark me Fault-Tolerance ka kaam kaise kiya, Jitna mujhe samajh aa raha hai, DAG to data ko re-compute karega lekin ye nahi samajh aa raha hai ki under what circumstances Spark will have to use DAG to re-compute/re-process something. Please explain if you have any example/use-case

@journeyWithAshutosh 6 месяцев назад

sir, pyspark ka full syllabus wala ek playlist banayi ye na plz

@aryankhandelwal8517 11 месяцев назад

GOOD VIDEO🤟

@amanjha5422 Год назад

Bhaiya plz is series ko age lekr jaiye . Me bhut dino se ye sikhna chta tha and apki video bhut mstt hh ..

@manish_kumar_1 Год назад

Sure

@pratikparbhane8677 6 месяцев назад

Attendance Marked

@wellwisher7333 Год назад

Thanks bhai

@dataman17 5 месяцев назад

Brilliant explanations!

@rajandeshmukh3094 5 месяцев назад

Are you a fellow data engineering aspirant ?

@coding7241 Год назад

thnaks

@ComedyXRoad 3 месяца назад

thank you brother

@sanooosai 4 месяца назад

thank you sir

@rajeshwarreddyracha4655 11 месяцев назад

Why we will use Hive, if we have already Spark in our project, Any specific reason ?

@harshi993 4 месяца назад

What in what ? Data storage or processing ?

@lifelearningneo Год назад

bhaiya Hadoop me fault tolerance to kewal storage level pe hoti hai na , application level pe fault tolerence nahi hota naa,,correct me if I am wrong

@LOFI_WORLD_SONG 7 месяцев назад

I don't want to code. Can I learn data engineering or should I go for Devops engineering?

@navjotsingh-hl1jg Год назад

bro aap roz video upload karo humari consistency banni rahi gayi

@shubhajitadhikary1960 9 месяцев назад

🔥🙇🏻🙏🏻

@chiragsharma9430 Год назад

Hi Manish can you also make a video on spark related project which could be useful for aspiring data scientists also just like the one you have created for data engineering specific. Thanks in advance!

@manish_kumar_1 Год назад

Will try

@chiragsharma9430 Год назад

@@manish_kumar_1 thanks

@nitilpoddar 7 месяцев назад

done

@manish_kumar_1 Год назад

Directly connect with me on:- topmate.io/manish_kumar25

@amitkumar-ij9sw 6 месяцев назад

Manish hadoop was developed by former yahoo developer Doug Cutting not by google

@ytsh9366 Год назад

Hello Manish bhaiyya, I have two year experience in service based company on web development and I wanted to switch into data engineering profile I learnt SQL and learning python after watching your video and my company do not change role internally so how to switch into data engineering role pls answer this pls

@manish_kumar_1 Год назад

Watch one of my titled " How I bagged 12 offers "

@anshukumari6616 Год назад

Thanks for the detailed explaination !!

@gchanakya2979 Год назад

Marking my attendance 🙏

@punkad2337 Год назад

Manish sir , Ap Data Engineer ka course ya tutorial videos provide kara sakte ho kya ?? Agar kara skte please provide me link so that i will buy the tutorials or course ??

@manish_kumar_1 Год назад

Free me hi padhata hu. Aap Mera 12 offer wala video dekh lijiye. Saare free resources mil jayenge

@reachrishav Год назад

Hi Manish, how do you make such notes in onenote? What stylus/device is required for this? I want to purchase a similar device for digital note-taking. Please advise.

@manish_kumar_1 Год назад

Pentab is required to write it on notebook or ppt. You can buy online. I have medium size one. You can find the link in description

@reachrishav Год назад

@@manish_kumar_1 Is it the iPad pencil you're referring to? Will wacom one pen tablet work the same?

@manish_kumar_1 Год назад

@@reachrishav yes but it won't have any screen. You will get a pad and stylus. You have write on pentab with stylus but what ever you are writing will be shown in laptop one note or ppt or any other software that you are using

@reachrishav Год назад

@@manish_kumar_1 Thanks. I guess you are using wacom tablet/stylus for this video?

@Watson22j Год назад

Bhaia, 128MB to default size hota hai na block storage ka jo ki hum customise kr skte hai apne jarurat ke hisab se. To mera sawal ye tha ki, kis case me ye block storage ka size hum decrease krte hai aur kis case me increase krte hain?

@manish_kumar_1 Год назад

If we have many smaller size disk blocks, the seek time would be maximum (time spent to seek/look for an information). And also, having multiple small sized blocks is the burden on name node/master, as ultimately the name node stores metadata, so it has to save this disk block information.

@Watson22j Год назад

@@manish_kumar_1 Thank you :)

@adityaanand835 Год назад

i think the title should be Mapreduce vs Spark.. hadoop me dono use kr hi sakte h na..

@manish_kumar_1 Год назад

Yes it should be map reduce vs spark. But the term Hadoop vs spark is more popular

@adityaanand835 Год назад

@@manish_kumar_1 Dont stick with the popularity stick with the concept. to avoid confusions

@alakmarshafin9065 5 месяцев назад

Minor Correction Hadoop is created by Yahoo! not google

@TheBest-yh1yj 8 месяцев назад

Bhai, I have question related to DAG. If process 3 get failed, then DAG knows the steps to generate the information of process 3. What happens when process 1 gets failed? how DAG recover forms it? and what is process?

@soumyaranjanrout2843 8 месяцев назад

If "process 1" fails in the DAG, the recovery would typically involve retrying or restarting "process 1" itself. The success of this recovery depends on whether "process 1" is independent or has dependencies. If it has dependencies, those may need to be reprocessed as well to ensure a consistent state in the workflow. Essentially, DAG recovery for a failed process involves identifying the failure point, addressing it, and potentially rerunning dependent processes to maintain the integrity of the workflow. Thanks ChatGPT Let me elaborate it: A Directed Acyclic Graph (DAG) in Spark represents a computational workflow where nodes denote tasks or operations, and directed edges illustrate dependencies between these tasks. In the context of fault tolerance, if a task like "process 1" fails, the DAG aids recovery by re-executing the failed task based on information collected from its dependencies, ensuring the computational flow continues. Consider a scenario where you apply five transformations to a DataFrame (DF). Each transformation creates a new DF as DFs are immutable. If, for instance, "transformation 4" fails during execution, Spark retrieves information from "transformation 3's" DF (its dependency) and then re-executes "transformation 4." Regarding your question about "process 1" failure, if it fails, recovery involves restarting "process 1." Given interdependencies between tasks, subsequent transformations won't proceed if the initial process fails. The DAG orchestrates this recovery process by ensuring the restarting of the failed task, allowing the entire workflow to progress seamlessly. If I am wrong then please someone let me know because I am also beginner in Data domain.

@TheBest-yh1yj 7 месяцев назад

@@soumyaranjanrout2843 thanks for details explainantion. What is the meaning of "Given interdependencies between tasks, subsequent transformations won't proceed if the initial process fails."?

@soumyaranjanrout2843 6 месяцев назад

@@TheBest-yh1yj In simpler terms, if one step in a process fails, the following steps that depend on it also get stuck until the initial issue is resolved. If I will simplify it more then as we knew every tasks are interdependent so if task 1 got failed(as per your question) then the remaining tasks that rely on its output cannot continue until the initial task is successfully completed. Hope you understood it😊