GK Codelabs

GK Codelabs

124
894 383

Подписаться

I am Arpit Singh.
I have been working in IT industry for about 10 years now, and 8+ years in Big Data.
I provide useful content for Big Data aspirants, who need essential interview questions, and use case scenarios all built and explained from scratch.
Please subscribe to my channel for all interesting videos..!!

For any queries write to me at- gkcodelabs@gmail.com

Keep Watching :)

Test Your AWS Skills with This Fun Quiz 🤓 | Data Storage & Migrations on AWS

6:19

Test Your AWS Skills with This Fun Quiz 🤓 | Data Storage & Migrations on AWS

3 месяца назад

Large Data Migration in AWS Cloud: Challenges & Solutions

11:29

Large Data Migration in AWS Cloud: Challenges & Solutions

3 месяца назад

Data Security Strategies in Data Pipelines | Apache Spark | Best Practices

14:02

Data Security Strategies in Data Pipelines | Apache Spark | Best Practices

4 месяца назад

Automate EMR Creation & Termination for Spark | Ready to use setup

2:12

Automate EMR Creation & Termination for Spark | Ready to use setup

4 месяца назад

Get Personalized Guidance for creating your video content for YouTube or Instagram!

1:52

Get Personalized Guidance for creating your video content for YouTube or Instagram!

6 месяцев назад

🚀 Unlock Success: 8 Essential Tips Every CS/IT B.Tech Graduate Must Know! 🎓

7:29

🚀 Unlock Success: 8 Essential Tips Every CS/IT B.Tech Graduate Must Know! 🎓

6 месяцев назад

What to do? When Manager Asks to "Something Out Of the Box" | Best IT Career Tips 🔥

12:13

What to do? When Manager Asks to "Something Out Of the Box" | Best IT Career Tips 🔥

6 месяцев назад

Career Tip for BTech CS/IT Students | Job Market Alert

6:47

Career Tip for BTech CS/IT Students | Job Market Alert

6 месяцев назад

2024 me Apache Spark Roadmap? Data Engineering career ke liye must.

9:18

2024 me Apache Spark Roadmap? Data Engineering career ke liye must.

6 месяцев назад

All about AWS S3 | Versioning | Replication | Lifecycle

19:46

All about AWS S3 | Versioning | Replication | Lifecycle

Год назад

AWS IAM Policies in 10 Mins with Hands-on Demo | AWS Series | Session-3

9:30

AWS IAM Policies in 10 Mins with Hands-on Demo | AWS Series | Session-3

Год назад

AWS User, Groups, Roles, Policies in 15 mins | AWS Complete Handson Series.

16:45

AWS User, Groups, Roles, Policies in 15 mins | AWS Complete Handson Series.

Год назад

AWS Architecture & Billing | Session 2 | AWS Complete Hands-On Series

10:02

AWS Architecture & Billing | Session 2 | AWS Complete Hands-On Series

Год назад

Free Cloud Series Announcement | Poll results | AWS Vs Azure Vs GCP

3:28

Free Cloud Series Announcement | Poll results | AWS Vs Azure Vs GCP

Год назад

5 points to add to checklist before going for a Big-Data Interview

8:58

5 points to add to checklist before going for a Big-Data Interview

2 года назад

NULL Values in Spark ☹️| A Common mistake ❌ | Spark Interview Question

5:57

NULL Values in Spark ☹️| A Common mistake ❌ | Spark Interview Question

2 года назад

Important Cloud Terminologies | Cloud Interview Question

15:43

Important Cloud Terminologies | Cloud Interview Question

2 года назад

Insert-Overwrite in Spark | 7 Important Scenarios Explained

7:40

Insert-Overwrite in Spark | 7 Important Scenarios Explained

2 года назад

Using Kafka for REST API Inputs | Kafka REST | Kafka with React Apps

10:45

Using Kafka for REST API Inputs | Kafka REST | Kafka with React Apps

2 года назад

Using Currying in Spark | Use case explained

14:09

Using Currying in Spark | Use case explained

2 года назад

When to use Higher Order Function in SPARK? | Scala & Scala

17:32

When to use Higher Order Function in SPARK? | Scala & Scala

2 года назад

Why to choose Cloud Storage over HDFS??

5:33

Why to choose Cloud Storage over HDFS??

2 года назад

Describe and Summary in Apache spark

8:28

Describe and Summary in Apache spark

2 года назад

Big Data Insights | Should you move to Big Data? | Experienced Discussion

48:15

Big Data Insights | Should you move to Big Data? | Experienced Discussion

2 года назад

Copy Row Data | Spark Interview question

14:59

Copy Row Data | Spark Interview question

2 года назад

coalesce vs repartition vs partitionBy in spark | Interview question Explained

8:02

coalesce vs repartition vs partitionBy in spark | Interview question Explained

2 года назад

PIVOT in Spark DataFrames | When to use PIVOT? | USE CASE!

20:10

PIVOT in Spark DataFrames | When to use PIVOT? | USE CASE!

3 года назад

Read Spark DataFrame from different paths | Spark Interview Question

11:57

Read Spark DataFrame from different paths | Spark Interview Question

3 года назад

Handling nested Json in Apache Spark | Big Data | Interview Questions | Part-1 | Json Traversing

13:00

Handling nested Json in Apache Spark | Big Data | Interview Questions | Part-1 | Json Traversing

3 года назад

Комментарии

@chinnalearns9565 16 дней назад

Code plz

@NewThingsAk 16 дней назад

can u please send me github code

@naveenbhandari5097 2 месяца назад

really helpful video. it helped me a alot

@BishalKarki-pe8hs 2 месяца назад

peovide the code sir

@BishalKarki-pe8hs 2 месяца назад

please can you provide the code for this

@ravishankarrallabhandi531 2 месяца назад

How can we handle the case where source records are closed / deleted ?

@chandanpatra1053 2 месяца назад

what about lambda & kappa architecture . Can't we say this in the interview?

@1HourBule 2 месяца назад

A person on RU-vid who actually knows Spark

@dev4128 3 месяца назад

Thankssss pelase make videos on ur daily work

@pravinmahindrakar6144 3 месяца назад

Thanks a lot, I think we can use row_number window function to get updated records by using partitions by emp_id and order by date desc. Finally can filter for row_number=1

@soulamazing1228 3 месяца назад

The point of using Apache Spark is to handle large datasets so why are you converting the values to pandas... Pointless video in real world scenarios.

@GKCodelabs 3 месяца назад

Conversion to Pandas is only done on final, aggregated and normalized data where volume is brought down only for what is required for "Visualization" Dont get confused with Data "processing" and "Visualization" You never run huge Spark jobs when a visual report is requested on the fly, that approach becomes "Pointless" Reports are pulled from aggregated data. And Matplotib is one of the approach, there are many other tools and approaches. Thanks for bringing this up btw, this will help others as well, in case someone has similar doubt.

@mallinathbirajdar6610 3 месяца назад

हे सर्व फुलपाखरांचे फोटो आणि व्हिडिओ कुठे घेतले आहेत.? म्हणजे गावाकडे की पुण्यामध्ये.? आणि पुण्यात घेतले असतील तर नेमका location काय आहे.?

@ravikumart6561 3 месяца назад

Could you please share methods or some pseudo code to implement this concept .. Arpit !!!!

@rkdatalabs404 4 месяца назад

Very useful information. Clearly explained.Thank you arpit❤

@mohitjain2196 4 месяца назад

best course ❤🔥

@electricalsir 5 месяцев назад

Thanks man your are amazing 😍❤❤❤

@sonurohini6764 5 месяцев назад

Without any experience in data engineering can we join as a DE from different non IT background. If yes how supportive is the team.

@shashireddy3573 5 месяцев назад

Hi Arpit, do you have any real-time Spark-scale projects in a cloud environment? I searched your playlist but couldn't find any relevant videos.

@user-de6zx5er2s 6 месяцев назад

Hi can you provide the data set we will try from our end

@MerleNader 6 месяцев назад

Promo-SM 😔

@satyendrakumar4349 6 месяцев назад

बहुत अच्छी जानकारी दी ऐसी वीडियो रेगुलर बनाते रहो

@ashutoshojha4244 6 месяцев назад

Hey, Arpit! Do u suggest going with databricks after learning pyspark, even though i want to make a career as an AWS data engineer, ..or would it be more apt for me to just go with aws glue instead?

@GKCodelabs 6 месяцев назад

No doubts in going with AWS.. But dont just stick to AWS Glue, its just one convinient way to spark loads. Explore all Data Analytics services.. PS: AWS Data Analytics is now the new term for Data Engineering services. Happy Learning 👍🏻

@akshaygidwani4360 6 месяцев назад

Hi Arpit! I have been practicing PySpark for a while now, studied and implemented most of the concepts you mentioned in the video as individual concepts. I am looking for projects/use cases where I can combine all the concepts building a meaningful end result. Please could you suggest some? Thanks!

@ashutoshojha4244 6 месяцев назад

Hi , where did u learn pyspark from? I am thinking of buying the udemmy pyspark course by jose portila. Can you share your resources please?

@charangowdamn8661 6 месяцев назад

Which course did you refer for pyspark

@piashreetalukdar4258 7 месяцев назад

Great Video. Nicely explained. Thank you 😊

@user-lp7sb5dw7l 8 месяцев назад

When you do repartition and then partitionby already data is partitioned now based on partitionby column they why no of part file depend on repartition() again?

@ravulapallivenkatagurnadha9605 9 месяцев назад

Nice videos

@kampfer6375 10 месяцев назад

Nice explanation

@astropanda1623 10 месяцев назад

Very good explanation

@முரளிதரன் 10 месяцев назад

2 months big data which company is hiring.. Not suitable for freshers and also for some experienced. Join u will know..😂😂

@anurodhpatil4776 10 месяцев назад

Wooo great sir.....step by step ❤

@vinitpandey4424 11 месяцев назад

Its very basic. I would suggest to keep in python code rather than sql.

@JjCSJ 11 месяцев назад

beautiful explanation

@FormulaMedia-gl9pi 11 месяцев назад

i am getting ssh connection timeout in gitbash. What should be the reason behind this ?

@absarusain5196 11 месяцев назад

GCP

@avinash7003 Год назад

Is spark + Scala roles does exit?

@yashgupta6684 Год назад

File Sink not discussed I was looking for file sink only

@shashireddy3573 Год назад

Hi I need big data real time projects for adding in my resume.

@subhadipsamanta35 Год назад

So life saving 🙌🏻 Thankyou

@Sagar0155 Год назад

Playlist is really helpful. Explained from very basic to high level including important minor services

@2002asimanand Год назад

Good initiative.. Waiting for gcp...

@MalayaleeYoutuber Год назад

Olap cubes and data warehouse are Different(in diagram it is marked together ). Data warehouse warehouse persist data , in Bigdata it will be in data lake gold layer.

@srikanthreddy4516 Год назад

AWS

@prabhatgupta6415 Год назад

azure data enginner is in huge demand as compared to aWS

@my_j.a.r.v.i.s. Год назад

Sirrrrr.... you are back. I started my data engineering journey from here. I learnt basics of GCP but now in my job AWS will be used. I am happy you chose AWS.

@fenixbros-1 Год назад

GCP

@PrashantKumar-vt2wr Год назад

Fantastic material...very easily and smoothly you describe everything...

@GKCodelabs Год назад

Thanks Prashant 😊👍🏻

@A_Dasgupta Год назад

Perfect, would definitely like to explore it!

@prabakaran758 Год назад

Aws

@sravankumar1767 Год назад

Thank you, am getting duplicate records when i mentioned overwrite mode, prevous records as well as new records also. How can we resolve this issue