Тёмный

Solving Real-World Data Science Interview Questions! (with Python Pandas) 

Keith Galli
Подписаться 222 тыс.
Просмотров 111 тыс.
50% 1

Visit brilliant.org/KeithGalli/ to get started learning STEM for free, and the first 200 people will get 20% off their annual premium subscription
In this video we solve a series of Data Science Interview questions on Stratascratch. We start with easy problems using Python Pandas and then progressively get more difficult. At the end of the video we do five non-coding interview questions that force you to think at a high level.
Mentioned Resources!
Second Channel: / techtrekbykeithgalli
Regex Cheat Sheet: cheatography.com/davechild/ch...
Probability text book: www.amazon.com/dp/188652923X/...
Here are the questions that we complete (in order)
~~ Coding ~~
1. Finding Updated Records: platform.stratascratch.com/co...
2. Number of Bathrooms and Bedrooms: platform.stratascratch.com/co...
3. Counting Instances in Text: platform.stratascratch.com/co...
4. Customer Revenue in March: platform.stratascratch.com/co...
5. Monthly Percentage Difference: platform.stratascratch.com/co...
6. Premium vs Freemium: platform.stratascratch.com/co...
~~ Non-Coding ~~
1. Credit Card Activity: platform.stratascratch.com/te...
2. Outliers Detection: platform.stratascratch.com/te...
3. Probability of Having a Sister: platform.stratascratch.com/te...
4. Uber Black Rides: platform.stratascratch.com/te...
5. Terabyte of Data: platform.stratascratch.com/te...
The skills that we work on in this video include:
- Python Pandas
- Groupby & Aggregate DataFrames
- Use regexes to analyze text
- Datetime objects in Pandas
- Filtering by Conditionals
- Applying a lambda function to a data frame
If you have any questions, let me know in the comments!
If you enjoyed this video, make sure to throw it a like & subscribe for all future content :)
-------------------------
Video Timeline!
0:00 - Intro & Video Overview
0:46 - Check out this Video’s Sponsor, Brilliant!
3:10 - Coding #1 (Microsoft, Easy) - Finding Updated Records
10:36 - Coding #2 (Airbnb, Easy) - Number of Bathrooms and Bedrooms
16:38 - Coding #3 (Google, Medium) - Counting Instances in Text
28:23 - Coding #4 (Meta/Facebook, Medium) - Customer Revenue in March
36:51 - Coding #5 (Amazon, Hard) - Monthly Percentage Difference
56:38 - Coding #6 (Microsoft, Hard) - Premium vs Freemium
01:10:28 - Non-Coding #1 (Visa, Easy) - Credit Card Activity
01:13:33 - Non-Coding #2 (IBM, Easy) - Outliers Detection
01:16:46 - Non-Coding #3 (Google, Medium) - Probability of Having a Sister
01:27:19 - Non-Coding #4 (Uber, Medium) - Uber Black Rides
01:36:57 - Non-Coding #5 (Capital One, Hard) - Terabyte of Data
01:46:41 - Video Conclusion & Recap
-------------------------
Follow me on social media!
Instagram | / keithgalli
Twitter | / keithgalli
TikTok | / keithgalli
-------------------------
If you are curious to learn how I make my tutorials, check out this video: • How to Make a High Qua...
Practice your Python Pandas data science skills with problems on StrataScratch!
stratascratch.com/?via=keith
Join the Python Army to get access to perks!
RU-vid - / @keithgalli
Patreon - / keithgalli
*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.
This video was Sponsored by Brilliant

Опубликовано:

 

3 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 83   
@KeithGalli
@KeithGalli Год назад
Thank you Brilliant for sponsoring this video! Check out brilliant.org/KeithGalli/ to get started learning STEM for free, and the first 200 people will get 20% off their annual premium subscription. Hope you all enjoyed this video :). I'm working on a bunch of new content right now so be on the lookout for another video or two in the next couple of weeks. If you have any questions about the topics covered in this or have a request for a future video, let me know here in the comments!!
@edwardj.warden5072
@edwardj.warden5072 Год назад
Hi @KeithGalli. I’ve got two questions to ask you. I have watched lots of your videos that I like, and learned a lot. My question is do you think that the certificate that Datacamp provides for data science is worth to earn, and would it help me to find a data science job? And, what best place, you recommend, in online to get certificate for data science that would help me to find a data science job? Thank you.
@yogeshuttekar8542
@yogeshuttekar8542 Год назад
Glad to see you back mate. I have really learned more from your videos than attending University.
@hardiktyagi1955
@hardiktyagi1955 Год назад
At 37:48 I work for Amazon's RPA team, trying to make a career in data science. Last month I was appearing for an IJP and got the same question in SQL coding round. Thanks for making this Keith. Keep them coming.
@KeithGalli
@KeithGalli Год назад
Dang that's too funny. My hope is that this video will help people in similar situations to yours moving forward. Thanks for watching!
@shivamburnwal7765
@shivamburnwal7765 Год назад
Hey Hardik, can you tell me why exactly you are trying to make a career in Data Science? Is it because RPA doesn't have a good future in the industry or it is because you personally prefer the Data Science field. I am asking this question as I am also starting as a member of EXL's RPA team.
@nicholasgrandizio7596
@nicholasgrandizio7596 Год назад
Thank you for all the hard work you put into teaching Data Science. Your videos and others like you, provide more to the community such as myself trying to build a career in data than what University Programs provide. Your playing an important role in the future of Data Science by leading current students along the path to future industry leaders.
@laurentreynaud4404
@laurentreynaud4404 Год назад
Thank you so much for these data science courses!
@deepaksaikumar5178
@deepaksaikumar5178 Год назад
Hi Keith, You have been a great resource to learn Python and Data science-related skills. Thank you!
@edwardj.warden5072
@edwardj.warden5072 Год назад
Very helpful. Thank you Keith.
@danielefarotti1061
@danielefarotti1061 Год назад
I really like your approach in explaining things. I am currently transitioning from pure maths into data science, and I find these videos very helpful!
@BOGABOOfull
@BOGABOOfull Год назад
Glad you're back bro ;) love this types of vids. Love from Portugal
@adeafni9544
@adeafni9544 Год назад
Thank you Keith, you're amazingg, keep it up!!!
@dinkinflicka157
@dinkinflicka157 Год назад
Yay! Another real world problem solving video. Thanks Keith. Love your content as always.
@KeithGalli
@KeithGalli Год назад
Glad to hear it, I appreciate your support!! :)
@masked00000
@masked00000 Год назад
You're literally the best tutor I have seen, I myself am a Data Scientist but the amount of data science approaches I learn from you is incredible, I started from your channel and always wait for you to post new video, Hat's off. Love from Pakistan.
@kennethstephani692
@kennethstephani692 Год назад
Great video, Keith!
@xxxihabxxx1
@xxxihabxxx1 8 месяцев назад
this took me a week to finish all coding questions, 10000% helped me alot to practice everything i learned in your previous pandas crash crourse. thanks
@niteshprajapat7918
@niteshprajapat7918 Год назад
You are gem ❤️ the way you explain concepts are at next level 🔥🔥
@troy671
@troy671 Год назад
Thanks for the video. It is great to see your thinking process even though you are not an expert in pandas.
@netanelmad
@netanelmad Год назад
Thanks for the video! Would love to see your approach to more non-coding questions specifically :)
@kumaripritika2799
@kumaripritika2799 Год назад
Really helpful video!
@ansekao4516
@ansekao4516 Год назад
Great video, please do more like that. Watching you for a long time
@Lnd2345
@Lnd2345 Год назад
Here's a one liner chained version I've come up with for coding #6 df = ms_user_dimension.merge(ms_acc_dimension, on = 'acc_id').merge(ms_download_facts ,on ='user_id').pivot_table(index = 'date',columns = 'paying_customer',values = 'downloads',aggfunc ='sum').reset_index().query('no > yes')
@arashomranpour5468
@arashomranpour5468 Год назад
good having you back
@KeithGalli
@KeithGalli Год назад
Good to be back! :)
@a.5214
@a.5214 Год назад
amazing! we want more of this stuff 👌
@KeithGalli
@KeithGalli Год назад
Appreciate it! More coming soon :)
@udayabhaskar1495
@udayabhaskar1495 Год назад
Thank you for this video!👍
@9eartheyes
@9eartheyes Год назад
great video! thank you!
@wiz8058
@wiz8058 Год назад
Great work man!! you're always doing the best.🔥🔥🔥
@KeithGalli
@KeithGalli Год назад
Thank you for the support as always!!
@mekuzeeyo
@mekuzeeyo Год назад
Thank you for coming back🤗
@KeithGalli
@KeithGalli Год назад
Happy to be back!!
@user-xj9re7gv5g
@user-xj9re7gv5g 2 месяца назад
It is very great. Thank You!
@phoenixcollege6608
@phoenixcollege6608 Год назад
makes it easy to understand watching your vid on a friday night and these are the best years of my young life
@expat2010
@expat2010 Год назад
I really enjoy the real world feel of your videos. Probably now ChatGPT would be a lot faster than searching Stackoverflow or the Pandas docs for those things that one doesn't know by heart.
@zanerios2776
@zanerios2776 Год назад
really love the style and format of vid, just subbed
@KeithGalli
@KeithGalli Год назад
Glad you liked it man! Thanks for the sub
@n_12346
@n_12346 Год назад
Brilliant video! very helpfil
@bobbyg603
@bobbyg603 Год назад
Glad you're back bro!
@KeithGalli
@KeithGalli Год назад
thanks brother!!
@ranjithraghunathan1267
@ranjithraghunathan1267 Год назад
Thanks Keith
@wahaha108
@wahaha108 Год назад
long time no see keith, welcome back 😀😀
@iamfavoured9142
@iamfavoured9142 Год назад
Welcome back Keith 💃🏻💃🏻
@iamTHIEN013
@iamTHIEN013 Год назад
Hi Keith , Thank you so much for these videos, could you make more videos about power PI or Tableau, really really appreciate it .
@pratikpawar336
@pratikpawar336 Год назад
great video, please make more video like this
@prof_albert
@prof_albert Год назад
That was great. Bravo and all of your videos are awesome 🌺👌💞🤩💪
@DendrocnideMoroides
@DendrocnideMoroides Год назад
yes please make more videos like this
@mehdismaeili3743
@mehdismaeili3743 Год назад
excellent, thanks.
@KeithGalli
@KeithGalli Год назад
You're welcome :)
@finnnelson5472
@finnnelson5472 Год назад
TY :)
@phsopher
@phsopher Год назад
For the fifth problem, pandas has an in-built percentage difference method (pct_change). The solution could be as follows for example: sf_transactions['year_and_month'] = sf_transactions.created_at.dt.strftime("%Y-%m") monthly_revenue = sf_transactions.groupby(["year_and_month"]).sum().reset_index() monthly_revenue['pct_change'] =(monthly_revenue.value.pct_change()*100).round(2) monthly_revenue[['year_and_month','pct_change']]
@KeithGalli
@KeithGalli Год назад
Oh cool, I didn't know that! Thanks for sharing :). Nice solution 🤠.
@drakkarleon
@drakkarleon Год назад
Yeeeeeeeeyyy!!!! i love your enthusiastic cry of success :D 26:31
@AIdevel
@AIdevel Год назад
The problem lays in your use of round function you supposed to wrap the equation with round and then select the decimals 2
@jovanjanjic9029
@jovanjanjic9029 11 месяцев назад
In question #3 Counting Instances in Text you should add filters=re.I to account for capital letters: len(re.findall(r'\bbull\b', text, flags=re.I)))
@jovanjanjic9029
@jovanjanjic9029 11 месяцев назад
Great video btw!
@user-if1dj7fy2y
@user-if1dj7fy2y Месяц назад
Bravo 👏 Lit 🌠 Impressive 👌 ❤ Gratitude 🥳 for your satisfactory Work 💪🚀💯💪
@user-zm6kj7oi3d
@user-zm6kj7oi3d Год назад
you are helping a high schooler out by being back
@KeithGalli
@KeithGalli Год назад
More videos coming soon :)
@fcoatis
@fcoatis Год назад
Great video Keith. I just got curious how you comment a block of code?
@kinghezzy
@kinghezzy Год назад
Highlight and ctrl+/
@RahmanIITDelhi
@RahmanIITDelhi Год назад
Hey ,Keith ..Can we access library during the solving at real time exam?
@fantasyxpress7966
@fantasyxpress7966 11 месяцев назад
Is dsa important for data scientists too keith
@anonviewerciv
@anonviewerciv Год назад
That first one and others are SQL problems converted to pandas. I suppose that's a decent way to get basic pd questions. (28:48) 17:20 I know it's more a reference to the stock market terms, but I can't stop thinking of Fallout: New Vegas. 1:11:00 If you have the locations that's just a simple matter of putting it on a map and seeing where it clusters the most. 1:28:00 Context, context, context. Was that the only reduction?
@konstantinpluzhnikov4862
@konstantinpluzhnikov4862 Год назад
These stratascratch tasks could be solved in sql. The site provides this option.
@ranjithraghunathan1267
@ranjithraghunathan1267 Год назад
how can i download or copy the raw dataset for each part ?
@AIdevel
@AIdevel Год назад
Replace yes with 1 and no with zero and sum them
@MikeResurrected
@MikeResurrected Год назад
Could you actually google for help during a DS coding interview nowadays?
@vanshmalik1446
@vanshmalik1446 Год назад
Hey! Does anyone knows more of the data analysis pay after placement programs accepting applications all over the globe?
@balakumar.n4891
@balakumar.n4891 Год назад
super
@manphu2515
@manphu2515 Год назад
Thanks so much for the video, learn a lot from you. And you are super cute 😍
@YunusFidan_
@YunusFidan_ Год назад
Noice!
@konstantinpluzhnikov4862
@konstantinpluzhnikov4862 Год назад
LifeHack: if you are short of money, but want to use a service, use vpn of relatively poor country. Result will be interesting.
@DendrocnideMoroides
@DendrocnideMoroides Год назад
did you ever use it? and on which website?
@jovanjanjic9029
@jovanjanjic9029 11 месяцев назад
Your solution for the Probability of Having a Sister question is not correct. We know for sure that the random girl must be from the [1, 2, 3, 4] part of the dataset, which amounts to 0.7. We should divide the probabilities for 1, 2, 3, 4 with 0.7, to get the probabilities that the girl is from each of these families. She theoretically can't be from families with 0 and 5 children. Essentially, you are counting in the possibilities of she being in families 0 and 5, even tough it's impossible. (In practical terms, you are needlessly being blind about the info you already have.) So the correct solution is: 0.25/0.7 x 0 + 0.2/0.7 x 0.5 + 0.15/0.7 x 0.75 + 0.1/0.7 x 0.875 = 0.42857, which is 0.43 when we round it up.
@meujie8835
@meujie8835 Год назад
Hi, I'm Jiemeu and I love your channel. I hope to discuss business cooperation with you.....
@doulaishamrashikhasan8425
@doulaishamrashikhasan8425 Год назад
you disappeared again 😢
@KeithGalli
@KeithGalli Год назад
My apologies! I have a video that I'm finalizing the editing for. It should be out in the next 3-4 days and then I'm going to try to be more consistent!!
@YaIdcReportMe
@YaIdcReportMe 29 дней назад
Probably not a good use of your time to watch this guy struggle with coding questions for over an hour
@ratkillerthe
@ratkillerthe Год назад
I solved the Bathrooms/Bedrooms problem with: cols_of_interest = airbnb_search_details[['city', 'property_type', 'bathrooms', 'bedrooms']] property_results = cols_of_interest.groupby(['city','property_type']).agg( avg_bathrooms = ('bathrooms', 'mean'), avg_bedrooms = ('bedrooms', 'mean')).reset_index()
Далее
Мама хитрая😂​⁠​⁠@ladymilanapap4610
00:16
Et toi ? Joue-la comme Pavard ! 🤪#shorts
00:11
Просмотров 1,6 млн
Astrophysicist reacts to BATTLESTAR GALACTICA
12:37
Просмотров 13 тыс.
Exploratory Data Analysis with Pandas Python
40:22
Просмотров 436 тыс.
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
Solving real world data science tasks with Python Pandas!
1:26:07
Мама хитрая😂​⁠​⁠@ladymilanapap4610
00:16