Тёмный

25 Nooby Pandas Coding Mistakes You Should NEVER make. 

Rob Mulla
Подписаться 182 тыс.
Просмотров 271 тыс.
50% 1

In this video I go over my list of 25 mistakes commonly made my beginners learning pandas in python. Pandas is a great tool, but there are some pitfalls to avoid!
Shoutout to mCoding who inpired the idea for this video! / mcodingwithjamesmurphy
Follow me on twitch for live coding streams: / medallionstallion_
My other videos:
Speed Up Your Pandas Code: • Make Your Pandas Code ...
Intro to Pandas video: • A Gentle Introduction ...
Exploratory Data Analysis Video: • Exploratory Data Analy...
Working with Audio data in Python: • Audio Data Processing ...
Efficient Pandas Dataframes: • Speed Up Your Pandas D...
* RU-vid: youtube.com/@r...
* Discord: / discord
* Twitch: / medallionstallion_
* Twitter: / rob_mulla
* Kaggle: www.kaggle.com...
#python #pandas #datascience

Опубликовано:

 

27 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 515   
@viewsfromthechris7810
@viewsfromthechris7810 2 года назад
I need to implement the chaining methods and using functions into what I do, much easier to use and read. Great video as always.
@robmulla
@robmulla 2 года назад
Totally. Just those two things alone are huge! Glad you enjoyed the video.
@texloch1401
@texloch1401 Год назад
Usually these videos address REALLY nooby mistakes that any general programmer already avoids. THIS video however ACTUALLY addresses library FUNCTIONALITY and discusses the tools that a programmer may be unaware of to increase readability and efficiency. Rob, my good sir, you just earned a sub.
@robmulla
@robmulla Год назад
So happy to have you as a sub. Even happier to read such a kind comment. Looking forward to more videos like this in the future.
@murphygreen8484
@murphygreen8484 Год назад
I couldn't have said that better myself. I am self taught and definitely learned new tricks here. You have also earned my sub!
@javiercmh
@javiercmh Год назад
Same! I had no idea many of these even existed!!
@xinaesthetic
@xinaesthetic Год назад
@@robmulla I am a fairly experienced programmer, not so much with Python, but I have a few things I might want to use Pandas for at some point and this has given me a bit of a taste for features that I look forward to trying.
@akmalmir8531
@akmalmir8531 2 года назад
00:18 #1. Writing into csv with unnecessary index 00:53 #2. Using column names which include spaces 01:25 #3. Filter dataset like a PRO with QUERY method 01:44 #4. query strings with(@ symbol) to easily reach variables 02:07 #5. "inplace" method could be removed in future versions, better explicitly overwrite modifications 02:35 #6. better Vectorization instead of iteration 03:01 #7. Vectorization method are preferable than Apply method 03:30 #8. df.copy() method 04:08 #9. chaining formulas is better than creating many intermediate dataframes 04:28 #10. properly set column dtypes 05:01 #11. using Boolean instead of Strings 05:25 #12. pandas plot method instead of matplotlib import 05:45 #13. pandas str.upper() instead apply and etc 06:10 #14. use data pipeline once instead of repeating many times 06:41 #15. learn proper way of renaming columns 06:59 #16. learn proper way of grouping values 07:31 #17. proper way of complex grouping values 08:01 #18. percent_change or difference now could be implemend with function 08:25 #19. save time and space with large datasets with pickle,parquet,feather formats 08:58 #20. conditional format in pandas(like in Microsoft Excel) 09:22 #21. use suffixes while merging TWO dataframes 09:48 #22. check merging is success with validation 10:13 #23. wrapping expression so they are readable 10:33 #24. categorical datatypes use less space 10:55 #25. duplicating columns after concatenating, code snippet
@robmulla
@robmulla 2 года назад
Thanks for making this!
@akmalmir8531
@akmalmir8531 2 года назад
@@robmulla i wish i commented better as English is not my native language, Thank You for bringing us Valuable Tutorials that saves us our time and energy! I wish i helped and learned from you more
@kongson14
@kongson14 2 года назад
egg bro
@PGhai
@PGhai Год назад
thanks, I like no 4
@bonumonu5534
@bonumonu5534 Год назад
This needs to be pinned
@vishnurj6207
@vishnurj6207 Год назад
Please keep doing this. No additional jargon, crisp, straight to the point explanations are what are required. No body needs a 10 hour tutorial. Thank you for this.
@robmulla
@robmulla Год назад
I'll try my best! I do like trying to cram a ton of information into a short format, but these videos take a while to create. I totally copied the format from mmcoding (check out the channel if you haven't already)
@jelmermulder7276
@jelmermulder7276 Год назад
I thought I was pretty good in Pandas, but you gave me so many new things to improve. HUGE thank you!
@robmulla
@robmulla Год назад
Glad I could help! I'm constantly learning better ways to do things in pandas myself.
@olyaagapova989
@olyaagapova989 Год назад
I was thinking that I was pretty bad, but surprisingly I usually only make 2 mistakes from the video (which is a cool chance to improve). I just love such videos because not only they help to improve your skills, but also to be realistic about your expectations and ambitions. Thanks for the video, Rob!
@DeadLine171
@DeadLine171 Год назад
I have been working 2 years now with pandas and I can strongly affirm that I have made like 70% of those bad practices, appreciate a lot your video!
@robmulla
@robmulla Год назад
Thanks for commenting. Honestly I still make many of them to this day.
@DataCraftsman
@DataCraftsman 2 года назад
I feel personally attacked. Thanks so much for releasing this. I knew my code was bad, but not THIS bad.
@robmulla
@robmulla 2 года назад
Haha. With coding we all are learning and getting better every day. Me included. Thanks for watching!
@alberttu8120
@alberttu8120 Год назад
These are fantastic refactoring suggestions.
@singsinghai1505
@singsinghai1505 Год назад
The pandas query function does not outperform the loc method. In fact, it is sometimes much slower when your query/data is so big. We industry users will utilize the loc method for quick EDA. Query might be useful when you have a scheduled cron
@robmulla
@robmulla Год назад
Yea. Query isn’t for speed of processing but speed of writing the code.
@ryantakers
@ryantakers Год назад
I'm currently working on my first major pandas project and I reckon that I may have done around 15/25 of these 'mistakes'. Looks like I have some optimisation to do over the coming days!
@robmulla
@robmulla Год назад
We all have to start somewhere. I didn't learn many of these until I had been using pandas for years.
@ladiesperfume
@ladiesperfume 2 года назад
Wow dude! You are single handedly responsible for my data science growth. PLEASE keep making more of these videos I really appreciate it.
@robmulla
@robmulla 2 года назад
Wow! I love hearing feedback like this. I'll keep making videos if you all keep watching! :D
@iReaperYo
@iReaperYo Год назад
One of the best videos I've seen on Pandas! So glad someone prominent enough is advocating for method chaining and pandas methods!
@iReaperYo
@iReaperYo Год назад
The 'Query' method in particular is relatively unknown. In conjunction with not using 'snake case' this leads to beginners being very inefficient at code due to not being able to use dot syntax I am just an intermediate level so I can relate to many of these mistakes. It goes as deep as university however. They do not teach clean, efficient code at all!
@robmulla
@robmulla Год назад
Glad you enjoyed it! I confess I don't use chaining nearly as much as I should.
@kaymaqsood8920
@kaymaqsood8920 10 месяцев назад
Rob, thank you for all the time and energy you have put in for us. Would appreciate an updated video on "Exploratory Data Analysis" may be expanding on your year old one. Thank you again!
@leaky3955
@leaky3955 Месяц назад
I had no experience with Pandas before joining a team where I need to work with it a lot. Have been learning as I go and it feels like the perfect time to see this video. I have enough time under my belt to have made or inherited code with many of these mistakes. With that context, I absorbed so much from what you shared. Thank you for helping me improve. I’m excited to refactor and apply what I learned!
@magisterumbrae
@magisterumbrae Год назад
This can be, some of my first times commenting in youtube after years of usage. This video was INCREDIBLY USEFUL! There's a lot of my previous team members did on scripts and sometimes are complicated to maintain or create new ones following the same logic. This covers exactly what they used and what is the best option to rewrite it and make it more understandable. Thank you so much for this godly information.
@robmulla
@robmulla Год назад
You're very welcome! I really appreciate the positive feedback. I’ll try to keep making helpful videos like this. Share with your friends in the meantime!
@FrocketGaming
@FrocketGaming 2 года назад
This video rocked me. I've been using python for a few months and watching this video made me bust out my laptop so I could try all of these items out. Thank you for this.
@robmulla
@robmulla 2 года назад
So glad you found it helpful. Share with a friend!
@NERGYStudios
@NERGYStudios Год назад
Learned more about Pandas in this video than a whole many videos worth hours combined. Seriously, thank you.
@popnitro
@popnitro 2 месяца назад
I've had little to no formal training. These tips are amazing and concise. Thank you so much.
@krmunoz2169
@krmunoz2169 Год назад
Dude I've worked with pandas for 7 years and learned some new tricks, thanks a lot!
@robmulla
@robmulla Год назад
Great to hear! You've been working with it longer than I have. Please share my channel with any friends you think might also learn from it.
@joaomurilopalonefauvel2843
@joaomurilopalonefauvel2843 2 года назад
Matt Harrison's "Effective Pandas: Patterns for Data Manipulation" is one of the best resources I've read on idiomatic pandas.
@robmulla
@robmulla 2 года назад
I really need to get myself a copy! He knows his stuff for sure.
@MrEo89
@MrEo89 Год назад
He has a great video (series?) on effective pandas also!
@lord_voldemort44
@lord_voldemort44 4 месяца назад
ty i will look into this book
@alexanderreznik1700
@alexanderreznik1700 2 года назад
I used the Pandas lib more then 2 years, but today I learned something new! Thank you, man!
@robmulla
@robmulla 2 года назад
Glad you learned something new! Share with anyone else you think might appreciate it!
@我想想-e5d
@我想想-e5d Год назад
The space need to be avoid part is so true! But wait a second, every time I face the space but not underscore is from others data, so I think what we actually need is how to deal with the space condition.(Which is a pain of journey)
@Saareem
@Saareem Год назад
Maybe rename all the columns with versions without a space. Like, you replace all the spaces with an underscore. df.rename can take dictionaries or even a mapper function so this is easy to do. Using a dictionary is preferable as you can just reverse map it, if you want to use the columns with spaces in them in the end.
@robmulla
@robmulla Год назад
Good point. In most cases to can be done with a list comprehension one liner!
@JakeStetter-wo6jr
@JakeStetter-wo6jr 5 месяцев назад
Really enjoyed how fast this content came. I felt like it was a great speed to keep me engaged. I usually find these types of videos boring.
@scottbrewer474
@scottbrewer474 2 года назад
Found lots of favorite annoyances and learned a few new tricks! I'll add a shout-out to the ".pipe()" method to allow for wrapping all your transforms in a single statement when a single .method can't cover the required transform. An added bonus of "pipe()" - since it's using user defined functions to do the transforms, you can add decorators to automatically print out metadata on the resulting transform steps to get a quick insight into potential bugs.
@robmulla
@robmulla 2 года назад
Oh. Great one. I forgot to add pipe and assign in this video but wish I did.
@digitsphinx
@digitsphinx 10 месяцев назад
oh wow the quality and clarity is worth subscribing! thank you !
@JustCrateIt
@JustCrateIt Год назад
I can't believe how good this video is. I love your no-nonsense delivery; I don't have time at work to watch a 4-hour "intro" video. Keep it up!
@efi3825
@efi3825 9 дней назад
Oh man, I am making so many of these mistakes. Honestly, this is a great checklist to improve my clean coding.
@Lewstars
@Lewstars 2 года назад
I really think this should be written up in a medium blog article. Would be awesome to refer to.
@robmulla
@robmulla 2 года назад
That’s a good idea. I really want to make blogs for all my videos but I don’t have the time. Maybe someday
@AlexTheAnalyst
@AlexTheAnalyst Год назад
I was genuinely worried I was making noob mistakes in Pandas...
@robmulla
@robmulla Год назад
😂 Hey Alex! Now I'm dying to know... did you have any reason to be worried?
@dedoseis
@dedoseis 10 месяцев назад
Dear Rob, I'm a total beginner in Python and Pandas. From what I understand, the warning at 3:30 is not about making a copy of sliced data, but rather about not using the .loc method and using "direct assignment" for columns (or whatever it's called). I could be wrong, but this is what I've gathered from reading the documentation and encountering a similar warning in my code. Thanks for your valuable content. It has been a great help
@haskellbear
@haskellbear 5 месяцев назад
I'm an experienced developer looking to get familiar with Pandas. I found this video very valuable.
@emily2e2e
@emily2e2e Год назад
This is awesome, I’ve been wanting to know what are the better ways to write my code and why. Please continue to make these videos.
@robmulla
@robmulla Год назад
Wow! Thanks so much Emily. Really apprecaite the feedback and super thanks!
@hwlee03a
@hwlee03a Год назад
Oh god. I clicked on this video just to confirm that this is one more overly exaggerated self-confident dude trying to teach newbies of 2 weeks experience. After watching this, this is god damn life changing. As an engineer focusing on fluid dynamics and floater response, I use pandas daily basis. Out of 25, I didn’t know approximately 20. Every single person who has any plan to use pandas must watch this. Awesome!
@protohale
@protohale 2 года назад
I'm so guilty of number 8! Thank you for this!
@robmulla
@robmulla 2 года назад
I’ve made every one of these mistakes at some point so I know how you feel. Thanks for watching!
@julsmanbr8152
@julsmanbr8152 Год назад
Awesome stuff. I've been using pandas for over 4 years, but it never occurred me to start using the query method instead of loc (despite me finding it tiresome to keep repeating "df" all over the place when using loc). I also appreciate the quick format. You see RU-vidrs taking too long to say nothing at all, so congrats on actually going through 25 tips in 10 minutes. You got yourself a sub!
@santiagoperman3804
@santiagoperman3804 2 года назад
I do several of these and never imagined Pandas has styling. Time to rewrite and share with my peers.
@robmulla
@robmulla 2 года назад
My mind was blown when I found out about the styling and I use it a lot now. Please do share with others who you think might find this helpful.
@omagodourado
@omagodourado Год назад
This video made me realize i have still a long road ahead in Pandas. Thanks! Just subscribed ;D
@robmulla
@robmulla Год назад
Thanks for the sub! We all start somewhere, but you'll pick it up quickly in no time.
@TravisGore-ep4yk
@TravisGore-ep4yk Год назад
I can't believe I watched this whole video and only 2 of them were things I didn't know about! Thank you for sharing!
@checher100
@checher100 Год назад
Awesome video! I work with Pandas for +3 years and learned a lot here! Thanks
@robmulla
@robmulla Год назад
Happy to hear it. Tell your friends!
@julians.2597
@julians.2597 2 года назад
9:13 another pet peeve, though this one is more important than the last one. Do not use backslashes. ever. well, not _never_, use them when writing a `with` statement with more than two context managers. But otherwise, don't. I'll quote the `Black` (the formatter) documentation: Backslashes and multiline strings are one of the two places in the Python grammar that break significant indentation. You never need backslashes, they are used to force the grammar to accept breaks that would otherwise be parse errors. That makes them confusing to look at and brittle to modify. This is why Black always gets rid of them.
@robmulla
@robmulla 2 года назад
Good point. Then backlashes are and old habit I’ve been trying to stop use. We all are learning constantly!
@narudh
@narudh Год назад
some great tips here. i usually chain with \ and i didn't know a query method exists!! guess you learn everything new all the time!
@robmulla
@robmulla Год назад
Glad you learned something new! Cheers.
@YassFuentes
@YassFuentes Год назад
Hey, Rob! Super video this one. I myself am Sr. DS working each day intensively with pandas, I will implement many of the tips you show! Thanks a million :)
@robmulla
@robmulla Год назад
Awesome to hear! I'm still learning new tricks with pandas every day.
@DiegoRamirez-kv3fq
@DiegoRamirez-kv3fq Год назад
When you mention the slice warning sometimes you don't care about the original data frame so it doesn't matter if you modified it
@robmulla
@robmulla Год назад
That’s true. But I don’t like seeing the warnings. And if you don’t need the rest of the data you can just overwrite it with the slice?
@TimoTalksTech
@TimoTalksTech Год назад
found your channels few days ago and man you have some epic content . The noob mistakes here are the exact way most tutorials teach you..just wondering why the hell the non noob ways are not taught as they are easier and shorter and the syntax makes more sense... thank you for this video
@robmulla
@robmulla Год назад
Glad you like them! I’m trying to continue to make more stuff like this so keep watching!
@nickolastradess
@nickolastradess 2 года назад
Have not, and will not make any of these mistakes because I’ve seen your “A Gentle Introduction to Pandas Guide” !!
@PMU004
@PMU004 2 года назад
Trueeeeee
@robmulla
@robmulla 2 года назад
Love it! Thanks nick.
@garyfritz4709
@garyfritz4709 2 года назад
Where, please? I found your twitter feed, and lots of “gentle introductions” from other people, but not yours.
@robmulla
@robmulla 2 года назад
@@garyfritz4709 here is the link ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-_Eb0utIRdkw.html
@garyfritz4709
@garyfritz4709 2 года назад
@@robmulla Aha. I was googling out on the web, and it didn't find THAT video in YT. Merci!
@thaynangamarano3340
@thaynangamarano3340 Год назад
I started to watch your videos recently, and from now on I'm doing the chaining and putting each function in "one row" to make the data cleaner, and also, the query method, so powerful and simple, I was used to replicate the dataframe with the column and value searched to filter my df. You are boosting my studies! Thanks for that!
@artemissrijan473
@artemissrijan473 2 месяца назад
This video is too damn good, I would love to find more videos like this.
@spaceyfounder5040
@spaceyfounder5040 Год назад
Oh man, that guide is pro! Thanks, gonna apply all of that when refactoring my project!
@robmulla
@robmulla Год назад
Glad it helped! Tell a friend!
@shivangagarwal8332
@shivangagarwal8332 2 года назад
Excellent points! Learned new stuff that a lot of tutorials don't explicitly teach.
@robmulla
@robmulla 2 года назад
Glad it was helpful! Thanks for watching and please share with others.
@SamusUy
@SamusUy Год назад
Regarding the 'inplace' comment at 02:07 there's a very valid and very useful reason to prefer that and it's memory usage. `df = df.reset_index()` or anything similar creates an entire copy of the dataset before replacing it with the original and for extremely big data that is a problem, it may get over the physical memory available and have the OS kill the script.
@robmulla
@robmulla Год назад
Interesting. I’ve heard this but then also thought it was debunked. I think the fact that the pandas core developers want to remove inplace gives good reason to try and avoid using it.
@SamusUy
@SamusUy Год назад
@@robmulla I guess it's more "functional style" to do it like they want but I recently had this problem with the memory when creating copies and I solved it by using 'inplace' (Python 3.7 and Pandas 1.3.5 if it matters)
@robmulla
@robmulla Год назад
@@SamusUy good to know!
@nikhhiilreddi1371
@nikhhiilreddi1371 2 года назад
Extremely underrated channel Extremely helpful
@robmulla
@robmulla 2 года назад
Thanks Nikhhilil!
@fizipcfx
@fizipcfx 2 года назад
This video is literally a gem
@robmulla
@robmulla 2 года назад
Glad you liked it Fizip. Hopefully you learned a thing or two you that will help you write better code!
@fizipcfx
@fizipcfx 2 года назад
@@robmulla Thank you for your reply. I am thankful for your content.
@djangoworldwide7925
@djangoworldwide7925 Год назад
As an R user, I would've never thought of doing all these mistakes. R is naturally a vectorized language. Use R.
@robmulla
@robmulla Год назад
R indeed is good for what it is. I'm more of a python fan because of everything that you can do with machine learning and non-data related tasks.
@piotrkulinski922
@piotrkulinski922 4 месяца назад
OMG! I had to rest after first 10. So huge dose of information. Thanks.
@bendirval3612
@bendirval3612 2 года назад
Oi! There were several of those I didn't know. I wouldn't have thought I was a noob, but I guess we all have a bit of that in us. Thanks for the video!
@robmulla
@robmulla 2 года назад
Glad you learned something new. I find I’m always learning something new with python and data science. That’s why I love it so much.
@avirajankitjain256
@avirajankitjain256 2 года назад
Dude, Amazing video apparently clear the concept.
@robmulla
@robmulla 2 года назад
Glad you think so! Share with your friends!
@gregglind
@gregglind 2 года назад
Releasing a notebook showing all these tips would be a great benefit to the community. The `.style()` trick at @9:18 is amazing.
@robmulla
@robmulla 2 года назад
If this video gets 100k views I’ll share the notebook cringe 😬!
@peterappel9154
@peterappel9154 7 месяцев назад
@@robmulla It currently has 241k views 😉
@Singularitarian
@Singularitarian Год назад
Very illuminating video! I learned a lot quickly.
@robmulla
@robmulla Год назад
Thanks for the feedback Daniel!
@flusyrom
@flusyrom 2 года назад
Wow, very useful - a true "tour de force" for better Pandas code. THX for this !
@robmulla
@robmulla 2 года назад
Glad it was helpful! Please consider sharing it with anyone else you think would benefit from watching.
@erikyoung5139
@erikyoung5139 Год назад
I wish I had this video 6 years ago. Thank you.
@robmulla
@robmulla Год назад
Glad you found it helpful!
@artemaleksandrin7582
@artemaleksandrin7582 Год назад
Fun fact from the *query* method that wasn't mentioned here. In *query* you actually (!) Reference to pandas columns. So you can do something like this : `df.query('Name.isna()')` - to query *NaN* containeings `df.query('Name.str.contains("John")')` - to filters all rows when Name containing John And even something crazy like `df.query('Price.rolling(7,1).mean() > Price.mean()')` to take rows that rolling means more than average
@robmulla
@robmulla Год назад
🔥 great tips! Almost needs a video specifically on this.
@CodeEmporium
@CodeEmporium Год назад
Nice video! I have been using pandas for years and still run into these issues :)
@robmulla
@robmulla Год назад
Thanks! Glad you enjoyed the video. I really enjoy your videos too.
@ClearVista
@ClearVista 2 года назад
Learned tons with this. Short and succinct. New subscriber.
@robmulla
@robmulla 2 года назад
Thanks for subscribing!
@jjhendriks8652
@jjhendriks8652 Год назад
Merge validator! Excellent thanks!
@robmulla
@robmulla Год назад
👍
@mschuer100
@mschuer100 Год назад
Rob, as always, fantastic video. I have to admit, i get caught on some of those mistakes so it is great to have you point out and make suggestions on how to correct them. Thanks for sharing. Much appreciated.
@robmulla
@robmulla Год назад
I fall into these a lot too! We can all get better, glad you found the video helpful.
Год назад
At 6:23 (#14) you're returning the dataframe, but you're also modifying it in place. Having a return there gives the impression that the original dataframe isn't modified, specially if you also assign it to itself later. It ties back to #5.
@aspeno5613
@aspeno5613 Год назад
Thanks Rob! I just made my first Kaggle notebook and I think I made all 25 of these mistakes 😂
@РоманЩурко-х4й
@РоманЩурко-х4й Год назад
great video! wanted to add on #7, may be someone would find that helpful: in case you need to apply some function to a several values in a row, one of the fastest solution is numpy.vectorize smth like: def divide(num, denom): if denom == 0: return 0 else: return num / denom so instead of doing df["div"] = df.apply(lambda row: divide(row["value1"], row["value2"]), row=1) you go with df["div"] = np.vectorize(divide)(df["value1"], df["value2"])
@robmulla
@robmulla Год назад
Great tip! np.vectorize can be really handy. I think your example could be vectorized without having to use it though.
@РоманЩурко-х4й
@РоманЩурко-х4й Год назад
@@robmulla yeah) just couldn't come up with anything else))
@MariaSaleem-gi4uj
@MariaSaleem-gi4uj 6 месяцев назад
As a beginner this video made me learn some basic concept about pandas. thanks
@b16ftw
@b16ftw Год назад
lots of good info! thank you!
@robmulla
@robmulla Год назад
Glad you learned from it!
@rafaelcaballeroroldan9582
@rafaelcaballeroroldan9582 Год назад
Thanks for the video!! A small comment about number nine, creating multiple intermediate dataframes. I understand that this can be costly in terms of memory, but I also think it can be nice for debugging and understanding during the development phase. Moreover, using the same name 'df' once and another can be prune to errors if you have different operations in different cells and you are 'playing' skipping some of them to see the effect, because you don't know which 'df' are actually taking as input.
@robmulla
@robmulla Год назад
Good point! It really depends on what you're doing and the time it takes to develop sometimes is more important than the code itself. However, once you are done debugging then changing it to using chaining methods is typically preferred.
@adrianmuresan7764
@adrianmuresan7764 2 года назад
Thank you! The .diff method is a lifesaver when computing velocities. The advice on not using inplace is excellent i got into various troubles because of it but i thought that's what the "experienced guys" do.
@robmulla
@robmulla 2 года назад
Thanks for watching. inplace is very tricky. Diff method is really powerful, and there are parameters you can use within it depending on your use case.
@joeymea
@joeymea Год назад
1. Using Pandas. It has always been more trouble than it's worth - at least in 90% of the places I encounter it at my job.
@robmulla
@robmulla Год назад
Interesting. What do you prefer?
@karlduckett
@karlduckett 10 месяцев назад
This was great! Just what I needed :)
@smiley-wu1kn
@smiley-wu1kn 2 года назад
This is amazing! Thanks a lot.
@robmulla
@robmulla 2 года назад
Glad you like it!
@dimasamchuk4733
@dimasamchuk4733 Год назад
the last 5 were cool! thank you
@robmulla
@robmulla Год назад
Glad you found them helpful.
@sayantanghosh6619
@sayantanghosh6619 Год назад
I loved this to s be to my students. You did a great job in a short video!
@robmulla
@robmulla Год назад
Thank you so much! It's hard to make it short but is worth it in the end.
@buraktiras93
@buraktiras93 Год назад
Rob, you're an amazing person
@robmulla
@robmulla Год назад
Thanks!
@HitAndMissLab
@HitAndMissLab Месяц назад
phenomenal video. real learning accelerator.
@shreyaroraa2234
@shreyaroraa2234 Год назад
Great video for new users not knowing tips and tricks.. Wish you shared the code also to keep it handy for reference
@robmulla
@robmulla Год назад
Thanks for watching. I don’t think I kept the code unfortunately
@reneulloa2647
@reneulloa2647 Год назад
Rob, amazing video and intuitive. Happy to subscribe!
@willykitheka7618
@willykitheka7618 Год назад
Hey Rob! You got me on that one right off the bat! I write a file to csv and when I load it back in, I get an 'unnamed' column and I wonder why....then I have to drop the column. 🤐Unnecessary work! Thanks a heap!
@robmulla
@robmulla Год назад
That's good to hear that you learned something new only a few seconds into the video :D - if you enjoyed it please share it on social or with any friends who might learn from it.
@siqueirapaty
@siqueirapaty Год назад
Great video. Thank you for being so direct and giving us valuable tips ☺
@robmulla
@robmulla Год назад
Glad you liked it! Thanks for giving feedback. Share the video with anyone else you think might also like it.
@richjaxxon
@richjaxxon 2 года назад
Great video. Very helpful. Please keep making more like this
@robmulla
@robmulla 2 года назад
Appreciate that. I plan to!
@Handsdownification
@Handsdownification Год назад
#9 : I dont wanna, i like having legible transformations, but i'm aware its saving memory
@robmulla
@robmulla Год назад
It all depends on what you're trying to achieve so I can respect that.
@SolidBuildersInc
@SolidBuildersInc Год назад
You have turned my Dictionary into a Pamphlet 😂😂😂 Thank you
@goldmemeber2000
@goldmemeber2000 2 года назад
I feel like such an idiot, I wasted so much time writing inefficient code for my papers, if only I had known these. Better late than never though, thanks for the vid!
@robmulla
@robmulla 2 года назад
Never feel like an idiot when you learn something new! I learned many of these recently too. Thanks for watching!
@deepakramani05
@deepakramani05 2 года назад
Another awesome, useful video, Rob. Thank you.
@robmulla
@robmulla 2 года назад
Thanks for watching Deepak!
@werneckpaiva
@werneckpaiva Год назад
Very useful! Thank you for sharing in such an easy and agile way.
@robmulla
@robmulla Год назад
Hey! Glad you learned something. Appreciate the feedback!
@nishb9567
@nishb9567 2 года назад
Great Video! I think its important to add that the Pandas Vectorization doesn't always mean the code will run faster. In particular for the case of working with string data types it can sometimes be slower (even if it looks cleaner).
@robmulla
@robmulla 2 года назад
Good point! I didn't know that was the case. Do you have an example where vectorization is slower? I'd love to give it a look. Sometimes it's worth giving up a little bit of speed for readability. query is slightly slower than .loc but I prefer the former.
@MagnusAnand
@MagnusAnand 2 года назад
@@robmulla a couple of months ago I found an article that showed an example where vectorization wasn’t faster. If I found it I’ll post it here
@gabrielcosta4513
@gabrielcosta4513 2 года назад
Great video! I also like the jazz bass behind you, I also play bass :)
@robmulla
@robmulla 2 года назад
Awesome! I’m more of a guitar player but I also enjoy playing bass.
@batman9937
@batman9937 Год назад
apply method is also a verctorisation function
@robmulla
@robmulla Год назад
Is it? I think it depends. Vectorized functions apply across the entire series at once.
@batman9937
@batman9937 Год назад
@@robmulla you're right but I see it classify as such generally. in my experience, it's much faster than looping. i assume there's not much overhead in looping.
@matthewcaron6446
@matthewcaron6446 Год назад
Great video. Lots of operations and procedures that are helpful for effective coding. Would be really helpful to have a cheat sheet linked for easy reference.
@drlirankatzir
@drlirankatzir Год назад
Thanks for this great videos. One minor comment in number 14, to "pipe" and chaining several data transformation, one can use `df.pipe(process_data).pipe(yet_another)`.
@robmulla
@robmulla Год назад
That’s a good add. I’m not very comfortable with pipes. I need to learn and stop being a noob! 😂
@swannschilling474
@swannschilling474 5 месяцев назад
Thanks for this great tutorial!!
@Dongnanjie
@Dongnanjie 8 месяцев назад
Love it. Thank you!
@alexwalker4642
@alexwalker4642 2 года назад
I have been using the vectorised notation purely due to it requiring less syntax lol. But good to know its faster.
@robmulla
@robmulla 2 года назад
Yep! It can be a lot faster.
@davidcorona644
@davidcorona644 2 года назад
dot syntax is a noob way to do things. the correct way is always to use brackets because more often than not, you're accessing data, not creating it. also, dot syntax makes things more confusing.
@robmulla
@robmulla 2 года назад
I agree. Columns with spaces are still bad too.
@slapfighting
@slapfighting Год назад
Wonderful, this video is super helpful
@robmulla
@robmulla Год назад
Glad you think so!
@slapfighting
@slapfighting Год назад
@@robmulla videos like these actually helps a data scientist’s sword to be sharper. Thank you for sharing it, hope to look for more advanced videos like these in future.
@trashantrathore4995
@trashantrathore4995 2 года назад
Great insights, thanks for these important tips
@robmulla
@robmulla 2 года назад
Glad you found them helpful. Share it somewhere on social you think people might learn from!
@junas4837
@junas4837 10 месяцев назад
Most of these always offer basic useless stuff that 99% already knows. This was not the case here! Well done mate.
Далее
Самая сложная маска…
00:32
Просмотров 962 тыс.
25 nooby Python habits you need to ditch
9:12
Просмотров 1,8 млн
Learning Pandas for Data Analysis? Start Here.
22:50
Просмотров 100 тыс.
Pandas Query Filter Function Guide [Beginner Friendly]
14:46
This Is Why Python Data Classes Are Awesome
22:19
Просмотров 807 тыс.
My top 25 pandas tricks
27:38
Просмотров 269 тыс.
Pydantic Tutorial • Solving Python's Biggest Problem
11:07
Make Your Pandas Code Lightning Fast
10:38
Просмотров 183 тыс.