Тёмный
No video :(

Introduction to Data Processing in Python with Pandas | SciPy 2019 Tutorial | Daniel Chen 

Enthought
Подписаться 67 тыс.
Просмотров 119 тыс.
50% 1

This is a tutorial for beginners on using the Pandas library in Python for data manipulation. We will go from the basics of how to load and look at a dataset in pandas (python) for the first time, and begin the process of preparing data for analysis. The topics covered are: - Load and look at slices and views of data - Groupby aggregates to summarize data - Tidy and reshape data - Write functions and apply them to data - Plotting data using Seaborn - Encode dummy variables to prepare for analysis and model fit - Fitting a model using sklearn By the end of this tutorial, you should have a solid foundation on working with datasets in Python. The last topic of encoding dummy variables segues into using other libraries, such as scikit-learn and statsmodels to fit models on your data.
Tutorial information may be found at www.scipy2019....
See the full SciPy 2019 playlist at • SciPy 2019: Scientific...
Connect with us!
*****************
/ enthought
/ enthought
/ enthought

Опубликовано:

 

21 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 73   
@aghileslounis
@aghileslounis 4 года назад
Daniel, best teacher in the world ! nothing is better than teaching with live examples it is very intuitive !
@MehdiZouaoui
@MehdiZouaoui Год назад
That was a long video but I managed to complete it. I liked the honesty of the guy and he was doing things on the go. Chapeau bas!
@zoexu3997
@zoexu3997 4 года назад
This is hands down the best panda tutorial I've ever watched so far. Thank you, Daniel:)
@Don_Modern_Ancestor
@Don_Modern_Ancestor 3 года назад
His Book Pandas for everyone is the best out there. Really in-depth.
@siddiqkhan246
@siddiqkhan246 3 года назад
This video explains Pandas so well. Great job Daniel, this is by far the best Pandas video on youtube.
@Barry_L
@Barry_L 3 года назад
Sweet! all these for freeeee.... I'm a true believer that information should be free and i say a BIG THANK YOU for this Daniel,
@xt.7933
@xt.7933 4 года назад
This is really awesome. I just started as an absolute beginner of coding, only finished Dojo's tutorial for the absolute beginner, and I am able to catch up with most of what you taught so far (1:39:00)!! Thank you!!!
@zzhou3894
@zzhou3894 4 года назад
Best Pandas tutorial so far I can find. Thanks.
@PP-im6lu
@PP-im6lu 2 года назад
I've watched bunch of Pandas tutorial videos and this is definitely the best one so far.
@zkinguk
@zkinguk 5 лет назад
Watched the entire video - really helpful stuff as a pseudo beginner.
@semrana1986
@semrana1986 4 года назад
one of the best tutorials on pandas
@vigneshpadmanabhan
@vigneshpadmanabhan 2 года назад
Best pandas tutorial… glad I found this talk.
@bbyum7618
@bbyum7618 3 года назад
Very useful class for understanding some basic aspect of pandas that is often not explained in other tutorials, Long data, applying functions to dataframes and using accessors. Thank you!!
@jmyable4
@jmyable4 3 года назад
that mitigated my pandas headache! Thanks!
@tonypendletoniii3209
@tonypendletoniii3209 4 года назад
@1:15:40 it is: ebola_long['cd_country'].str.split('_').str.get(0)
@marialaustsen9016
@marialaustsen9016 5 лет назад
Great video for beginners. Thanks for sharing.
@dhananjaywalunj3652
@dhananjaywalunj3652 3 года назад
Well explained ...Thank you Daniel.
@thegreatgreenpea835
@thegreatgreenpea835 4 года назад
It was very helpful and informative. Thank you very much for posting this video!
@hashimkhan4731
@hashimkhan4731 3 года назад
Nice tutorial indeed. Can you point out any such nice tutorial for beginners of ML?
@narendraful
@narendraful 14 дней назад
Great lecture ! Thanks I just have one doubt at 2:05:39 we use avg_2 function, but we did not need to vectorise it on the other hand avg_2_mod needed vectorisation. I can’t understand what is the difference between two functions… I.e. why does one need vectorisation and the other doesn’t for the same inputs ??
@AjayKumar-mh9um
@AjayKumar-mh9um 5 лет назад
Recommended for beginners
@rohscx
@rohscx 4 года назад
This is awesome. Thank you.
@rohitpurkait4046
@rohitpurkait4046 4 года назад
Sir at 1:15:45 , we need to call two str to get the desired value, Like, ebola_long['cd_country'].str.split('_').str.get(0)
@vijaypalmanit
@vijaypalmanit 4 года назад
true, I know it works by calling it twice but it does it make intuitive sense to call it twice.
@MouradBENKADOUR
@MouradBENKADOUR 2 года назад
Excellent, he forget to do it this time, but he did it in pyData conference in 2018 ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-iYie42M1ZyU.html
@vittorio8087
@vittorio8087 4 года назад
Great tutorial ,great Daniel :) thanks
@steveoshaughnessy3736
@steveoshaughnessy3736 3 года назад
Excellent tutorial. Very detailed. I have one gripe though. And it's not Daniel. EVERYONE/EVERY tutorial does this. They name their dataframe df. That's like naming your spreadsheet "spreadsheet" or "ss". Or naming a variable by it's datatype. No one ever names age as "i" or "int". They call their variables by the real world things they are. And a dataframe is a variable. DataFrames should be named like we name spreadsheets (their tabs) or database tables.
@rje4242
@rje4242 2 года назад
hungarian notation has a place in python. including the type in the name tells you what type it should be, though you need typechecking and asserts to guarantee that.
@hamoda510
@hamoda510 2 года назад
Thanks Daniel
@_asim_ktk
@_asim_ktk 4 года назад
@1:18 How would be I sure that the new columns corresponds to correct row?
@yasseralkindi7350
@yasseralkindi7350 3 года назад
is there a single place where we can find these datasets, like a shared drive perhaps? Would be good to follow along with that as well.
@shereenkhanzada7953
@shereenkhanzada7953 4 года назад
I have a query regarding running my python code in jupyter notebook. Sometime in the middle during running code, the cursor jumps to the next cell instead of running code. I have tried so many things e.g restart the notebook, rewrite code and so many but the same result. Can anybody help me regards this issue?
@puar6124
@puar6124 4 года назад
Check if your kernel shut off due to inactivity or something
@shereenkhanzada7953
@shereenkhanzada7953 4 года назад
@@puar6124 checked it too... but still the same :(
@mohammadghanatian114
@mohammadghanatian114 4 года назад
He was really a nice guy
@da_ta
@da_ta 4 года назад
excellent thanks you
@srinivasdasari6614
@srinivasdasari6614 4 года назад
At @2:10:49 you directly split the Series without using.. Str. Split('/'). How it split data frame Series. In previous example while splitting we use.. Str. Split. Pls explain
@woOpPerjr
@woOpPerjr 4 года назад
I think you're talking about the "function" example/question. so i'm not using str.split becuase that's how you use split in a pandas series. but we're writing a function that takes in a single string so we have direct access to the string methods becuase it's really regular "my_string".split("_") in base python. we then apply the function to our data.
@pavansaitadepalli6097
@pavansaitadepalli6097 5 лет назад
excellent
@sidhantmahipal9934
@sidhantmahipal9934 9 месяцев назад
Where can I access the datasets being used in this video?
@adds5257
@adds5257 4 года назад
I need to remember the syntax, while at the same time excel show you average value ,jus drag to your data , the average showed
@vijaypalmanit
@vijaypalmanit 4 года назад
yeah, but you cant automate any reporting in excel, with pandas you need to write code only once for any report and next time onward you can reproduce it.
@habrom1000
@habrom1000 2 года назад
2:00:00 vectorize it is useful for me
@rahularanger407
@rahularanger407 2 года назад
Why does my output even include Nan values from the table shown in ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-5rNu16O3YNE.html like for the day "Thur" it shows Lunch and Dinner(this has Nan) but in video, there's only lunch
@maxbart1353
@maxbart1353 4 года назад
i need an extra tutorial for that
@radyoalmikyel6881
@radyoalmikyel6881 4 года назад
you dropped total_bill in X=tips_dummy no?
@kamilwhite8139
@kamilwhite8139 3 года назад
million likes
@souhamahmoudi7745
@souhamahmoudi7745 2 года назад
where can i find the data that has been used in this video, please ?
@tunkyi7162
@tunkyi7162 4 года назад
the window size for coding should be full windowed, can't see quite well
@woOpPerjr
@woOpPerjr 4 года назад
thanks for letting me know. I just realized the other day that I can get a little more screen real estate by hitting F11 so I'll be sure to do this in the future.
@surajviswakarma254
@surajviswakarma254 4 года назад
Can i have the access to your notes u have? please of if someone is having ?
@rohitpurkait4046
@rohitpurkait4046 4 года назад
You can get all the notes from Github
@calluma8472
@calluma8472 4 года назад
Does the audio artifact on this video ever stop? Driving me crazy.
@RAL2010
@RAL2010 4 года назад
it's his phone, he should have switched it off.
@woOpPerjr
@woOpPerjr 4 года назад
@@RAL2010 oh did not know that's what caused it. :\ I use my phone for my teaching notes. Since it's a live coding sessions it would be super disruptive to tab back and fourth on the screen... Might also be that the phone was probably pluged in and charging. Would the interference be just from the charger? or does putting it in airplane mode help?
@DanWhalen
@DanWhalen 4 года назад
what os is that, is he on kde neon?
@ankushmenat
@ankushmenat 4 года назад
KDE for sure.
@woOpPerjr
@woOpPerjr 4 года назад
I run/ran arch (antergos) with KDE.
@yasseralkindi7350
@yasseralkindi7350 3 года назад
great content, annoying crackling noise :(
@inhyeokbaek6258
@inhyeokbaek6258 3 года назад
24:35
@MrEstate
@MrEstate 4 года назад
Is the Slack Channel still working? I can't find it.
@enthought
@enthought 4 года назад
Sorry, Robb. The SciPy 2019 Slack Channel is no longer active.
@KukaKaz
@KukaKaz 4 года назад
@@enthought hi! Where can i find the dataset and the codes to follow along? Couldnt find it in Daniel Chen's github page. Could u please send me the link or email me at tolekbaeva@bk.ru. Thank you!!
@elliottscott666
@elliottscott666 4 года назад
@@KukaKaz I just found it today by searching on GitHub using the description in the video
@codingwithjoyk
@codingwithjoyk 5 лет назад
you think. wowl
@anveshicharuvaka4650
@anveshicharuvaka4650 4 года назад
Melt around 50:00
@anveshicharuvaka4650
@anveshicharuvaka4650 4 года назад
Melt 50:00 Pivot 1:20:00 Apply 1:39:00
@Imran_et_al
@Imran_et_al 3 года назад
Pandas was never that easy elsewhere
@maxbart1353
@maxbart1353 4 года назад
in excel the pivot table stuff is much easier, (for me at least)
@Dev-yv5xl
@Dev-yv5xl 4 года назад
Again you make video. Put that Mobile phone away from your mic.
@Michael-ur3zs
@Michael-ur3zs 3 года назад
he has great content, but that phone interference is so distracting
Далее
Solving real world data science tasks with Python Pandas!
1:26:07
Exploratory Data Analysis with Pandas Python
40:22
Просмотров 457 тыс.
NumPy vs SciPy
7:56
Просмотров 36 тыс.
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
15 Python Libraries You Should Know About
14:54
Просмотров 383 тыс.
Learning Pandas for Data Analysis? Start Here.
22:50
Просмотров 93 тыс.