Тёмный
No video :(

Pandas for Data Analysis | SciPy 2017 Tutorial | Daniel Chen 

Enthought
Подписаться 67 тыс.
Просмотров 84 тыс.
50% 1

There are audio issues with this video that cannot be fixed. We recommend listening to the tutorial without headphones to minimize the buzzing sound.
Tutorial information may be found at scipy2017.scip...
Data Science and Machine learning have been synonymous with languages like Python. Libraries like numpy and Pandas have become the de facto standard when working with data.
The DataFrame object provided by Pandas gives us the ability to work with heterogeneous unstructured data that is commonly used in "real world" data.
New learners are often drawn to Python and Pandas because of all the different and exciting types of models and insights the language can do and provide, but are awestruck when faced with the initial learning curve.
This tutorial aims to guide the learner from using spreadsheets to using the Pandas DataFrame.
Not only does moving to a programming language allow the user to have a more reproducible workflow, but as datasets get larger, some cannot even be opened in a spreadsheet program. The goal is to have an absolute beginner proficient enough with Pandas that they can start working with data in Python.
We will cover how to load and view our data. Then, some basic methods to do quick visualizations of our data for exploratory data analysis. We will then work on combining and working multiple datasets (concatenating and merging), and introduce what Dr. Hadley Wickham has coined "tidy data". Tidy data is an important concept because the process of tidying data will fix a host of data problems that are needed to perform analysis. We then cover functions and applying methods to our data with a focus on data cleaning, and how we can use the concept of split-apply-combine (groupby) to summarize or reduce our data.
Finally, we cover the basics of string manipulation and how to use it to clean data before briefly covering the role of Pandas in analysis packages such as scikit learn. The tutorial will with a fitted model.
The goal is to get people familiar with Python and Pandas so they can learn and explore many other parts of the Python ecosystem.

Опубликовано:

 

21 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 42   
@gndp
@gndp 6 лет назад
Topic Annotations 7:16 - Intro 19:59 - loc, iloc 39:26 - Ok there 49:53 - Assemble 1:21:46 - Missing Values 1:42:25 - 955 break 1:45:38 - Tidy data 1:53 04 - Tidy data 2:31:38 05 - Data types 2:41:00 - Spelling Mistake 2:42:33 - Apply 3:05:30 - Group by 3:19:29 - Stats models 3:36:46 - Python 2 hasn't ended 3:43:01 - Vectorize function
@texaswriter6314
@texaswriter6314 6 лет назад
Thanks for adding this
@adityasinghaswal4923
@adityasinghaswal4923 6 лет назад
thanks paji
@giorda77
@giorda77 6 лет назад
Thanks :)
@ItsRainingSteak
@ItsRainingSteak 4 года назад
You are the man
@rosgori
@rosgori 7 лет назад
You can find notebooks here: github.com/chendaniely/scipy-2017-tutorial-pandas
@Skirmitch
@Skirmitch 7 лет назад
Incredible channel, watched two conferences already, absolutely subscribing
@namanmehta5243
@namanmehta5243 7 лет назад
ya, what the hellEnthought is...
@PUNEETKUMAR-le9vo
@PUNEETKUMAR-le9vo 5 лет назад
by using --------(ignore_index=True) you can get rid of unwanted index sequence and new correct index in ascending order.
@bigdataarchitect7163
@bigdataarchitect7163 6 лет назад
How many times can I like this video?....Great work!
@clinton11994
@clinton11994 5 лет назад
1:20:00 the output has duplicate rows cos of left join, use "on" to take intersection of two dataframes.
@dipanjansaha6824
@dipanjansaha6824 6 лет назад
52.25:To get the Index Number in assending order- row_concat = pd.concat([df1,df2,df3],ignore_index=True) row_concat
@Ali-mi9up
@Ali-mi9up 5 лет назад
nicely explained; couldn't understand the reason for passing in a list for multiple columns previously now I do!
@GeorgeKlucsarits
@GeorgeKlucsarits 6 лет назад
Very nice introduction; thanks for doing this.
@amirkahinpour547
@amirkahinpour547 6 лет назад
@enthought this is an amazziiinnng playlist. Thank you times and times more for sharing
@mariav1234
@mariav1234 6 лет назад
Amazing tutorial! Thank you!
@shikharagrawal1797
@shikharagrawal1797 6 лет назад
In the 'apply' section after 2:42:33, when we are applying average function on the dataset, we are able to compute avg for each columns in the dataset. But we are not able to apply it on a particular column. Why?
@Tmmmey
@Tmmmey 6 лет назад
For a collection of 18+ pandas video tutorials, and more, see this GitHub repo: github.com/tommyod/awesome-pandas
@taisenzhuang2302
@taisenzhuang2302 6 лет назад
Great Training Video!! Thanks you!!
@jiminy2731
@jiminy2731 6 лет назад
pd.concat([df1, df2, df3],ignore_index= True) this is the code to reset the index after merge. @55:00
@cluelesscoder8526
@cluelesscoder8526 6 лет назад
If you've already concatenated the data, I think you can use: row_concat.reset_index(drop=True). In this case, "row_concat" is just the variable name used in the video.
@shivam13juna
@shivam13juna 6 лет назад
I love this guy
@mjja99
@mjja99 5 лет назад
Is it just me - or is this guy Leonard Hofstadter's body double - even his mannerisms are similar!
@plotsandsetups8539
@plotsandsetups8539 6 лет назад
Amazing video
@jmick175
@jmick175 6 лет назад
I'm sure this entire video is helpful, but can you provide minute notations for specific topics? Specifically, I'm hoping that somewhere in this 3.75 hour video there is a discussion of reading from and writing to SQL databases with Pandas. Even if there is not any discussion of SQL, it would be even more helpful to have jump-to points for different topics. Thanks.
@jza1996
@jza1996 6 лет назад
check the top comment
@texaswriter6314
@texaswriter6314 6 лет назад
Thanks for uploading.
@aznfoever35
@aznfoever35 5 лет назад
very useful! thanks!
@oscarbecerril8343
@oscarbecerril8343 7 лет назад
Very very nice. Thanks
@bhargav1389
@bhargav1389 Год назад
Hi there not able to download the repository, can you provide alternate way to download the repository.
@yvonneachieng7257
@yvonneachieng7257 6 лет назад
Thank you Sir
@sumithrap3155
@sumithrap3155 5 лет назад
how to transform non-numerical labels to numerical labels
@Superdooperhero
@Superdooperhero 5 лет назад
Felt like Samuel L Jackson in Pulp Fiction watching this. Say "so" one more time.
@didierleprince6106
@didierleprince6106 5 лет назад
A true Bible. Merci
@lucasqian7488
@lucasqian7488 6 лет назад
how to auto complete the function?
@rahuljassal6080
@rahuljassal6080 5 лет назад
Can anyone help me in letting me know what is the link for the files used in this particular tutorial ? I am interested in downloading the Jupyter notebooks that were used in this tutorial.
@tayyabnawaz4679
@tayyabnawaz4679 5 лет назад
github.com/chendaniely/scipy-2017-tutorial-pandas
@fernandosanchezros3341
@fernandosanchezros3341 4 года назад
did anyone know a tutorial like this but in R?
@legalfictionnaturalfact3969
@legalfictionnaturalfact3969 3 года назад
pandas
@user-zt8dj4nq9g
@user-zt8dj4nq9g 7 лет назад
1:19:30
@gongfei
@gongfei 6 лет назад
the url is github.com/chendaniely/scipy-2017-tutorial-pandas it took 6 minutes...
@daxmahoney7912
@daxmahoney7912 7 лет назад
assmeble
Далее
Exploratory Data Analysis with Pandas Python
40:22
Просмотров 457 тыс.
Solving real world data science tasks with Python Pandas!
1:26:07
Stephen Simmons - Pandas from the Inside / "Big Pandas"
1:17:26
Learning Pandas for Data Analysis? Start Here.
22:50
Просмотров 93 тыс.
Data Science with Python Pandas by Athena Kan
51:24
Просмотров 116 тыс.