Тёмный

Stack, Unstack, Melt, Pivot - Pandas 

Data Talks
Подписаться 17 тыс.
Просмотров 40 тыс.
50% 1

“There should be one-and preferably only one-obvious way to do it,” - Zen of Python. I certainly wish that were the case with pandas. In reading the docs it feels like there are a thousand ways to do each operation. And it is hard to tell if they do the exact same thing or which one you should use. That's why I made An Opinionated Guide to pandas-to present you one consistent (and a bit opinionated) way of doing data science with pandas and cut out all the confusion and cruft.
I'll talk about which methods I use, why I use them and most importantly tell you the stuff that I've never touched in my years of data science practice. If this sounds helpful to you then please watch and provide feedback in your comments.
This series is beginner-friendly but aimed most directly at intermediate users.
“Opinionated Guide - Group Operations” contents:
github.com/kna...
Helpful links:
pandas stack/unstack: pandas.pydata....
An Opinionated Guide to pandas - Intro and Environment Setup: • Installing Pandas
An Opinionate Guide to pandas - Intro to Data Structures: Series: • Series - Pandas
An Opinionate Guide to pandas - Intro to Data Structures: DataFrames: • Series - Pandas
An Opinionate Guide to pandas - Intro to Data Structures P3: • DataFrame Functions - ...
An Opinionated Guide to pandas - Indexing and Selecting: • Indexing and Selecting...
Categorical Encodings: • Categorical Encoding
Link to GitHub repo including environment setup for tutorials: github.com/kna...
Link to GitHub Intro To Data Structures Jupyter Notebook: github.com/kna...
PEP 20 - The Zen of Python link: www.python.org...

Опубликовано:

 

2 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 36   
@arhataria
@arhataria 4 года назад
Lecture notes - Stack, Unstack, Melt, Pivot 1. stack, unstack - moving things from 'columns into the indices' and 'the indices back into the columns' (useful when you need to deal with pivot table format) 2. melt, pivot - fancy way for stacking and unstacking 3. get_dummies() - for Meachine Learning
@barefootalex
@barefootalex 3 года назад
Started my analytics journey as a way to upskill during the pandemic. It has not been easy. However, the journey has been worth it. For those who are just starting out, stick with it!! Things will begin to click. Be patient and it’ll come!
@automatewithamit
@automatewithamit 6 месяцев назад
Thanks for your Video ! Can you please let me know if we can put those collapse and expand functionality in actually generated pivot table in excel using python ??
@InteligenciadeNegocios
@InteligenciadeNegocios 2 года назад
Thank you! was really helpful
@AnjanBasumatary
@AnjanBasumatary Год назад
How to display emty rows and column combination in pivot table function ? ....in the output I only get those rows and column which have values
@jessyjames2950
@jessyjames2950 3 года назад
The problem I had with stack() was that I was not able to use the stacked data in DataFrame structure.. melt() saved me though!
@vitorribeirosa
@vitorribeirosa 2 месяца назад
Neat!!! Thanks for sharing.
@DookyButter
@DookyButter 2 года назад
Awesome video and tutorial. One caveat I would add about pd.get_dummies() and using it in preparation for any type of linear models (at 8:50) is that you would need to drop either the sex_Male or sex_Female columns so as to avoid the dummy variable trap.
@NikitaShilyaev
@NikitaShilyaev 3 месяца назад
so in easy words, than you have a binary feature - you just leave it as it is. But once your feature is not only True or False, you should create dummies as pd.get_dummies(drop_first=True) to avoid multicollinearity of dummies
@bubblebath2892
@bubblebath2892 4 месяца назад
Nice tutorial
@jessyjames2950
@jessyjames2950 3 года назад
Thank you! You saved my day!
@JLRocco43
@JLRocco43 2 года назад
omg the riddler
@vinceangeloespada6927
@vinceangeloespada6927 2 года назад
DUDE YOURE A BLESSING! THANKS FOR THIS!
@son3305
@son3305 Год назад
amazing explanation! Thank you
@riley09
@riley09 3 года назад
Great content, and I really like your voice!
@dr.kingschultz
@dr.kingschultz 2 года назад
Very nice video! I have two simple problems if you have a video about it would be nice! 1- I have multiple values in the same cell, so I have to split the value and create multiple columns and than stack() - How to split the values from one cell? 2- I have values in one column that if it is a one I want to select that line to check if the value of the second column exist or not, and if does count it.
@DataTalks
@DataTalks 2 года назад
Thanks! If these are coming from a problem set online definitely send it over. Every once in a while I do one of those in another video series. But I do have some short answers here as to what I think I would do given your descriptions above. To split the value out and create multiple columns I would very likely do a .str manipulation and then assign a new column on the data frame from the output. For the having values in one column and want to check if that second value is in the other column, I would do an apply of the isin operator. Ultimately with depend on the problem itself and how big the data is. Using the apply operator in pandas often extremely expensive. Great question!
@mimanadade4patas701
@mimanadade4patas701 2 года назад
Great video! Thank you
@pranay6481
@pranay6481 5 лет назад
Cool stuff Nathaniel! I use stacking/unstacking over melt too for quite similar reasons
@marlonmag-isa3500
@marlonmag-isa3500 4 года назад
Hi how to pivot multiple columns? For example I have 9 columns and I want to retain columns 1 to 3 and pivot the columns from 4 to 9.
@DataTalks
@DataTalks 4 года назад
Stack and unstack both take lists of columns/indexes as input :)
@ericwr4965
@ericwr4965 4 года назад
Phenomenal video and had no idea about the creation of the dummy variables.
@imotivate963
@imotivate963 4 года назад
I only have 1 column that is in number in value. And I want to stack it groupby that column. columns: house_number, zone, full_name, occupation Can I stack by the house_number by using groupby? and How can I do that?
@DataTalks
@DataTalks 4 года назад
Hey Roel, great question! If house_number is an integer you can just apply a groupby and it will bring the house_number into the index. If house number is continuous, then you will need to use something like pd.cut to discretize it first. I hope that helps!
@imotivate963
@imotivate963 4 года назад
@@DataTalks I get the result I want. Thanks
@moongihong2794
@moongihong2794 4 года назад
I don't need multi-index functionality of pandas. So I stick to pivot/melt rather than stack/unstack. btw. great quality of videos.
@DataTalks
@DataTalks 4 года назад
Awesome! Yeah that is definitely the other way to do it! Our of curiosity - were you familiar with excel before you started using pandas? I've heard that excel is similar to the pivot + melt syntax
@moongihong2794
@moongihong2794 4 года назад
@@DataTalks Bad at excel lol. Frankly I'm familiar to SQL. Some DBMS support multi-index and some are not.
@eatcake7572
@eatcake7572 3 года назад
I like the explanation. I've been using melt, though I'll start looking into stack/unstack. Is there an advantage of one over the other (stack vs. melt)?
@DataTalks
@DataTalks 3 года назад
Not really! Melt and pivot are just like stack and unstack, so if you know one pair that is good enough :)
@cooky123
@cooky123 3 года назад
Thank you!
@tylerjnewman
@tylerjnewman 4 года назад
Super useful, thanks
@jordanhensiek3882
@jordanhensiek3882 4 года назад
3:00 unstack mark can do the thing you're lookin to do
@lolita000018
@lolita000018 4 года назад
great work explaining how stack and unstack work just a question about that magic line to convert multi columns into columns. can't understand what is the point of .strip() after join! got the same result without it!
@DataTalks
@DataTalks 4 года назад
I think strip is just there to help with spaces at the beginning or end of column names - so a pretty uncommon use case!
Далее
Merge, Join, Append, Concat - Pandas
11:42
Просмотров 83 тыс.
How do I use the MultiIndex in pandas?
25:01
Просмотров 174 тыс.
Вопрос Ребром - Серго
43:16
Просмотров 1,6 млн
Radxa X4: An N100 Pi
20:48
Просмотров 60 тыс.
Group By - Pandas
12:22
Просмотров 13 тыс.
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
Make Your Pandas Code Lightning Fast
10:38
Просмотров 184 тыс.
Вопрос Ребром - Серго
43:16
Просмотров 1,6 млн