Seaborn distplot | Seaborn distplot interpretation and how to make a distribution plot in seaborn

Kimberly Fessel

Подписаться 21 тыс.

Просмотров 24 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

30 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 77

@KimberlyFessel 3 года назад

If you're using Seaborn 0.11.0+, check out my videos all about the new Seaborn distribution plots: displot (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-4DA_dgc521o.html) and histplot (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Bjz00ygERxY.html) 😄

@MagnusAnand 2 года назад

NOTE: distplot is deprecated

@ketanbutte3497 Год назад

Great explanation, and I found out about those cool dropdowns in nbextensions!! thanks Kimberly for your work!!

@jaimefernandopelembe4339 2 месяца назад

Thank you for the video, your explanation is very easy to understand 😊

@KimberlyFessel 2 месяца назад

Most welcome! Glad you found it helpful!

@World_Exploror Год назад

Hello the title misleading you haven't interpreted the plots instead you just shown how to make the distplots, the important part is interpretation about the plots

@yxngboypolo Год назад

did u just use a semicolon in python >:( . Great video btw

@AadityaGupta-cm6mj 4 месяца назад

linewidth bandwidth fit and specific colouring of only rug are not working with latest version

@KimberlyFessel 4 месяца назад

Thanks for the update. Yes, the distplot has been replaced by the displot. You may find that one more useful. Cheers!

@anashasiba2894 2 года назад

can please make a videio of I PLOT please !

@rajatgupta7344 3 года назад

seaborn 0.11.0 is not supporting distplot however it suggesting to use histplot or displot and the various argument of distplot not supporting displot 😅

@KimberlyFessel 3 года назад

Yes -- the most recent Seaborn update was a big one! The distribution plot is now called the displot and a simple histogram plot now exists, the histplot. The displot is supposed to be more similar to the catplot and the relplot, so some arguments were removed (for example, fit) while others only work if "kind" matches (for example, bins only works for kind="hist"). Planning to do videos on both the updated displot and the histplot in the future!

@rajatgupta7344 3 года назад

@@KimberlyFessel thnx u are awesome. Waiting for your video on both topic.

@060584saurav 3 года назад

Hi Kimberly. Can we change the bin width and bin range in distplot. It is coz fit function is not available with histplot.

@KimberlyFessel 3 года назад

As far as I know, binwidth and binrange were introduced with the seaborn histplot (in seaborn version 0.11.0) -- the distplot only has the bins argument. So, you can pass in your own custom bin list to the bins argument, but just not the binwidth and binrange options. And I agree - it was a bummer that the "fit" argument was removed!

@vedanthbaliga7686 4 года назад

How did you do the dropdown in Jupyter Notebook?

@KimberlyFessel 4 года назад

Great question! I installed nbextensions (jupyter-contrib-nbextensions.readthedocs.io/en/latest/) which allows for additional functionality in Jupyter Notebook. Those dropdowns are an extension called "Collapsible Headings" (jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/collapsible_headings/readme.html).

@gopikrishna1404 2 года назад

Can u also upload data sheet

@warrenarnold 2 года назад

you are so happy

@sergejlust7927 3 года назад

nice, just missed the interpretation part which was in the title. Never the less informative.

@danielsan6676 3 года назад

Thanks a lot, your video was so useful

@priyanshjoshi7 3 года назад

You're absolutely amazing in everything. The best part of your videos is a really sweet voice and the explanations are awesome!! Thank you so much Kimberly. Your channel is quite underrated.

@KimberlyFessel 3 года назад

Wow, thanks so much! Really appreciate the support and glad to hear your are enjoying my content - cheers!

@slungilemhlongo8028 3 года назад

Thank you for sharing your knowledge with us. Your videos are really helpful, thank you!!

@KimberlyFessel 3 года назад

So glad to hear that the videos have helped - cheers!

@xeeharshit 3 года назад

Hey, great work here, really awesome! A small suggestion, please update few codes as per new version in description box or in video. Kinda stuck with huge warning box.

@KimberlyFessel 3 года назад

Thank you! And yes, a few months after I launched this series, Seaborn underwent a big revamp in version 0.11.0. Will definitely take your suggestion into consideration, but I did go ahead and make new videos about the updated functions: displot (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-4DA_dgc521o.html) and histplot (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Bjz00ygERxY.html) 👍

@shubhamtalks9718 3 года назад

How can I draw the mean line in this plot?

@KimberlyFessel 3 года назад

I would probably use matplotlib to add a mean line. First import pyplot (from matplotlib import pyplot) and calculate your mean (say, m = cars.horsepower.mean()). Then you can use pyplot to add a vertical line at the mean (plt.axvline(m)). This code can be added right after your seaborn figure in the same Jupyter Notebook cell. I also have a video about adding vertical or horizontal lines with matplotlib if you'd like to learn more: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-xKeu1W2mn64.html

@shubhamtalks9718 3 года назад

@@KimberlyFessel Thanks Kimberly. Keep doing more wonderful videos.

@adityajangir7149 3 года назад

please provide the link of the data sets you use.

@KimberlyFessel 3 года назад

Sure thing - I mostly use datasets that are included in the seaborn library, and those datasets are also publicly available through GitHub (github.com/mwaskom/seaborn-data). In this particular video, I used the "mpg" dataset from seaborn (github.com/mwaskom/seaborn-data/blob/master/mpg.csv).

@adityajangir7149 3 года назад

@@KimberlyFessel okayy thanks ☺

@adityajangir7149 3 года назад

@@KimberlyFessel nd also thank you for such great videos.

@sebastianw8952 3 года назад

Very good explanation. Hope you are getting more viewers.

@KimberlyFessel 3 года назад

Thanks very much -- and I hope so, too! 😄

@aniketsinghvats6441 3 года назад

Great work. Really enjoyed your tutorials and this seaborn series seems interesting.

@KimberlyFessel 3 года назад

Very glad to hear that! The more I learn about seaborn, the more interesting it becomes!

@akintomiwaodusanya3735 4 года назад

Thank you Kimberly. You have found a way to make graph fun.

@KimberlyFessel 4 года назад

Glad you enjoyed it -- I have fun making both graphs and videos!

@harshitarawat8941 3 года назад

I just want to say, I love your videos and way of explaining. thank you so much!!

@KimberlyFessel 3 года назад

Thanks very much for stopping by and for the compliment! 😄

@eatbreathedatascience9593 2 года назад

Excellent video !

@phssyk2 3 года назад

Great job Kimberly, Please keep on adding more tutorials. Really liked the way you teach the concepts and the code. Already Subscribed and will look out for more videos on ML and Data Science from you!!!

@KimberlyFessel 3 года назад

Thank you -- definitely will continue adding more videos!

@harkawalsohi9761 3 года назад

hey. What if the given data is about the weight of different fruits and you want you to make a displot of the weight of one fruit say apple.??

@KimberlyFessel 3 года назад

Hi -- Your best bet is probably to use pandas to filter down to the apples first and then do a displot of the apple weights. So if your dataframe, df, has two columns "fruit_type" and "weight", you could do a displot on df[df.fruit_type == "apple"].weight to draw out a distribution plot for just the apple weights. This assumes that you have several apple weight measurements.

@harkawalsohi9761 3 года назад

Kimberly Fessel thanks.. i will try this

@induchanti9966 3 года назад

How to denote distplot for name in the considered dataset?

@KimberlyFessel 3 года назад

Distplot is used to show distributions of numerical values. Since "name" in this dataset has descriptive categories, distplot won't be able to show you much. You could perhaps build a barplot to count up the number of cars from each make, which is the first word in name: cars.name.apply(lambda x: x.split()[0]).value_counts().plot(kind='bar')

@rskura 3 года назад

Great video tutorials Kimberly! I am wondering what the best way is to handle bounded data with the kdeplot. For example, measurements from an instrument that will always be positive (so bounded on the left by zero). The kdeplot can show positive probability for values below zero. I understand you can limit the x-axis of the plot to start at zero, but the probability is underestimated because some of it "spills over" onto the negative side of the x-axis. I have read about (and tried) a method to include all negative values of the same dataset and the resulting kdeplot shows accurate probability density as it approaches zero. Then you can limit the x-axis to start at zero. Have you used this approach or is it better to use the scipy.stats.skewnorm?

@KimberlyFessel 3 года назад

Hi Richard - thank you for this interesting question! Yes, that is definitely a drawback of the KDE, that values can spill into unnatural areas. Seaborn has an option called "clip". If you set that to a pair of numbers, Seaborn will not evaluate density outside of the bounds you provide. But I haven't looked into the source code. Not sure if it will redistribute the density into the allowed area or just clip off the ends. Otherwise, the idea you mentioned should work! If you add in equivalent negative values and THEN lop off your x-axis, you will get an equivalent added part from the negative values even though part of the positive values are getting cut off. 👍

@rskura 3 года назад

@@KimberlyFessel thank you for the detailed reply. I think I might have tried “clip” before, but I might have to revisit. Your videos are great and truly appreciated. They have helped me a lot. Keep up the great work!

@murtazajabalpurwala8124 3 года назад

thanx alot

@KimberlyFessel 3 года назад

Most welcome - cheers!

@SwavimanKumar 4 года назад

Awesome videos. Thanks for making it. I have one small doubt. what is that xkcd?

@KimberlyFessel 4 года назад

Excellent -- glad you enjoyed the video. Also glad you asked this question! xkcd is a fun comic series. About ten years ago they conducted a large-scale color survey: blog.xkcd.com/2010/05/03/color-survey-results/ The resulting named colors (xkcd.com/color/rgb/) can be accessed by matplotlib or seaborn by prepending the color name with 'xkcd:' I am planning to do a video all about color options and color palettes within seaborn soon!

@SwavimanKumar 4 года назад

@@KimberlyFessel wow. That was interesting information. Awaiting your video on color options. 👍👍

@gauthamambethkar4483 3 года назад

Hi Kimberly, These are amazing tutorials. Just one question. At 5:23, when you make the plot right skewed using skewnorm, how will you do it for left skew?

@Himanshu-ed3mf 3 года назад

I guess, distplot() automatically recognizes whether data is skewed to the left or right.

@KimberlyFessel 3 года назад

Thank you! And yes - Himanshu is correct - skewnorm would automatically detect left or right skewness. Unfortunately, however, the "fit" argument I mentioned in this video has been removed from the new displot and histplot functions in Seaborn version 0.11.0.

@gauthamambethkar4483 3 года назад

Thank you both😊

@ajay6015 3 года назад

How to add some legends for the distplot

@KimberlyFessel 3 года назад

You can add a legend to the distplot using matplotlib commands. If you have imported and aliased pyplot (from matplotlib import pyplot as plt), add "plt.legend(["label"])" as a line of code after your distplot. You can also label the kde and histogram separately by making a longer list, e.g. "plt.legend(["kde", "histogram"])".

@andreazecchi812 3 года назад

Over the top. Great job, Kimberly!

@KimberlyFessel 3 года назад

Thank you!

@hudsonjorge2000 3 года назад

Nicely explained. Thanks.

@KimberlyFessel 3 года назад

Most welcome - glad you enjoyed it!

@anshulsingh3210 4 года назад

Hey. Can you please tell me how can we change the size of plots?

@KimberlyFessel 4 года назад

Hi there -- you can add a line of code before your Seaborn plot to change the figure size via matplotlib: plt.figure(figsize=(10,6)). Just input your required dimensions in place of my example tuple, (10,6), and be sure to do: "from matplotlib import pyplot as plt" at the start of your code.

@anshulsingh3210 4 года назад

Thank you so much...it solved my problem..

@zouhir2010 3 года назад

danke bis später

@KimberlyFessel 3 года назад

Bitte!

@AlvessSavio 3 года назад

Awesome, that helps a lot

@KimberlyFessel 3 года назад

Excellent! Glad it was helpful.

@dhirajp4677 4 года назад

Hey Thank you for the video..can you explain what is y axis in Kde plots..how to interpret ..

@KimberlyFessel 4 года назад

Hi there -- the KDEplot provides you with an estimate of your data's probability density function. The height of the graph is scaled so that the area under the curve sums (or integrates) to one.