Тёмный

How to draw a histogram from a set of data 

Jeremy Blitz-Jones
Подписаться 15 тыс.
Просмотров 10 тыс.
50% 1

Step-by-step guide for how to create a histogram from a set of data.
Video transcript:
I recently looked up local dogs that are up for adoption. I found that the dogs had a range of weights. Many dogs were the kind you could fit in your bag and some were too big to sit in your lap.
To better understand and visualize this data on local dog weights, we can create a histogram. Start by drawing the y-axis which is always the number of data points, in this case the number of dogs. Then draw the x-axis which represents our variable: dog weight. When drawing the x-axis, we need to decide what intervals or bin sizes we should use for the dog weights. For example, we could use increments of 5 pounds...or increments of 10 pounds. Let’s start with bins of 10. So our labels would be 0-10 pounds, 10-20 pounds, and so on. Some histograms label the ranges while others label only the boundaries.
Next, let’s sort our dog weights from smallest to largest. Then, in each bin, we put how many dogs are within that range. The first bin is 0-10 pounds. There is a 5-pound dog, 6-pound dog, 7-pound dog, and three 8-pound dogs for a total of six dogs in the 0-10 pound range. So the first bar for 0-10 pounds has a height of six. Next, count how many dogs are in the 10-20 pound bin. There are 8 dogs in this range. So the 10-20 pound bar has a height of 8. There are only two dogs in the 20-30 pound range so that bar has a height of two.
There’s a dog weighing 31 pounds and a dog weighing 40 pounds. Does the 40 pound dog go in the 30-40 pound bin or in the 40-50 pound bin? The convention is to put borderline values in the upper bin, so the 40 pound dog would be put in the 40-50 pound range rather than the 30-40 pound range. So each bin includes the bottom value in the range but not the top value, in this case 30 to 39 pounds. This means for our 30 to 39 pound bin, we only include the one 31 pound doggo. Our 40 to 49 pound bin gets the 40 and 45 pound dogs. In our data set, there aren’t any dogs in the 50-59 pound range so we’ll skip it and add our last dog to the 60-69 pound bin and draw a bar with a height of one.
Congratulations! You now know how to draw a histogram for a set of data. Now let’s explore how different bin sizes affect the look of the histogram and the story the data tell. First, let’s reduce the bin size to intervals of 5 pounds. We can now see more details about the data, which is useful for the dogs between 5 and 20 pounds but looks a bit silly for the remaining weights where it just shows us that there is one dog in some of the other weight ranges.
Let’s try a larger bin size of 20 pounds. This conveys the main trend which is that there are many more lighter weight dogs, but it doesn’t allow us to differentiate between 0-10 pound dogs and 10-20 pound dogs like we could before. As you’re noticing, with histograms, there’s no way to know where within each bin a data point falls, so we don’t know whether the one dog in the 60-80 pound range is 60 or 79 lbs. It seems the 10 pound bins are probably the best choice for this data set because they give a sense of the trend and also enough detail for someone considering adopting a dog.
Have fun visualizing your data!

Опубликовано:

 

17 ноя 2020

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 9   
@streemguitar
@streemguitar 3 года назад
The vocal tone and quality of the speaker is on point
@anishnitinachanta6680
@anishnitinachanta6680 3 года назад
Yours is an underrated channel. Great work, sir!
@Anttimation
@Anttimation 3 года назад
The animations visualize it very well. Nice!
@RidleyE
@RidleyE 2 года назад
Keep making videos, these are fantastic
@eriklokensgard2351
@eriklokensgard2351 2 года назад
Excellent video!
@DOUBLEECaDA
@DOUBLEECaDA 2 года назад
Hi, thank you for sharing. What about a video on A/B split testing? Or ads testing: calculating right sample size, choosing right confidence level, calculating what is probability of the same results if a test is repeated or more money is invested into a winner ad. I know it is more complex but maybe it would be agood topic since no such videos on RU-vid, Cheers.
@sunildmello2610
@sunildmello2610 2 года назад
Fantastic..liked the explanation
@oliverkowalski5838
@oliverkowalski5838 6 месяцев назад
Thx
@NickyMehula
@NickyMehula 4 месяца назад
Feels like watching an American teen movie😂
Далее
How to create a Histogram
6:41
Просмотров 361 тыс.
Reading Histograms - Corbettmaths
9:50
Просмотров 193 тыс.
Statistics - How to make a histogram
3:07
Просмотров 1,1 млн
Histogram Explained
13:21
Просмотров 331 тыс.
Introduction to Histograms
5:57
Просмотров 310 тыс.
Histogram and Frequency Polygon
14:25
Просмотров 344 тыс.