Тёмный

violin plots should not exist 

Angela Collier
Подписаться 199 тыс.
Просмотров 200 тыс.
50% 1

Violin plots are never the best version of a plot. They are hard to read and bad.
Violinplot: www.stat.cmu.e...
Beanplot: www.stat.cmu.e...
Most plots from Harvard Open Courseware stuff: www.labxchange...
Patreon (join for exclusive video each month): / acollierastro

Опубликовано:

 

29 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 2,4 тыс.   
@KingBobXVI
@KingBobXVI Год назад
If I ever write a paper, I'm going to not use a violin plot, and I'm going to cite you for why I didn't use a violin plot.
@acollierastro
@acollierastro Год назад
Great point. I demand citations every time someone avoids a violin plot in the future.
@bearsaroundhere
@bearsaroundhere Год назад
​@@acollierastroif I wasn't going to use one anyways, then should I still cite so that it's obvious why I didn't
@ozymantiasVI
@ozymantiasVI Год назад
​@@acollierastroI'll do you one better, when I review a paper and it uses a violin plot I'll ask the authors to replace it with a boxplot + histogram and cite your video
@georgelionon9050
@georgelionon9050 Год назад
Instead of people citing a video, should write a paper I suggest the title: "Violin plots considered harmful" (because their past existence cannot be undone)
@xaxfixho
@xaxfixho Год назад
Getting passive aggressive, annoying vibes 😮 Transwomen are women 🙋‍♀️🙋🙋‍♂️
@AdrianBoyko
@AdrianBoyko Год назад
Hello! I am the creator of the Turnip Plot, which is similar to a violin plot but rotated around the vertical axis and rendered in 3D. Please cite my paper. Thanks!
@LimeyLassen
@LimeyLassen Год назад
Sounds like a lot of accidental buttplugs to me 😅
@carpathianhermit7228
@carpathianhermit7228 Год назад
Why do you need citation
@ts4gv
@ts4gv Год назад
@@carpathianhermit7228 Turnip
@hsm4983
@hsm4983 Год назад
if u don't cite my beyblade plot paper in your turnip plot paper I'm citing u for plagiarism
@samsowden
@samsowden Год назад
hur hur hur looks like b*schrödinger'scat*tt pl*schrödinger'scat*g
@michaelfairchild6768
@michaelfairchild6768 Год назад
Violin plots are overused but they have a use case for comparisons of a large number of samples that have complex distributions. We use them for this when comparing gene expression in cell populations. We can quickly see the 'shape' and get the vibe of the multimodel gene expression for large sample numbers.
@rikwisselink-bijker
@rikwisselink-bijker 11 месяцев назад
Exactly, the point of the violin is just the vibe. The data is in the box plot inside of the violin.
@bordeterre5234
@bordeterre5234 11 месяцев назад
Wouldn’t ridge plots work in that kind of situation ?
@s_de-x6r
@s_de-x6r 8 месяцев назад
each time she points to the void and a violin plot appears I absolutely and completely lose my shit they get uglier each time without missing a beat oh my god
@scritoph3368
@scritoph3368 10 месяцев назад
the mad scientist plotting to take over the world cackles madly as he circles “violins?” on his list of evil plots.
@ronytakchi7199
@ronytakchi7199 8 месяцев назад
I really liked the comparison you showed @33:57 that compares violin plot to ridgeline plot. That drove the message home.
@stephanieparker1250
@stephanieparker1250 Год назад
So, I like all your videos for different reasons, but I have to say this one is my favorite. Fantastic points. Also, I had a (sorta) similar experience at work. Our boss usually brought snacks to meetings. She brought ice cream once. I was the only overweight person on a large team of fit, young sports nuts. So what does she do upon wrapping up her PowerPoint? She says, “there’s lots of ice cream left if anyone wants extra. Stephanie?” Everyone turns to me because of COURSE the fat girl will want more ice cream, right?! I shouldn’t have to give an excuse to why I don’t want any ice cream.. but I still had to explain I’m lactose intolerant. So I hate meetings where she brought snacks. Which is a shame.
@criznach
@criznach Год назад
Most typewriters allowed you to apply whiteout without removing the paper. And some even had a white ribbon, to strike over incorrect letters. :D
@UffeHellum
@UffeHellum Год назад
Thank you for yet again making things so crystal clear! Agree 99.9% -- Beginner question: Why would I ever need units on a distribution? If I know that the area is always equal to 100%, then the shape and length (in this case the vertical axis), and smoothing, are the only three things that matter for a distribution? Admittedly, I do not have ANY math OR science background, but that tiny corner of the argument seems trite to me. I will absolutely never, ever use violin plots, over my dead body. I do not wish to be perceived as one of those guys, and I don't wish to feed that side of the room, ever. As a non-english, male speaker with no math background, I apologize for any language errors of mine.
@lux_incola4224
@lux_incola4224 Год назад
It's like a ruler without sub-markings on it vs with them as normal. If you know the whole thing is a certain length, you can estimate the length of stuff you put beside it sorta well. With the markers on it though, you can know how long the stuff is much quicker and much more accurately.
@johnstoddart5523
@johnstoddart5523 6 месяцев назад
I’m old enough to remember wet copying. And making stencils. You bloody think before you write. On the really old typewriter, you would backspace, find the error and scratch it out. You may or may not use a paper filler to reinforce the integity of the paper. That’s the reason old time paper was a heavier grade eg bond paper, was for precisely this reason. You’re welcome to the good old days.
@BillTranmer
@BillTranmer Год назад
I love your videos. It's like if science was a city and you're a tour guide taking us to all the places where people get stabbed.
@Tekenduis98
@Tekenduis98 Год назад
Funniest description ever!
@donpietruk1517
@donpietruk1517 Год назад
Could you make a violin plot outlining the distribution density of those stabbing areas for us please? I'll show myself out now. 😂😂
@gravity_mxk5663
@gravity_mxk5663 Год назад
Omg I’m dead 😂
@samwiseshanti
@samwiseshanti Год назад
Omg please tell me you didn't come up with that off the cuff, what a perfect description
@n20games52
@n20games52 Год назад
Also known as: Violence Plots.
@richardurwin4432
@richardurwin4432 Год назад
Have you noticed that most graph paper uses a relatively faint blue colour? That is because photocopiers (before they became scanners) couldn't see blue well. If you were careful with the contrast, you could draw a diagram or a form on graph or squared paper and the photocopier would come out with only your lines on it. It was a great way to create forms, character sheets for RPG games or, indeed, the sorts of diagrams in an academic paper. Regarding the typewriter thing, you could get whiteout on paper strips. You backspaced over the error, poked the whiteout paper behind the ink-ribbon and hit the erroneous letter again. That deleted it and you could over-type again with the correction. The line spacing on typewriters is quantised so you can go up and down by exact lines easily. It's only when you noticed the error after you had taken the paper out that you would have to resort to liquid whiteout and fiddle to get the positioning correct.
@mikedavis979
@mikedavis979 Год назад
In my typing class in high school (wow, am I really that old?), we used that method as well. We also had fancy Selectric-2's that had a built-in white-out strip! But i also remember just rolling the paper up a bit, applying white-out, letting it dry, and rolling it back down by the same amount. No need to take the paper out altogether. This would have been 1987 or 1988. The tail end of the typing class days, before keyboarding became standard, and Apple II's or Macs, etc., became cheap enough to replace typewriters, to teach typing. I mean, we had a few Apple IIs, but they were for "computer class", not "typing class".
@richardurwin4432
@richardurwin4432 Год назад
@@mikedavis979 That was around a decade after I went to school. In my day only the girls did shorthand and typing. I've cursed that fact many times during my IT career. By the time I realised it would be useful to be able to touch-type, I was too fast without to be able to stick to learning.
@LathosZan
@LathosZan 11 месяцев назад
I had a typewriter from my dad that had a little like tape strip in it for corrections instead of white out, and you'd hit the backspace key to go back one character spce and the delete lever would push the ribbons up so the correction tape was in line and you'd just smack the key for the offending character a few times until the tape pulled all the ink up, then you turn off delete and keep typing. You could even do neat stuff by combining some inked characters with other characters on the correction tape, like clearing a + across an inked M looked pretty cool, and it made good on-demand icons if you needed them for something.
@richardurwin4432
@richardurwin4432 11 месяцев назад
@@LathosZan That was a carbon ribbon instead of an ink ribbon, a plastic film with a layer of carbon on it. The rich people like important executives used those. They produced more professional-looking text because the ink didn't bleed into the paper. But the ribbon could only be used once; when the shape of a letter had been transferred onto the paper, it left a transparent hole in the tape. There was industrial espionage where people stole the secretaries' used typewriter ribbons and read the documents they'd been typing from the ribbon. Carbon ribbons were expensive and had to be regularly replaced. Ink ribbons just reversed themselves each time through and you kept using them until your text looked too faint. If you were enterprising you could even re-ink them yourself, in much the same way that you can re-fill ink cartridges today.
@LathosZan
@LathosZan 11 месяцев назад
@@richardurwin4432 Neat!
@TalysAlankil
@TalysAlankil Год назад
i spent 27 minutes going "okay is she going to say they look like vulvas" and then felt very validated
@ookazi1000
@ookazi1000 10 месяцев назад
Yeah, I was nodding along to the mechanical critiques of the plots, and realized about three-quarters of the way through the first half (I'm a bit slow about these sorts of things cause I'm an asexual cis man who ain't get none and ain't want none neither) that oh yeah, these do kinda look like genitals (and it only clicked cause one of em kinda looked like a penis and that made it click that the rest of em look like vulvas) and was like: Huh, That's weird: I wonder if anyone else noticed that, and if she's gonna mention it at all?
@luckyape
@luckyape 9 месяцев назад
spoiler alert
@GSBarlev
@GSBarlev 9 месяцев назад
Meanwhile I'm here getting distracted by the ones that look like stingrays. 🤷‍♂️
@fburton8
@fburton8 8 месяцев назад
I call them snot plots because to me they look like gloopy boogers such as you sometimes see in kids with colds.
@nickcarroll8565
@nickcarroll8565 6 месяцев назад
@@GSBarlevthe shrinks are going to have a field day with you😂
@davichk
@davichk Год назад
I used to work at a tight tolerance thin film deposition optics manufacturer. One day my current supervisor visited my workstation unexpectedly and asked, "you haven't been using violin graphs in any of your report generators, have you?"; "Never heard of such a thing. Why?"; "Good. I knew you were a smart one. Just don't. They're useless anyway." ... Later she quietly explained their inappropriateness. Turns out she wasn't just trying to prevent her team from using them either. The owner got top management together and instructed them to purge any current work of their existence. HE was smart.
@SuND4a1
@SuND4a1 Год назад
This story is so wholesome.
@SlenderSmurf
@SlenderSmurf 10 месяцев назад
Inappropriateness? Did they mean from a scientific or a social perspective?
@luna010
@luna010 Год назад
The technology exists for us to put 3d objects into .pdfs. For this reason, I propose the vase plot. The diameter of the vase is the probability density. or maybe the cross-sectional area is the probability density. It is intentionally ambiguous. Please cite my comment whenever you use my case plot in your paper.
@jainabraina
@jainabraina Год назад
The inner diameter or outer diameter or cross sectional area (selected at random when you run the program, no the module is not seedable) of the vase is the probability density. uploading this to npm and pypi asap
@marcellarisa7239
@marcellarisa7239 Год назад
Buttplug plot
@rojnx9
@rojnx9 Год назад
I suggest a 4 dimensional plot, called the aerofoil plot, where the aerodynamic drag coefficient of each 3 dimensional cross section of the 4d shape determines the probability density. Also every single 3d cross section is vaguely penis shaped. Please cite me
@rbr1170
@rbr1170 Год назад
How about a 4d object casting a 3d shadow? The orientation of a 4d object projects n 3d shadow where n gives the spectrum of possible densities depending on the k-orientation of a complex data mapped onto a 4d-manifold.
@vaporisedair4919
@vaporisedair4919 Год назад
Call it the amphora plot, to get the greek creds
@bilbobaggin3
@bilbobaggin3 Год назад
Librarian here: Physical Journals/books/etc. are often preferred when it comes to preservation and access, since when we buy a physical copy, we won't lose access to it when the publisher decides that we need to pay another $400 per user to access their system, or if a ransomware attack takes out the archive. Obviously, storing all these is impractical for all but the largest library systems, but the number of times someone needs to cite an article in the 2003 summer edition of Phrenology Today means that you can get away with only one copy in an offsite, climate controlled storage space. The number of times I've had to do dumb piracy shit because a publisher literally pulled access to a non-downloadable article I was using for a paper *mid semester* was infuriating. Not to mention the time my school literally stopped paying our publisher liaison because they upped their rates by a factor of 10. We had a lot of 3rd/4th generation photocopies floating around that year, lmao
@Emilio1985
@Emilio1985 Год назад
I love love love my campus library staff, at every campus I've been affiliated with. You all rock! And there's something so nice about just walking through the stacks to flip through journals rather than just scrolling and clicking through online lists of volume/issue numbers.
@trespaul
@trespaul Год назад
god bless Alexandra Elbakyan
@AR0ACE
@AR0ACE Год назад
Phrenology Today lol
@martinnovacek9151
@martinnovacek9151 Год назад
@@trespaul She's the best. I really miss the old, up-to-date scihub tho :(
@obrotherwhereartliam
@obrotherwhereartliam Год назад
Public access now!
@blakethomson7901
@blakethomson7901 Год назад
The kooky patterns make it so that colorblind people can read the histograms. I'm red-green colorblind, and I get really passionate about data visualization, partially because I have trouble with certain color visuals that other people don't struggle with, and partly because data visualization is a very efficient way to communicate info when done correctly, and I also have adhd, so I appreciate their communicative strengths. But yeah, all that to say, I really appreciate when kooky patterns are included. It's immediately tells me that the person who made that visual is thinking about how their data will be perceived and wants to communicate it effectively to as many people as possible.
@tglittle3166
@tglittle3166 Год назад
Designing data representation and slides for talks with color blind people and dyslexics in mind is a drum that I always have to beat with my trainees. A lot of the arbitrary complaining that people do about visualization and typeface etc is actually pretty ableist. Fun example is that comic sans is actually one of the easier fonts for people with dyslexia to read.
@QuentinWes
@QuentinWes 11 месяцев назад
Interesting that it helps with colorblindness, i always ran into it in school worksheets and tests. They were printed in black and white so needed a way to differentiate between grey and slightly lighter grey, and we all just had pencils to make them ourselves. It became a bit of a competition as to who could come up with the weirdest patterns for their bar charts that were still distinct. Always nice when adaptations for specific things end up being useful for unintended reasons
@TheMvlproductionsinc
@TheMvlproductionsinc 11 месяцев назад
the comic sans thing is a myth look it up. The same as fonts specifically designed for dyslexia like open dyslexic. More important than anything is font size. @@tglittle3166
@charlesloeffler333
@charlesloeffler333 10 месяцев назад
Yes, the data visualization field should deliberately popularize plot color selections that don’t penalize the red-green folks. Or, encourage thinking about using line and hatching types that don’t require colors to distinguish
@hasch5756
@hasch5756 10 месяцев назад
This is called hatching and there's a whole system behind it. It comes from heraldry and was developed in the 17th century back when there was already significant demand for printed illustrations but printable coloured ink was not yet invented. Basically, you have hatching patterns representing six basic colours; red which you represent with vertical lines, blue which is horizontal lines, yellow (or gold) which is dots, green which is diagonals from top left to bottom right, purple which is diagonals the other way, and black which is a grid. You get pastel tones by replacing the lines with dashes, and you can mix colours by overlaying the hatching of both, for example dashed verticals give you pink and if you interleave that with dots, you get orange
@VivekPatel-ze6jy
@VivekPatel-ze6jy Год назад
Your story about everyone turning to look at you as the only woman... it gave me a flashback to school sex-ed where the teacher (to be inclusive) said "or a**l sex" and people turned to look at me which basically outed me to the teacher. But also what reaction were they expecting from me lmaoo
@GSBarlev
@GSBarlev 9 месяцев назад
This was a great reminder about effective strategies for allyship. Any of the other ten guys in the room could have made the response she described, and it wouldn't have even seemed like White Knighting. That they turned to her was definitely motivated by empathy, but the effect is the same as what happened to you.
@eddieantonio
@eddieantonio Год назад
During my masters, my advisor actually ENCOURAGED the use of violin plots, and I didn't really question it at the time. Rude jokes were made. I have a publication that has FIVE SEPARATE PAIRS of violin plots (and plots that cluster points into hexagons? for some reason?). And you're right! Side-by-side histograms would have been BETTER and MORE COMPACT 😭😭😭
@HoneyBadgerLikesYou
@HoneyBadgerLikesYou Год назад
Angela utilizing her platform to indoctrinate the public against evil plots in an intellectual crusade is what I live for
@livingroomviewing2987
@livingroomviewing2987 Год назад
That part.
@werawerlnwerlnrlnelr
@werawerlnwerlnrlnelr Год назад
the plot thickens
@idontwantahandlethough
@idontwantahandlethough Год назад
that was quite clever
@benedixtify
@benedixtify Год назад
Evil plots
@zimbu_
@zimbu_ Год назад
What do we want? Good data visualization! When do we want it? Now!
@christopherknight4908
@christopherknight4908 Год назад
Yes to the symmetry argument. I spent the entire video trying to figure out why they were twice the size they needed to be.
@brindlebucker4741
@brindlebucker4741 Год назад
I'm not a scientist. Just a humble welder. I like your videos. I had no idea where you were going with this, but you had me nodding along, thinking, 'I get it. These violin plots are stupidly complicated and are not as efficient as other plots.' I get it! And you know, not being involved with the sciences, I didn't actually care about the plots one way or the other, but I could see how it would annoy an actual scientist/researcher. Then it finally got it where it was going, and I was like, 'Damn! That was masterfully done.' Keep up the great work.
@lost4468yt
@lost4468yt 11 месяцев назад
"humble welder" - I like how you just had to validate your stereotype of "how do you know someone's a welder? They'll tell you" on the first line.
@teremleonheart3776
@teremleonheart3776 10 месяцев назад
@@lost4468yt Just like vegans and horse girls.
@NightsReign
@NightsReign 10 месяцев назад
​@@teremleonheart3776 When you say "horse girls" are you meaning equestrians, or is this some new horror? 🤔
@teremleonheart3776
@teremleonheart3776 10 месяцев назад
@@NightsReignthis comment honestly made me chuckle for a good bit, i mean girls that are super duper into horses, n have posters of em and backpacks with horses on it, that typa horse girl lmao 😂😂😂😂
@CrownRock1
@CrownRock1 8 месяцев назад
@@teremleonheart3776 And now I'm laughing about the connection between welders and horse girls that I had never seen before, but can't unsee now.
@erdvige
@erdvige 9 месяцев назад
As a data analyst, you've made amazing points for not using violin plots--scientifically. But in the business world, violin plots are **pretty** and the vibe is how you get execs to make the decisions in favor of what you want 😂😂😂😂
@GiovanniBottaMuteWinter
@GiovanniBottaMuteWinter 8 месяцев назад
But do the execs understand those plots or do they just think they are cute?
@daniel6678
@daniel6678 7 месяцев назад
@@GiovanniBottaMuteWinterI think that’s the point they’re making… they’re saying that in the business world, the information doesn’t matter so long as it’s presented in an attractive way.
@londonalicante
@londonalicante 6 месяцев назад
@@daniel6678 Executives are only interested in the overview. They will naturally gravitate towards the violin plot for the overview data and assume the technical people will handle the details in the histogram. We make judgements on how relevant info is to us all the time, and pretty is not necessarily better. For example the more visually appealing a letter through my door is, the quicker it will be identified as junkmail and be thrown away.
@spuriusbrocoli4701
@spuriusbrocoli4701 3 месяца назад
Head on the nail, tbh. As I was learning data science post-BA, I was struck by how differently academics are expected to visualize info vs how you need to present that same data to "general audiences", i.e. rich kids who coasted into "business".
@robertaylor9218
@robertaylor9218 3 месяца назад
I’m a super lay person. My problem is that it is about the only plot that I’ve seen that doesn’t intuitively communicate anything.
@GeoQuag
@GeoQuag Год назад
The comment at the end about “why have two flaps” is the most upsetting part of them to me. The only time I’ve seen something like this (where they used histograms instead of smoothing, so less yonic) that seemed even a little defensible is the man/woman population age plots for different counties. It’s still probably better to arrange them differently but at least they were using the two sides for something.
@the_mad_fool
@the_mad_fool Год назад
Honestly, those would also just be better if they had both on the same side, as then you can compare them properly....
@GeneralTaco155555a
@GeneralTaco155555a Год назад
Exactly. What is the point of having your data mirrored onto both sides? Your smoothed out histograms were so nice you had to show them twice? BS. I do see how stacking histograms can get cluttered, but as you pointed out: if the goal is to compare histograms without stacking them, then make an asymmetrical violin chart with labeled axes so you can actually interpret the data.
@Appletank8
@Appletank8 Год назад
The one sorta viable use case that just uses the other side for something useful, but they also don't just smooth it out. Ex. population age plot between men and women, Vertical axis is age, horizontal axis is pop count. men on left bars and women on right bars.
@cadosian078
@cadosian078 Год назад
I thought population pyramids were good ways to visualize the data honestly.
@Qwicksilver
@Qwicksilver Год назад
Saw this on Polymatter and I thought the same thing. That’s the one viable use of this data visualization technique. But there it’s just two histograms turned on their side and placed opposite one another.
@user-td3yi1mq7p
@user-td3yi1mq7p Год назад
This video sort of gave me the urge to come up with plots that are even more cursed than the violin plot. Like a stick figure plot where different aspects of the data set are represented by the size and orientation of the body parts.
@raygivler
@raygivler Год назад
It exists. Pie charts.
@nerdinleather
@nerdinleather Год назад
​@@raygivlernah this is like you took a pie chart and cursed it
@crzyprplmnky
@crzyprplmnky Год назад
Can you pull up the latest set of Homunculus plots please? I'm a bit concerned about some outliers I saw 😂
@PokeCube_
@PokeCube_ Год назад
what about a scatter plot in audio form? you map numerical values to hertz values, and instead of coloring points and lines, you give them an instrument. like a guitar note would be played for each data point, and a violin is played to show the line of best fit. i'd call it a song plot
@L3X1N
@L3X1N 11 месяцев назад
@@PokeCube_ Symphony plot?
@RandyGoble
@RandyGoble Год назад
When I was in undergrad getting my degree in economics, I saw these every once in a while when conducting research for papers. Turns out I wasn't an idiot for not being able to read these plots, I was just an idiot for getting a degree in economics.
@Gersberms
@Gersberms Год назад
I thought a degree in economics was a guaranteed job at Amazon. Is that still a thing?
@therealpbristow
@therealpbristow Год назад
@@Gersberms If true, that's the best reason I've heard for *not* getting an economics degree! =:o\
@Bozebo
@Bozebo Год назад
@@Gersberms Isn't Amazon fundamentally bad for the economy so they're guaranteed to get only hire bad economists? xD
@TessHKM
@TessHKM Год назад
@@Bozebo it depends on if you view "the economy" as something that's meant to serve small businesses/producers or something that serves consumers.
@azlanadil3646
@azlanadil3646 Год назад
@@TessHKM I think “the economy” is generally meant to serve everyone. Amazon is obviously not good for producers, and small businesses, but it is also in the long term bad for consumers. Yes it does provide them cheaper goods, but it also results in wealth being funnelled out of communities which kills small towns. It also results in worse working conditions, and lose pay for people in cities. Overall it’s a net negative to the living standard of the average person.
@IntuitiveAndExhaustive
@IntuitiveAndExhaustive Год назад
Hello! I like violin plots a lot; Im a data science researcher. A violin plot displays not only easily comparable mean and quartile information, but also more granular information about the shape of tge distrobution. This gives violin plots a unique ability to intuitively inform certain dicisions about further analysis, especially when youre exploring new data with numerous distrobutions for the first time. Histograms have an issue of sharing the same axis, which, when trying to understand intricacies of distributions, can be difficult to read. Box plots are easy to read but can obscure information, maybe leaving readers to question if the choice of a box plot was appropriate. A violin plot allows you to render an easily interpretable plot which lays bare qualitative aapects of the underlying dustribution. This not only allows for easy analysis via the box plot, but also high level qualitative understanding. I never, when i read a violin plot, care about the scale of the distribution, but the shape, which i think they do fairly well. Of course, when publishing I may or may not use them. I find them incredibly good for visualizing data exploration, and like to use them when explaining datasets moreso than results. On the point of smoothing, totally. Thats why ive gravitated towards swarm plots for general qualitative distribution understanding. But, smoothing is an issue within itself, histograms have essentially the same exact issue in terms of bin size. Also, worth noting, im colorblind af, so overlayed color infornation may as well be jibberish to me, which might be part of the reason i hate overlayed histograms so much.
@benprytherch9202
@benprytherch9202 Год назад
I'm not colorblind and I also can't read an overlayed histogram. Way too cluttered, and in my mind I'm trying to imagine what they'd look like not overlayed.
@IntuitiveAndExhaustive
@IntuitiveAndExhaustive Год назад
I promise I'm not as stupid as my spelling suggests.
@TheManifoldTruth
@TheManifoldTruth Год назад
Honest question, but why not just use staggered histograms then? What does the rotation and mirroring add? Including information on medians/averages (and quartiles if you really need to, but if you have the histogram there anyway why would you) could be done in pretty much any format you choose.
@IntuitiveAndExhaustive
@IntuitiveAndExhaustive Год назад
@@TheManifoldTruth That's a great question, and the honest answer is it's really convenient to plot violin plots with seaborn. Another more honest answer is the aspect ratio of monitors. While they don't have to, histograms have their density along the Y axis, meaning, if you have a lot of distributions you want to compare, it's easier to fit a violin plot which orient things horizontally. Yes you could just rotate the histogram horizontally, but the love of making the "perfect" vs the "good enough" plot starts to die out around your 10,000th plot in your career. Another, maybe more satisfying but less honest answer, is the mean and standard deviation of the distributions is useful in comparison, and that comes out of the box in most violin plots. Really, the debate around this feels like the debate around the oxford comma; strong opinions around "rules" which are really well entrenched but still arbitrary preferences. I don't have any evidence to back this up, but I wouldn't be surprised if violin plots were more common in more data rich and fast moving research domains like data science rather than physics. In data science, making a plot that's good enough quickly is way more attractive given the sheer volume of visualization required in the domain. I have to note though, a lot of my takes make a lot more sense in a business context, rather than an academic context. Papers take a long time to make, so having a sub-par plot makes much less sense.
@tglittle3166
@tglittle3166 Год назад
Came here to say many of the things that you said. Thank you for saying them more completely.
@ealloc
@ealloc Год назад
An alternative to a violin plot is a "beeswarm" plot: Instead of a smoothed density you plot each individual datapoint as a dot at its exact y-value, and the x-values of the points are chosen so the dots don't overlap, causing y-coords with a lot of dots to bulge out. I like them because you can simultaneously see the raw datapoints, and also see the broad distribution. One problem is that in naive implementations you get chains of points extending out from the center in a line, giving a christmas-tree appearance. But good implementations can avoid this.
@HunchbackJack
@HunchbackJack Год назад
I'm an old man, so I know something about typewriting erasure techniques. In rough order of technological advancement: 1. hand-erasing with an ink eraser 2. Liquid ink eraser solution, in a bottle. You would apply this to the typo on the page and it would break down the ink and fade it somehow. 3. An "eraser strip" as part of the ink ribbon where you would retype the offending letter using the strip and it would abrade/absorb/dissolve the ink 4. hand painting over the typo with white-out (from a bottle) 5. hand-held whiteout strips, with dried whiteout on one side. You would retype the letter with one hand, holding the strip against the page where the hammer hits with the other hand. 6. whiteout strip built into the ink ribbon. Same as the whiteout strip, above,, but you don't need to hold the strip, its part of the ink ribbon. Most solutions did not require removing the paper, because you can *never* get it aligned again. There's typically enough space where the hammers hit the page for you to get in there with whatever erasing solution you're trying to use.
@RealDevastatia
@RealDevastatia Год назад
IIRC, the IBM Selectric III would automatically retype the previous character with the whiteout ribbon when you pressed the backspace key.
@RealDevastatia
@RealDevastatia Год назад
I amazed the spell checker didn't ding me for spelling "whiteout" without a hyphen. It's always "correcting" words that don't need correcting.
@artfuldodger5933
@artfuldodger5933 Год назад
Neat! Thanks for sharing some history!
@tlecoyotl
@tlecoyotl Год назад
Being a 90's child I still managed to use both mechanical and electric typrewritters. I had forgotten those witheout strips! In my mind, those used to come in little red plastic boxes, kinda like chewing gum packs
@delusionnnnn
@delusionnnnn Год назад
Those Selectrics with a delete key and a special ribbon were the best! I forget which technology they used - "glueing off" a plastic based ink, or a white-out ribbon. I think they used white-out if I remember correctly, but the few that used a plastic based ink that you could glue off within a few seconds were an absolute delight since you could use that technology on almost any paper and it didn't matter what colour the paper was.
@martinnovacek9151
@martinnovacek9151 Год назад
This channel really feels like having a cool older PhD friend who tells you all the secret tips and tricks and cool stories in academia
@tuomasmassa2954
@tuomasmassa2954 Год назад
Exactly! ❤
@castroski7
@castroski7 Год назад
Its the best
@Marc42
@Marc42 Год назад
Spot-on!
@davido2644
@davido2644 Год назад
Thanks, you really summarised the feeling so well! Love this channel ❤
@ubahfly5409
@ubahfly5409 Год назад
Who you callin "old" , buster ?
@sageanastasi2028
@sageanastasi2028 Год назад
Related to that very last point about "why not just make it half the graph", when they made us do violin plots in high school biology we were told to put *half* the data on each side so that the width of the *whole* thing matched the amount of data. Which is absolutely not how anyone else does their violin plots. Also we had to do them by hand, which is as excruciating as it sounds
@SlenderSmurf
@SlenderSmurf 10 месяцев назад
I thought that was how they were drawn as well. So that the area of the shape has an actual meaning, which is something intuitive to look at. Although now that I think about it there is no x-axis so doubling or halving all of the widths doesn't change anything.
@FordFourD-aka-Ford4D
@FordFourD-aka-Ford4D Год назад
You'd *apply whiteout w/ the page STILL inside* the typewriter, wait about 1 minute, and then use the *backspace* key to shift _back a space_ to your mistake so you can apply new ink over the dried whiteout. Later on there were typewriters that could apply the whiteout for you. Usually electric ones. There were some other more esoteric solutions too! But yeah, most people just applied something directly to the paper and shifted back a space. Calling it "backspace" on a computer keyboard is one of many holdovers from the typewriter days. So is the stubborn yet incorrect convention of double-spacing after a sentence. (Double-spacing is essentially a typewriter trick/convention that makes things easier to read because periods are so small and on some typewriters don't offset enough. Single space after a sentence has ALWAYS been typographically correct in the world of typesetting books - plus print & graphic design.) We use a lot of old terms and symbols that don't apply anymore. Like saying "rolling" when a camera starts recording comes from the early days of film when there was a step to roll the film. Same way that many save icons are still simplified shapes of floppy disks - lots of kids grow up associating that shape form with "saving" without actually knowing it's a real physical thing. Or how how we associate the power symbol with turning things on (it actually was originally a standby-reset symbol or something but that's a whole different conversation.
@NoeLPZC
@NoeLPZC Год назад
You have a source for that last paragraph? I've always heard the power symbol was a combination of 0 and 1 - a binary toggle for on/off.
@MissaBrevis
@MissaBrevis 11 месяцев назад
​@@NoeLPZCyou're both right - the 0 and 1 do represent binary states, but the version of the power symbol with the line crossing the circle was originally a standby symbol. If I remember correctly it was meant to indicate something like what we'd call sleep mode as opposed to turning something all the way off and on. The actual power-on-off icon was supposed to be the line totally within the circle, not crossing it. It's even still used in some specific cases now - I work in a lab and we have vortexers that have marked switch positions for on (line), off (circle) and touch-activated (line breaking circle) modes.
@adora_was_taken
@adora_was_taken 10 месяцев назад
actually most correcting typewriters had an adhesive ribbon that would lift the letter off the paper. there's a fun technology connections video about it
@mykal4779
@mykal4779 7 месяцев назад
this concept is called a skeuomorphism
@obrothernotagain4668
@obrothernotagain4668 Год назад
I remember vividly how my advisor emphasized how he read papers: title, authors, abstract then plots. If the plots were compelling then he'd dig in. Those plots absolutely need to tell a succinct and coherent story.
@halfstep44
@halfstep44 Год назад
Interesting point. I've always thought of the graphs, plots, whatever they are as being a side dish What you said reminds me of my father telling me to always start with the maps in a book of military history, then decide if you want to purchase that book. Similar reason
@richardbloemenkamp8532
@richardbloemenkamp8532 Год назад
Usually the quality of plots it very quick to evaluate. So if you have limited time and a lot of papers it makes sense to judge a bit on plot quality before reading the whole paper. However many plots require some significant explanation of the measurement system and conditions. Therefore after a quick look at the plots I often go back to the text.
@NameName-u9e
@NameName-u9e Год назад
"You can't actually get data, you're just getting vibes." Brilliant.
@Dongobog-ps9tz
@Dongobog-ps9tz 2 месяца назад
The whole point of a data visualisation is to extract vibes from numbers though surely?
@NameName-u9e
@NameName-u9e Месяц назад
@@Dongobog-ps9tz Vibes are definitely not the whole point of a plot, but a nice feature if you have a well constructed one. You should still be able to recreate the original data from whatever visualization you end up using not only so other people can try to find other useful features in the dataset, but they can verify the plot actually matches your original data. If your visualization is "vibes only", it's a marketing gimmick, not a useful research tool.
@Dongobog-ps9tz
@Dongobog-ps9tz Месяц назад
@@NameName-u9e A plot is a lossy compression where you're trying to turn the data into something human readable. Maybe we have a different definition of vibes but all I'm talking about is that the plot tells a story with the data. It can misleading and entirely truthful.
@jasontracey3416
@jasontracey3416 Год назад
I'm convinced people only use them because they look vaguely sexual
@KillerOfWhales
@KillerOfWhales Год назад
A pretty good reason tbh
@wtfpwnz0red
@wtfpwnz0red Год назад
I never heard the name until today. I always imagined them as vulvas with misplaced clits and thought when they showed up it was scientists being juvenile somehow
@GelidGanef
@GelidGanef Год назад
I'm convinced they're only called violin plots because they couldn't get a paper to publish "vagina plot"
@Emilio1985
@Emilio1985 Год назад
And STEM is still largely a boys-club, so there is increased tolerance for anything that even subtly makes non-men uncomfortable.
@bulldozer8950
@bulldozer8950 Год назад
To be fair, if there was a plot that looked like a dick, men would certainly also use that just because it looks like gentiles.
@vsiegel
@vsiegel 10 месяцев назад
My theory: The inventors of the violin plot were *literally trolling,* and call it internally the *pussy plot.*
@nick_eubank
@nick_eubank Год назад
One principle of peer review is that we shouldn’t just assume authors are analyzing their data correctly. I appreciate violin plots because they provide the reader reassurance that the use of box plots is appropriate. Absent the density overlay, I worry (and sometimes rightly) that the authors are using box plots in inappropriate contexts (as evidence from the fact one sometimes sees multi mode distributions in violin plots in papers)
@mikedavis979
@mikedavis979 Год назад
I agree, although I do agree with Dr. Collier that violin plots are less aesthetically pleasing. Plotting semi-transparent points over a box plot sometimes can work. Perhaps everyone should make a separate violin plot for reviewers, as well as box plot or something else. Hmmm....
@davidjohnston4240
@davidjohnston4240 11 месяцев назад
I prefer to see a test of gaussianness (the actual test name escapes me right now). Then you have a one liner saying "yep, boxplots are good here". No wasted paper.
@toastedbread5985
@toastedbread5985 10 месяцев назад
I think you are referring to a Q-Q line plot? It gives a quick indication if the data is normal and any possible skew at a glance. They are also very useful for comparing goodness of fit between distributions.@@davidjohnston4240
@danielhicks1824
@danielhicks1824 10 месяцев назад
​@@davidjohnston4240normality lol
@mitchellsteindler
@mitchellsteindler 9 месяцев назад
Just use a histogram
@ubahfly5409
@ubahfly5409 Год назад
A violin plot to overthrow the physics department !
@LimeyLassen
@LimeyLassen Год назад
There's no need to resort to violins!
@offensivebeefroast5407
@offensivebeefroast5407 Год назад
Let me get the band together
@PinataOblongata
@PinataOblongata Год назад
Plot twist!
@leehurst172
@leehurst172 Год назад
As a non-scientist, I've always thought these plots were confusing and just obviously above my pay grade. Very validating to hear that they are indeed as uninformative as I thought they were. Much appreciated❤️
@thefaboo
@thefaboo Год назад
Same! I felt the same about radar plots for a long time until I found out there's no real consesus on how to read those either 🙃
@zorinzorinzorin5243
@zorinzorinzorin5243 Год назад
This is such an important video. I remember that one of my high school textbooks had some stupid plot (that I now understand to be a violin plot) that the author loved to use. That book could have been half-a-pound lighter if they just took them out.
@leehurst172
@leehurst172 Год назад
@@thefaboo yeaahhhhhhhh radar plots are cool until you realize the area can be altered by how the spokes are ordered lol. It's just a multi-variable plot with connections between each percentage for no real reason
@Kevin_the_Caveman
@Kevin_the_Caveman Год назад
The whole point of data visualisation is to make it easy to understand, otherwise you'd just dump raw data in table format at people, so your POV is perfectly valid. Of course, depending on context, if you are writing something to be read by people familiar with the topic instead of the general public, you can go a bit spicier on the complexity, but it should always be as simple as possible
@ShankarSivarajan
@ShankarSivarajan Год назад
On the contrary, they're information dense, combining histograms and box-and-whiskers plots for multiple sets of data. There only uninformative if you decide not to read them.
@punkinholler
@punkinholler Год назад
My grad school used to have professional plot makers on staff. It was long before my time but the space they worked in was still there and there were some people still working there who remembered them.
@acollierastro
@acollierastro Год назад
Professional plot makers! I love that.
@robertadsett5273
@robertadsett5273 Год назад
Pretty much the same for me. There were still a few around but they were on the way out. Images had to be pasted into place
@thosewhowish2b693
@thosewhowish2b693 10 месяцев назад
The colors on the graph at 20:37 are really really hard to tell apart for people with deuteranopia (some 6% of males). These pastel colors are hard, it's much better if they are very definitely yellow, or blue, or red, or grey, etc. Just wanted to chip in, since we're talking about it already.
@jamesstevenson1766
@jamesstevenson1766 Год назад
I've always felt vaguely guilty as a scientist for never using the violin plot functions in any plotting tool - thank you for lifting this weight from me.
@wraithwrecker_
@wraithwrecker_ Год назад
I thought, "Well they look funny, but surely there's a reason why they'd be useful!" And then at the 8-minute mark, you finish explaining how you make a violin plot and I'm like, "Okay but why would you do that though???" I think it's a terrible plot already and there's still over 30 minutes of reasons to listen to. Brilliant!
@iesmeh
@iesmeh Год назад
I am not a scientist. I have never heard about violin plots before today. But now I know about them and why they are mostly useless. I cannot overstate how much talent you have for making subjects like these interesting. A lot of the time, I pause science videos while alt tabbing to other things, taking in the videos in chunks. I always seem to watch yours straight through, beginning to end. The way you break up your videos with music and title-cards really helps make them digestible. Thank you!
@thecynicalone7655
@thecynicalone7655 11 месяцев назад
As to the violin plot joke thing social difficult choices. A classic is to look confused and ask why the joke is funny in a very sincere way. Another way that tends to work for me is to just say "dude, c'mon", as that puts the onus squarely on them I do find it very helpful after a moment like the one you described to reflect on what I could have done differently while keeping my goal in mind. Generally speaking with stuff like this, the best approach is to deflate the other person, so to speak. I wish you all the best in your future strange and awkward social interactions
@РостиславНізіньковський
About papers 100 years ago. Based on the memoirs of the Stephen Timoshenko it seems that there were special people at universities who prepared plots. You would give them hand scatched drawings and they prepare then versions for a paper.
@welcomeblack
@welcomeblack Год назад
I think the type of smoothing they're doing is called a Kernel Density Estimate. They didn't teach us about KDE plots in physics classes because, as you say, they mostly just show vibes, but it's still better than the arbitrary-window smoothing you're suggesting they do. See the Seaborn documentation for violinplot
@ausiidnd
@ausiidnd Год назад
Finally! Someone else hates these things!
@Amira_Phoenix
@Amira_Phoenix Год назад
No science, only vibes 🙄 also, 🐱
@m.f.3347
@m.f.3347 Год назад
Violin plots are just Georgia O'Keeffe paintings for STEMlords
@keenanlarsen1639
@keenanlarsen1639 Год назад
that is so spot-on 🤣
@RobertKnutzen
@RobertKnutzen Год назад
i came to this comment section to make this joke and now i feel unoriginal
@btrenninger1
@btrenninger1 9 месяцев назад
I'm convinced the original paper was an elaborate troll. It's funny because sexual body parts are inherently funny. For the same reason farts are funny. It takes a physical (physics!) aspect of ourselves which is tied to emotions that we normally try to keep private and forces it out into the open. This emotional discomfort is mostly politely expressed as humor. Now had your joking colleague just said with clear disgust, "It looks like HONK, ugh" wouldnt that have been worse? Of course you are right, best course of action would have been to say nothing at all. The obvious humorous response to the violin plot is to come up with some sort "c*ck" plot and then come up with a logical way to overlay it or point it at the violin plot. And , get it published straight up. Never acknowledge any physical resemblance. That would be funny. I know you'd like it to be but, clearly, humanity isn't better than this.
@ComradePhoenix
@ComradePhoenix 10 месяцев назад
I think the reason the histogram got mirrored is because "symmetry makes it look and function better" (which isn't necessarily true in general, and certainly not true in this specific case, but it feels like a common misconception, though that might also be personal bias because my spicybrain likes symmetry). Also, the joke is that genitals are funny. Not just AFAB people's genitals, everyone's. I've seen some radio reception graphs that look like a different set of genitals, and had to stifle a giggle. Like, the sexism stuff is definitely real and valid, and there's a time and place for genital jokes, and an academic setting definitely isn't that, and the jokes absolutely age more quickly than some short-lived isotopes, but still.
@Nossimid
@Nossimid Год назад
This is actually kind of funny. I'm a PhD student in statistics, and I learned about violin plots for about 10 seconds in one of my first year courses. Just a few weeks ago I ran into a situation where I actually considered using violin plots to convey the distribution of sequence lengths for a system running in different states. However, I did ultimately decide to use a different plot, because the finished violin plot just looked too weird, and would have been distracting. I admit, I've never seen them used in a professional setting by other statisticians or scientists.
@NeonNijahn
@NeonNijahn Год назад
I've never once used a violin plot... but for some reason i still felt like i was in trouble the whole video.
@alexanderkonczal3908
@alexanderkonczal3908 Год назад
*me, struggling through learning data analysis* time to get my opinions validated about this dumbass plot
@alexanderkonczal3908
@alexanderkonczal3908 Год назад
ok, I had to stop watching due to being a parent and didn't come back for a long time. whoops. I can safely say ALL my opinions were validated, except that... every time, I think of dangerous butt plugs rather than genitalia, which is even worse, imo? regarding why they mirror the curve about the axis, I think 1. some people are unnaturally obsessed with bilateral symmetry, and 2. mirroring the curve makes the changes in the data more dramatic, and this is a plot for the wholly unsubtle.
@benprytherchstats7702
@benprytherchstats7702 Год назад
I want to defend violin plots against some of the criticisms from the first segment of this video. But, the criticism in the second part is why I've never and will never make a violin plot. So maybe let's imagine a world where all violin plots were replaced with their half-violin equivalents, which as Dr. Collier concedes are definitely better. Here goes: - On smoothing: smoothed density plots look different depending on the value of the smoothing parameter. And histograms look different depending on the bin width. I guess histograms have an edge because you can see the bins and thus infer the amount of "smoothing" created by binning. But the basic issue is present in both: you can make two histograms or two density plots of the same data that look very different. - On violin plots vs. stacked/overlapping histograms: stacked histograms are also ugly and get really hard to read the more groups there are. Overlapping histograms are an abomination. They cannot be read. The plot at 20:17 is indecipherable. - Joyplots (which we're supposed to call ridgeline plots but I like joy more than ridgelines) are great. Yay for these. But, if we're gonna say "yay for these", how can we also say "boo for the half-violin plot"? They're almost the same thing. The joyplot puts the groups behind each other and does a cool little perspective thing. The half-violin plot puts them next to each other sideways. That's it, right? - On density plots and sample size: YES!!!! This is a huge issue. If we want to plot the distributions of two samples on the same plot, and the samples are of different sizes, we have a choice to make. And the correct choice is usually the one that conveys information about sample size. But, there are exceptions. Either way, this needs to be made clear. I just don't see why it's a violin plot problem. Isn't it also a stacked density plot or stacked histogram problem? - The plot at 26:15 is probably missing a caption, which points to a problem with these kinds of plots. Sometimes those lines are confidence intervals, meant to capture some hypothetical "true" or "population" mean. Sometimes they're "prediction interval" plots, meant to capture some % (like 95%) of data. Sometimes they're like the boxplot whiskers, extending from maximum to minimum. Sometimes they're the mean +/- a standard error, as opposed to +/- two standard errors like with most confidence intervals. Those lines could be anything, and you gotta read a caption in order to interpret the plot. Which, I think was a criticism of the violin plots. Ok, those are the defenses. But they're all pretty much moot because violin plots unfortunately look kinda like gentalia and it doesn't matter what else they might have going for them, that's enough reason to not use them. Also, I wish I worked in a field where "too many plots" was a problem. Try reading education research papers. All anyone talks about is directionality: "we made this change to the class and then there was a statistically significant increase in scores on a concept inventory, The End." Most of the time when I read a paper I'm just praying and begging for some plots. Any plots! Show me a picture of the data!!!! But no, it's all endless prose about what was or wasn't statistically significant.
@SteelBlueVision
@SteelBlueVision Год назад
And just how many histograms can you overlay, before 12:33 happens? Because showing histograms side by side certainly does not allows for easy visual comparisons of one vs another, especially for: Number of histograms > 2. Also, per 13:35 , I don't think you know what a histogram is! Those are vertical bar charts.
@gabitheancient7664
@gabitheancient7664 Год назад
actually there IS a type writer that can erase stuff and uses some weird ass material science I don't know about to kinda suck the ink out after you write it, my good ol' technology connections has a whole video about corrections in type writers
@pmcgee003
@pmcgee003 Год назад
I assume whiteout tape originated in typewriters .. as an auxiliary ribbon. People used whiteout out of a little bottle.
@MattMcIrvin
@MattMcIrvin Год назад
@@pmcgee003 There was whiteout tape that came in dispensers like Scotch tape. There was a way to shift the ribbon out of the way, and you'd backspace over the mistake and type it again while sticking the whiteout tape between the type hammer and the paper. That would type over it in white. My mom used this back in the 70s. Later, there were typewriters that had something like this as an auxiliary ribbon. Actually, more often they would use a carbon-based regular ribbon and the correction ribbon could literally lift the stuff off of the paper when you typed over it.
@pmcgee003
@pmcgee003 Год назад
@@MattMcIrvin yeah, the IBM Selectric golfball typewriter was a beast to try moving around the office.
@nos9784
@nos9784 Год назад
​​@@MattMcIrvin Ah, interesting! i just looked at my "brother AX310" electric typewriter. (heirloom) It seems to have a translucent correction tape, if anyone cares i might try to find out if that is how it works 😅
@AdrianBoyko
@AdrianBoyko Год назад
Typewriters like the Selectric were more like letraset than ink. They erased by using a sticky tape that literally peeled the “letraset” character off the page.
@Tim3.14
@Tim3.14 Год назад
While I also find violin plots a bit hard to read, for something like the paper at 13:30 where they apparently want to compare 7 different probability distributions side-by-side, I'm not sure any other option would be much more readable. It's probably too many to overlay the probability densities on top of each other (although I agree that's a good option for comparing 2 or 3 distributions). I guess they could do 7 side-by-side histograms or pdfs. 🤷🏻‍♂️ (By the way, Fig. 3 and 4 you point to aren't actually a histogram of the same thing, they're a bar chart of something else. Note the x-axis isn't numeric, unlike the y-axis of the violin plot. Sorry to nitpick!)
@ethanpayne4116
@ethanpayne4116 10 месяцев назад
I made a comment saying basically the same thing, the arguments in this video don't actually make sense in the context of the examples given. Even though I have never personally used violin plots before, I am now convinced that they are a very effective way of visualizing many distributions at once without overlap.
@andybrice2711
@andybrice2711 7 месяцев назад
That's what I was thinking. But then the Ridge-Line Plot at 21:20 looks like it's probably superior in every way.
@mattc2327
@mattc2327 7 месяцев назад
As a PhD student in Bio, I was also on the way to say this. I have a lot of overlapping distributions for a lot of conditions. I think one solution is to distill your conditions into the truly necessary ones. Then, I think the ridge-line plot (or a less overlapping version of it) is definitely better than a violin plot
@Daniel-ev1gx
@Daniel-ev1gx 2 месяца назад
google ggridges
@oscarfriberg7661
@oscarfriberg7661 Год назад
The animation on 18:20 could've just been a line diagram. Perfectly conveys the same data with just a single image. Super easy to plot in Excel too. No need to make it a complicated animation that’s impossible to understand. I feel like that's often the case with dataisbeautiful. It's almost a competition in presenting the most basic data in the most convoluted way possible. Like those "make the worst volume bar" UI challenges, but serious.
@ultimatedude5686
@ultimatedude5686 Год назад
Since it's supposed to be different levels of legalization I would've gone for a stacked area graph, but I agree with the sentiment. It looks kinda cool but it's definitely worse for conveying information.
@dalmationblack
@dalmationblack Год назад
one of dataisbeautiful's biggest issues is an obsession with making data animated that doesn't need to be it's honestly way easier to tell how quick something is by looking at a slope on a time plot then by trying to compare different speeds half a minute apart in the same animation
@antonhelsgaun
@antonhelsgaun 10 месяцев назад
Why do you want a line, or for them to be connected at all? Shouldn't it be 4 separate bars?
@oscarfriberg7661
@oscarfriberg7661 10 месяцев назад
@@antonhelsgaun 4 separate lines that demonstrates the change over time. Then you don’t need to make it an animation. Or do a stacked area graph like mentioned above.
@Sakkura1
@Sakkura1 11 месяцев назад
9:09 Histograms are pretty difficult to use for comparison of single-cell RNA sequencing data. You can also make a variation on the violin plot by making each half of the violin represent two different conditions, eg. a control condition vs. some treatment (eg. how does this drug that inhibits this pathway influence expression levels of this other thing, compared to no drug). The ridgeline plot is one alternative, but it's pretty space-inefficient unless you overlap the data and risk harming readability.
@ethanpayne4116
@ethanpayne4116 10 месяцев назад
This video was more of a rant than a legitimate analysis of the use cases for the violin-plot. Even the example shown at 10:22 shows just how unreadable overlapping histograms become once you have more than 2. Violin plots are literally just a way of visualizing several histograms at once without making them collide with each other.
@HaydenLikeHey
@HaydenLikeHey Год назад
When you say that a man brought up that the plots resemble a vagina and combining that incident with the flip across the y axis being superfluous, there's some part of me that can't accept that not being on purpose. That they exist solely as some sort of joke to the original author that somehow managed to get published.
@MrBfiguero
@MrBfiguero Год назад
A picture is worth a thousand words. Dr. Collier's beleaguered sigh is worth a thousand data points.
@marcins.1128
@marcins.1128 Год назад
When you made a mistake when typing on a typewriter you had to use a backspace then you would use a special white chalk covered piece of paper - put it between the typing tape and the paper then type the wrong character again - that would "erase" the wrong character, so you could use backspace again and press the correct key.
@marcins.1128
@marcins.1128 Год назад
Now I feel really old.
@HunchbackJack
@HunchbackJack Год назад
@@marcins.1128 I'm old enough to remember when those Tipp-Ex strips came out. They were an amazing innovation. Before that, you needed to use an ink eraser, or some noxious kind of solvent that faded the ink on the page. Later, some typewriters had the whiteout strip *built into the ink ribbon*. It was a magical time.
@mercury5003
@mercury5003 Год назад
@@marcins.1128 Im only 26 and I happened to grow up with one. There was one at my moms old job when shed take me as a kid and I'd play around with it. I'm not sure if the backspace function worked the same way though.
@marcins.1128
@marcins.1128 Год назад
@@mercury5003 there were also newer typewritters with two tapes - one of them was the erasing one. They store some recent characters in memory so you could use the backspace as on your PC.
@jtsiomb
@jtsiomb Год назад
Or just cross it out with X-es and type it correctly right next to that or above that. Or use correction fluid, wait for a bit, then type over it (it always looked different though), or re-type the whole page.... depends on what you're doing and what are the tolerances for nice presentation vs just having the text on a piece of paper.
@jainabraina
@jainabraina Год назад
I enjoy that this turned into a histogram appreciation video because histograms are really great.
@ariadne4720
@ariadne4720 10 месяцев назад
I worked as a statistician in the 90s and into the early 00s, and never heard of a violin plot. Knowing what they are now, I see they are entirely useless.
@douglasmagowan2709
@douglasmagowan2709 Год назад
People love "chart art." I used to work in finance. There was pressure to replace data tables with charts when possible, even if the charts ultimately distort the data. But the chart makes the report look pretty. A very common chart that I absolutely hate is the 3D pie chart. The pie chart is already a bad chart, but someone has sabotaged what meaning a pie chart has as the areas have become distored and are no longer direct representations of the weights.
@OlleLindestad
@OlleLindestad Год назад
A 2D pie chart is only bad when there are more than two categories in it. A pie chart with two categories is excellent. You can immediately see whether the fraction displayed is closer to a quarter, half, or three quarters. For more than two categories, a stacked bar is better.
@davidstrickland3510
@davidstrickland3510 Год назад
I mean, you can immediately tell whether X% is closer to a quarter, or a half, or whatever without the chart. Charts should ideally be used to get a snapshot of lots of data--not just to make a list of percentages look fun.
@FrogVoice
@FrogVoice Год назад
When I wrote my bachelor's thesis in uni my supervisor insisted that I would use violin plots to show some data. The problem was, the outliers in my dataset where not many but they were far out, like really really far out. So in the plots the often just weren't visible at all but I had to include them to represent the data accurately, at least that's what I was told. In the end the captions for every figure featuring these plots ended up absurdly long because I had to explain what the hell was going on there lest I forget it myself. So I 100% agree with you that these plots are just bad in every regard. The bit about these plots looking like genitalia is also true in every regard: because of these outliers some of my medians ended up at the bottom of the plot so that's of course where the belly was located, this however had the fun little side effect of making these particular plots look like a cock and balls. So my supervisor basically insisted that I would draw a bunch of dicks in my thesis. These things are truly terrible, they just always look like genitalia
@tanyabils9399
@tanyabils9399 Год назад
I would argue including the outliers made your visual less accurate not more.
@coreycampbell1689
@coreycampbell1689 Год назад
Maybe we should just call them Rorschach plots
@drmodestoesq
@drmodestoesq Год назад
In 1956, Bette Nesmith Graham (mother of future Monkees guitarist Michael Nesmith) invented the first correction fluid in her kitchen. Working as a typist, she used to make many mistakes and always strove for a way to correct them. Starting on a basis of tempera paint she mixed with a common kitchen blender, she called the fluid "Mistake Out" and started to provide her co-workers with small bottles on which the brand's name was displayed.
@rmsgrey
@rmsgrey 10 месяцев назад
There's a strong link between embarrassment and humour, and a lot of embarrassment over taboo topics like anatomy that roughly half of people have (particularly among teenagers and people who haven't got over having been teenagers). As for the use of references as a form of comedy, the basic idea is "we all laughed at {thing} then; remembering it will bring you to a similar state of mind and probably make you laugh now". If you didn't laugh as a teenager when someone broke taboo, then you're not going to find it funny when people try to evoke the experience as an adult. On the other hand, if you're someone who found Monty Python hilarious, then someone saying "this parrot - " (the pause is essential) is going to remind you of John Cleese screaming at Michael Palin and very likely get at least a smile, if not a chuckle, out of you. There's also a whole in-grouping thing going on - "you and I share an understanding of this reference, therefore I am a member of the in-group and popular and successful"
@Schnowotski
@Schnowotski 10 месяцев назад
A few counterpoints: 1) Violin plots are hard to understand. I don't see how violin plot would be any more difficult to understand than a KDE density cure or a histogram. If you know it's a density estimate - you know it's a density estimate. I think it's pretty obvious what violin plots show an how to read them. Then again, it seems there are multiple people commenting here that they have had trouble with this so maybe you are right, I don't know. 2) Why not just use a histogram? Well, couldn't the same point could be made against KDE (or any other smoothed density estimate): why not use histograms? You don't suggest this, but histograms aren't a panacea for density estimation and have their own problems. (Andrew Gelman has many times criticized R's bad default settings for histograms and I agree). 3) You suggest that multiple violin plots could be replaced with KDE curves that have a z-axis. I STRONGLY disagree with this one. Adding a z-axis creates more problems than it solves, since now you have to adjust in your brain for perspective. I honestly don't understand how you can sincerely suggest that this would make comparisons easier. You also have another example in which you replace violin plot with KDE curves (around 10:23). I think you made the graph significantly worse. I don't think it's easier to compare the distributions in your version, quite the contrary: the original violin plot version clearly shows the different shapes of the distributions, and you get an idea about their location and spread with a single galce. Your version, in my opinion, is more cluttered and difficult to read: it takes more time to disentangle the distributions from each other. There's also a greater probability of mixing up the curves (which curve is from what parameter/dataset or whatever is being plotted). --- I think violin plots are useful when you have to display many density estimates at once, for example, when visualizing marginals of a Bayesian posterior distribution. I don't think your idea of overlaying KDE curves is a single plot, or trying to use z axis to de-clutter them is any better or clearer; I think they might actually be worse. As a kind of "worst case scenario" consider that you want to compare marginal distributions from two 10-dimensional posterior distributions. Would you just shove all 20 density estiamates on top of each other in a single plot? Would you make 10 graphs, taking up huge swathes of space? Or just a single violin plot, in which the relevant density estimates are next to each other? You might be right that violin plots are overused in contexts in which information should be conveyed in some simpler way to people who might not be that familiar with density estimation.
@JoshuaNorton
@JoshuaNorton Год назад
Ha, only a minute in and I already adore the video. Out curiousity I had a professor give me old dissertations to see how they used to do data visualisation back in their days. And the solution is glue. Glued in graphs hand drawn on graph paper. Glued in photographs of the setups. And then it clicked in my head why we learned to do all that glueing stuff on paper in elementary school.
@spacelem
@spacelem Год назад
I'm only 8 minutes into the video, so you may well change my mind before the end, but I have made good use of violin plots in my work. When I've been comparing posterior distributions of multiple parameters from multiple different MCMC chains, the violin plots have been an excellent way for me to tell at a glance what the data is doing, and if there are any severe problems. Boxplots do not tell you if your posteriors are multimodal, violin plots do, and a histogram with 30 variables is going to be completely unreadable. I don't really care about the precise values of the interquartile ranges, I want to see if chains are converging to the same unimodal distributions. None of this information is for presenting in a paper (I'll give sensible posterior distribution plots there), it's for me (and my collaborators) to understand how well my MCMC is converging, and where the problems are. For that they work pretty well. Okay, now I'm going to shut up and continue watching to hear what you have to say! EDIT: all my violin plots were horizontal. It never actually occurred to me what they resembled when viewed vertically.... Also I work in veterinary epidemiology (I'm a mathematical modeller), where the majority are women, and my supervisor is a German woman (also did maths at undergraduate), who has no issues with speaking her mind, so I don't think anyone would be as daft as to joke about it!
@Ibeechu
@Ibeechu Год назад
ja but if your data set is multimodal, why even use a box plot? Or, like, make a histogram since that's showing the important parts and then put the quartiles in a table or something?
@TheFartoholic
@TheFartoholic Год назад
I agree with this use case - the shinystan package in R makes good use of violin plots and has saved me a lot of time in evaluating models. But I'd argue that the usefulness of violin plots go beyond MCMC. Overlaying density plots / histograms is ideal in most situations, but things get incredibly cluttered the moment you have >4 lines to plot. Having multiple panels of densities works - but is essentially a violin plot without the mirroring.
@qu765
@qu765 Год назад
ridge line plots tho
@TheFartoholic
@TheFartoholic Год назад
​@@qu765 Rdgeline plots are good and probably the first choice, but violins are particularly good when you want to compare groups across different strata (e.g., geography and income)
@andrewmatas6984
@andrewmatas6984 Год назад
I have used violin plots and liked them. You have convinced me that I was wrong.
@ubahfly5409
@ubahfly5409 Год назад
Oh really? Where were you Jan 6th !?
@aidanjimenez9343
@aidanjimenez9343 Год назад
@@ubahfly5409what does this mean dawg
@thefaboo
@thefaboo Год назад
​@@aidanjimenez9343I think they were (jokingly) calling you a terrorist....
@gebali
@gebali Год назад
We all make mistakes. But not all of us admit to them publicly
@rbr1170
@rbr1170 Год назад
​@@gebaligive me a sec and I'll make a violin plot of that.
@TheLgonzal1
@TheLgonzal1 10 месяцев назад
Hospital business analyst here... the reason I use violin plots is to specifically highlight the fact that the data is wonky and it cannot accurately be represented by an average... despite it being reported out that way on ten prior occasions
@mojorn8837
@mojorn8837 10 месяцев назад
That’s what I was thinking the whole time. In corporate settings, you’re not just trying to explain your data obviously, but often have to show why the counter side’s “data” is misleading if not out right deceptive.
@VikcocVyk
@VikcocVyk Год назад
This went from ha ha, to not ha ha real fast I want to say that you do an amazing job explaining the female side of these interactions The fact that you articulate why it was not ok is a great source of information for people who want to do good but don't yet see how certain things are problematic
@mapleveritas2698
@mapleveritas2698 Год назад
It is funny, but my master thesis was actually one of first theses in my university that were typeset using LaTeX. Probably the first thesis, actually. And my diagrams were drawn using Postscript. Yes, I wrote the programs to draw the diagrams. Of course, Knuth created the whole digital typesetting thing because the expert typesetters (actual people) were retiring. And the new generation could not do his "The Art of Computing" well enough because of all the diagrams and mathematics. Yeah, we came a long way. I was just one of the people right in the middle of the old and the new. Later, I was using gplot to generate Postscript graphs. It still exists, I believe.
@Marstead
@Marstead Год назад
My wife in astrophysics -- she has a very similar experience to what you described with the Violin plots, where she'll go out to dinner with a bunch of male physicists and the waiter will come up and say "Well, ladies first!" And she can't explain how frustrating it is to have the entire table be alerted/reminded of the fact that she's the only woman there. It's tough in that situation because she can't comment on how it makes her uncomfortable because the waiter's not being a bad person about it and it'll make her look bad if she brings it up at the table. So she just kind of has to deal with it. It sucks!
@Daniel-ih4zh
@Daniel-ih4zh Год назад
Why exactly would this make someone uncomfortable lmao? Bizarre these "diversity and inclusion" people become uncomfortable when they're unique
@vickypedia1308
@vickypedia1308 Год назад
​​@@Daniel-ih4zhI genuinely hate when people acknowledge I'm a woman and that I'm special because I'm the only woman around at the moment. Why the hell does my gender matter to you guys, I'm here to do my business like everyone else. I didn't "earn" the trait of being a woman so it seems weird to point it out as if it were extraordinary.
@Daniel-ih4zh
@Daniel-ih4zh Год назад
@@vickypedia1308 I 100% agree and understand. But it seems like people are having their cake and eating it when they hold this sentiment while also promoting things like WiStem and AA
@vickypedia1308
@vickypedia1308 Год назад
​@@Daniel-ih4zhI don't know what those terms mean (not a native speaker, so if those are terms where I live they likely have different acronyms). However I would like to add that the people who get uncomfortable when someone highlights that they're "special" for being some sort of minority are usually not the same ones who actively advocate for special treatment. For those who do, it tends to be because they're two different kinds of "special treatment" and one of them feels patronizing while the other doesn't. Personally, I think we should strive to decrease sexism at the workplace, not force women quotas or other artificial stuff like that. Sexism is more likely to happen in fields that are predominantly pursued by men, simply due to the lack of women who can point out if someone is being sexist. (And even if there are one or two women there, you don't want to be *that* person who complains about something nobody else sees an issue with.) In my opinion, the fix isn't to forcibly try to get women into that field and making a big deal out of it. I would certainly not want to be the token woman who only got in because a company needed to fill a quota. I think we should rather try to make the place feel welcome to *any* person, women included.
@lepannean4231
@lepannean4231 Год назад
@@Daniel-ih4zh You think it's bizarre when people who want to be included as equal participants, get uncomfortable at being singled out for no reason? It's *almost* like you think "equality for minorities" is the same thing as "special treatment". Hmm. Maybe you should reflect more about that.
@StressDespot
@StressDespot 4 месяца назад
i just got through an 'intro to data science in python' course for my masters of analytics (it was a bloodbath- ~1100 people enrolled in the course. 110 people made a straight up ZERO on the final. got NO points. and 1/3 of the class got a failing grade on the final. which was worth 25% of our grade. but anyways..) there was a violin plot on one of the homeworks- i remember it because i was like "hey, what the hell is this thing?" it was supposed to be visualizing a group of horseshoe crabs, the length of their carapaces, and the average time of survival of the different lengths. i saw the violin plot and thought it was a plot of 'this is what the scientifically ideal horseshoe crab body looks like.' i read it like 5 times before i realized the violin wasn't supposed to be the shape of a horseshoe crab.. smh. shitass plot, zero out of 10.
@moseistrujillo8300
@moseistrujillo8300 Год назад
There are so many data visualization that just should be wiped from existence. Violin plots are at the top of my personal list, they are outlassed in every way in modern data visualization
@BlueSapphyre
@BlueSapphyre Год назад
Pie chart/Gauges are the top of my list. But for some reason C-suite loves them.
@PBMS123
@PBMS123 Год назад
@@BlueSapphyre pie charts are fine.
@hyphenatednick
@hyphenatednick Год назад
If I could murder one type of plot it would be the pie chart. Not because they're worse than violin plots, but because they are more prevalent.
@rbr1170
@rbr1170 Год назад
​@@hyphenatednickand often used in the wrong way.
@rbr1170
@rbr1170 Год назад
​@@PBMS123only if used properly.
@RobertBlair
@RobertBlair Год назад
Do be careful with plots that heavily rely on color. Color vision impaired folks can't always read them well.
@RobertBlair
@RobertBlair Год назад
I'm also listening at work, so thank you for bleeping appropriately. And sorry that nobody told that turd of a guy to STFU.
@Somebodyherefornow
@Somebodyherefornow Год назад
thats why everyone shiuld use pattern + color + contrast !!!
@bbqchezit
@bbqchezit Год назад
I totally agree about the "feminism" point. The level of defensiveness, especially in STEM is insane. Re: the plot: I think they doubled it assuming we'd get a more intuitive sense of the "area" corresponding to an increased histogram value. Still a shitty plot
@peterpeterson8792
@peterpeterson8792 Год назад
Yes, the feminist pivot was a disqualification. Could have started with "the useless plots look like a ****", that would cover it. But dipping in the intersectional victimology half way?
@bbqchezit
@bbqchezit Год назад
@@peterpeterson8792 "Intersectional victimology" is a loaded way of saying "shared how it made her feel". Maybe you don't care about that part. She identified that section pretty clearly. If hearing how she feels makes you feel some kinda way, that's for you to examine.
@peterpeterson8792
@peterpeterson8792 Год назад
@@bbqchezit Do you realize that the hypothetical offensive situations never happened, she went on freeflow of inventing nonsense about some hypothetical men making a stupid joke and how she would feel if that happened. And then "allies" jumped in here ready to pre-save poor pre-victim of her imaginary pre-situation of a pre-bad-taste infantile joke she imagined that surely would traumatize her forever. Oh, "allies" feel she is already traumatized by her own imagination? And sure, we should feel for her trauma and cancel the plot? Thus, from disliking the plot she figured if she comes up with a me-too victim imaginary situation, and how horribly she would feels about it, that sure will erase the graph. By the way, unless one has no spatial (2-d spatial!) imagination, there are valid and well demonstrative applications for this plot, perhaps not for her data. I wonder how leaves make her feel? Ever thought of it? Think of it, violin plots or ****** if you wish, all over the forests, trillions of them? I never thought violin graph looked like a ****** until this chick made a stink about it. And still don't. Any other offensive shapes, circles perhaps? Just get real, get therapy if you need, and don't ask others to participate in your manipulation by your imaginary issues that "make you feel".
@rmb706
@rmb706 4 месяца назад
I’m confused. How do box plots “show the average of a data set”? I don’t think averages are part of box plots by default. Box plots show the quantiles. Some plotting software I use has options to add means to box plots, but it is not traditionally part of them, right?
@bejeweled280
@bejeweled280 Год назад
I love your videos. Like why would I care why a certain plot is horrendous? 40 plus minutes later I'm super invested and ready to go on the war path about violin plots.
@El_Rey_247
@El_Rey_247 Год назад
I don't know if anyone has mentioned yet, but a population pyramid comes to mind as a use case where it makes sense to have both the mean or median and quartiles, but the overall shape of the distribution is still important and useful. Mind you, I'm typing this at the 9:20 mark, so maybe that does come up in the video at some point and I'm just jumping the gun.
@just_some_commenter
@just_some_commenter Год назад
The population pyramid could instead be two stacked histograms. This would make it much easier to compare the male and female populations at a given age. However, it wouldn't be a pyramid, so you would no longer be able to use your copy of _Demography_ to keep your razor blades sharp.
@3snoW_
@3snoW_ Год назад
But those convey so much more information. Each bin is labeled and usually the left and right are not exactly symmetrical, the left and right are assigned to male and female population. So say there was a particularly deadly war, you can expect a bigger dent on the male side between certain ages than on the female side for the same ages. The violin plot fails in all of these points, it isn't labeled, it has no bins, it's smoothed so it becomes even more vague, and it's symmetrical for no reason.
@El_Rey_247
@El_Rey_247 Год назад
@@3snoW_, oh sure, it's not a violin plot, and I wouldn't want it to be one. I just mean that it's a histogram where it probably wouldn't hurt having box plots within them (one per side), so you could quickly compare the median and quartile ranges of male vs. female populations, assuming you wanted to force that into a single visualization and you didn't want to set aside space for a table or something
@RobFisherUK
@RobFisherUK Год назад
I sometimes have a lot of possible plots I could do, and generating all of them as violin plots is useful because I don't know in advance if my data is bimodal or whatever. And I can put lots of violins next to each other and compare them, unlike histograms. But sure, I won't put them in my presentations because by then I know the best way to show the data. Fair enough. Also excellent point about the smoothing! People not understanding their statistical models bothers me a lot.
@londonalicante
@londonalicante 6 месяцев назад
I've never seen these before. I do think they would make sense in One particular situation. You give an example of temperature data distribution per month (12 separate plots) but suppose instead of months, you want to plot the annual temperature distribution of all the world's capital cities, and you want to put a continuous variable (latitude) on the X axis, with temperature on the Y axis. In that situation, the symmetry of the violin helps centre the data correctly on the X axis. I completely agree that this format should only be used for overview information (but I note that it can work without colour, while certain other types of plots can be difficult to read without colour.)
@simonwillover4175
@simonwillover4175 8 месяцев назад
I loved this video. I don't get the whole inappropriate stuff from what looks like gross bugs / fish / sting rays. I'm not a woman though, so who cares about my opinion? XD Seriously though, these plots are horrible, simply for being unintuitive. I can't tell what's going on, and it looks like someone is comparing leafs, or blobs of slime, or rocks, or weird sting rays or something, instead of doing actual science. If I saw this in a paper, I would be like "Yo, what is this supposed to be? Where's the data?" would be my actual comment while presenting. Is that a somewhat unprofessional comment? Well, yes, but it's better than calling them something irrelevant, like a sting ray or part of a human body or something. Counselor: What do you see? Me: I'm setting the foot print of a dinosaur after it rained for a few minutes and the foot print got smoothened out. Counselor: Oh. Well, you were supposed to see a 2-headed butterfly... I guess that means your a psychopath now. Me: What does this ridiculous and non-sensical image have to do with my mental well being anyways?
@EXQEX9
@EXQEX9 Год назад
I've been watching this video thinking "Mmm, I don't think I'm seeing where Angela is coming from here. They aren't that bad." ...until the bean plot at 23:07. An absolutely chaotic dumpster fire of borderline illegible meaninglessness. I get it now. I'm revolted that it took me this long.
@inafridge8573
@inafridge8573 Год назад
The big problems are 1) there is no scale provided for the frequency of the distribution and 2) If you want to compare how two distributions differ, you need them overlayed on top of each other, but violin plots are presented side by side instead of overlayed (you could get rid of the box plot in the middle and then overlay them, but then that just makes them harder-to-read, unnecessarily-smoothed histograms)
@EXQEX9
@EXQEX9 10 месяцев назад
​@@inafridge8573I didn't fully consider point 2. Imagining trying to do this disgusts me. Thanks for your reply! :)
@TobiasWeg
@TobiasWeg Год назад
I come out and say that I used the Violin Plots in my PHD thesis. I had grain size distributions to display. Histograms have the big problem that you can not well put a lot of them over each other. I had I think ten different samples I wanted to display next to each other (to make them comparable). Furthermore, I wanted to use the median to simplify the further discussion, but I also wanted to show the actual distribution. Of the Particles, as it was important for the behavior I was looking at. The Violin plot was a good combination of: 1. The median is visually represented. 2. I do show the actual distribution, so I can discuss the skew, if there is one. 3. I can pack a lot of them next to each other, the reader gets a good visual representation of the different distributions. 4. Yes, I find them visually pleasing, if done right. I did plot them horizontal, so. 5. I think the Violin plot is symmetrical, for the same reason as the Boxplot is symmetrical. And when I think about a metal grain, which I worked with). I thought like it represents the form of the actual grains. I did not add another histogram. I did give the smoothing value and normalized the width to one. As, I saw the second part of your video, I am sorry to hear about the unfortunate situation and that this plot makes you and other women uncomfortable, this is unfortunate. I did not know about this, and it was absolutely not obvious to me. I am sorry for that.
@LordOfLemon
@LordOfLemon 10 месяцев назад
I would argue that in the era of information, violin plots have an important purpose. I get that many dislike them, and I appreciate why. It's annoying to take a ruler and run it across a list of plots to figure out the values. But 99.9% of people nowadays are not actually doing knee-deep research that requires precise fact-checking for these plots. Also, usually whoever made the plot has also published their data so you can just make your own plots or use the data directly, which is actually way more accurate than eyeballing some numbers if you're actually doing math for something in your paper. And violin plots have a very useful feature, skimmability. They reduce a plot down to its basic shapes and make it so the human brain, upon skimming, can quickly identify "hey, this visualized 2D object has three humps, whereas this one has two. And this one looks just like a ball. These are fundamentally different samples!" Instead of just showing you the curve of the shape (like a normal histogram), the violin plot also gives you the ability to "feel" the volume of the displayed "object" at a specific point. This is useful because our monkey brains are better at identifying and classifying objects than curves. As an example, what might look like a slightly more sudden downturn (but no big deal) in a histogram could give you a "wow, this sample is tapering out like crazy at the bottom!" effect. Also you can often just throw violin plots on top of your box plots and it will just give readers a better understanding of the data than pure box plots. However, I do agree that smoothing diagrams so they look pretty can be quite obnoxious and sometimes even obstructive to science. And it happens most often with violin plots, since they give more of a "wow" effect if you smooth them like crazy. Additionally, I think violin plots really only "work" with certain types of small data (i.e. just look at these three humps). If there are too many metrics, it becomes a nightmare to read and interpret violin plots (as you have stated) and it should just be thrown in a different diagram. Basically, if your violin plot looks more like a weird snake than a violin, you should rethink how you're presenting your data. And honestly, I do just want more research on these plots and their effects on scientific research. If you can get a paper together that actually proves violin plots are trash and obstruct science, I would love that. It would help me accept that everything in this comment (so far) is just a subjective personal experience from me and my professors, and I'd be more inclined to never use one ever again. My counterarguments to the video are subjective experiences, after all. Also, then I could cite you, which would "appreciably increase your market value" as they say on LinkedIn, I think. Oh, and I'd love to see the feedback of the people I've worked with who have conducted extensive research primarily relying on violin plots. I'm cool with sharing videos like this with my friends, but I'd rather give my professors something more academic, you know. Addendum: Wow.. I don't have a vagina and I leave thoughts about genitals at home when doing research, I can't believe I never realized they look like . Yeah, that will actually make me think twice about using them... Thanks to the creator for the story and transparency, it must suck to put themselves out there to this extent. And I'm glad they decided to ignore the idiots who go "huhuhuh va" at work. Especially since many of those people will likely see this video, potentially even including that f-ing guy. The "joke" is so thoroughly unfunny that it made me almost laugh at how badly the guy conducted himself. I think it's best to just pretend those people don't exist, in most cases. I really hope some of his coworkers or friends gave him a serious talking-to after the presentation... If any reader experiences this kind of behaviour from someone they know, please just sit them down in private and tell them it's not ok. That said, as many commenters have mentioned before, something looking kinda suggestive does not mean it is wrong to use scientifically. It's just really nice to have a box plot with extra data about a specific metric sometimes. Sometimes you just wanna be able to directly compare like 10-20 different f-ing plots without having them overlap and look like a horrible unreadable cluster-f. Especially if you expect the reader to find something you didn't expect. What if someone finds a cool association that would be hidden behind the back layers of the fancy 3D layered histogram? You cannot seriously tell me you can clearly see the beginning of each distribution at 21:00... Also, accessibility! Not everyone can see colours, colourblind people exist. That's why the one histogram in the beginning had kooky patterns, colourblind people wanna understand data distributions too. Also, I know it's the digital age, but many researchers like printing out their important references and notes. As a starving student, it costs soooo much more to print something in color rather than greyscale. I implore you to take your "useful diagram" at 20:40, print it in greyscale and explain to someone what it means. Or better yet, just ask your red-green colourblind friends to read it. In this case, not even kooky patterns can save you, because you want them all to overlap! Not every design decision is good, many of them have problems. But I would say violin plots have a clear reason to exist, even if that reason is a bit nuanced and overlooked. And yeah, I don't like making women uncomfortable, but anything is suggestive if you look at it too much. I think most of us can agree that the awkward situation was created by that one weirdo, not by the plot. However, I hold hope that perhaps one day some psychology student will set out to prove me wrong on that. Until then, I will probably keep using violin plots every now and then
@esmenhamaire6398
@esmenhamaire6398 8 месяцев назад
I've had the good fortune to have never encountered a violin plot before being a science geek (NB: I am not a scientist, nor have I ever been. I've simply had a lifelong interest in science). I agree with this young lady, in general - violin plots look less informative than histograms or box-plots. As a means of presenting data, if I came across some article or paper that interested me that used a violin plot, I suspect I'd be little the wiser for having seen the violin plot. However, I do recall a video on evolution where, for various families of species, they plotted time on the vertical axis against some measure of how common the species in that family were on teh horizontal scale, and they did this all in a single image. Now, whilst it did not give me anything in the way of numeric data, I could see that this family arose around then, whilst this other family were dominant, then it expanded over time, there seems to be rivalry with a different family a bit later, and so on. You could see how various sections of the evolutionary tree began, blossomed and died out over time quite easily. For that, to give a rough overview, it worked well. And yes, it looked like a bunch of simultaneous violin plots., but because of the way in which it was done, and bearing in mind its intended audience - the general public - it adequately conveyed what it needed to convey simply and quickly. However, if I were sufficiently interested in biology and evolution in particular to the extent that I wanted actual data, I would not be a happy bunny given something like that. A histogram would, IMO, make it much easier to extract the information I might want from it. I'm prepared to accept that there may be edge cases (when are there not edge cases?!) where a violin plot might be the best way to display something, but I find it hard to imagine such a situation. And yes, I have read the other commentors including those speaking well of violin plots, whose arguments I find unconvincing. But perhaps I just need to see cases where violin plots are appropriate rather than ones where they are not?
@CocoBombLeche
@CocoBombLeche Год назад
😂😂😂 OMG, when I first saw these violin plots I had the same thought. I feel validated that someone else had the same thought, because I felt that my profession was slowly corrupting my brain - I'm an OB/GYN. Thank you for providing content for me to watch on my post-call day. It sucks being a woman in a science based field, and unfortunately this bias has seeped its way over into medicine as well (shocker). This is why representation matters, but we must also continually actively dismantle the patriarchy
@Ziraya0
@Ziraya0 Год назад
I love these kooky patterns too! I first ran into this in Trello, where the labels (categories) are colored rectangles; but that sucks for accessibility. So they have Color Blind Friendly mode where they add distinct patterns to the colors. Everything should do this! I'm not colorblind and it helps SO MUCH!
@LimeyLassen
@LimeyLassen Год назад
I've often seen them used in colorless graphs. It might be popular in Ecology.
@m.streicher8286
@m.streicher8286 Год назад
Once, I loudly expressed that a piece of lab equipment looked similar to a "toy" - I still cringe thinking about my own behavior. The key difference being that I was an 11 y/o
@robertofontiglia4148
@robertofontiglia4148 10 месяцев назад
I'll agree with you that violin plots have some problems, but uhm... 11:03 "If I open a paper and I see a violin plot, if I want to actually get data out of it.." Who's trying to get accurate data out of plots in papers?? Is that something people do??? I always think of plots as a way to tell a story about your data. I don't necessarily think you're meant to be reading them in that much detail. Like, you said "you can't get data from them, just vibes", and that's ... the point of them? Like, you talk about ridge line plots and 3D plots and I'm sorry but I think they're pretty much just as bad. It feels like they're just your æsthetic preference at this point.
@1Cr0w
@1Cr0w 10 месяцев назад
While i agree that violinplots are not very good, some of the "better" alternatives you give are equally bad: - The 3D-plot that you claim still lets you compare/measure the data does not allow you to do that. 3D quite generally becomes near unreadable if your data is not neatly aligned or interesting features end up hidden behind layers in front. - Overlaying histograms on top of each other becomes entirely unparseable very quickly with the number of conditions/data groups I will agree that raincloud plots (with or without the "rain") are better, and that the researcher (and reviewers) should decide what contrasts are of relevance and what features of the data are worthy of visualization (or can simply be mentioned in text; or given in the supplementary materials), and will also concede i have myself opted for histograms/smoothed histograms instead of violinplots. Do i find them ugly? Yes, i do, mostly -- however, that's not a valid reason. 3D barplots look pretty cool, but are one of the single worst visualizations possible. Scatterplots are kinda ugly (depending on the data), but are a great visualization.
@TheBBQify
@TheBBQify Год назад
i love watching your videos while i procrastinate on my physics 101 homework. makes me feel like i'm actually working
@hiiamelecktro4985
@hiiamelecktro4985 Год назад
Never heard of these, thanks for letting me know about them. Also I do think it’s important to talk about the woman’s perspective. I know there are a lot of men like the one in the badman example. But there are also men who want to fill in their blind spots. Hearing women’s experiences helps them do that. So thanks for that to.
@nunyabiznes7446
@nunyabiznes7446 Год назад
I was thinking "man I've basically never seen these" but then I realized - they're employed for ALMOST EVERY chart showing the age demographics of a country. And it's always weird and hard to parse and takes a second. It probably started with people using the two different sides to show men and women (which honestly you don't need to, just turn it sideways and give a second color for one of them, it would be way more legible, 'down' is not generally the direction we think of time passing in) but nowadays most of them don't even do that. They just turn it into one of these godforsaken graphs because that's how you're supposed to show age demos and it wastes like three seconds of my life every single time
@ESDAFable
@ESDAFable Год назад
im not the only one bothered by this! yes!
@jwbaker
@jwbaker Год назад
I never thought of these "population pyramids" as akin to violin plots, because they are just bar charts binned on the vertical axis by year. Are we banning the whole genre? These communicate a great deal of information and if demographers want to have domain-specific charts, I am willing to let it slide. The thing about putting the gender on either side of the horizontal is because the gender surplus is pretty interesting. It communicates information about life expectancy, but also for a given place it clearly shows wars.
@ESDAFable
@ESDAFable Год назад
@@jwbaker Yes but it's a LOT easier to see that comparison if the histograms are lain on top of each other instead of flipped. Otherwise the differences, especially when subtle, are not easy to see. You can't easily pick out at a glance the discrepancies unless they're severe, and the math is less obvious too.
@jwbaker
@jwbaker Год назад
@@ESDAFable Yes, I can see your point. Some of those demographic plots overcome it by using four colors instead of two. The more intense color at the extremes indicates the surplus on the larger side. But, I think the overall flaw in the violin plot is its symmetry. The bilateral symmetry of the violin plot conveys nothing, it serves only to confuse or annoy. But on the population pyramid, the plot is not symmetrical, and the asymmetry conveys meaning. There is almost certainly a better thing than population pyramids. People are always innovating in dataviz, and some of these innovations are even good.
@nunyabiznes7446
@nunyabiznes7446 Год назад
@@jwbaker I think it's subject to basically the same issues as all the other violin plots - even when it's serviceable, it would be improved by just cutting it in half and orienting it the same as every other X variable over Y time chart basically every time. Even for the purposes of comparing sexes, it would be dramatically better to overlay two transparent charts, letting you see precisely where and by how much they're different, than having them reflected down the midpoint for some reason forcing you to try and eyeball the two different sides for comparison
@CFTim
@CFTim 10 месяцев назад
While I also don't like violin plots, a lot of this commentary is pretty bad. They're not impossible to read, often they are more intuitive people are used to seeing dependent values on the Y-axis, and if you have a large number of data sets then superimposed histograms become impossible to read. At some point you "improve" a violin plot by rotating it, but you end up with an axis that goes from high to low. If people don't tell you how they made the plot and what smoothing they used, then that's an issue with underreporting the methods, not an issue with the plot itself. That Vanderwaals paper you showed does NOT show the same data in fig 3 and 4 as in the violin plots in fig 5 and 6. (Also you say fig 3 and 4 are histograms, but they're not. They're a lot of bar graphs without error bars packed in tight, that's not the same thing.) But where you really lost me is about halfway where you show the 3D plot... You cannot be serious. You can no longer interpret the X-axis, a lot of it becomes hard to see and you completely obscure part of the data on the left side of the back few data sets. How on earth is that better? Yes, violin plots are not great and I get that it's fun to have a go at them if that happens to be a pet peeve. But all of these bad takes really don't help your argument or the quality of the video.
Далее
a tiny peek at Christmas economics
52:00
Просмотров 100 тыс.
the postdoc exodus
37:04
Просмотров 513 тыс.
Катаю тележки  🛒
08:48
Просмотров 567 тыс.
how to cheat at chess
30:06
Просмотров 112 тыс.
Downloading Images From US Military Satellites
26:51
Просмотров 771 тыс.
a scary science data story
47:53
Просмотров 174 тыс.
theoretical physicist reads: love, theoretically
1:17:00
Sympathy for the Machine
26:31
Просмотров 1,7 млн
the computers can play stratego now
39:24
Просмотров 135 тыс.
Cursed Units
18:29
Просмотров 2,2 млн
physicists only have 5 jokes
19:10
Просмотров 255 тыс.
Катаю тележки  🛒
08:48
Просмотров 567 тыс.