Thanks everyone for your amazing comments! I had no idea so many people would see this video! I'm already looking to put together a follow up video to address some of your questions and suggestions. For those of you who expressed interest in getting this code, I have published it on github (be warned it's a bit hacky 😱): github.com/datatime27/videos/tree/main/word-tracker
I have a few negative words: I can't believe how awful it is that you don't have more subscribers, despite the painful hours of hard work you put in. I hate how terrifyingly underrated this channel is to the ruthless, unfeeling RU-vid algorithm.
That tracker cannot detect sarcasm and contexts, so despite this comment being so friendly and supportive, it has a negative score of around 0.85 at least
pls 😭😭😭 how many times have i thought about data mining the youtube transcripts but i could not come up with an interesting project idea! this is so great and i am kicking myself rn; good job buddy
Jacksepticeye will produce extremely interesting data due to his high diversity in content, never deleting old videos, and drastic change in character over time. If you make a follow-up video on him, I will kiss you on the mouth.
For a word analysis video, Markiplier would be a challenge since he has interconnected story videos and horror games videos. Would love to see how you tackle the interconnected videos challenge
@@NotHalfADime As much as he uses words like flubberknucked he is a very well-spoken person like he has a very strong hold on vocabulary and word diversity. Honestly that's what got me watching his channel 10 years ago and here we are now
I'm gonna suggest Stand-up Maths, mostly because I think he'd be delighted to be the subject of someone else doing statistical analysis via a bit of python code
Wow, as a data nerd, I love this! I see a top comment saying this is a 'Wednesday afternoon for unemployed friends' but this is a 'late night I need to find this answer' vibe!
I might be more face blind than I thought because I didn't notice that looks like two of the same person until I read comments mentioning it. Then again, I was mostly listening instead of watching.
The youtube algorithm has blessed us with a criminally underrated channel. I can tell how much work you guys put into this video, and you definitely deserve more recognition. Keep it up! I'm excited for your future videos.
I know he’s had a lot of controvery recently. But he’s always been pretty open with his approach to video creation & buisness. He speaks fast because it keeps people watching. More people watch: He gets more money. There’s a lot to criticise. But speaking fast is still primarily for that reason I think.
TheReportOfTheWeek does food reviews so you could analyze the general positivity or negativity of every review he’s ever done! He also has a unique vocabulary, his videos are less edited than most, and his speaking style so that might add a unique attribute to the analysis
Well he need to keep it shot and simple for kids to understand "If they keep watching, they will get money, candy or video games. They may even go for a ride in his Van."
well incase of the rso, as it was mentioned somewhere the cc said Chandler's cracker, when the actual bit was Delaware's cracker. so there's that. but i still think there would be helpful data in them. did you consider publishing them publicly? not a legal expert by any means, but i believe as it was public it wouldn't be a problem
As a data analyst, this is so fascinating to me. I can't imagine the work that went into this project. Data is so cool, especially if it's explained in a fun, engaging way like this !
Man I love this channel, going over the flaws in a system, answering questions people may have, explaining concepts that some people may not know, and *so. Much. Data*
So I work at a call center and they analyze our sentiment score. I always wondered what exactly they were using for it, now that you told us about the NLTK I’m extremely confident that’s what they use to analyze. I shall now begin memorizing which terms have the highest sentiment scores in that Vader_Lexicon list haha, that’s going to make my analytics look so much better
would be funny if you just sput out some random words from the list at the end of the call/times the caller is not paying attention to raise your score
@@yukijoou that is funny haha! What’s even funnier is that the list is based on analysis of text and isn’t really intended for spoken English, so there are some interesting results, like how saying “143” is the third most positive term in the entire list. Imagine sneaking that into random calls
That makes no sense, how did this age like fine wine? What am I missing? Did Mr beast do something scummy with the speed of him saying his words? I'm so confused.
hey man great video, i was wondering which section of the downloaded json file contains the client login file. I have tried the "client_secret" and the client ID ,but run I try to debugg it using VSC i get this error "No such file or directory" thanks again for your time and marvelous video :)
@@DataTime27 Indeed. I have found the solution. it seems as though the program was having trouble locating the .json file so i did the following: In line 25 instead of the default i used os.environ.get('RU-vid_API_CLIENT_SECRETS') then i created a .env file and defined the model as RU-vid_API_CLIENT_SECRETS = "client_secret" after that i put the following commadn in the terminal (or cmd for windows users): export RU-vid_API_CLIENT_SECRETS=/home/usr/dir/client_secret.json with usr = user dir = where the file is located client_secret.json being the name of the file after that i ran the program and it worked as expected. I hope this helps :)
@@DataTime27 Furthermore, after the code has been ran, the words_dictionary.json file doesn't seem to auto populate itself, could that have something to do with the Captions_Dir?
What do you mean by auto-populate itself? This json file should already have all of the words in it (about 370K). Take a look at this huge link raw.githubusercontent.com/datatime27/videos/main/word-tracker/words_dictionary.json
@@DataTime27 im familiar with the json file, when i mentioned “auto-populate” im referring to the words that are scraped from the transcription of the youtuber we selected.
This is something that I've always wondered about so seeing a video that did the hard work to satisfy my curiosity was great. With all the data you have you can honestly run many more analyses when it comes to a creators' specific accent or speaking quirks. I would honestly love to know stuff like the most uncommon English word that is used regularly, or maybe the most common word in the English dictionary that has never appeared on a channel. These are just two ideas but you can search for these things easily and I think it might lead to interesting results.
MARKIPLIER!!!!! I love the video , especially cause i learnt R , i am able to understand and grasp what you have used to get these analysis ! thank you
first video I've watched of data time. i was completely immersed. first time that's happened with a youtube channel in months, perhaps years. immediately subscribed. incredibly entertaining.
Okay, i love this idea, and i hope to be able to analyse my other channel's videos for this eventually. Might end up a good tool to evaluate my speaking
In an interview, Jimmy actually said that he stopped yelling and talking too much on his vids and started focusing more on storytelling with a calmer voice. So this just confirms that
I'm so surprised you don't have more subs. This video was well put-together, informative, and interacts with the audience. I love your style dude. Can't wait to see more videos! I'm subbing with the bell on so i can watch more
Before watching the video, I thought this video would analyse which words ended up giving him more views, by measuring when it was said, in what video, and how many views the video has, compared to "the background level" of normal speech.
This was so fun to watch and very entertaining. And the fact that your channel is called "Data Time" makes it an instant win. I subscribed from the name alone, I know what type of content I'lll get :D Also, I totally didn't understand the "two-person" gag until halfway through the video lol
Really like the data analysis! Subscribed hoping you do more. You should look into topic analysis, grouping, and association if you're not already familiar. It would make for additional interesting data!
for a next video, i think it'd be nice to go more in-depth about outliers, and maybe going over a few methods for removing them from the dataset to get a more meaningful avreage! also, as others said, i don't think you can draw many useful conclusions between popularity and words used though, it's very fun to see more videos on NLP, amidst the ai craze, it's actually nice to see some people using these technologies properly
Tbf I think this video would have been better if framed as demonstrating the limitations of working with large data sets and interpreting conclusions from findings! Pretty much every graph came with caveats like "wpm as a measure of speech rate is thrown off by vids with little or lots of music" or "sentiment analysis doesn't capture the nuances of language like context and sarcasm".
Had a dejavu about this video, keep it going man, if your video sneaked in my dream, it could easily sneak in recommendations of a lot of people for sure.
It would've been interesting to see positivty / timestamp in % in each video. Would be interesting to see how he structures his videos. Probably positive during the intro then more negative because of the tension somewhere in the center.
found this on my recomended and Im glad Im click on it the video is actually super entertaining and I really enjoy watching so you just earned a sub :)