Тёмный
Jordan Boyd-Graber
Jordan Boyd-Graber
Jordan Boyd-Graber
Подписаться
My research lies at the intersection of natural language processing and machine learning. I post videos from my courses and research.
What Makes for a Bad Question? [Lecture]
11:18
7 месяцев назад
Don’t Cheat with ChatGPT in my class! [Rant]
6:27
7 месяцев назад
Is ChatGPT AI? Is it NLP? [Lecture]
4:04
7 месяцев назад
What made ChatGPT Possible? [Lecture]
5:41
7 месяцев назад
Do iid NLP Data Exist?  [Lecture]
10:02
11 месяцев назад
Комментарии
@BahaedinKhodami
@BahaedinKhodami 12 дней назад
Amazing!
@BahaedinKhodami
@BahaedinKhodami 12 дней назад
Amazing!
@13strong
@13strong 24 дня назад
You realise this is a comedy show, right? It's not supposed to be taken this seriously.
@JordanBoydGraber
@JordanBoydGraber 24 дня назад
Dropout literally has a whole show about nerds making pedantic corrections called "Um, Actually". Mike Trapp was the host. But I realize I don't get any points because I failed to say "Um, Actually".
@JordanBoydGraber
@JordanBoydGraber 24 дня назад
And if this is a way for new people to understand AI better than Grant does, I consider that a win.
@JordanBoydGraber
@JordanBoydGraber 24 дня назад
Repost, first post was missing a final edit, sorry about that!
@zachcrennen2342
@zachcrennen2342 25 дней назад
This guy is good, great explanation!
@buzhichun
@buzhichun Месяц назад
Very interesting, thanks for sharing
@DanceScholar
@DanceScholar Месяц назад
Great to see a breakdown of what Cicero is and is not.
@jacklennox1
@jacklennox1 Месяц назад
Great video, thank you 👏👏
@abdulaziza.9654
@abdulaziza.9654 Месяц назад
Beautiful !!
@pseudoki
@pseudoki Месяц назад
I absolutely agree. Tossing more layers just *feels* wrong. There definitely is something missing in these newer neural models that while they perform well, they don't really do so efficiently. Either they in the future will massively improve via using some of the old techniques, or by being crafted architecturally with more biological inspiration.
@tariqkhan1518
@tariqkhan1518 Месяц назад
Can you please reference the paper you mentioned form google?
@tariqkhan1518
@tariqkhan1518 Месяц назад
got it in the description.
@wilfredomartel7781
@wilfredomartel7781 Месяц назад
🎉
@dursung_
@dursung_ 2 месяца назад
Masterpiece! Amazing intro, thanks
@michaelmoore7568
@michaelmoore7568 3 месяца назад
As much as I hate LLMs... do LLMs use Chinese Restaurant Processes and/or Kneser-Ney?
@JordanBoydGraber
@JordanBoydGraber 3 месяца назад
Not really, this is older technology to relate similar contexts together. Modern LLMs (or Muppet Models, as I like to call them) use continuous representations to do that.
@maryam2677
@maryam2677 4 месяца назад
Perfect! Thank you so much.
@RajivSambasivan
@RajivSambasivan 4 месяца назад
Thanks, that was informative. Learned something.
@420_gunna
@420_gunna 5 месяцев назад
I haven't finished the video, so apologies if you cover it, but in the 2023 CS224N NLP lecture on coreference resolution, Chris Manning introduces the (very complicated and demoralizing, to me) Hobb's algorithm, and then basically says something like "Hobbs HIMSELF said publicly that he didn't like the algorithm, and often pointed to it as ~an example of how we clearly needed something better."
@amoghmishra9222
@amoghmishra9222 5 месяцев назад
Synthetic data generations has become so easy now thanks to LLM!
@exploreyourdreamlife
@exploreyourdreamlife 5 месяцев назад
Your video has sparked a meaningful conversation. How has being a young-onset Parkinson's patient shaped Jessica's perspective on life? As the host of a dream interpretation channel, I'm curious to explore how her experiences with Parkinson's influence her dreams and subconscious mind. I truly appreciate the opportunity to learn more about Jessica's journey, and I've already liked and subscribed to the channel for more insightful content like this.
@donfeto7636
@donfeto7636 6 месяцев назад
13:11 there is mistake in last line t(e1,f2) * ( t(e2,f0) + t(e2,f1) + t(e2,f2) ) should be this slides duplicate f2
@user-nm8tj4rh2t
@user-nm8tj4rh2t 7 месяцев назад
Jordan is soooooooo cool ...🤭 I really want to meet you at the NLP conference ...!!!!
@user-wr4yl7tx3w
@user-wr4yl7tx3w 7 месяцев назад
Can you share this video with the president of Harvard? I don’t think she got the message. Yet somehow DEI still think it was okay for her to cheat. DEI is accusing everyone of racism.
@RajivSambasivan
@RajivSambasivan 7 месяцев назад
Awesome, can't believe guys tried doing this in your class. This is like commiting a burglary and leaving a confession note and a business card. This is really funny.
@JordanBoydGraber
@JordanBoydGraber 7 месяцев назад
Not just that. I'm not sure what the right analogy is, but it's that *plus*: trying to rob the safe company, the thief's guild, or the police station.
@lianghuang3
@lianghuang3 7 месяцев назад
thanks for using my slides! :)
@JordanBoydGraber
@JordanBoydGraber 7 месяцев назад
Thanks for making such great slides!
@gametimewitharyan6665
@gametimewitharyan6665 7 месяцев назад
My book mentioned about continuous and discrete data, but they did not explain anything. Your video clarified it so well for me Thanks a lot!!!
@JordanBoydGraber
@JordanBoydGraber 7 месяцев назад
You're welcome! Glad to be of help. This is an old video (pre-neural revolution), I just went through it again and it holds up pretty well (except for my not-so-great green screen).
@sebastianM
@sebastianM 7 месяцев назад
Fire video after fire video with this guy. Incredible.
@JordanBoydGraber
@JordanBoydGraber 7 месяцев назад
If you're a human, thank you! If you're a bot, you're an excellent example of the technology in the video, so thank you for providing a real-world example. :)
@leslietetteh7292
@leslietetteh7292 7 месяцев назад
Great intro video, and lovely coverage of the key concepts there. I listened to the guy credited with coming up with the transformer model, and I think in adjusting the word vectors to predict the next word in a sequence more effectively, its also mapping phrases, sentences, ideas and concepts into multidimensional space, up to its input context length. So it ends up having what Isaac Asimov described as a "perceptual schematic" of the world, how everything relates to everything else, encoded in multidimensional space. Then all the behaviours it's trained to perform based on rlhf are possible because it has this initial perceptual schematic.
@JordanBoydGraber
@JordanBoydGraber 7 месяцев назад
Yes, but that schematic isn't a schematic (yet). It's just a vector space, which means that the exact meanings can get fuzzy. This association can only get us so far, which is why we're starting to see the technology's limits. Exciting to see what happens!
@leslietetteh7292
@leslietetteh7292 7 месяцев назад
@JordanBoydGraber I'm not sure we are starting to see the technology's limits? I appreciate your breadth and depth of knowledge in the field, but all of the indications from these companies would appear to suggest that we're not close to approaching an asymptote with regards to these models yet. I do think I know what you're saying though, and I agree, what it has is a set of interrelated numbers, it has no actual "knowledge" per se, its what it's trained to do with these interrelated numbers really. I think the best analogy to get at what im saying is with the vision transformer model. It starts off representing small patches of the image as vectors like words, and has an associated positional encoding vector for each patch too. It learns to not only classify the entire image, and to cluster similar images in dimensional space when it classifies them, but it also learns positional encodings for each patch of the image, adjusting the positional encodings for each patch of the image, to orient it correctly in terms of the image so it has a much better chance of classifying the whole image. I see the same with the language transformer model. It's adjusting vectors on a word level, but because it's using these word vectors to do something with the whole block of text, its still learning to place the entire block of text, in one word iterations, up to its context length, in certain positions in interrelated dimensional space, just like it does with images, even though it only has vectors for words, like it only has vectors for small image patches. Then further training helps it prune down this vast interrelation to a conceptual map (2nd part just a theory from me here). I think there may be a limit with purely language based models, but potentially the sky is the limit with multimodality. The constraining factor appears to be hardware ATM imo.
@dipaco_
@dipaco_ 8 месяцев назад
This is an amazing video. Very intuitive. Thank you.
@sebastianM
@sebastianM 8 месяцев назад
Incredible work. Sharing with my class.
@Kaassap
@Kaassap 8 месяцев назад
This was very helpful tyvm!
@yusufahmed2233
@yusufahmed2233 9 месяцев назад
9:42 for Rm(H), what is the use of taking expectation over all samples? As we saw previously, like from 6:12, calculation of empirical Rademacher does not use the true label of samples, rather just the size of the sample.
@grospipo20
@grospipo20 9 месяцев назад
Interesting
@sebastianM
@sebastianM 10 месяцев назад
It's really wonderful when no nonsense science communication comes with a generous helping of low-key courage. Dope.
@sebastianM
@sebastianM 10 месяцев назад
Really excellent like the other videos on this series. I am sharing the course with colleagues and hoping to go thru the syllabus in the Spring. Thank you for the excellent work, Prof!
@taofiqaiyeloja1820
@taofiqaiyeloja1820 10 месяцев назад
Excellent
@sebastianM
@sebastianM 10 месяцев назад
This is incredible. Thanks!
@user-qx9cg5hx9w
@user-qx9cg5hx9w 10 месяцев назад
in 6:55, it is said that H(x, M) = sum(log(M(xi))), but accroading to the defination of cross entropy, it should be H(P, Q) = sum(-1 *P(x)log(Q(x))), so are we assuming P(x) is always one when computing perplexity?
@JordanBoydGraber
@JordanBoydGraber 10 месяцев назад
This is a really good point. Typically when you evaluate perplexity you have one document that somebody actually wrote. E.g., you're computing the perplexity of the lyrics of "ETA". In that case we have a particular sequence of words. Given the prefix "He's been totally", the probability of P(x_t) = "lying" and everything else is zero. For some generative AI applications, this might not be true. E.g., for maching translation you might have multiple references. Thanks for catching this unstated assumption!
@AlbinAichberger
@AlbinAichberger 10 месяцев назад
Excellent interview. Excellent YT Channel, thank you!
@heyman620
@heyman620 10 месяцев назад
That's just a brilliant video, I appreciate the fact that your videos always introduce an uncommon point of view that still makes a lot of sense.
@jeromeeusebius
@jeromeeusebius 11 месяцев назад
IS there a link to the "mark riddle(?)" transformer diagram? can't find it in the description.
@samay-u2n
@samay-u2n Месяц назад
pbs.twimg.com/media/FZUiCbpXgAEd11j?format=jpg&name=large ,say no more ;)
@candlespotlight
@candlespotlight 11 месяцев назад
Amazing video!! I’m so glad you covered this. Your passion and enjoyment about the subject really comes through. Thanks so much for this ☺️
@tombuteux9294
@tombuteux9294 11 месяцев назад
should equation 6) be: 2e^(-epsilon*m/2)? This is because the chance of sampling from the whole highlighted region is epsilon, so the probability of sampling from a specific region is epsilon/2? Thank you for the great lecture!
@andyvon034
@andyvon034 11 месяцев назад
Yes I think so too, epsilon/2 for each side
@dundeedideley1773
@dundeedideley1773 Год назад
Cool idea! Other rating ideas: how evenly does the straight line cut the country into two pieces? Are they the same size? Same Population each side of the line? This way you can allow for easy countries and hard countries, where you can score the "even" disection of countries irrespective of how long the line is. Also a hint: your microphone has some awful automatic gain setting or something, where all the quiet sounds are amplified and all the loud sounds are quieted down, so your tiniest breathing in is the same volume as your loudest talking bits. It's really annoying
@JordanBoydGraber
@JordanBoydGraber Год назад
1) I like the population bisection idea. It's obviously easier to go through less popular areas. 2) Thanks for mentioning that, it's easy to tune these sorts of things out.
@kwesicobbina9207
@kwesicobbina9207 Год назад
Loved this video 😅 for some reason ❤
@JordanBoydGraber
@JordanBoydGraber Год назад
Thanks! Good to know. Perhaps I'll do more things like this. Not relevant to any of my classes, really, but I enjoyed doing it.
@mungojelly
@mungojelly Год назад
the name muppet models is super cute but alas the perspective that muppet models just make stuff up is misplaced, true in some ways but also dangerously wrong, they do get things wrong or out of place when speaking off the top of their head, but, um, statistically far less than humans do already, the confusion is that they're so much better at talking than humans that they can give almost accurate coherent essays about stuff completely off the top of their heads while a human would just be saying "uhhhhh", if you give them the equivalent of a human salary worth of compute they can also check the accuracy of things a zillion times better than any human could ever check
@DomCim
@DomCim Год назад
Dance your cares away <clap><clap> Worries for another day <clap><clap>
@JordanBoydGraber
@JordanBoydGraber Год назад
Sing und schwing das Bein, <klatschen> lass die Sorgen Sorgen sein.
@darkskyinwinter
@darkskyinwinter Год назад
It's canon now.
@shakedg2956
@shakedg2956 Год назад
You don't have enough views.
@JordanBoydGraber
@JordanBoydGraber Год назад
From your mouth to the algorithm's parameters!
@JordanBoydGraber
@JordanBoydGraber Год назад
Yuval Pinter makes the excellent point that I shouldn't conflate "writing system" and "language". Indeed, this video should have been titled "How to Know if Your Writing System is Broken". See more in their excellent position paper on the subject: aclanthology.org/2023.cawl-1.1/
@jayronfinan
@jayronfinan Год назад
Lol what was that short powermark on question 32