Jordan Boyd-Graber

Jordan Boyd-Graber

408
1 217 532

Подписаться

My research lies at the intersection of natural language processing and machine learning. I post videos from my courses and research.

How (Un-)Realistic is Dropout's "AI-generated" Video? [Rant]

7:12

How (Un-)Realistic is Dropout's "AI-generated" Video? [Rant]

21 день назад

Meta released a Diplomacy-playing LLM. How good is Cicero at talking to players? [Research Talk]

10:00

Meta released a Diplomacy-playing LLM. How good is Cicero at talking to players? [Research Talk]

Месяц назад

What Makes for a Bad Question? [Lecture]

11:18

What Makes for a Bad Question? [Lecture]

7 месяцев назад

Don’t Cheat with ChatGPT in my class! [Rant]

6:27

Don’t Cheat with ChatGPT in my class! [Rant]

7 месяцев назад

Is ChatGPT AI? Is it NLP? [Lecture]

4:04

Is ChatGPT AI? Is it NLP? [Lecture]

7 месяцев назад

What made ChatGPT Possible? [Lecture]

5:41

What made ChatGPT Possible? [Lecture]

7 месяцев назад

Are two Heads better than One also True for Large Language Models [Research]

4:51

Are two Heads better than One also True for Large Language Models [Research]

7 месяцев назад

HackAPrompt Best Theme Paper Presentation at EMNLP 2023 [Research]

12:42

HackAPrompt Best Theme Paper Presentation at EMNLP 2023 [Research]

8 месяцев назад

What Makes for a Good Video Presentation: The Best ACL 2023 Videos

42:30

What Makes for a Good Video Presentation: The Best ACL 2023 Videos

8 месяцев назад

How would you feel if this video's title didn't match what it's about? [Research]

5:56

How would you feel if this video's title didn't match what it's about? [Research]

9 месяцев назад

Helping Computers add "A Little Extra" when they Translate [Research]

6:50

Helping Computers add "A Little Extra" when they Translate [Research]

9 месяцев назад

How is a good Question is like an NP-Complete Problem? [Lecture]

7:00

How is a good Question is like an NP-Complete Problem? [Lecture]

9 месяцев назад

How can Trivia Games around the World improve AI? [Lecture]

13:38

How can Trivia Games around the World improve AI? [Lecture]

9 месяцев назад

QA is not one size fits all: Getting different answers to the same question from an AI [Lecture]

12:55

QA is not one size fits all: Getting different answers to the same question from an AI [Lecture]

10 месяцев назад

My video making process (what not to do) [Lecture]

13:42

My video making process (what not to do) [Lecture]

11 месяцев назад

Do iid NLP Data Exist? [Lecture]

10:02

Do iid NLP Data Exist? [Lecture]

11 месяцев назад

Natural Questions: Google's QA Dataset Five Years Later and Why it's Impossible Today [Lecture]

10:41

Natural Questions: Google's QA Dataset Five Years Later and Why it's Impossible Today [Lecture]

11 месяцев назад

The Fulfilling Straight Line Mission (from a Computer Science Perspective) [Rant]

6:54

The Fulfilling Straight Line Mission (from a Computer Science Perspective) [Rant]

11 месяцев назад

Update: Why you should call Large Language Models Muppet Models [Rant]

5:03

Update: Why you should call Large Language Models Muppet Models [Rant]

Год назад

Academic Conferences' Dark Secret and Why Virtual Conferences will never Improve [Rant]

5:53

Academic Conferences' Dark Secret and Why Virtual Conferences will never Improve [Rant]

Год назад

How to Know if Your Language is Broken [Rant]

5:22

How to Know if Your Language is Broken [Rant]

Год назад

What I expect from TAs in my Course [Lecture]

8:41

What I expect from TAs in my Course [Lecture]

Год назад

Recurrent Neural Networks as Language Models and the two Tricks that Made them Work [Lecture]

11:12

Recurrent Neural Networks as Language Models and the two Tricks that Made them Work [Lecture]

Год назад

Explaining Recurrent Neural Networks through a silly Word-Counting Sentiment Example [Lecture]

18:02

Explaining Recurrent Neural Networks through a silly Word-Counting Sentiment Example [Lecture]

Год назад

What general term should you use for models like BERT and GPT? [Rant]

7:21

What general term should you use for models like BERT and GPT? [Rant]

Год назад

No, CICERO has not "mastered" Diplomacy [Rant]

17:34

No, CICERO has not "mastered" Diplomacy [Rant]

Год назад

Can ChatGPT and You.com answer questions I thought no AI can answer? [Rant]

7:42

Can ChatGPT and You.com answer questions I thought no AI can answer? [Rant]

Год назад

How to read my course webpage [Lecture]

4:10

How to read my course webpage [Lecture]

Год назад

Why I Teach Using a Flipped Classroom and How it Works [Lecture]

6:43

Why I Teach Using a Flipped Classroom and How it Works [Lecture]

Год назад

Комментарии

@BahaedinKhodami 12 дней назад

Amazing!

@BahaedinKhodami 12 дней назад

Amazing!

@13strong 24 дня назад

You realise this is a comedy show, right? It's not supposed to be taken this seriously.

@JordanBoydGraber 24 дня назад

Dropout literally has a whole show about nerds making pedantic corrections called "Um, Actually". Mike Trapp was the host. But I realize I don't get any points because I failed to say "Um, Actually".

@JordanBoydGraber 24 дня назад

And if this is a way for new people to understand AI better than Grant does, I consider that a win.

@JordanBoydGraber 24 дня назад

Repost, first post was missing a final edit, sorry about that!

@zachcrennen2342 25 дней назад

This guy is good, great explanation!

@buzhichun Месяц назад

Very interesting, thanks for sharing

@DanceScholar Месяц назад

Great to see a breakdown of what Cicero is and is not.

@jacklennox1 Месяц назад

Great video, thank you 👏👏

@abdulaziza.9654 Месяц назад

Beautiful !!

@pseudoki Месяц назад

I absolutely agree. Tossing more layers just *feels* wrong. There definitely is something missing in these newer neural models that while they perform well, they don't really do so efficiently. Either they in the future will massively improve via using some of the old techniques, or by being crafted architecturally with more biological inspiration.

@tariqkhan1518 Месяц назад

Can you please reference the paper you mentioned form google?

@tariqkhan1518 Месяц назад

got it in the description.

@wilfredomartel7781 Месяц назад

🎉

@dursung_ 2 месяца назад

Masterpiece! Amazing intro, thanks

@michaelmoore7568 3 месяца назад

As much as I hate LLMs... do LLMs use Chinese Restaurant Processes and/or Kneser-Ney?

@JordanBoydGraber 3 месяца назад

Not really, this is older technology to relate similar contexts together. Modern LLMs (or Muppet Models, as I like to call them) use continuous representations to do that.

@maryam2677 4 месяца назад

Perfect! Thank you so much.

@RajivSambasivan 4 месяца назад

Thanks, that was informative. Learned something.

@420_gunna 5 месяцев назад

I haven't finished the video, so apologies if you cover it, but in the 2023 CS224N NLP lecture on coreference resolution, Chris Manning introduces the (very complicated and demoralizing, to me) Hobb's algorithm, and then basically says something like "Hobbs HIMSELF said publicly that he didn't like the algorithm, and often pointed to it as ~an example of how we clearly needed something better."

@amoghmishra9222 5 месяцев назад

Synthetic data generations has become so easy now thanks to LLM!

@exploreyourdreamlife 5 месяцев назад

Your video has sparked a meaningful conversation. How has being a young-onset Parkinson's patient shaped Jessica's perspective on life? As the host of a dream interpretation channel, I'm curious to explore how her experiences with Parkinson's influence her dreams and subconscious mind. I truly appreciate the opportunity to learn more about Jessica's journey, and I've already liked and subscribed to the channel for more insightful content like this.

@donfeto7636 6 месяцев назад

13:11 there is mistake in last line t(e1,f2) * ( t(e2,f0) + t(e2,f1) + t(e2,f2) ) should be this slides duplicate f2

@user-nm8tj4rh2t 7 месяцев назад

Jordan is soooooooo cool ...🤭 I really want to meet you at the NLP conference ...!!!!

@user-wr4yl7tx3w 7 месяцев назад

Can you share this video with the president of Harvard? I don’t think she got the message. Yet somehow DEI still think it was okay for her to cheat. DEI is accusing everyone of racism.

@RajivSambasivan 7 месяцев назад

Awesome, can't believe guys tried doing this in your class. This is like commiting a burglary and leaving a confession note and a business card. This is really funny.

@JordanBoydGraber 7 месяцев назад

Not just that. I'm not sure what the right analogy is, but it's that *plus*: trying to rob the safe company, the thief's guild, or the police station.

@lianghuang3 7 месяцев назад

thanks for using my slides! :)

@JordanBoydGraber 7 месяцев назад

Thanks for making such great slides!

@gametimewitharyan6665 7 месяцев назад

My book mentioned about continuous and discrete data, but they did not explain anything. Your video clarified it so well for me Thanks a lot!!!

@JordanBoydGraber 7 месяцев назад

You're welcome! Glad to be of help. This is an old video (pre-neural revolution), I just went through it again and it holds up pretty well (except for my not-so-great green screen).

@sebastianM 7 месяцев назад

Fire video after fire video with this guy. Incredible.

@JordanBoydGraber 7 месяцев назад

If you're a human, thank you! If you're a bot, you're an excellent example of the technology in the video, so thank you for providing a real-world example. :)

@leslietetteh7292 7 месяцев назад

Great intro video, and lovely coverage of the key concepts there. I listened to the guy credited with coming up with the transformer model, and I think in adjusting the word vectors to predict the next word in a sequence more effectively, its also mapping phrases, sentences, ideas and concepts into multidimensional space, up to its input context length. So it ends up having what Isaac Asimov described as a "perceptual schematic" of the world, how everything relates to everything else, encoded in multidimensional space. Then all the behaviours it's trained to perform based on rlhf are possible because it has this initial perceptual schematic.

@JordanBoydGraber 7 месяцев назад

Yes, but that schematic isn't a schematic (yet). It's just a vector space, which means that the exact meanings can get fuzzy. This association can only get us so far, which is why we're starting to see the technology's limits. Exciting to see what happens!

@leslietetteh7292 7 месяцев назад

@JordanBoydGraber I'm not sure we are starting to see the technology's limits? I appreciate your breadth and depth of knowledge in the field, but all of the indications from these companies would appear to suggest that we're not close to approaching an asymptote with regards to these models yet. I do think I know what you're saying though, and I agree, what it has is a set of interrelated numbers, it has no actual "knowledge" per se, its what it's trained to do with these interrelated numbers really. I think the best analogy to get at what im saying is with the vision transformer model. It starts off representing small patches of the image as vectors like words, and has an associated positional encoding vector for each patch too. It learns to not only classify the entire image, and to cluster similar images in dimensional space when it classifies them, but it also learns positional encodings for each patch of the image, adjusting the positional encodings for each patch of the image, to orient it correctly in terms of the image so it has a much better chance of classifying the whole image. I see the same with the language transformer model. It's adjusting vectors on a word level, but because it's using these word vectors to do something with the whole block of text, its still learning to place the entire block of text, in one word iterations, up to its context length, in certain positions in interrelated dimensional space, just like it does with images, even though it only has vectors for words, like it only has vectors for small image patches. Then further training helps it prune down this vast interrelation to a conceptual map (2nd part just a theory from me here). I think there may be a limit with purely language based models, but potentially the sky is the limit with multimodality. The constraining factor appears to be hardware ATM imo.

@dipaco_ 8 месяцев назад

This is an amazing video. Very intuitive. Thank you.

@sebastianM 8 месяцев назад

Incredible work. Sharing with my class.

@Kaassap 8 месяцев назад

This was very helpful tyvm!

@yusufahmed2233 9 месяцев назад

9:42 for Rm(H), what is the use of taking expectation over all samples? As we saw previously, like from 6:12, calculation of empirical Rademacher does not use the true label of samples, rather just the size of the sample.

@grospipo20 9 месяцев назад

Interesting

@sebastianM 10 месяцев назад

It's really wonderful when no nonsense science communication comes with a generous helping of low-key courage. Dope.

@sebastianM 10 месяцев назад

Really excellent like the other videos on this series. I am sharing the course with colleagues and hoping to go thru the syllabus in the Spring. Thank you for the excellent work, Prof!

@taofiqaiyeloja1820 10 месяцев назад

Excellent

@sebastianM 10 месяцев назад

This is incredible. Thanks!

@user-qx9cg5hx9w 10 месяцев назад

in 6:55, it is said that H(x, M) = sum(log(M(xi))), but accroading to the defination of cross entropy, it should be H(P, Q) = sum(-1 *P(x)log(Q(x))), so are we assuming P(x) is always one when computing perplexity?

@JordanBoydGraber 10 месяцев назад

This is a really good point. Typically when you evaluate perplexity you have one document that somebody actually wrote. E.g., you're computing the perplexity of the lyrics of "ETA". In that case we have a particular sequence of words. Given the prefix "He's been totally", the probability of P(x_t) = "lying" and everything else is zero. For some generative AI applications, this might not be true. E.g., for maching translation you might have multiple references. Thanks for catching this unstated assumption!

@AlbinAichberger 10 месяцев назад

Excellent interview. Excellent YT Channel, thank you!

@heyman620 10 месяцев назад

That's just a brilliant video, I appreciate the fact that your videos always introduce an uncommon point of view that still makes a lot of sense.

@jeromeeusebius 11 месяцев назад

IS there a link to the "mark riddle(?)" transformer diagram? can't find it in the description.

@samay-u2n Месяц назад

pbs.twimg.com/media/FZUiCbpXgAEd11j?format=jpg&name=large ,say no more ;)

@candlespotlight 11 месяцев назад

Amazing video!! I’m so glad you covered this. Your passion and enjoyment about the subject really comes through. Thanks so much for this ☺️

@tombuteux9294 11 месяцев назад

should equation 6) be: 2e^(-epsilon*m/2)? This is because the chance of sampling from the whole highlighted region is epsilon, so the probability of sampling from a specific region is epsilon/2? Thank you for the great lecture!

@andyvon034 11 месяцев назад

Yes I think so too, epsilon/2 for each side

@dundeedideley1773 Год назад

Cool idea! Other rating ideas: how evenly does the straight line cut the country into two pieces? Are they the same size? Same Population each side of the line? This way you can allow for easy countries and hard countries, where you can score the "even" disection of countries irrespective of how long the line is. Also a hint: your microphone has some awful automatic gain setting or something, where all the quiet sounds are amplified and all the loud sounds are quieted down, so your tiniest breathing in is the same volume as your loudest talking bits. It's really annoying

@JordanBoydGraber Год назад

1) I like the population bisection idea. It's obviously easier to go through less popular areas. 2) Thanks for mentioning that, it's easy to tune these sorts of things out.

@kwesicobbina9207 Год назад

Loved this video 😅 for some reason ❤

@JordanBoydGraber Год назад

Thanks! Good to know. Perhaps I'll do more things like this. Not relevant to any of my classes, really, but I enjoyed doing it.

@mungojelly Год назад

the name muppet models is super cute but alas the perspective that muppet models just make stuff up is misplaced, true in some ways but also dangerously wrong, they do get things wrong or out of place when speaking off the top of their head, but, um, statistically far less than humans do already, the confusion is that they're so much better at talking than humans that they can give almost accurate coherent essays about stuff completely off the top of their heads while a human would just be saying "uhhhhh", if you give them the equivalent of a human salary worth of compute they can also check the accuracy of things a zillion times better than any human could ever check

@DomCim Год назад

Dance your cares away <clap><clap> Worries for another day <clap><clap>

@JordanBoydGraber Год назад

Sing und schwing das Bein, <klatschen> lass die Sorgen Sorgen sein.

@darkskyinwinter Год назад

It's canon now.

@shakedg2956 Год назад

You don't have enough views.

@JordanBoydGraber Год назад

From your mouth to the algorithm's parameters!

@JordanBoydGraber Год назад

Yuval Pinter makes the excellent point that I shouldn't conflate "writing system" and "language". Indeed, this video should have been titled "How to Know if Your Writing System is Broken". See more in their excellent position paper on the subject: aclanthology.org/2023.cawl-1.1/

@jayronfinan Год назад

Lol what was that short powermark on question 32