George Hotz | Latent Space Ep 18: Petaflops to the People - with George Hotz of tiny corp | tinygrad

Подписаться 196 тыс.

Просмотров 32 тыс.

50% 1

Date of the podcast 20 Jun 2023.
Follow, Subscribe to Latent Space: - latent.space/p/geohot (writeup and show notes)
- / @latentspace-podcast
- / latentspacepod
- / swyx (Shawn Wang)
- / fanahova (Alessio Fanelli)
Source of this video:
- • Ep 18: Petaflops to th...
We got permission ( / 1671261099528982528 ) from Shawn Wang (of Latent Space) for uploading this video. All material displayed in this video belongs to their respectable owners. We uploaded this video in good faith to share the work and progress of George Hotz, tiny corp and comma.ai.
Chapters:
00:00:00 intro
00:00:55 devkit, gatekeeping
00:01:35 the hero's journey, the portal
00:02:15 sam altman ml compute, nvidia, qualcomm
00:03:24 CISC, Arm, RISC-V
00:04:15 AMD stack, TPU, Google ML framework
00:06:05 turing completeness, re-order buffer, speculative execution, branch predictions, halting problem
00:07:40 clockless, analog computing, changing cache hierarchy, removing branch predictions, warp schedulers
00:08:20 turing completeness, CUDA, TPU, systolic arrays
00:10:05 systolic arrays visualization, TPU closed source, Trainium
00:11:25 lines of code, pytorch, tensorflow code
00:12:34 developer experience, ONNX, ONNX runtime, compliance tests, core ML
00:13:25 unnecessary memory operations, pytorch lightning, pytorch relu a class
00:16:05 laziness, eager, graph compute model
00:17:30 pytorch smart people, less complexity
00:18:15 fusing, lazy.py
00:19:10 GRAPH=1, DEBUG=2, John Carmack
00:21:05 uncompetitive on nvidia, x86, slower
00:21:32 competitive on qualcomm gpu's
00:22:25 tensors, AMD bugs, opencl, ml perf
00:23:45 kernel driver, ml framework, user space runtime, cuda_ioctl_sniffer
00:24:30 kernel panic, intel GPUs, AMD Lisa Su
00:26:35 open source culture, nvidia P2P, cuda memcpy
00:28:00 building in public, contributing to open source
00:28:32 ggml, M1 pytorch, AMD pytorch
00:30:00 test_ops.py, CI, good tests, mojo, pytorch compatibility
00:31:35 replicating python hard
00:32:08 tiny box red, limited by GPUs, luxury ai computers, fp16 llama
00:33:22 ggml quantization, compressing the weights, memory bandwidth
00:35:32 int8 support, weights in int8, fp16 to int8 to fp16
00:37:45 tiny box challenges, 6 GPUs, blowers or watercooling, pcie 4 extenders, pci redrivers
00:39:10 silent tiny box, 45-50 dB, one outlet of power, limit the power on GPU
00:40:30 AI hub for the home, personal computer cluster, pci bandwidth
00:41:50 training limit on tiny box, 7B, interconnect bandwidth
00:43:05 training longer, making bigger model, inference on cloud
00:44:30 on device training, fine-tuning
00:45:25 mining FLOPCoin, how to tell crypto is a scam
00:45:45 ensuring data is correct, tiny net
00:46:25 federated training, distributed training
00:47:42 enterprise use, flops per dollar, watt, person = 20 PFLOPS
00:49:32 Tampa of compute, GPT 4 mixture model, 16 inferences
00:50:40 secretive companies
00:51:10 better training, batch norm, flash attention
00:52:50 Rich Sutton The Bitter Lesson, computers all you need
00:53:40 Hutter Prize, RNN, MDL, OpenAI vs working at Facebook
00:55:38 hiring people when computer can do everything
00:56:20 model doing a simple pull request
00:57:05 unimpressed language models, subpar rap lyrics generation
00:58:04 10 LLMs in a room to discuss the answer, program generation
00:58:45 tiny corp remote company, programming challenges
00:59:30 tiny grad pull requests, stipend
01:00:45 coding tool complete (above API line), driving not tool complete (under API line)
01:01:40 artists, tools getting better
01:02:30 full time at tiny corp, proposing bounties
01:03:16 separation in company
01:04:05 comma body
01:05:40 large YOLOs, talking to LLMs, latency
01:06:12 LLaMA vs ChatGPT
01:06:40 computer vision and language
01:07:30 AI girlfriend, merging with a machine
01:08:50 brain upload
01:09:30 living forever, how many weights a human has
01:11:05 the goddess of everything else, AI is not going to kill us
01:11:35 alignment problem, complexity will continue, paperclipers do not exist
01:12:25 grateful for AI, math to understand ML
01:13:54 John Carmack six insights, Elon's methodology
01:14:25 accessibility, tiny corp building computers, luck
01:15:25 why transformers work, semi weight sharing
01:16:25 the weights can change dynamically based on context
01:17:10 attention is all you need
01:17:50 Elon fundamental science physics, George fundamental science information theory
01:18:55 e/acc, Mark Andreessen
01:20:25 why avatar 2 bad, Jake Sully
01:21:35 ChatGPT level pull request
01:22:00 impact of chat bots, spam bots
01:22:40 go try tinygrad
01:22:55 building chips, silicone mines, self reproducing robot
We archive George Hotz and comma.ai videos for fun.
Thank you for reading and using the SHOW MORE button.
We hope you enjoy watching George's videos as much as we do.
See you at the next video.

Наука

Опубликовано:

29 июн 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 44

@geohotarchive Год назад

Writeup and show notes: www.latent.space/p/geohot Grateful to Shawn Wang (of Latent Space) for allowing us to upload this video. Follow, Subscribe to Latent Space: - www.latent.space - youtube.com/@LatentSpace-podcast - twitter.com/latentspacepod - twitter.com/swyx (Shawn Wang) - twitter.com/fanahova (Alessio Fanelli) Source: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-K5iDUZPx60E.html Chapters: 00:00:00 intro 00:00:55 open pilot, devkit, gatekeeping 00:01:35 the hero's journey, what was the portal? 00:02:15 sam altman congress, ml compute, nvidia, qualcomm 00:03:24 CISC, Arm, RISC-V 00:04:15 good AMD stack, Google TPU, Google wrote their ML framework 00:06:05 turing completeness, re-order buffer, speculative execution, branch predictions, halting problem 00:07:40 clockless, analog computing, changing cache hierarchy, removing branch predictions, warp schedulers 00:08:20 turing completeness is easy, what is CUDA, TPU, systolic arrays 00:10:05 systolic arrays visualization, TPU closed source, AWS Trainium 00:11:25 tinygrad, lines of code, pytorch, tensorflow code 00:12:34 tinygrad developer experience, ONNX, ONNX runtime, compliance tests, core ML 00:13:25 unnecessary memory operations, pytorch lightning, why pytorch relu a class? 00:16:05 laziness, eager, graph compute model 00:17:30 competing against smart people, less complexity 00:18:15 how does fusing work, lazy.py 00:19:10 GRAPH=1, DEBUG=2, John Carmack 00:21:05 tinygrad right now uncompetitive on nvidia, x86, slower 00:21:32 tinygrad competitive on qualcomm gpu's 00:22:25 tensor core support, AMD bugs, opencl, ml perf 00:23:45 AMD kernel driver, ml framework, user space runtime, cuda_ioctl_sniffer 00:24:30 kernel panic, intel GPUs, AMD Lisa Su, AMD communication people 00:26:35 open source culture, nvidia nickel, nvidia P2P, cuda memcpy 00:28:00 building in public, contributing bug fixes to open source 00:28:32 ggml, M1 pytorch, AMD pytorch 00:30:00 test_ops.py, CI, good tests, mojo, pytorch compatibility 00:31:35 replicating python hard 00:32:08 tiny box red, limited by GPUs, luxury ai computers, fp16 llama 00:33:22 ggml quantization, compressing the weights, memory bandwidth 00:35:32 int8 support, weights in int8, fp16 to int8 to fp16 00:37:45 tiny box challenges, 6 GPUs, blowers or watercooling, pcie 4 extenders, pci redrivers 00:39:10 silent tiny box, 45-50 dB, one outlet of power, limit the power on GPU 00:40:30 AI hub for the home, personal computer cluster, pci bandwidth 00:41:50 training limit on tiny box, 7B, interconnect bandwidth 00:43:05 training longer, making bigger model, training, inference on cloud 00:44:30 on device training, fine-tuning 00:45:25 mining FLOPCoin, how to tell crypto is a scam 00:45:45 how to ensure your data is correct, tiny net 00:46:25 federated training, distributed training 00:47:42 enterprise use, flops per dollar, flops per watt, one person of compute as 20 PFLOPS 00:49:32 one Tampa of compute, GPT 4 mixture model, 16 inferences 00:50:40 secretive companies, hiding something that is not that cool 00:51:10 better training, batch norm, flash attention 00:52:50 Rich Sutton The Bitter Lesson, OpenAI computers you all you need 00:53:40 Hutter Prize, RNN, MDL, what is OpenAI getting wrong? vs working at facebook 00:55:38 how to hire people when computer can do everything 00:56:20 can a model do a simple pull request 00:57:05 unimpressed language models, subpar rap lyrics generation 00:58:04 10 LLMs in a room to discuss the answer, program generation 00:58:45 tiny corp is a remote company, 1000 job applications, programming challenges 00:59:30 tiny grad pull requests, stipend 01:00:45 coding is tool complete (above API line), driving is not tool complete (under API line) 01:01:40 stable diffusion replacing artists, tools getting better 01:02:30 full time at tiny corp, working on bounties, proposing bounties 01:03:16 separation in company 01:04:05 comma body, software problem 01:05:40 large YOLOs, segment anything, talking to LLMs, latency 01:06:12 LLaMA vs ChatGPT 01:06:40 no distinction between computer vision and language 01:07:30 company after tiny corp, AI girlfriend, merging with a machine 01:08:50 brain upload, George's brain already on youtube 01:09:30 living forever, how many weights a human has 01:11:05 the goddess of everything else, AI is not really going to kill us 01:11:35 AI alignment problem, the complexity will continue, paperclipers do not exist 01:12:25 grateful for AI, don't need hard math to understand AI, ML 01:13:54 John Carmack six insights, Elon's methodology 01:14:25 accessibility, tiny corp building computers, luck 01:15:25 why transformers work, semi weight sharing, qualcomm 01:16:25 the weights can change dynamically based on context 01:17:10 attention is all you need 01:17:50 Elon fundamental science physics, George fundamental information theory 01:18:55 e/acc, only the left takes ideology seriously 01:19:45 effective accelerationism, Mark Andreessen 01:20:25 why avatar 2 bad, Jake Sully 01:21:35 ChatGPT level pull request 01:22:00 impact of chat bots, spam bots 01:22:40 go try tinygrad 01:22:55 building chips, building silicone mines, self reproducing robot All material displayed in this video belongs to their respectable owners. We uploaded this video in good faith to share the work and progress of George Hotz, tiny corp and comma.ai.

@beeztherapy Год назад

George hotz is not anything lol im 14 i know how to hack linux etc sql injection what ever cool anyone can code a self driving are with openai no one cares the reason tesla was made is because people want a new look to a car not the same oicture they see everyday they want scince so geo hotz listen buddy very close your not special my 11 year old friend can code better than you she is mensa you just a guy who can write a little c++ anyone in the comments who are a programmer and hacker would say the same do you want a cookkie and attetion foir jail breaking a weak system sit down and stop talking your "Interviews" are anoyed putting their hand above on their head like when does this stop LOL

@user-uc9nu1yn1n Год назад

Geohotz 2024!

@gillianorley Год назад

I own two Comma 3 devices. The “research projects.” They drive my car and truck every day.

@atillacodesstuff1223 Год назад

based

@hayd7371 Год назад

I love how Geohot is a man of principles. I trust him.

@semtex6412 Год назад

can't say this enough. but GeoHot needs to be heard!

@thuh1951 Год назад

@RG-si1qz 11 месяцев назад

@@thuh1951yES

@WisamAlRawi Год назад

I wish the audio volume was louder. I maxed it out and it was still low. Great interview.

@ChristyZach Год назад

Love to hear George talking.. Wish he becomes successful like Elon.. Anarchist Engineer...

@kulaengineering Год назад

Are you kidding? We have one tech mogul jerk enough. Send him to Mars.

@SwornInvictus 11 месяцев назад

GeoHot is my favorite dude in tech

@icanyagmur Год назад

I didn't get why George didn't understand why they call "attention" in NLP. I mean he proves this in the video by saying that "load the weights given the context" (1:16:57). The word attention is a higher-level description for exactly that. Or did he talk about something else? Please clarify for me if I get him wrong. Also what he means by saying "semi-weight sharing"?

@andtpfack8243 11 месяцев назад

Love geohot but i hope he doesn't lose the plot. For me he seems to be on the verge of being completely absorbed into bits and bytes. I guess thats what it takes.

@psibarpsi 11 месяцев назад

Yup! That's what it takes.

@Hambxne 11 месяцев назад

He still has Alex to keep him sane

@justdoeverything8883 11 месяцев назад

comma body moves around so much better than Tesla's lol, add some climbing capability for stairs and I think it would outperform in some ways at least in price, simplicity, utility.

@chaigtin259 Год назад

1:02:21 Why is the closed-captioning so random? "kanban board" "cabin board" "combat board" and finally "compound board" "

@miyamotomasao3636 11 месяцев назад

Google/RU-vid's AI for automatic subtitles is not intelligent. One more proof AI should be called AS : Artificial Stupidity. Or FI : Fake Intelligence.

@dnserror89 7 месяцев назад

Lol

@bennguyen1313 11 месяцев назад

Regarding the 12m mark on how Google TPU is the alternative to Nvidia/AMD chips.. I understand the Open Neural Network Exchange Format (ONNYX or ONNX) was created so that deep learning models could be exchanged regardless of how they were generated.. but how do the conformance tests for ONNX interoperability compare with CoreML? Where are those tests?