David was the right dude to do this video, his commentary is always so sharp and it translates really well to interview questions. I’m grabbing more popcorn and binging this straight through🍿
People who are as knowledgeable as this guy about a particular field and are able to breakdown complex concepts from their fields into these palatable bits of information for us normies really fascinate me. Great interview, definitely one of my favs!
I usually watch the podcast in pieces and finish it right before the next one arrives....but this time I actually finished it on Sunday and had to wait 5 days and I'm so happy right now that they released the bonus episode right when I needed it the most Haha...thankyou guys 😂
Danu is a superb guest. He's a gift for explaining complex ideas beautifully. Not surprised he's in a senior position at Google in such a pathbreaking area of computer research. Thanks for a well-organized podcast, David, and for asking Danu these productive questions.
This episode was worth leaving my first ever comment on RU-vid. I love this type of topic and how in this video, the very complex and technical parts are explained so well. Makes me feel both smart for understanding and like an idiot compared to Danu at the same time.
Ok, this was excellent. Thank you David for organizing, and Danu for obliging. Loved it, would love to see more stuff like this alongside my fav part of friday - getting home to a new episode of the very enjoyable waveform podcast!
Very cool episode! Congrats to mkbhd for giving space for everyone else in the team to shine. Great work David and overall congratulations to the team for putting great content!
I get this is mostly a sales pitch for Google AI, but there's some major issues today. The first big issue is interpretability of the model. Today, we can't interpret the model weights and most of the research shows that we don't have a formal model of how to interpret weights. The second is we know a fraction of the weights contribute to most of the "important stuff" in neural network models. It's common known as "over parameterized" models. We can see this comparing smaller LLama2 models to GPT4. Llama2 get similar performance to GPT4, but with an order of magnitude fewer parameters. Then there's recent papers that suggests larger models are harder to align, which means scaling transformers up 10 or 100x could be impossible to align. As Jensen Huang has said about data centers, all data centers are power limited. Meaning, data centers don't have enough power to handle 2x the TPU or GPU. This is why microsoft and google are building new data centers. It's also why Tesla is building their own data center ie DOJO. Scaling these models to handle 100 million concurrent users isn't trivial. OpenAI claims to handle 10 million users today and it's constantly having performance issues. Some people have tried to estimate the cost to handle a single chatGPT request. Depending on how to calculate it, it's any where from 2-4 cents per request. In contrast, classic google search is less than 0.1 cents/request. If you look at the latency of chatGPT vs classic google search, it's seconds versus 0.01 milliseconds. That's 1000x slower to run a chatGPT request.
That was super interesting! I love what he said at the end about how one of the practical benefits of AI is that it lowers the barrier of entry for people to be able to implement their ideas. I think this is a much better way to frame how we will use AI, rather than people who might say it promotes laziness or not having to learn things for yourself. I think you still need to have your own knowledge and skillset to generate an idea, but there can be so many other things involved in bringing that idea into reality, and we can't possibly become experts in everything in our lifetime. So that's where you can use AI to fill in certain gaps in your knowledge to some extent to allow you to implement your idea. Could be something like someone who is an awesome baker, but they aren't too business savvy. AI doesn't replace that person being good at baking, but it could help to ease the complications around running a bakery.
Wait are people actually thinking Google employees use Chromebooks? It’s a completely different target market…Most software engineers, including at Google, use MacBooks.
Wow! That was quite something! Danu was a great, and fascinating guest, and David was excellent as host, with some very intelligent questions... The deep thought processes behind AI and Deep Learning is incredible! Brilliant bonus episode guys; thanks very much. 😃 👍
At around 20mins Danu mentions that the fact that the transformer could perform a derivation / write code / etc were it having emergent properties because the transformer wasn't trained to perform those specific tasks. But in the context of a transformer being trained to chain together the most likely strings of tokens & producing a string of tokens to do with each of those topics, you would expect that the strings should be accurate based on training data. So it would seem to not necessarily be an emergent property in the same capacity as something that was trained to get colour that could also provide shape. But then I guess you could argue that it's emergent in the context of not expecting it to be able to perform those activities purely based on the likelihood ratios of strings of token? 🤔
David Awesome Podcast with so much great information and has to be one of my favorite episodes with the person on this. I hope we get to see much more like this.Very interesting and much enjoyed.✌️💯
David, that was awesome. I have to say I was confused why you joined the team based on your complete differences However, I think you're probably the best edition to MKBHD just for that alone. I think that your team keeping egos in check could become an actual network of force in our industry well in your industry, but definitely for everyone involved. Awesome job that was awesome
transformers are super variable auto encoders that basically approximate the policy or function in the data provided. so it augments the training of ai's self supervision provided the data is complete
I think you can mainly say that a model has emergent capabilities, because it has not been trained to have those capabilities. Many text models are trained to predict a token (~= word) in a sentence given the tokens that that appear before that one. Despite that simple instruction they learn to reason.
Interesting that a person from Google, talking about Google was sitting in front of a Mac book. However, putting my kindergartenesk sense of the whimsy aside. This was a brilliant interview. And one I am going to have to re-watch a few times in order to brain soak all the information delivered. Many thanks David.
Incredible interview, Danu Mbanga is so intelligent, and knows how to explain things in simple ways… we need more people like him talking about deep tech topics
This is my second "deep dive" into the "Ai,ML,deeplearning" stuff... I've learnt a bit more about it here((i did not know that the scalability of machinelearning was an unexpected phenomenon))...
Not cool guys. I genuinely believed for a few hours I was way ahead in the week and looking forward to the weekend. Oh what-? whats that ITS WEDNESDAY? well my hope is crushed.
I think it's more helpful to define Machine Learning not just as the "math behind AI" but in terms of the high-level problems it is invoked to solve: classification, clustering, regression. This makes the distinction between AI, ML, and DL even more clear, at least for me 😅
Thank you! Only a true expert can explain something so complex in a way anyone can understand. Usually I’d tune out with content like this but I was riveted. PS Tom using his own laptop is surely cheating. Has anyone else used their own keyboard?
Did MKB get a haircut? I think there is something different about him, just can't put my finger on it! (Don't judge me, I am an AI trained on pictures of white people)
55:09 I find what he's saying a bit worrying. It seemed at first he's partly blaming users for AI hallucinations, for not phrasing questions properly. And then saying it's the user's job to check it's the truth. But he did say that why Bard took so long is because the checks and grounding is not fully there. So the industry admits it's rushing and not ready, and then tries to blame users when the AI hallucinates?
here my two cents. this guy seems very knowledgable about ai in general. but ... once he said he using it everyday without have to tell bard or chatgpt where it went wrong scares me dont get me wrong i use it everyday but i still leave the hard. stuff to humans cause chatgpt goes off alot more than most want to state
just starting to watch, and let me just say, the thumbnail is setup to make me think the other guy is an ai reimagination of the person on the right as an african-american one. let's see where this goes...
It seems like a solution to a chat style LLM on minimizing hallucinations is to have the bot ask questions back to frame the intent behind the seeked information.