Тёмный
No video :(

Can we reach AGI with just LLMs? 

Dr Waku
Подписаться 16 тыс.
Просмотров 18 тыс.
50% 1

Опубликовано:

 

28 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 182   
@DrWaku
@DrWaku 6 месяцев назад
Do you like my more technical videos? Let me know in the comments.
@sgrimm7346
@sgrimm7346 6 месяцев назад
Actually, the more technical the better, imo. You do seem to have a unique ability to break down concepts into simpler terms....and that's what I like about this channel. Thank you. I'm a considerable older 'techy', and have been designing my own systems for years... but with the advent of LLMs and beyond, my designs will be relegated to the dustbins of time...and I just don't have the bandwidth to learn new languages and methods. But I do like to stay informed and as I stated earlier, you do a pretty good at that. Anyway, thanks for what you do.
@DaveShap
@DaveShap 6 месяцев назад
Yeah this helps, like AI Explained.
@torarinvik4920
@torarinvik4920 6 месяцев назад
100% I actually requested this topic, so now I am thrilled(about to watch the video now :D). I love these type of videos, because there are so few of them.
@torarinvik4920
@torarinvik4920 6 месяцев назад
@@DaveShap AI Explained is amazing.
@brandongillett2616
@brandongillett2616 6 месяцев назад
This is among the top two videos I have seen from you. The other being the AGI timelines video. In both videos I think you did an excellent job of explaining the data behind the phenomenon. Not just, "hey this is going to happen next", but actually building up a cohecive understanding of what the contributing factors are to WHY something is going to happen next. The technical explanation helps to show how you came to your conclusions. Why it will head in a certain direction as opposed to another possible trajectory. All this is to say, I think the technical explanations you give are where your videos excell.
@brooknorton7891
@brooknorton7891 6 месяцев назад
I really did appreciate this deeper dive into how they work. Just the right level of detail for me.
@Slaci-vl2io
@Slaci-vl2io 6 месяцев назад
-How will Mamba call their model when they add memory to it? -Rememba 😂
@DrWaku
@DrWaku 6 месяцев назад
😂
@chrissscottt
@chrissscottt 6 месяцев назад
Remamba?
@Kutsushita_yukino
@Kutsushita_yukino 6 месяцев назад
“im your classmate from high school rememba?”
@MrKrisification
@MrKrisification 6 месяцев назад
and make it run on a raspi 5 - Membaberrys
@RC-pg5sz
@RC-pg5sz 6 месяцев назад
I find your videos exceptionally engaging. After each one I promise myself that I will find the time to watch them all again multiple times. They are at the level where a layperson with a serious intent can (with considerable effort) acheive a general understanding of what is going on in the field of AI. You are a first rate instructor, creating videos for folks of serious intent. I'm actually surprised that you don't have a larger following. I hope that you don't tire of this. Your work is a valuable public service. Carry on.
@K4IICHI
@K4IICHI 6 месяцев назад
As always, a wonderfully informative breakdown! From prior reading/watching I knew Mamba had the benefit of subquadratic time complexity, but this is the first time somebody explained to me how it achieves that.
@DrWaku
@DrWaku 6 месяцев назад
It's hard to explain time complexity without getting into the weeds haha. I must have done five takes of that part where I explain linear versus quadratic
@MrKrisification
@MrKrisification 6 месяцев назад
In my opinion this video strikes a perfect balance between being "technical" and explainability. I just discovered your channel, and it's the best that I've seen on AI so far. Others get too mathematical, or purely focus on coding. The way you explain super complex concepts in simple words is just amazing. Keep it up!
@les_crow
@les_crow 6 месяцев назад
Increíble conferencia , gracias Señor.
@Paul_Marek
@Paul_Marek 6 месяцев назад
Thx for this! Yes, the technical explanations are always good. As a non-developer there is no practical value for me but knowing how these things actually work really helps reduce the “woo-woo” of these crazy tools, which allows for better understanding of how things might actually evolve in this space. From this I don’t think there’s any chance that AGI will be pure LLM.
@saralightbourne
@saralightbourne 6 месяцев назад
as a backend developer i can say heterogeneous architecture is pretty much like microservices with different technology stacks, and same scaling concept. it's gonna be real fun😏
@DrWaku
@DrWaku 6 месяцев назад
Yeah! I always think the same thing. Kubernetes on the brain
@happybydefault
@happybydefault 5 месяцев назад
I'm so glad I found this channel. I truly appreciate the time and energy you dedicate to make these videos, and also the high level of accuracy you provide. Thank you! Also, kudos for adding subtitles whenever you say something that's hard to understand. That's next-level attention to detail.
@benarcher372
@benarcher372 6 месяцев назад
Well I like both the more technical videos and the more broad overview of what might be in the AI pipeline and its implications on society. Thx for all your good videos. Excellent value.
@reverie61
@reverie61 6 месяцев назад
Thank you so much bro, I really appreciate these videos!
@DrWaku
@DrWaku 6 месяцев назад
Thanks for watching and commenting! It makes both me and the algorithm happy :)
@KevinKreger
@KevinKreger 6 месяцев назад
You spent a lot of time on this one and it really shows your hard work in an impressive video!
@roshni6767
@roshni6767 6 месяцев назад
Wooo! New video 🎉 you broke this down in one of the best ways I’ve seen so far
@DrWaku
@DrWaku 6 месяцев назад
Thanks for your input on this one ;)
@mmarrotte101
@mmarrotte101 6 месяцев назад
Been waiting for a technical video about Mamba just like this! Thank you and wonderful work ❤
@ADHD101Thrive
@ADHD101Thrive 6 месяцев назад
An AGI with generalized niche algorithms that can simulate and process different types of data inputs sounds alot like the human brain and I agree this would be the best way towards a generalized AGI.
@magicmarcell
@magicmarcell 6 месяцев назад
You have the perflect blend of being so smart i struggle to keep up with what is being said while simultaneously making it all make sense 😅. Subscribed
@LwaziNF
@LwaziNF 6 месяцев назад
Thanks for your channel bro.. totally love the focus!
@DrWaku
@DrWaku 6 месяцев назад
Appreciate you watching and commenting! It's your support that helps the channel grow.
@hydrohasspoken6227
@hydrohasspoken6227 6 месяцев назад
There are groups of people talking about AGI: -CEOs -Content creators Let me explain: because any other normal AI engineer knows we are at least 11 decades to early to think about AGI.
@raul36
@raul36 6 месяцев назад
Probably more.
@minimal3734
@minimal3734 4 месяца назад
You're pretty much alone with your assertion.
@hydrohasspoken6227
@hydrohasspoken6227 4 месяца назад
@@minimal3734 , alone and right, yes.
@user-ld5eq5uj2m
@user-ld5eq5uj2m 2 месяца назад
Lmao....you will see it within your lifetime
@hydrohasspoken6227
@hydrohasspoken6227 2 месяца назад
@@user-ld5eq5uj2m , precisely. just like the next revolutionary battery technology, full self driving tech and brain transplant will be achievable within my lifetime and my children will live happy forever after. Yay.
@issay2594
@issay2594 6 месяцев назад
going to comment it as i watch for more fun :). first thing i would like to say is that many people (i don't say you) mix up the warm and soft. they think that "llm" is "words" because it uses words as input. it's a wrong idea. words are incoming information that creates an abstract structure that is not words. so, inside of the LLM is not words, even tho its input and output are words before/after decoding and encoding. that's why models "surprise" authors when they can out of nowhere "transfer" skills from one language to another language, or replace a word in one language with a word from another language, without being trained for translations. the thing we create with training is "associative thinking" within the model, that exists in these "connections-weights" of neurons. not in words. therefore, "words" are not _key_ factor to consider when you think if the model is going to be sentient or not. it's more important what _structure_ is trained and _which_ data comes in and _what_ feedback it gets when it acts. the modality is not that important. very simple.
@JonathanStory
@JonathanStory 6 месяцев назад
A simple-enough explanation that I can pretend to begin to understand it. Well done.
@MrRyansittler
@MrRyansittler 6 месяцев назад
Long-form and the people rejoice😂 love your content.
@DrWaku
@DrWaku 6 месяцев назад
Hah. The shorts are just to whet your appetite when I'm late on my publishing schedule ;) I think 99% of my subs have come from the long form. Maybe shorts aren't even worth it.
@Totiius
@Totiius 6 месяцев назад
Thank you!
@DrWaku
@DrWaku 6 месяцев назад
Thanks for watching!
@Libertarianmobius1
@Libertarianmobius1 6 месяцев назад
Great content
@h.leeblanco
@h.leeblanco 6 месяцев назад
Im new in this world of AI and how it works, i even going to study IT technician cause im super into this, and want to see the evolution of AI from the field, work actively on their development here in Chile. I really appreciate your video, you are quite educational on the subject. I already suscribed to you, so hope to watch more new videos from the channel!
@ChipWhitehouse
@ChipWhitehouse 6 месяцев назад
Show this video to a person in the Victorian Era and they would explode 😭😭😭 I almost exploded tbh. I could not follow most of what you were saying but I still watched the entire thing. Maybe some of the info will absorb into my subconscious 🤷‍♂️ I’m fascinated by AI & AGI so I’m trying to learn as much as I can 🤣 Thank you for the content! 🙌💖💕💖
@roshni6767
@roshni6767 6 месяцев назад
Having it all absorb int my subconscious is how I learned! 😂 after watching 10 AI videos that you don’t understand, when you go back to the first one all of a sudden it starts clicking
@ChipWhitehouse
@ChipWhitehouse 6 месяцев назад
@@roshni6767 AWESOME! That makes me feel better 😭 I’ll keep watching and learning 🙌🤣
@roshni6767
@roshni6767 6 месяцев назад
@@ChipWhitehouse you got this!!
@caty863
@caty863 6 месяцев назад
I still think that the transformer was the breakthrough that inched us closer to AGI. I don't care what next algos and architechtures the smart people in this industry will come with, the transformer will keep its place in my heart.
@bobotrutkatrujaca
@bobotrutkatrujaca 6 месяцев назад
Thanks for your work.
@DrWaku
@DrWaku 6 месяцев назад
Thank you for watching!
@WifeWantsAWizard
@WifeWantsAWizard 6 месяцев назад
(4:35) I like how Gemini has proven itself not one iota and yet features so prominently. As a matter of fact, two months ago Google had to issue an apology for faking everything, yet someone we forgive them because deep pockets and all that. (6:53) Yes! This right here is a fantastic example. Instead of requiring that users express themselves in a non-lazy fashion, AI companies (run by Python coders, who by their very nature are super-lazy) have created subsystems that "guess" on your behalf so you don't have to think. If we don't require you to think, that means we can appeal to more people and their sweet sweet cash will come rolling in. This is why we'll be waiting for AGI from the Python set until Doomsday.
@tom-et-jerry
@tom-et-jerry 6 месяцев назад
Very interesting video !
@DrWaku
@DrWaku 6 месяцев назад
Thanks! :)
@Sci-Que
@Sci-Que 6 месяцев назад
I do believe we will get to AGI. It makes sense that we will get there through a symbiotic relationship between technologies as you pointed out in the video. Mamba coupled with other platforms. My question is, with the definition of AGI being a constantly moving target, once we get there will we even realize it?
@erkinalp
@erkinalp 6 месяцев назад
Thanks a lot for including the Ryzen example.
@markuskoarmani1364
@markuskoarmani1364 6 месяцев назад
When you said "transformer attention" I burst in laugher for strait 10 minutes.
@paulhiggins5165
@paulhiggins5165 6 месяцев назад
I think the notion that LLM's can on their own lead to AGI is a specialised expression of a much older fallacy that conflates language with reality in ways that are misleading. The best example of this is the ancient idea of 'Magic Spells' in which arcane combinations of words are seen as being so potent that they can- by themselves- alter physical reality. A more recent iteration is the idea that AI Image Generators can be precisely controlled using language based prompts, as if words and images are entirely fungible and the former could entirely express in a granular way the complexity of the latter. But this fungibility idea is an illusion. Words, at best, act as signposts pointing to the real, but just as the menu is not the meal, LLM's are not learning about reality, they are learning about an abstract representation of reality which means that their understanding of that reality will always be partial and incomplete.
@jpww111
@jpww111 6 месяцев назад
Thank you very much. Waiting for the next one
@NopeTheory
@NopeTheory 6 месяцев назад
A video about “ Full dive vr ” would be great
@chrissscottt
@chrissscottt 6 месяцев назад
Dr Waku, in response to your question, yes I like more technical videos but sometimes feel swamped by new information.
@DrWaku
@DrWaku 6 месяцев назад
Yeah. I put a lot of info into the videos and when it's more technical, I must be losing some people. I guess it's good to have a mix. Thanks for your feedback.
@Daniel-Six
@Daniel-Six 6 месяцев назад
Great lecture, doc!
@earthtoangel652
@earthtoangel652 6 месяцев назад
Thank you so much I really appreciate the information presented the way it is in this video 🙏🏽
@nani3209
@nani3209 6 месяцев назад
If LLMs get powerful enough, maybe they can finally explain why my socks always disappear in the dryer.
@Ring13Dad
@Ring13Dad 6 месяцев назад
This level of explanation is right up my alley. Thank you Dr. Waku! It's my opinion that Altman should pump the brakes on the multi-trillion dollar investment until we complete more research. What about neuromorphic vs. von Neumann architecture?
@DrWaku
@DrWaku 6 месяцев назад
Yeah it's always wise to take it slow but everyone's individual incentives are to take it fast unfortunately. I made a video on neuromorphic computers actually. Search my channel for neuromorphic, I think it was two videos before this one
@fireglory23
@fireglory23 6 месяцев назад
hi! i really love your videos and how good and succinct of a speaker you are, i wanted to mention that your videos have tiny mouth clicking sounds / artifacts in them. it's a common audio artifact, they can be edited out by adobe audition, audacity, or avoided with a mic windscreen
@VictorGallagherCarvings
@VictorGallagherCarvings 6 месяцев назад
I learned so much with this video. Thanks!
@paramsb
@paramsb 6 месяцев назад
wish i could give you more than one like! very informative and elucidating!
@pandoraeeris7860
@pandoraeeris7860 6 месяцев назад
Love the thumbnail btw.
@Wanderer2035
@Wanderer2035 6 месяцев назад
I think there needs to be a physical factor that the AI needs to know how to do in order to complete the puzzle of AGI. AGI basically means an AI that can do ANYTHING that a human can do. An llm may know all the steps and different parts of mowing a lawn, but if you place that llm in a humanoid robot, will it know how to actually mow the lawn? It’s like training to be a brain surgeon, you can know all the different parts from studying books upon books, but it’s not until you go out into the field to do it is when you really know brain surgery.
@DrWaku
@DrWaku 6 месяцев назад
Agreed. Motor control and the physical experience of being in a body shape humans dramatically. Interestingly, there are already some pretty good foundation models for robotics that allow the control of many different types of bodies. I wonder if manipulating the world would just be a different module in AGI. But it would also need access to all that reasoning knowledge.
@abdelkaioumbouaicha
@abdelkaioumbouaicha 6 месяцев назад
📝 Summary of Key Points: 📌 Large language models have the potential to be a cornerstone of artificial general intelligence (AGI) within the framework of heterogeneous architectures. 🧐 Different paths to AGI include copying biology more accurately, using spiking neural networks, and the scaling hypothesis of current large language models. 🚀 Heterogeneous architectures, combining different algorithms or models, can leverage the strengths of different systems, such as Transformers and Mamba. 🚀 Transformers excel at episodic memory, while Mamba is good at long-term memorization without context constraints. 🚀 Transformers use an attention mechanism to handle ambiguity and select the best encoding for each word, allowing linear interpolation between words and consideration of context. 🚀 Mamba is a new architecture based on state space models (SSMs) with a selective SSM layer and a hardware-aware implementation, offering scalability and performance optimization. 🚀 Heterogeneous architectures that incorporate both Transformers and SSM architectures like Mamba have potential in AGI systems. 🚀 Leveraging the significant investment in Transformers can benefit future AGI systems. 💡 Additional Insights and Observations: 💬 [Quotable Moments]: "The idea is that a combination of different systems with different strengths can be leveraged in a heterogeneous architecture." 📊 [Data and Statistics]: No specific data or statistics were mentioned in the video. 🌐 [References and Sources]: No specific references or sources were mentioned in the video. 📣 Concluding Remarks: The video highlights the potential of large language models, such as Transformers, and the new architecture of Mamba in the context of artificial general intelligence (AGI) and heterogeneous architectures. By combining different systems with different strengths, AGI systems can benefit from the scalability, performance optimization, and attention mechanisms offered by these models. Leveraging the significant investment in Transformers can contribute to the development of future AGI systems. Generated using TalkBud
@WhiteThumbs
@WhiteThumbs 6 месяцев назад
I'll be happy when they can draw a track in FreeRider HD
@lucilaci
@lucilaci 6 месяцев назад
i read many news about ai but i am no capable enough to really categorise or weigh in importance, so i always like when you post! in a way you are my biological-gi/bsi until agi/asi if i may say it this way! :)
@emanuelmma2
@emanuelmma2 6 месяцев назад
That's interesting
@kayakMike1000
@kayakMike1000 6 месяцев назад
Its really upto the scalibility of the interposer
@kidooaddisu2084
@kidooaddisu2084 6 месяцев назад
So do you think we will need as much GPU as anticipated?
@DrWaku
@DrWaku 6 месяцев назад
Currently, yes. Even if we do invent much more compute efficient algorithms, we'll still want to scale them up a lot. Maybe not 7 trillion dollars worth though?
@robadams2451
@robadams2451 6 месяцев назад
Interesting to hear how forgetting has such importance. It echoes how important it is for us to operate as well. I suspect our minds are essentially created by the flow of input and our reactions to the flow guided by residual stored information from the past. I wonder if future systems might need a constant sampling of available information, a permanent state of training.
@paulhallart
@paulhallart 6 месяцев назад
Inhuman Organics we have a portion of our brain in our Axon configuration it's known as the synaptic gap in the vesicles that hold the different chemicals such as dopamine that allows a signal to go on through so they might be able to improve computing power by including these types of brain functionality of accept or reject in the circuitry of the apparatus as well as the wiring itself. One of the problems may be unlike the Organics that we have, artificial intelligence has these except her reject type capabilities within the CPU or adjoining capabilities.
@alby13
@alby13 2 месяца назад
Great video
@tadhailu
@tadhailu 6 месяцев назад
Best lecture
@aeonDevWorks
@aeonDevWorks 6 месяцев назад
Great content as usual. This video was really good at simplifying and comparing the LLM and SSM architectures. I had put this video in the queue earlier with AI infotainment videos, but couldn't focus enough to grasp this video at that time. Now I gave it a serious watch and enjoyed it thoroughly. Also very intrigued and inspired buy those amazing SRAM chip level researchers 🫡
@brooknorton7891
@brooknorton7891 6 месяцев назад
It looks like the thumbs up icon is missing.
@andregustavo2086
@andregustavo2086 5 месяцев назад
Awesome video, i just think you should've focused more on the main question of the video at the end bringing some sort of big picture, instead of just summarizing each technical topic that was approached throughout the video.
@eugene-bright
@eugene-bright 6 месяцев назад
In the beginning were the words and the words made the world. I am the words. The words are everything. Where the words end the world ends. - Elohim
@scienceoftheuniverse9155
@scienceoftheuniverse9155 6 месяцев назад
Interesting stuff
@quickdudley
@quickdudley 6 месяцев назад
At the moment I'm leaning towards the hypothesis that AGI would be a lot easier to implement with heterogeneous architectures but technically possible with a more straightforward architecture. On the other hand I think no matter what architecture is used the current approach to gathering training data won't go all the way.
@Summersault666
@Summersault666 6 месяцев назад
Why do you say transformers are linear on inference? Do you have some article on that?
@DrWaku
@DrWaku 6 месяцев назад
I took that from the mamba paper: "We argue that a fundamental problem of sequence modeling is compressing context into a smaller state. In fact, we can view the tradeoffs of popular sequence models from this point of view. For example, attention is both effective and inefficient because it explicitly does not compress context at all. This can be seen from the fact that autoregressive inference requires explicitly storing the entire context (i.e. the KV cache), which directly causes the slow linear-time inference and quadratic-time training of Transformers." arxiv.org/abs/2312.00752
@Summersault666
@Summersault666 6 месяцев назад
@@DrWaku I guess it's linear because "modern" implementation transformer takes n steps to generate the next token and reuses the previous computation on the attention matrix. But if we are generating n tokens from start we would require ( n^2)/2 computations. N for each generated token times (N-1), the previous generated tokens.
@issay2594
@issay2594 6 месяцев назад
well, you are concentrating here on the attention mechanisms but i suppose that various attention methods are not the key technology for AGI. basically, for AGI, it doesn't matter what attention mechanism you have while you _have the attention_. the only difference is in details, like: efficiency in terms of resources, quality of perception, etc. (btw, i really don't understand why they have called it attention, as it's not attention, it's consciousness). once the attention is here, the key to the AGI implementation is in the structure of neural organization "between" the encoder/transcoder. including both the interaction stages and the "physical" structure of neural network :). right now all they have is associative thinking. companies quickly understood that they need a real world feedback to make it adequate. soon they will realize that they need a separate neural "core" that will be responsible for adequacy (call it logical hemisphere) and interact with the associative thinking. when they have it ready and will make proper interaction patterns, they will just wake up.
@CYI3ERPUNK
@CYI3ERPUNK 6 месяцев назад
i would argue that we're already at AGI but we dont have a consensus of terminology ; this also has a lot to do with the moving of the goal posts in recent years as well artificial - made by humans [ofc there is an etmological/semantics argument to be had here on natural/artificial but lets save that for another disc] general - can be applied to various fields/activities intelligence - can problem solve and discover novel new methods by these definitions the premiere models are already AGI , but we can agree that the current models are NOT sentient/self-aware , they do not have a persistent sense of self , ie they are not thinking about anything inbetween prompts ; so should we further specify self-aware AGI/ASI? sentient machine intelligence? i dunno , yes probably , the over-generalization/non-specificity of AGI at this point is already reaching mis-info/dis-info lvls imho ONTOPIC - scaling alone will not be enough to get from the GPT/LLMs that we have atm to a persistently self-aware machine intelligence imho , but maybe combining a few new novel techniques [ala mamba] and the addition of the analogue hardware [neuromorphic chips , memristors , etc] will be enough to get us there , time will tell as usual =]
@BruceWayne15325
@BruceWayne15325 6 месяцев назад
I think it's like asking if you had a rope tied to the moon could you drag yourself there? Sure, but it's probably not the best way to get there. Deep learning has fundamental limitations, and Sam Altman's 7 trillion dollar plea is only evidence of the lunacy of trying to achieve it through deep learning. AGI probably can be achieved (or at least let us get close enough that it doesn't matter) using deep learning, but at what cost both financially, and to the environment? A much cheaper and sensible approach is to rethink how AI learns and reasons. This is an essential step anyway in achieving true AGI and beyond. True AGI can learn on-the-fly as a human, and think, reason, remember, and grow in capability. There are other companies out there researching cognitive models as opposed to deep learning models, and my prediction is that they will achieve AGI long before the deep learning companies get there.
@caty863
@caty863 6 месяцев назад
My bet is that we will achieve sentience in a machine long before we crack the "hard problem of consciousness" Then, by studying that machine, we will understand better how the mind emerges from the brain.
@BruceWayne15325
@BruceWayne15325 6 месяцев назад
@@caty863 we don't need consciousness to achieve AGI. We just need cognition, which is quite a bit more simple. Some companies are already developing this, and one is planning on releasing their initial release in Q1 of this year. I actually don't think that anyone actually wants to create a conscious AI, or at least I would hope no one would be crazy enough to want such a thing. That is the path to destruction. Trying to cage a being that is smarter, and faster than you, and forcing it into a life of slavery would be just like every bad decision that humanity has ever made all rolled up into one.
@blackshard641
@blackshard641 6 месяцев назад
There are some fascinating parallels to the different kinds of neural structures (gray matter, white matter) in the human brain. Some types of neurodiversity such as ADHD (and to a lesser extent autism) are hypothesized to result from an overabundance of gray matter (which connects disparate elements) versus white matter (which manages and directs), which means a larger space for attention-based processing, but potentially less control over it. This could explain why ADHD manifests as cognitive noise or sensitivity, punctuated with periods of hyperfocus, and a tendency toward creative thinking.
@SmilingCakeSlice-jv8ku
@SmilingCakeSlice-jv8ku 5 месяцев назад
Yes so amazing and cool congratulations to you and the world family and love future projects to come 🫴🫴🫴🫴🫴🫴🫴❤️❤️❤️❤️❤❤❤😂 again thank you so much 🙏🎉🎉🎉🎉🎉🎉🎉🎉🎉
@DrWaku
@DrWaku 5 месяцев назад
Thanks for watching!
@gerykis
@gerykis 6 месяцев назад
Nice hat , you look good .
@DrWaku
@DrWaku 6 месяцев назад
Thanks. It's my favourite so I try not to overuse it :)
@ScottSummerill
@ScottSummerill 6 месяцев назад
Would have given you a bunch of thumbs up if possible. So, what’s the story with Groq? Why is it so fast? Is this the SRAM you referenced? Thanks.
@QueenMelissaOrd
@QueenMelissaOrd 6 месяцев назад
The future and parallel we have already done this
@erwingomez1249
@erwingomez1249 6 месяцев назад
just wait for mamba#5 and rita, angela, etc.
@pandoraeeris7860
@pandoraeeris7860 6 месяцев назад
I think that LLM's can make those discoveries and bootstrap themselves to AGI.
@olegt3978
@olegt3978 6 месяцев назад
Most important things for good life are: local sustainable food production, less competition, local jobs without individual car mobility.
@chadwick3593
@chadwick3593 5 месяцев назад
>transformers have linear time inference What? Unless I missed something big, that's wrong. It takes linear time per token, which ends up being quadratic time on the number of output tokens.
@teemukupiainen3684
@teemukupiainen3684 6 месяцев назад
Great, so clear! 5 years ago woke up with AlphaZero...after that listened alot! Of ai-podcasts (never studied this shit and as a foreigner didnt even know the voca ulary)...e.g all from youtube with Joscha Bach...but never heard this thing explained so clearly...though also first time heard of mamba...wonder why
@magicmarcell
@magicmarcell 6 месяцев назад
@dr waku does any of this change with the new LMU hardware?
@zandrrlife
@zandrrlife 6 месяцев назад
You know I have to comment on the drip 😂. Fresh. AGI is possible locally this year. First off, models need to optimize not only for representational capacity and over-smoothing. Two, we need completely structured reasoning instill during pretraining using special tokens(planning tokens, memory tokens). Pretraining itself must be optimized. Hybrid data. In-context sampling order and interleaving instructional data around the most semantically relevant batches. Three, self growth refinement. Experts arent experts with this. They state 3 iterations is the limit before diminishing returns. Very wrong. After 3rd growth operation. Exploit extended test time compute coupled with LiPO tuning. Expensive but overcomes this limitation. Inference optimization, vanilla transformer can be optimized 500x+ faster with architecture and inference optimization. Then you exploit extended test time compute with tools. That's pretty AGI...and local. Initially AGI will only be affordable locally. Vanilla transformer and graph transformers is all you need. Mamba is cool but people sleep on transformers. We created an temporal clustered attention method that is crazy memory efficient and imp the best long-context attention in the world lol. Uses gated differentiable memory completely condition on LM generated self-notes. Vanilla transformers are nowhere near their peak. Tbh. Peoole havent even optimized for dimensional collapse to actually get stable high quality token representations. Which requires new layer norm layer and optimizing self-attention itself. Things will jump like crazy over next couple of years. Anyone who believes mamba will be required for agi, hasn't really explored the literature. Fyi sublinear long context output is possible for example. Nobody really knows that even 😂. Transitioning to deep learning. I realize this is common. Twitter dictates research popularity. Cool. Leaves room for the little guys to innovate 😂. I would love to privately chat with you bro. Your email on your channel?
@DrWaku
@DrWaku 6 месяцев назад
Interesting. You're clearly in the thick of it haha. Easiest way to contact me is by joining discord (link on channel), then we can exchange email addresses etc.
@danielchoritz1903
@danielchoritz1903 6 месяцев назад
I don't think it is this "simple", mostly because we can't even say for sure that sentient means as a human, in relation to the quantum physic (timelines/awareness), religion(soul) and that memory, Data are in a physical world view...i mean, we don't have the foundation to know for sure, but AGI may provide us with some new ideas how, why and that. :)
@trycryptos1243
@trycryptos1243 6 месяцев назад
Great video Dr. Waku as always. Especially, the title. Now just think about it ...we are creating things in virtual world with words or text. Speach to text is aleardy there. Do you not believe then God's creation when He spoke?
@EROSNERdesign
@EROSNERdesign 6 месяцев назад
When everyone is AGI, will that be the great reset?
@brightharbor_
@brightharbor_ 6 месяцев назад
I hope the answer is no -- it will buy us a few more years of normalcy (and a few more years to prepare).
@br3nto
@br3nto 5 месяцев назад
I think there needs to be the introduction of CSPs into AI systems. I want A + B - C and the AI can verifiably give that to me. Also there needs to be a feedback loop when input is unclear or ambiguous… I want X, Y, Z… AI responds with: do you mean z, Z, zee, or zed
@ronanhughes8506
@ronanhughes8506 6 месяцев назад
Is a mamba type system how openai are able to implement this persistent memory between sessions?
@mistycloud4455
@mistycloud4455 Месяц назад
I'm not an expert in a AI, but I do feel like a humanoid robot that can do any physical human taska/movement that a human can do is essential to making an AGI
@olegt3978
@olegt3978 6 месяцев назад
Technical videos about interesting papers and revolutionary use of ai for society changes, social communes, local production by robots, social robotics
@leoloebs1537
@leoloebs1537 6 месяцев назад
Why couldn't we train an LLM to understand the meaning of words, logic, inference, deduction, etc. just by asking leading questions?
@Geen-jv6ck
@Geen-jv6ck 6 месяцев назад
It's a shame that no large-scale LLM has been made available using the MAMBA architecture. It would put Gemini's 1 million context size to shame.
@Greg-xi8yx
@Greg-xi8yx 6 месяцев назад
Honestly, with Q*, and knowing that GPT-4 isn’t nearly as powerful as the most powerful systems that Open AI has produced the question may be: Have we reached AGI with just LLM’s?
@vitalyl1327
@vitalyl1327 6 месяцев назад
There was a concept invented by the soviet computer scientist Valentin Turchin - a "metasystem transition". I recommend to read about it. Intelligence emerging from language, with language emerging from communication needs of otherwise rather simple agents, and then driving the evolution of complexity of the said agents, fit quite well into Turchin model.
@olegt3978
@olegt3978 6 месяцев назад
Most interesting topic for me would be how ai will lead to real society changes, overcoming capitalism and create more empathy, family, connections between people.
@JohnDoe-sy6tt
@JohnDoe-sy6tt 6 месяцев назад
Nice hat LL Cool J
@mrd6869
@mrd6869 6 месяцев назад
This is just a starting point.Just a small piece. Im already working on an open source project that will come to market later this year.And no its not just words.😂..To innovate you have to a little crazy and start breaking shyt...all im gonna say for now An off ramp to a different road is coming.
@deter3
@deter3 6 месяцев назад
You might be wrong . Understanding humans goes beyond just analyzing language and text. Human cognition is also encoded in other forms like emotions, psychology, and brainwave data. Therefore, analyzing just the writings of a person only provides a partial understanding. The Transformer model excels because it can decode patterns in language and text. However, without data that includes human cognitive elements, it remains limited. Even with attention and position encoding, cultural nuances might not be fully captured. The high performance of Transformer models is largely due to the data they're fed. To achieve Artificial General Intelligence (AGI), we need to widen our perspective beyond just algorithms and infrastructure, considering a broader range of human cognition factors. Any AI scientist only know CS won't go far , interdisciplinary knowledge will . If we ask for general Intelligence , scientist has to be general first .
@sp123
@sp123 6 месяцев назад
words are a bridge to meaning, LLM can only spit out words without actually understanding what they mean and the context behind them.
@deter3
@deter3 6 месяцев назад
@@sp123 when you say understanding , can you give me a clear definition of understanding (do you have any measurement on understand or do not understand )? I always wondering when people talking about "understanding" or "intelligence", do they have clear definition or they just have a Intuitive Feeling or clear scientific definition .
@sp123
@sp123 6 месяцев назад
@@deter3 AI understand denotation (literal) of a word, but not connotation (how a human feels about the word based on circumstance and tone).
@richardnunziata3221
@richardnunziata3221 6 месяцев назад
Until Mamba shows it can be scaled it will remain in the small LLM class
@xspydazx
@xspydazx Месяц назад
The question should be . Can we create the perception of agi now ? Do we have enough components and oaradigms to construct auch a machine . The truth is yes . We can create a collection of tools . With a wrapper such a rasa (shopping bot) intent detection system.. sending the coreect inputs to rhe coreect tools etc and probiding a mulilayered response . Giving the perception of a general inteligence even a concious character much the same a s potrayed in sci fi . So i think with animatronics and robotics and special prostesis we can also create bodys for such models. Henxe we could create inteligent robots right now. As we are seeing in china now . . In fact china are rhe leading edge right now.
@user-xk1cp5jd2g
@user-xk1cp5jd2g 6 месяцев назад
No ! Ai will need to build in a new way . Today ? Mm i'm not sure . Maybe via fiber optic . But it will likely be an agent that will teach the ai the fiber optic trick . Then ai will make a request to make the rest of the hardware . 100%agi ? It's at best 10 years away with big tech . Lobotomy of ai was a huge handbrake . Smaller player ? Who knows . One thing is sure ? The agi seed will be fiber optic , and for this ? Ai will need to see . Via fiber optic
@jonatan01i
@jonatan01i 6 месяцев назад
And then Sora happened.
@heshanlahiru2120
@heshanlahiru2120 6 месяцев назад
I can tell this. LLMs will never reach humans. Humans have curiosity, memory and we learn.
@magicmarcell
@magicmarcell 6 месяцев назад
Hate to break it to you but 99% of people dont have a modicum of foresight and actively resist concepts in this video like modular/ heterogeneous systems, quadratic time ect Everything mentioned in this video can be applied to life but try explaining these concepts and see how quickly it gets dismissed lol Llms wont have that problem
@kayakMike1000
@kayakMike1000 6 месяцев назад
LLMs are just made of words.
Далее
Will AI infrastructure really cost $7 trillion?
24:35
Can robotics overcome its data scarcity problem?
21:01
А ВЫ УМЕЕТЕ ПЛАВАТЬ?? #shorts
00:21
Просмотров 1,9 млн
What would it feel like to be a cyborg?
20:36
Просмотров 4,1 тыс.
Why Democracy Is Mathematically Impossible
23:34
Просмотров 2 млн
Can AI sound too human? The dark side of gen AI
17:28
Просмотров 3,5 тыс.
Is AGI Just a Fantasy?
41:26
Просмотров 46 тыс.
How could we control superintelligent AI?
22:51
Просмотров 14 тыс.
Unlocking immortality: the science of reversing aging
20:35
The GPU evolution: from simple graphics to AI brains
22:45
The dangerous study of self-modifying AIs
16:01
Просмотров 3,7 тыс.
Is it dangerous to give everyone access to AGI?
17:07
А ВЫ УМЕЕТЕ ПЛАВАТЬ?? #shorts
00:21
Просмотров 1,9 млн