Тёмный
Yannic Kilcher
Yannic Kilcher
Yannic Kilcher
Подписаться
I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.

Twitter: twitter.com/ykilcher
Discord: ykilcher.com/discord
BitChute: www.bitchute.com/channel/yannic-kilcher
LinkedIn: www.linkedin.com/in/ykilcher
BiliBili: space.bilibili.com/2017636191

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: www.subscribestar.com/yannickilcher
Patreon: www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n


xLSTM: Extended Long Short-Term Memory
57:00
3 месяца назад
[ML News] Chips, Robots, and Models
39:14
4 месяца назад
[ML News] Llama 3 changes the game
31:19
5 месяцев назад
Hugging Face got hacked
18:01
5 месяцев назад
No, Anthropic's Claude 3 is NOT sentient
15:12
6 месяцев назад
Gemini has a Diversity Problem
17:36
7 месяцев назад
Mixtral of Experts (Paper Explained)
34:32
8 месяцев назад
Until the Litter End
3:40
8 месяцев назад
I created an AI-powered Social Network
8:17
8 месяцев назад
Комментарии
@0兒-y4c
@0兒-y4c День назад
yor are god
@davidespinosa1910
@davidespinosa1910 4 дня назад
It sounds like S4 solves the vanishing and exploding gradient problem, so that it works for very long sequences. Now I'm curious how that works...
@DeborahRodriguez-q8l
@DeborahRodriguez-q8l 4 дня назад
Carley Roads
@maryguty1705
@maryguty1705 4 дня назад
So the network only work on one scene? And it is more of a 3D model compressor than a 3D scene generator, am I understanding this correctly?
@DeborahBoozer
@DeborahBoozer 4 дня назад
Young Gary Perez Michelle Thomas Sarah
@DonatLoftus
@DonatLoftus 5 дней назад
You make wonderful videos! 👏 I’ve got a question: 🤨 I only have these words 🤔. (behave today finger ski upon boy assault summer exhaust beauty stereo over). What should I do with this? 🤷‍♂️
@kartiksetia5120
@kartiksetia5120 6 дней назад
attention is all you need
@yourdudecodes
@yourdudecodes 6 дней назад
Loved it. Thanks - Deep Learning Enthusiast
@RoxieRingelspaugh-n6r
@RoxieRingelspaugh-n6r 6 дней назад
Velva Tunnel
@RobertLayman
@RobertLayman 8 дней назад
Garcia Jose Jackson Paul Harris Sharon
@shawnmacdonaldbc
@shawnmacdonaldbc 8 дней назад
Third world garbage input
@shawnmacdonaldbc
@shawnmacdonaldbc 8 дней назад
True it's all East Indian fucking garbage input
@KennethWilliams-s6y
@KennethWilliams-s6y 9 дней назад
Peggie Key
@MarcelaApker-c9r
@MarcelaApker-c9r 10 дней назад
Tremblay Parkways
@TitusAugust-l6n
@TitusAugust-l6n 11 дней назад
Thomas Nancy Smith Anthony Rodriguez Kimberly
@Patrick-wn6uj
@Patrick-wn6uj 11 дней назад
do an update of this video on the online decision transformer
@Cereal.interface
@Cereal.interface 12 дней назад
Im glad i saw this
@FatihMercan-kn1hx
@FatihMercan-kn1hx 12 дней назад
The video starts at 5:12
@NdnxdidhhNndhxydh
@NdnxdidhhNndhxydh 12 дней назад
Thompson Anna Jones Sandra Hall Kimberly
@AsaPort
@AsaPort 12 дней назад
600 Deckow Island
@googleyoutubechannel8554
@googleyoutubechannel8554 12 дней назад
This is already sort of a 'fail' in that the important thing science does is not about the relations... the symbolic equation, that's just icing. the important part is the terms themselves, the properties and operators that you think could describe a system. So even if this is successful, it's basically useless, 99% of the 'work' is already done in deciding we care about this thing call 'mass' that things have 'mass' etc.
@Htyagi1998
@Htyagi1998 13 дней назад
27.57 explaination, here they have written if the likelihood of p(y/x) is low then 1 - p(y/x) will accelerate the gredients.. If i am not wrong then it can be consider as, if lets say likelihood of loosing side is more then the gredient will accelerate towards then other side.
@Htyagi1998
@Htyagi1998 13 дней назад
The way you explained ❤
@EricBlanco-e5l
@EricBlanco-e5l 13 дней назад
Farrell Forges
@RomeTWguy
@RomeTWguy 13 дней назад
So ChatGpt o1 is a ToT wrapper for ChatGpt4o
@SofiRycvan
@SofiRycvan 13 дней назад
Aileen Estate
@brendawilliams8062
@brendawilliams8062 13 дней назад
Curves. Agreed
@virginiareynolds7890
@virginiareynolds7890 14 дней назад
Lewis Edward Wilson Scott Robinson Maria
@pavel5074
@pavel5074 14 дней назад
I skimmed through the paper and couldn't find the part where they state that random attention pattern is the same from layer to layer... Are you sure layers didn't have different patterns for the same batch? Mind pointing to exact location in the paper where you got this idea? (sorry for being nit-picking, but this part seem to be important)
@JHenryEden
@JHenryEden 14 дней назад
who would guess that honesty and malreluctance to say horrible things and call them out as such would yield the most truth as a self-censoring A.I. but i would appreciate if you kept your little filthy fingers away from places that you betitle with "terrible, horrible" or any adjective of similar magnitude.
@jommy4240
@jommy4240 15 дней назад
Could you please tell what app you use to annotate?
@MahmutAyabakan
@MahmutAyabakan 15 дней назад
Lee Christopher Lee Dorothy Anderson Robert
@SohelRana4-m5c
@SohelRana4-m5c 15 дней назад
Martinez Scott White Frank Garcia Jennifer
@gJonii
@gJonii 17 дней назад
I actually tried to implement this recently, not knowing it had been invented already. Though my idea was to put constraints on Z and then do more training steps to get the latent representation.
@fredadaoliver2151
@fredadaoliver2151 17 дней назад
Martin Larry Young Kimberly Taylor Jennifer
@MahmutAyabakan
@MahmutAyabakan 17 дней назад
Jackson Christopher Martinez Barbara Davis Richard
@dimitrije929
@dimitrije929 20 дней назад
Good video but a script would benefit it a lot, a bit unorganized overall.
@TonyaMartin-b7c
@TonyaMartin-b7c 20 дней назад
Moore Robert Hernandez Barbara Thomas Paul
@KmgdsHfafjp
@KmgdsHfafjp 21 день назад
Lewis Sarah Lewis Michelle Johnson Jennifer
@theautisticside
@theautisticside 21 день назад
I am new to the AI field (studying deep learning). But I am not new to the --isms debates. I studied intellectual history back in undergrad. [For those interested, from the history of philosophy perspective, there is the rise of postmodernism. And, from the philosophy of science perspective, there is Thomas Kuhn to start with.] I look at this tweet exchange and I see two sides arguing from different mindsets, motives, and goals. One side is trying to discover new scientific truths (if you're a scientific realist); the other side is trying to shift power balances. I do think both goals are important. But, in my simplified assessment, this is at least in part the reason why the benefit of public discourse seems at an all-time low (some days).
@ericchang9568
@ericchang9568 21 день назад
Thanks for the explanation: so the Q function basically percolates from the near end-game moves, whose rewards are easier to learn, and gradually working its way from the back to the beginning?
@FamilyYoutubeTV-x6d
@FamilyYoutubeTV-x6d 7 дней назад
good thinking, I believe.
@MrBigbanan
@MrBigbanan 23 дня назад
New video on this topic ?
@jfbaro2
@jfbaro2 24 дня назад
Great video
@manuelkarner8746
@manuelkarner8746 25 дней назад
wow, i learned a ton, thanks
@googleyoutubechannel8554
@googleyoutubechannel8554 26 дней назад
This paper should be called, how to implement A* in the most expensive way possible. I would be surprised if a transformer network, which is also remember people, is just a bunch of stacked NNs with back prop goodness, couldn't learn it. Transformers should be able, if massaged and trained correctly, to do basically any type of ML curve fitting we've discovered, if you can shove it into their context window (which is just a big vector that gets shoved through a bunch of NNs) learning A*, a very short algo, and some data processing, seems very reasonable.
@SoniyaKhan-e8l
@SoniyaKhan-e8l 28 дней назад
Miller Eric White Jose Jones Timothy
@SpicyMelonYT
@SpicyMelonYT 28 дней назад
well this flopped
@AayanshMagicWorld
@AayanshMagicWorld Месяц назад
ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-B1xZRlGce1g.htmlfeature=shared
@kotakviraj4016
@kotakviraj4016 Месяц назад
I am enrolled in a masters course in AI and i have to read lot of research papers like these as new ones come out and this channel has the best simplest paper explanation videos out there. Also i completely disregarded all the hints about the authors of the paper i dont know who wrote it. 🤫
@zramsey11
@zramsey11 Месяц назад
It's almost like Hafner et al watched your video and built v3 to rectify your criticisms. Transferability to other problems - check, less hyperparameters - check, more generalizable loss function - check. Would really love to see a video like this going over v3. Been having a hell of a time wrapping my head around it, but this video is still helping a ton. Thanks Yannic!!