Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Stanford MLSys Seminars

Подписаться 21 тыс.

Просмотров 19 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 13

@tarepan_YT 2 года назад

Impressive works, clear presentation, and intriguing discussions! Thanks for sharing great seminar.

@BREAKDRS 2 года назад

Very well organized and easy to follow. Thanks!

@m.d.4979 5 месяцев назад

Hello! Great talk! I am currently studying your SSM-related works. They are amazing! Please share your ideas, challenges, and outcomes for implementing your MAMBA model into human(sports athlete) action forecasting. Thank you for your kind reply!

@salehgholamzadeh3368 2 года назад

A really Great Talk. Can S4 be integrated into Reinforcement Learning?

@brandomiranda6703 2 года назад

How do you think it will compare with memorizing transformers?

@sabrango 5 месяцев назад

Amazing

@stergiosbachoumas2476 Год назад

With regards to the stability question and I repeat: "Are Hippo A matrices stable?": The answer is that they are not stable or Hurwitz as we say in Control Theory because their eigenvalues are outside the unit circle. This is trivial to show as they are Lower triangular and therefore their eigenvalues are sitting on the diagonal. Thus the eigenvalues are 1,2,...,n+1 for an (nxn) matrix. Unfortunately, the organizers did not let Albert share his screen to show the form of the A matrix again. With this information now it would be very interesting to talk again about stability because Albert said that they are stable, well in what sense? Also, it's very interesting that other stable matrices do not lead to good learning.

@jonathanballoch Год назад

what are the implications of this instability

@stergiosbachoumas2476 Год назад

@@jonathanballoch What my comment above says is all wrong, the HiPPO matrix is stable because the eigenvalues are -1,-2,...,-n+1 all in the Left hand plane (i.e. negative). I forgot to come back and delete this comment but I will leave it here to remind myself that I must be more careful next time.

@simonl1938 8 месяцев назад

I'm trying to implement the S4 myself in C right now and have the issue of the state exploding, I don't see how the matrix is stable at all. Do you have any suggestions on what I should look into?

@swfsql 7 месяцев назад

@@stergiosbachoumas2476 Thanks for your update. Just a question, by being on the left hand plane are you referring to the root places in space state control theory?

@rohanasokan7338 6 месяцев назад

@@simonl1938 To add to Stergios. It seems Gu keeps the matrix form exploding by keeping the matrix in the left hand plane and he is doing that by limiting the real part of the diagonal to -1/2. There are some ablations he does to this in his dissertation if you are interested. And because it is on the left hand plane, the entire formulation will transform to the complex unit circle. In positive real space, you will always have the state explosion problem.