Тёмный

Paper deep dive: Evolutionary Optimization of Model Merging Recipes 

DataScienceCastnet
Подписаться 4,6 тыс.
Просмотров 3,3 тыс.
50% 1

Sakana AI has a great new paper exploring evolutionary approaches to model merging, showing how to find ways of combining existing models into new ones with impressive new skills. In this video, we dive into the paper and along the way spend some time learning about model merging in general, evolutionary algorithms, and more.

Наука

Опубликовано:

 

20 мар 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 6   
@gedankenthesis
@gedankenthesis 4 месяца назад
That was an excellent overview of not just Sakana's evolutionary methods to identify good merge candidates, but also the popular techniques TIES, DARE and Passthrough/Frankenmerge. Appreciate it as usual, Johno!
@jonatan01i
@jonatan01i 4 месяца назад
Oh my god, man you don't understand how happy I am for your storytelling about how things went in the timeline of developing on the idea of model merging up to this point, where it started how it went, that and how they were thinking about reasons why it works, etc.etc.. I want to get into this so that I understand the main ideas and be able to start working on these as well, but it's so hard to get to the root of things, it requires a huge amount of time to read and digest everything and slowly being able to put the pieces together, so boy do I mean it when I say thank you!
@UmerHA
@UmerHA 4 месяца назад
Hi Johno, at the beginning you said you're somewhat skeptical of model merging. Iiuc, your criticism is only about iterative merging for a given goal, which leads to overfitting. Or are you skeptical of the general concept of model merging? Thanks!
@abse-mj8pw
@abse-mj8pw 4 месяца назад
very great introduction! I can see a lot of efforts have been put into this video! It helps a lot to understand the paper! thank you for sharing!!
@abse-mj8pw
@abse-mj8pw 4 месяца назад
However I have one small question about the overfit part at the end of this video. Is it about that the test set translated into Japanese might be learned or finetuned by the math 7B llm?
Далее
Stable Diffusion Deep Dive Notebook Run-through
41:09
Gaussian Splatting explorations
32:45
Просмотров 25 тыс.
What is Speculative Sampling?
15:21
Просмотров 2,3 тыс.
Why Fine Tuning is Dead w/Emmanuel Ameisen
50:07
Просмотров 29 тыс.
Evaluating Diffusion Models with PickScore
14:32
Supercharge Multi-LLM Intelligence w/ CALM
26:19
Просмотров 3 тыс.
КАКОЙ SAMSUNG КУПИТЬ В 2024 ГОДУ
14:59
iPhone socket cleaning #Fixit
0:30
Просмотров 18 млн