Supercharge Multi-LLM Intelligence w/ CALM

Подписаться 44 тыс.

Просмотров 3,1 тыс.

50% 1

Revolution in AI: Beyond MERGE or MoE for multi-LLMs. Combine the pure intelligence of LLMs w/ CALM - Composition to Augment Language Models, by Google Deep Mind. A new revolutionary approach! Integrating ideas from LoRA, Mixture of Experts plus cross-attention from encoder-decoder Transformer architecture.
Delving into the technical heart of the discussion, the technical focus shifts to the intricate mechanics of combining Large Language Models (LLMs) through an advanced methodology (CALM by Google) that surpasses traditional model merging techniques. This approach revolves around the concept of 'composing augmented language models', which involves a sophisticated integration of LLMs at a deeper architectural level, beyond just output communication.
The technique involves the strategic merger of different LLMs by dissecting and reassembling their layer structures. However, the success of this process hinges on the architectural alignment of the original models. To overcome the limitations posed by disparate architectures, a new, more generic methodology is introduced. This method allows for the composition and combination of various complex LLMs, almost independent of their layer structure, with the added advantage of requiring minimal specific training data.
The core of this methodology is the utilization of projection layers and cross-attention mechanisms, with an emphasis on maintaining the original, frozen weight structures of the LLMs. This approach ensures the preservation of the inherent knowledge within each model while introducing new learnable parameters. The process involves mapping the dimensionality of one LLM's layer representation to match that of another, facilitating compatibility for cross-attention operations.
A key aspect of this technique is the projection of layer representations from one model (referred to as model A) to the dimensionality of another (model B). This step is crucial for ensuring compatibility between the layers of the two models. The cross-attention mechanism then dynamically integrates information from model A into model B, effectively allowing the latter to 'consult' the former about specific features or patterns in the data. This is particularly valuable when model B lacks certain knowledge or capabilities that model A possesses.
The technical execution of this process involves a detailed calculation of the cross-attention mechanism, incorporating query, key, and value matrices from the respective models. The queries are derived from model B (the anchor model), while the key and value pairs originate from model A (the augmenting model). The cross-attention output is then added as a residual connection to the layer representation of model B, and this output serves as the input to the subsequent layer in the composed model.
This advanced approach to LLM integration signifies a paradigm shift in the field of AI and machine learning. It enables the creation of models with enhanced capabilities by leveraging the collective intelligence of multiple LLMs. This method not only preserves the unique strengths of each individual model but also fosters the emergence of new abilities that were previously unattainable by either model alone.
The presentation concludes by highlighting the groundbreaking potential of this approach in various applications, including language inclusivity and complex code understanding, setting the stage for future explorations and innovations in the domain of AI.
Literature (all rights w/ authors):
LLM AUGMENTED LLMS:
EXPANDING CAPABILITIES THROUGH COMPOSITION
arxiv.org/pdf/...
#ai
#aieducation
#newtechnology

Опубликовано:

21 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 8

@stephanembatchou5300 9 месяцев назад

Excellent Expanation. This technique will open the door for supermodel.

@dayanemarcos1923 9 месяцев назад

There must be some interesting applications for multimodality here.

@mrd6869 9 месяцев назад

Already way ahead of you.Multi agent frameworks work well.

@yorailevi6747 9 месяцев назад

so just a generalization of encoder decoder, good to know

@mshonle 9 месяцев назад

I’ve been wondering why there has been less work on encoder-decoder architectures. I guess it made sense to see how far decoder-only architectures could go.

@PaulSchwarzer-ou9sw 9 месяцев назад

Do you have an X account? Maybe you should make one

@maximilianrck254 9 месяцев назад

Great Video and great explanation as always! What do you think how long this great technique will take place if we have self finetuning pretty soon? Is there a limit on how often you can merge a model and would it be possible merge a already fine tuned one? I think about self finetuning for user individual tasks and merging new capabillities like patches or updates. What do you think about it?