LlamaIndex Webinar: ColPali - Efficient Document Retrieval with Vision Language Models

LlamaIndex

Подписаться 20 тыс.

Просмотров 3,4 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

21 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 6

@jimmyjustintime3030 2 месяца назад

nice look under the hood paper reading walkthrough I was kinda hoping for a practical session sort of a load it and run it in a llamaindex pipeline but thanks for getting it out there you guys rock!

@SadCP 18 дней назад

Funny that this talk is on this channel given that LlamaIndex doesn't have direct support for ColPali indexing, and I havent found an example online

@RedCloudServices Месяц назад

How does ColPali compare to Google Gemini for the same benchmark?

@manuelalbertoromerogarcia9495 2 месяца назад

Very interesting! Thank you! I have a question: the output of the retrival is the page itself, isn't it? ( Or a set or ranked pages) Thank you!

2 месяца назад

It is interesting ... however despite that the young generation says in the session, using the document image instead of the text content is not at all new. This approach for Knowledge Retrieval from Document Images has been already implemented and tested internally by several companies, but not announced or discussed publicly. There are pros and cons. Among the cons, the ViT architecture and its tilling approach. Another issue is with the pre-trained multi-modal embedding model which has a "world knowledge" very generic and cannot adapt easily to very specialized illustration/text content.

2 месяца назад

Also, I would be interested to know the extra-cost of this VLM/LLM - based solution. Calling PaliGemma is not for free :)