Тёмный

Install MiniCPM Llama3-V 2.5 Locally - Beats GPT4o 

Fahd Mirza
Подписаться 17 тыс.
Просмотров 4,9 тыс.
50% 1

This video shows demo of CogVLM2 Llama 3 19-B which new generation of CogVLM2 series of models and a decent quality VLM.
🔥 Buy Me a Coffee to support the channel: ko-fi.com/fahd...
▶ Become a Patron 🔥 - / fahdmirza
#minicpm #minicpmllama3 #vlm
PLEASE FOLLOW ME:
▶ LinkedIn: / fahdmirza
▶ RU-vid: / @fahdmirza
▶ Blog: www.fahdmirza.com
RELATED VIDEOS:
▶ Model huggingface.co...
All rights reserved © 2021 Fahd Mirza

Опубликовано:

 

22 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 23   
@AshleyBurton
@AshleyBurton 4 месяца назад
I love your videos, thank you!
@fahdmirza
@fahdmirza 4 месяца назад
You are so welcome! Much appreciated.
@sergeistadnik8305
@sergeistadnik8305 2 месяца назад
This is a great model, indeed!
@fahdmirza
@fahdmirza 2 месяца назад
agreed
@andrewowens5653
@andrewowens5653 4 месяца назад
Very interesting. It would also be interesting to compare this model to the gguf quantized version when it comes out.
@HassanAllaham
@HassanAllaham 4 месяца назад
That's a very good request
@testales
@testales 4 месяца назад
Maybe it's possible to do that locally, there should be a script or tool for ollama to do this if I remember correctly. Once it can be loaded in Ollama it can be used with Open WebUI one can just drag/drop images there. We'll see. :)
@fahdmirza
@fahdmirza 4 месяца назад
Noted.
@npc4416
@npc4416 4 месяца назад
good but how is it better than gpt 4 o
@HassanAllaham
@HassanAllaham 4 месяца назад
in the Model card they wrote (for OCR Capabilities): 💪 Strong OCR Capabilities. MiniCPM-Llama3-V 2.5 can process images with any aspect ratio up to 1.8 million pixels, achieving an 700+ score on OCRBench, surpassing proprietary models such as GPT-4o, GPT-4V-0409, Qwen-VL-Max and Gemini Pro. Based on recent user feedback, MiniCPM-Llama3-V 2.5 has now enhanced full-text OCR extraction, table-to-markdown conversion, and other high-utility capabilities, and has further strengthened its instruction-following and complex reasoning abilities, enhancing multimodal interaction experiences. I could not find the OCRBench score for GPT-4o !! may be it was tested by the model developers!! but the evaluation comaprison on the photo showing that it is a good model: (It was compaird with GPT4-v-1106): huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/resolve/main/assets/MiniCPM-Llama3-V-2.5-peformance.png
@fahdmirza
@fahdmirza 4 месяца назад
On some benchmarks, plz see the model card for the numbers.
@massimogiussani4493
@massimogiussani4493 3 месяца назад
Is there a way to use this model with a GPU that has only 6GB of VRAM? When I enter the command "Automodel.from_retrained...." I get an error "CUDA out of memory"
@fahdmirza
@fahdmirza 3 месяца назад
would need to check
@QorQar
@QorQar 19 дней назад
How run ths model gguf on lama cpp python with image loczally
@fahdmirza
@fahdmirza 18 дней назад
use lm studio
@jampu
@jampu 4 месяца назад
can you fine tune this model?
@fahdmirza
@fahdmirza 3 месяца назад
yes
@WailMaghrane
@WailMaghrane 3 месяца назад
Can you provide how to finetune this model ?
@fahdmirza
@fahdmirza 3 месяца назад
Noted
@hamidmohamadzade1920
@hamidmohamadzade1920 4 месяца назад
is there any relation between this and CogVLM-2???
@fahdmirza
@fahdmirza 4 месяца назад
Both are VLMs
@gainstycom
@gainstycom 4 месяца назад
Is there a way to run it like on server and pay for usage?
@fahdmirza
@fahdmirza 3 месяца назад
May be from huggingface
Далее
«По каверочку» х МУЗЛОФТ❤️
00:21
Ollama's Newest Release and Model Breakdown
9:00
Просмотров 14 тыс.
This Llama 3 is powerful and uncensored, let’s run it
14:58
Reliable, fully local RAG agents with LLaMA3
21:19
Просмотров 112 тыс.
Intro to RAG for AI (Retrieval Augmented Generation)
14:31