Тёмный

Marker:Get Your PDFs Ready for RAG & LLMs|High Accuracy Open-Source Tool  

DataEdge
Подписаться 4,3 тыс.
Просмотров 3,4 тыс.
50% 1

PDFs are essential in business, academics, and more for their consistent formatting, but extracting content can be tricky, especially with images, tables, and formulas. This is a key step in preparing text for RAG (Retrieval-Augmented Generation) applications and language models (LLMs).
In this video, we’ll show you how converting PDFs to plain text simplifies data processing for LLMs. Discover the power of Markdown in preserving information and formatting during conversion, ensuring your LLM interprets content accurately.
#ai #llm #opensourcellm #generativeai #pdfs
Blog :www.dataedgehub.com
LINKS:
Code:www.dataedgehub.com/2024/07/u...
Github Code:github.com/VikParuchuri/marker
pytorch Installation : pytorch.org/
• Advanced Function Call...
• MiniCPM-Llama3-V 2.5 -...

Хобби

Опубликовано:

 

31 май 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 12   
@abdinegara3135
@abdinegara3135 2 месяца назад
Hey man i really appreciate your video, actually you deserve a more viewers ❤
@DassS-dass
@DassS-dass 2 месяца назад
It's great 👍
@mukeshkund4465
@mukeshkund4465 Месяц назад
Appreciate it. How can we build RAG on top of this?? If you can make a video on that it will be very helpful.
@DataEdge01
@DataEdge01 Месяц назад
Noted thank
@isagiyoichi-mg2ds
@isagiyoichi-mg2ds Месяц назад
Same request
@ignaciopincheira23
@ignaciopincheira23 Месяц назад
Could you add the description of each image to the text with the aim of having a single Markdown file, similar to the original PDF? This way, it would be possible to pass a file to a language model that is readable and maintains its content.
@DataEdge01
@DataEdge01 Месяц назад
Noted!
@intellect5124
@intellect5124 Месяц назад
Very informative video. Could you try to build a system that can run on a large number of PDFs and further convert these to .md files for an LLM to query or generate specific prompts with a UI?
@DataEdge01
@DataEdge01 Месяц назад
Noted,thanks!
@atomobianco
@atomobianco 2 месяца назад
Details matter, you say the index is well formatted into a table but it seems to me that the Markdown displays two columns while the PDF index only had one column
@DataEdge01
@DataEdge01 2 месяца назад
The limitations were addressed in the beginning of the video
Далее
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
LlamaParse: Convert PDF (with tables) to Markdown
15:55
Stop, Intel’s Already Dead!
13:47
Просмотров 335 тыс.
Ollama UI - Your NEW Go-To Local LLM
10:11
Просмотров 106 тыс.
Run your own AI (but private)
22:13
Просмотров 1,3 млн
Open Source RAG running LLMs locally with Ollama
10:00
23 AI Tools You Won't Believe are Free
25:19
Просмотров 2 млн
😳РЫБАК УДИВИЛ ПРОХОЖИХ!
0:12
Просмотров 3,3 млн
😳РЫБАК УДИВИЛ ПРОХОЖИХ!
0:12
Просмотров 3,3 млн