Тёмный

Best Tool For Getting Your Data Ready For RAG 

Data Science Basics
Подписаться 13 тыс.
Просмотров 3,1 тыс.
50% 1

In this introductory video about unstructured, I will show you how to get started and partition your pdf with 3 different approaches.
80% of enterprise data exists in difficult-to-use formats like HTML, PDF, CSV, PNG, PPTX, and more. Unstructured effortlessly extracts and transforms complex data for use with every major vector database and LLM framework.
Link ⛓️‍💥
unstructured.io/
Code 👨🏻‍💻
github.com/sudarshan-koirala/...
------------------------------------------------------------------------------------------
☕ Buy me a Coffee: ko-fi.com/datasciencebasics
✌️Patreon: / datasciencebasics
------------------------------------------------------------------------------------------
🤝 Connect with me:
📺 RU-vid: / @datasciencebasics
👔 LinkedIn: / sudarshan-koirala
🐦 Twitter: / mesudarshan
🔉Medium: / sudarshan-koirala
💼 Consulting: topmate.io/sudarshan_koirala
#unstructureddata #unstructuredio #llm #datasciencebasics

Наука

Опубликовано:

 

1 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 18   
@aalamansari8643
@aalamansari8643 28 дней назад
Sir, using partition_pdf not able to get the bulllet points from pdf like (- Any bullet point). How to get the bullet points, need help sir!
@TooyAshy-100
@TooyAshy-100 2 месяца назад
Congratulations on reaching 10K, and soon it will be even greater.
@datasciencebasics
@datasciencebasics 2 месяца назад
Thank you 🙏🏼
@d3mist0clesgee12
@d3mist0clesgee12 2 месяца назад
Congratulations on ur 10K !!!!!!!
@datasciencebasics
@datasciencebasics 2 месяца назад
Thank you 🙏🏼
@mchl_mchl
@mchl_mchl 2 месяца назад
Great video! I have been using unstructured data connectors to do hydrid searches and text embedding with elasticsearch - would love to see if you have some tips for the JSON mapping there or anything else. Would love to get a function for all data types that can handle all the edge cases
@awakenwithoutcoffee
@awakenwithoutcoffee 13 дней назад
HI sir, have you tried Azure AI Document Intelligence ? we are figuring out which data parser is the most suitable for production RAG apps. Cheers
@SantK1208
@SantK1208 3 месяца назад
Thanks Sudarshan, could you please make a video on fine tuning llama 3 model ???
@datasciencebasics
@datasciencebasics 3 месяца назад
You are welcome. Will note that in my to do list !!
@yazanrisheh5127
@yazanrisheh5127 2 месяца назад
Hello Sudarshan. Can you please make a video of the RAG on several PDFs where these PDFs have all text, images, and tables please.
@datasciencebasics
@datasciencebasics 2 месяца назад
Will take that in my to do list ✅
@CC-zg4el
@CC-zg4el 2 месяца назад
Hi Sudarshan, I have been trying to follow your unstructured tutorials, but I keep getting an erro at the beginning because, apparently, my virtual environment lacks something which I cannot figure out. I also forked and cloned your repository locally, in hope that there is a spec-file.txt file to clone your environment. However, It seems there is not such file. Would you mind sharing a spec-file.txt yo clone your environment and try your notebook? If you have another video where you have already instruct your subscribers how to follow along your tutorial, please just point me in the right direction. Thank you very much for your time!
@datasciencebasics
@datasciencebasics 2 месяца назад
You are welcome. Installing unstructured python sdk might be challenging as it might need some system level package installation. This video has some ideas and link in the notebook, please follow it there. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-hQu8WN8NuVg.htmlsi=V3N2VjcguzLPMiII
@drmetroyt
@drmetroyt 2 месяца назад
How to install as docker container?
@datasciencebasics
@datasciencebasics 2 месяца назад
You can read and follow feom the official documentation -> docs.unstructured.io/open-source/installation/docker-installation
@drmetroyt
@drmetroyt Месяц назад
​@@datasciencebasicsI'm not a tech person sir , I'm a medical student i have many PDFs which i want to feed in RAG so i want to install unstructured io in docker but there is no video on internet , although they mention in the documents to install as docker but there is no proper understandble guide
@awakenwithoutcoffee
@awakenwithoutcoffee 13 дней назад
@@drmetroyt ask GPT! It should be able to walk you trough the process or at-least point you in the right direction. It is not an easy subject for a beginner (I am also new to Docker).
Далее
NVIDIA CEO says Don't Learn to Code ... why?
27:12
Просмотров 156 тыс.
Ne jamais regarder une fille à la plage 😂
00:10
Просмотров 766 тыс.
Why are vector databases so FAST?
44:59
Просмотров 15 тыс.
5 Good Python Habits
17:35
Просмотров 462 тыс.
R vs Python
7:07
Просмотров 313 тыс.
LlamaParse: Convert PDF (with tables) to Markdown
15:55
How principled coders outperform the competition
11:11
[SK TECH SUMMIT 2023] RAG를 위한 Retriever 전략
19:16