Тёмный

LocalGPT API: Build Powerful Doc Chat Apps 

Prompt Engineering
Подписаться 161 тыс.
Просмотров 31 тыс.
50% 1

In this video, I will show you how to use the localGPT API. With localGPT API, you can build Applications with localGPT to talk to your documents from anywhere you want. In this example application, we will be using Llama-2 LLM.
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
#llama #localgpt #llm
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.com/@engineerprom...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/engineerprompt/c...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINKS:
LocalGPT Github: github.com/PromtEngineer/loca...
LocalGPT Video: • LocalGPT: OFFLINE CHAT...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Timestamps:
Intro: [00:00]
Setting up localGPT Repo: [01:33]
Llama-2 and other LLMs: [03:18]
Running the API: [04:25]
Running the GUI with API: [05:00]
The GUI Tutorial: [06:15]
What exactly is happening: [08:37]
Add more docs to the Vector Store: [09:05]
Code Walkthrough: [10:30]
Outro: [14:25]
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...

Наука

Опубликовано:

 

26 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 105   
@CapiFlvcko
@CapiFlvcko 10 месяцев назад
Thank you and everyone contributing to this!
@mindful-machines
@mindful-machines 10 месяцев назад
thanks for sharing this! I think "private AI" is the future and this project definitely makes it easier for people to run their own local models. cloning now 😁
@CesarVegaL
@CesarVegaL 10 месяцев назад
Thank you for sharing your knowledge. It is greatly appreciated
@tubingphd
@tubingphd 10 месяцев назад
Very usefull :) Much appreciatre your hard work on the project and the videos
@andrebullitt7212
@andrebullitt7212 10 месяцев назад
Great Stuff! Thanks
@noreasonsu3883
@noreasonsu3883 10 месяцев назад
This is amazing. Just my CPU runs very slow. Can you show how to use the Cuda to empower?
@Zayn_Malik_Awan
@Zayn_Malik_Awan 10 месяцев назад
You are working great ❤❤
@dycast646
@dycast646 10 месяцев назад
Just in time! Great for Sunday, I’ll blame you if wife yells at me for not watching tv with her😂😂😂
@engineerprompt
@engineerprompt 10 месяцев назад
🤣🤣
@zorayanuthar9289
@zorayanuthar9289 10 месяцев назад
Your wife should never yell at you. She should respect you for your curious mind and vice versa.
@dycast646
@dycast646 10 месяцев назад
@@zorayanuthar9289 I wish my wife is the reasonable one as you described 🥹🥹🥹
@cudaking777
@cudaking777 10 месяцев назад
Thank you
@gabrudude3
@gabrudude3 8 месяцев назад
Thanks for putting together a step-by-step tutorial. Very helpful! All your videos are amazing. I was looking for an exact solution to query local confidential documents. Two quick questions, how do I switch to 13b model? How do I train the model on custom database schema and SQL queries? I tried it with a schema document but sql queries it returned were not at all useful. A similar scenario with ChatGPT API returned good results.
@pagadishyam7049
@pagadishyam7049 9 месяцев назад
Thanks for this amazing video!!! can you suggest a high performance AWS EC2 instance where we can host this app? any suggestions to run this app in parallel...
@fenix20075
@fenix20075 10 месяцев назад
mmm... it looks like its old way, but adding the web API, great! Anyway, I found that any sentence transformer model can become the instruction embedding model for the project. But... the hkhulp/instructor-xl is still remain the most accurate instruction embedding model.
@jkdragon201
@jkdragon201 10 месяцев назад
Thank you, I'm learning so much from you. I had two questions on scalability. 1) If you had a simultaneous queries on the api, how does localgpt handle it? Will it queue the requests or run the in parallel, albeit slower? 2) I noticed that the searches are sometimes taking upwards of 30 seconds on a V100 GPU using a 7B Llama 2 model model. Are there any ways to optimize or accelerate the inference/retrieval speeds? Thanks!!
@engineerprompt
@engineerprompt 10 месяцев назад
Thanks :) To answer your questions. 1) Right now, it will queue it but that can be improved. 2) There are a few improvements that can be made to improve the speed. One possibility is to utilize different embedding model and experiment with different LLMs.
@waelmashal7594
@waelmashal7594 10 месяцев назад
This cool
@ckgonzales16
@ckgonzales16 10 месяцев назад
Great as always. Need a UI for associates so they just run an inquiry and not able to reset or add to the local knowledge base
@engineerprompt
@engineerprompt 10 месяцев назад
Thanks :) Its already there, will be covering it in future video!
@ckgonzales16
@ckgonzales16 10 месяцев назад
@@engineerprompt getting an error on the last line of code before running the code for localgpt, all the dependencies and env are good, cant seem to figure out where the bug is. also, wanted to touch base on the consultancy thing we discussed. finally got an update on it.
@engineerprompt
@engineerprompt 10 месяцев назад
@@ckgonzales16 what is the error? would love to connect again, let's schedule some time.
@WHOAMI-hq3nc
@WHOAMI-hq3nc 10 месяцев назад
Thanks for sharing, but there is a problem with this model. I’m not sure if it’s a bug or normal logic. If I try to ask the same question, its answering time will increase exponentially. Is this caused by reading in historical communication data every time?
@petscddm
@petscddm 9 месяцев назад
when installed and run got this error " File pydantic/main.py:341 in pydantic.main.BaseModel.__init__ ValidationError: 1 validation error for LLMChain llm none is not an allowed value (type=type_error.none.not_allowed)"" Any idea how to fix it?
@ernestspicer7728
@ernestspicer7728 10 месяцев назад
What is the best way to generate the Rest API for other applications to call it?
@RajendraKumar-ge9cv
@RajendraKumar-ge9cv 9 месяцев назад
Thank you for the video. really appreciate your effort in putting together the UI layer. I've a question, run_localGPT_API.py execution not starting the API console. Following is the status on my VS Code terminal for about an hour. Mac-Studio lgpt1 % python run_localGPT_API.py --device_type mps load INSTRUCTOR_Transformer max_seq_length 512 Am I doing anything wrong? Appreciate your response.
@snehasissengupta2773
@snehasissengupta2773 10 месяцев назад
Sir create a google Collab script to run this for low end pc user....You are my dream teacher....
@kingpatty4628
@kingpatty4628 4 месяца назад
I can't complete the requirements.txt because chroma-hnswlib required MSVC++ 14.0 or above to build the wheels. I installed visual builder tools and everything but still nothing. maybe it is the python version compatibility?
@williamwong8424
@williamwong8424 10 месяцев назад
can you complete it by making it as an app e.g render
@jamesc2327
@jamesc2327 6 месяцев назад
Is there a more generalized api wrapper? Is this specifically for documents?
@engineerprompt
@engineerprompt 6 месяцев назад
At the moment this is specific to the documents you ingest but I will add a generalized api that you can use to talk to the model itself.
@Kaalkian
@Kaalkian 10 месяцев назад
great progress. perhaps making a docker image would be the next step to simply the devops of this setup.
@wtfisthishandlebs
@wtfisthishandlebs 10 месяцев назад
I'm running this on a 1070 and it takes about 5min to answer a question. How much power to get like a 30sec-1min answer? Is this possible?
@sachinkr9764
@sachinkr9764 10 месяцев назад
Thanks for the video, can you please make a video on finetuning llama-2 model on pdf documents.
@paulhanson6387
@paulhanson6387 10 месяцев назад
Maybe this will help - ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-lbFmceo4D5E.html It's not for fine tuning, but will give you a start on doing Q&A with your docs.
@sachinkr9764
@sachinkr9764 10 месяцев назад
@@paulhanson6387 thank lot paul
@IanCourtright
@IanCourtright 9 месяцев назад
This is such a huge thing and you're not getting enough attention for it! I'm getting the UI to run on the 5111 port, but I am running across the issue of the initial python run_localGPT_API.py showing 'FileNotFoundError: No files were found inside SOURCE_DOCUMENTS, please put a starter file inside before starting the API!' but the constitution pdf is already there. Please advise!
@TrevorDBEYDAG
@TrevorDBEYDAG 5 месяцев назад
I just need to use LocalGPT on CLI putting some shortcuts to real doc folders and digest all them , is it possible?
@engineerprompt
@engineerprompt 5 месяцев назад
Yes, that's possible. You will need to provide the folder name as command line argument. Look at he constants.py on understanding how it set in the code.
@chanduramayanam236
@chanduramayanam236 10 месяцев назад
can you please tell what is the RAM, CPU and hard disk requirements to run localGPT? because im getting answers after 40mins for basic questions. Im having 12gb RAM as well. even i tried with GPUs of google colab, but still the answers are very late like after 40 mins
@chanduramayanam236
@chanduramayanam236 10 месяцев назад
@engineerprompt
@chanduramayanam236
@chanduramayanam236 10 месяцев назад
@engineerprompt
@emil8367
@emil8367 5 месяцев назад
Thanks for the video and useful information. The LocalGPT project uses some models described in constants.py file as a MODEL_ID and MODEL_BASE. Where this model is stored ? Also question about eg Fine tune with autotrain. Can you please tell me where are stored data when I use in command: "autotrain ... --data_path 'timdettmers/openassistant-guanaco' ..." ? I've triggered this command from my users home folder but don't see any files downloaded.
@engineerprompt
@engineerprompt 4 месяца назад
When you run that, it will create a new folder called "models" and the models will be downloaded to that folder. For autotrain, it should also download the model to that folder.
@emil8367
@emil8367 4 месяца назад
@@engineerprompt many thanks for the details 🙂
@lucianob4845
@lucianob4845 8 месяцев назад
Hi! , I have seen at 4:51 minutes of your video you have the following list of processor architeture features: AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 | Those are same as mine Orange pi 5 ARM RK3588S 8-core 64-bit processor . NEON: NEON is an SIMD architecture extension for ARM processors, ARM_FMA: Similar to the x86 FMA, ARM_FMA represents Fused Multiply-Add support for ARM processors. Are you using Orange pi 5 too? I'm trying to use the NPU of the RK3588S for the localGPT
@mvdiogo
@mvdiogo 10 месяцев назад
Hi i would like to help to make the project better, how can I help. Just find some bugs and some codes that could be nicer
@RABRABB
@RABRABB 10 месяцев назад
Why not H2OGPT? More capabilities, but has GPU usage option. If you have customer grade GPU(RTX3060) then it is at least 10 times faster.
@nomorecookiesuser2223
@nomorecookiesuser2223 10 месяцев назад
it remains to be seen if H2O is as offline and private as they first suggest. also, i do not want to run Java on my machine. H2O is very much a corporate controlled model. we will see if its offline function is anything but bait as time goes on.
@prashantwaghmare5453
@prashantwaghmare5453 10 месяцев назад
@Prompt Engineering i have issue while running local_api file .. automatically db vanishes and also even though source documents present it says no documents. please help guys..
@jirivchi
@jirivchi 10 месяцев назад
I have the same problem. FileNotFoundError: No files were found inside SOURCE_DOCUMENTS, please put a starter file inside before starting the API!
@Udayanverma
@Udayanverma 9 месяцев назад
Why u not building docker file or dockers
@alexandrelalaas8772
@alexandrelalaas8772 5 месяцев назад
Hello there, I try to launch the application but I have a problem 😢. When I launch the command line "pip install -r requirements.txt" in the anaconda prompt, I have the error "ERROR: Could not find a version that satisfies the requirement autoawq (from versions: none)". So after many attempts I tried to install from the source autoAWQ (clone the repository git clone) and tried to launch it. Then I have a new error "ERROR: Could not find a version that satisfies the requirement torch (from versions: none)". Has anyone encountered this error?
@nayandeepyadav8790
@nayandeepyadav8790 10 месяцев назад
how to deploy on gcp, aws and get website url instead of that localhost
@mohitbansalism
@mohitbansalism 9 месяцев назад
Were you able to get the solution for this?
@ikjb8561
@ikjb8561 10 месяцев назад
Is it fast?
@Lorenzo_T
@Lorenzo_T 10 месяцев назад
Could you show us how to use it in google colab?
@engineerprompt
@engineerprompt 10 месяцев назад
Yeah, I will make a video on it.
@Lorenzo_T
@Lorenzo_T 10 месяцев назад
@@engineerprompt thanks, it would be awesome. I look forward to it 🤩
@anand83r
@anand83r 10 месяцев назад
Hi always using only that constitution document is misleading the output quality. Why don't you use some math or law document to test the output.
@JustVincentD
@JustVincentD 10 месяцев назад
Tried it with iso Standards - Output is bad
@arturorendon145
@arturorendon145 10 месяцев назад
@@JustVincentD how can we improve it?
@JustVincentD
@JustVincentD 10 месяцев назад
@@arturorendon145 I think chunking and embedding must get better, also saving more metadata like Page numbers would be nice. I have not looked at the implementations of langchain (which are used). Just thinking of somehing like using different sizes of chunks in sequence. The embeddings remind me a lot of locality sensetive hash algorithms. So maybe copy some tricks there.
@caiyu538
@caiyu538 9 месяцев назад
I used T4 Cuda 16GB GPU. It will also take 3-4minutes to answer my question. But the answer to my file content is very precise with 4 5 page content. Taking 2-4 minutes to get answer is normal in this condition?
@engineerprompt
@engineerprompt 9 месяцев назад
That's on the longer side but you probably need access to a better GPU. Checkout runpod
@caiyu538
@caiyu538 9 месяцев назад
@@engineerprompt thank you so much for your great job and share with us. I am glad that the answer to my files are great although it takes a little longer to answer it. I will test more files and try different models. I also need to modify the prompt to make the answer more concise. I will check runpod you mentioned. Thank you.
@jaymatos100
@jaymatos100 9 месяцев назад
Hello again thanks for your video. I followed the instruction and the ingest.py script works fine. But when I try running the run_localgptapi or the run_localgpt I get the following error: pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain llm none is not an allowed value (type=type_error.none.not_allowed). I have text-generation-web-ui working with the TheBloke_WizardCoder-15B-1.0-GPTQ model working. I think it works because in Pinokio it is probably a docker container.
@engineerprompt
@engineerprompt 9 месяцев назад
Did you pull the latest changes to the repo?
@jaymatos100
@jaymatos100 9 месяцев назад
@@engineerprompt good morning. thanks for your prompt reply. I did a git pull and some files were updated. Still getting error. 2023-09-17 08:42:34,983 - INFO - load_models.py:38 - Using Llamacpp for GGUF/GGML quantized models Traceback (most recent call last): File "pydantic\main.py", line 341, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain llm none is not an allowed value (type=type_error.none.not_allowed)
@lukepd1256
@lukepd1256 10 месяцев назад
How would this work with OneDrive?
@engineerprompt
@engineerprompt 10 месяцев назад
You will have to give it access to read files from your drive. Other than that, it probably will work without much changes.
@ashwanikumarsingh2243
@ashwanikumarsingh2243 8 месяцев назад
Wothout internet when running python run_.... this throw error
@Dave-nz5jf
@Dave-nz5jf 10 месяцев назад
And if anyone knows how to get this running on Mac Silicon, like an M1, please post any advice?
@engineerprompt
@engineerprompt 10 месяцев назад
I run it on M2, follow the steps listed on the repo
@Dave-nz5jf
@Dave-nz5jf 10 месяцев назад
@@engineerprompt Sorry should have been clearer, got distracted. I meant using GPU and not CPU. I'll check repo for those instructs, but don't remember seeing them.
@Dave-nz5jf
@Dave-nz5jf 10 месяцев назад
It looks like privateGPT is looking for torch.cuda.is_available() .. and I'm using Silicon MPS. In my case torch.backends.mps.is_available(): is True
@chanduramayanam236
@chanduramayanam236 10 месяцев назад
you are never telling people about what do you mean powerful machine, what is the minimum requirements for a system to run these models, because in our laptops anyway these are not running. also a educative video about how to setup this on cloud server will also make your tutorial a complete tutorial. Orelse all these videos are just for showing your knowledge, where common man like us cant implement it. hope you understand this feedback
@jd2161
@jd2161 10 месяцев назад
Recplace
@jayantvijay1251
@jayantvijay1251 10 месяцев назад
what if i have thousands of pdf that i want to ask questions from
@engineerprompt
@engineerprompt 10 месяцев назад
It will still work, the response time might be a bit slower but this will work
@AverageCoderOfficial
@AverageCoderOfficial 10 месяцев назад
Lol, released 9 sec ago
@MudroZvon
@MudroZvon 10 месяцев назад
Are you making mistakes in the previews on purpose to get more comments?
@nomorecookiesuser2223
@nomorecookiesuser2223 10 месяцев назад
are you making pointless comments to get more comments? if you found an error, share a solution. otherwise, you are just whining.
@far-iz
@far-iz 10 месяцев назад
Is it free?
@engineerprompt
@engineerprompt 10 месяцев назад
Yes
@user-yd3zk4hb1o
@user-yd3zk4hb1o 10 месяцев назад
Completely showing errrors , unable ask a single question :(
@serenditymuse
@serenditymuse 10 месяцев назад
This is begging to be containerized.
@nomorecookiesuser2223
@nomorecookiesuser2223 10 месяцев назад
sure, lets basically black box an open thing because you do not want to use conda
@Enju-Aihara
@Enju-Aihara 10 месяцев назад
openai is not open and localgpt is not local, thanks for nothing
@photon2724
@photon2724 10 месяцев назад
what? It is local though...
@mokiloke
@mokiloke 10 месяцев назад
Its not openai
@Enju-Aihara
@Enju-Aihara 10 месяцев назад
@@photon2724 local means cut of the internet and have it run normally
@Dave-nz5jf
@Dave-nz5jf 10 месяцев назад
@@mokiloke Look again. I don't see where he says anything about OpenAI .. where do you see that?
@CesarVegaL
@CesarVegaL 10 месяцев назад
I found the following article to share, "OpenAI, a non-profit artificial intelligence (AI) research company, has announced its closure due to lack of funding from wealthy patrons. To continue its work and attract capital, OpenAI plans to create a new for-profit-related company. This decision is due to the need to invest millions of dollars in cloud computing and attract AI experts to remain competitive. High salaries for AI experts have also contributed to OpenAI's unviability. Although they will continue to be available, OpenAI tools may be affected by this change. Alternatives such as Scikit-learn, Pandas, Azure ML, and OpenCV are presented for Machine Learning projects."
@glitzsiva2484
@glitzsiva2484 8 месяцев назад
Does it support “h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3” - Model?
Далее
How to Set up and Use PrivateGPT and LocalGPT
20:06
Просмотров 27 тыс.
КРАФЧУ NAMELESS СКИН!
1:53:35
Просмотров 444 тыс.
Thank you 3M❤️
00:14
Просмотров 700 тыс.
LocalGPT Updates - Tips & Tricks
19:38
Просмотров 22 тыс.
This Zsh config is perhaps my favorite one yet.
17:24
Просмотров 146 тыс.
FullHD в 8К БЕЗ ПОТЕРЬ? |РАЗБОР
20:42
Main filter..
0:15
Просмотров 11 млн