hello brother .. is there a video and image enhancer github project that is open source and i can run it on google colab ? and the same question for text to speech and voice cloning other than rvc and open ai that can rival the eleven labs quality . i would be grateful really
🔥Install SearXNG with Perplexica and Ollama Locally for AI Search Engine ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-LjTIy0FEkAQ.htmlsi=DBfI1i0JxSsOsIgd
tool use is a crotch until llms are trained to process logic and alphaproof seems to crack the code with a silver olympiad title, for the time being, this is a good crotch
Thanks Fahd for this great video I've installed all the packages and I can run the program without errors (after several fights with the libs conflict) but it takes too much time to print the response, why ? Is it due to the fact that I'm using CPU ? I don't have an NVIDIA card in my laptop. is there a way to make it faster ?
Hey, thank you for the amazing tutorial, what is your pc config? How have you managed to load the 70B parameter model locally? Also will a 36 GB unified memory mac M3 studio be able to load the model?
How much VRAM it ends up using with the prompt guard and llama-guard models? I can load the llama 3.1 model, but when I try to run the mesop app it uses too much memory for my system. Do you know if there is a way to disable the guards so it doesn't load the other models?
The link in the video description is for Gemma blog entry from Feb 2024 and not the llama3 blog entry, can you please also link the llama3 RAG blog entry. Thanks for doing the tutorial and all other interesting videos.
Very cool tool! So the test prompts available are just a few hardcoded or is the repository available to download and add and configure your own prompts?
Bro i just discover this thing (transformers and huggingface) now i wonder, when we used model or pipeline, so we use their model right? Does it require internet or works like API? if yes so we can use the model even on low end pc or hardware (like use a object detection task with huge model in raspbery pi) because it running with internet not locally?
Thank you for the great video! I got an error that says that "No text files found in input" even though my input does have a clear *.txt file. Do you know what could be the problem?
I have no problem installing it. It runs easily but I would like a shortcut to start it quickly so I don't have to install it every time I need to use it.
What a waiste of time, why did you not just show how to access it publicly instead of rambling about how to use LM studio, there are videos around showing how to use the software, came to watch the video based on your title..
The unhelpful assistant was very unhelpful! Shrugging and remaining silent instead of answering the question. Impressively true to form. If I had to guess, I'd say it wasn't manipulation of the hyperparameters, but rather the removal of the word "unhelpful" in the prompt that improved the output here
thanks, a next interesting tutorial would be use ollama on sagemaker to expose the calls to the model to an EC2 instance say with a python web app that end users use, so you use the ec2 to expose to instance and the sage maker powered machine to run your machine learning models with the GPU's..
Cool, model. Instead of trying your method. I saw the same models in LM Studio. I was able to successfully download several different quant. versions, but when I tried to load them in LM Studio, I got the following error message: "llama.cpp error: 'done_getting_tensors: wrong number of tensors; expected 292, got 291'". I verified that I have the latest version and that other models are working fine. If you have any thoughts, please let me know. Thank you so much for your videos. I tried the other quantization version listed on the hugging face page which did work but was definitely censored per your example questions ("I can't help you with that").
@@fahdmirza the official meta page actually recommend downloading directly on local PC without ollama or hugging face. i downloaded the 8-b and there is no clear way to run it, the only option I've seen working is the hugging face. what do you suggest I don't want the 15 gb 8-b waste.. Can I still run it locally?
llama.cpp has already fixed it, just update your LM studio to get the fix, and then it should run without any problem. Ollama still throws the same error till date, expecting to be fixed in the coming release. finger crossed.
not to mention it runs locally on your computer, unless you host your website on your computer you'd have to install ollama and a chat interface on the webserver that you want to host your site at, or you use a paid online chat service
please if you sell courses i would buy them or discord membership to assist with code , the 70b ollama have huge reqs and even more the 405b model on local, you are a devops i want to learn that way to run the models and modifying them, thanks
Hi Very nice video!! Can you help me with an issue... am using "lmdeploy serve api_server internlm/internlm-7b" to launch my fast api for vl model. It is woking fine ut the vl endpoint of the API is storing accumulating the memory. This is leading to crash my server. How can we stop accumulating the memory.