Need a heavy GPU machine? Check out this video on setting AWS EC2 GPU instance. If you like this one check out my video on setting up a full RAG API with Llama3, Ollama, Langchain and ChromaDB - ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-7VAs22LC7WE.html
OMG!!! I freaking love you, I've been struggling with deployment on AWS with llama and you've made it crystal clear. I'll do anything to support ur channel. UR THE BEST!!!
Thanks a lot for the video !! Question : Is it possible to start the instance only if we do a request to the server ? It can be usfull to limit the costs. I think it is feasable with kubernetes and docker, but i would enjoy a video about it :) ! Thnks again, very good video
The Video was awesome and prety helpful but can you cover the security point of view too like anyone with the IP and portnumber can access it So how can we avoide that?
@@fastandsimpledevelopment if i correctly understand you can select the base ubuntu 22.04 image and install all yourself: nvidia driver, cuda driver, tensorflow, python etc?
Yes, if the OS has support and you have an AMD or Nvidia GPU installed and the latest version does auto-detect. You can also set it to NOT use the GPU in the Ollama config files but by default it does auto-detect.
By itself it is not, you need to add a front end like Nginx and then have several Ollama servers running, that is the only way that I am aware today. There is new updates all the time to keep track of Ollama updates