Another approach to storing data locally: If you clone the GitHub repo you can cd into chroma and run the command `docker-compose up -d --build` which will spin up an image container and volume which will persist the data when deleting the container and image
This is great but for performance I'd take a look at using named volumes over host volumes. There are pros and cons to the different types of volumes used with Docker but allowing it to manage its own volumes makes it harder for you to screw it up.
I have a vector db with embeddings and docs with me that i stored using below commands from langchain.vectorstores import Chroma db = Chroma.from_documents(docs, embeddings) And persisted it into a folder in google colab. But in colab, when the run time ends everything is lost. I want to keep my vectorized db forever so that I can retrieve data anytime I want. How to do that?
@@aayushchaurasia4727 hi, coul you please help? I am trying to Connect to ChromaDB like this vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings), but still when the run time ends everything is lost. In new session vectordb.get() gives me only {'ids': [], 'embeddings': None, 'metadatas': [], 'documents': [], 'uris': None, 'data': None}
I tried running your command to create the local storage on my Mac and got and error that I needed an argument to run. I changed it to docker run -p 8000:8000 -v /Users/jim/Desktop/chroma/:/chromadb/chroma chromadb/chroma and it worked. Do you have an extra / in your command or did I just type it wrong.
Think it may have been a typo, it is in a CloudFormation template and has been working since i made the video. Its usually that trailing slash before the colon that is dropped. Regardless, glad it worked!
@@jim02377 if you need to manually create it then it's for sure a permission issue! Make sure the docker user has the permission to write to the folder for storage
Thank you for this wonderful explanation! I am able to successfully able to run locally, How do we deploy this on kubernetes cluster using persistent volumes?