This was sick. Thank you for so patiently explaining each step. You could have just run a bunch of stuff you pre-wrote in a notebook. Doing it this way instead makes it an accessible entry point for people who might be interested in getting into ML in a more serious way. Very humbled.
I am blown away by your videos and learning every second. You are simply the best out here in this area of computing. I may be starting an academic research in computational linguistics reg. semantic change in loanwords. I would love to get in touch with you.
that's really cool to hear, thanks! Sounds like a fascinating topic to research - for getting in touch I usually recommend the discord chat: discord.gg/c5QtDB9RAP Or there is my email in the "About" section of the channel
Thanks for the great video. I am curious as the what type of performance you get? Obviously the hardware makes a difference, but in general how long does it take to get your results?
What would be outputted if you were to manually select a random point within the vector space? Would it return an incoherent image? Or would it throw an error?
When implementing this I got an error saying the images are in CPU and so embedding of this will not be possible, I was doing embedding of the image in my google drive with the help of clip embeddings. Have you or any of the people reading my comment has tried this? please respond thanks in advance
great video, thank you. Have you ever tried image+text sematic search on image+text dataset is that a good way to interpret the combination of this embedding? for e.g. (image = 512dim + text = 512dim) which way is better way to combine those two embedding? can i just concatenate it and search on the database concate this vector embedding?
Thanks for the valuable videos. I hav some doubts, kindly reply. 1. Whether NER tags can be used in semantic search or search engines/information retrieval tasks. Any links will be useful. 2. I hav experienced in usage of sentence transformers whether open AI models are heavy or high dimensional vectors to do similarity search?. 3.Can we apply this clip approach for query (text) mapping with images ( like bill images having texts)/assisted with OCR results. Thks in advance
Hey Venkatesan! 1. You could use NER tags as part of metadata filtering paired with your semantic search - see here www.pinecone.io/learn/vector-search-filtering/ 2. OpenAI models tend to use higher dimensionality vectors, ranging from (I think) 2048-dimensional to ~10K-dimensional, the out-of-the-box performance of OpenAI models are pretty incredible though 3. I'm not sure about this, often vision models struggle with text, but with a vision transformer I'd imagine this is not as major an issue as you have the attention mechanism which should help the model comprehend the image (and therefore written text in an image) as a whole - this is just speculation though, I'm not very familiar with vision models