Тёмный

How vector search and semantic ranking improve your GPT prompts 

Microsoft Mechanics
Подписаться 345 тыс.
Просмотров 21 тыс.
50% 1

Опубликовано:

 

28 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 15   
@CollaborationSimplified
@CollaborationSimplified 11 месяцев назад
These are great sessions!! It really does help to better understand what's happening under the hood - well done Pablo, Jeremy and production team!
@acodersjourney
@acodersjourney 11 месяцев назад
Your videos make software dev more accessible.
@Dineshkumar-wj8so
@Dineshkumar-wj8so 18 дней назад
Amazing explanation!!! Must watch video before griding the documentation.
@MSFTMechanics
@MSFTMechanics 14 дней назад
Glad you liked it
@rjarpa
@rjarpa 4 месяца назад
Great video document, now is easy to understand the solution stack.
@timroberts_usa
@timroberts_usa 11 месяцев назад
thank you for this clarification-- much appreciated.
@TheUsamawahabkhan
@TheUsamawahabkhan 11 месяцев назад
Love it. want to see llama on azure with cognitive search and also can we plug external vector database as well with CS?
@hughesadam87
@hughesadam87 9 месяцев назад
Are these AI tools available in Azure govcloud or just commercial?
@uploaderfourteen
@uploaderfourteen 11 месяцев назад
Jeremy - Great to see your work on this advancing so well! One of the issues still outstanding with vector or keyword based retrieval is that, by only retrieving chunks, you aren't providing the model with the deeper 'train of thought' or 'line of reasoning' that characterises your data source as a whole (the semantic horizon is limited to the chunk size). As a consequence it seems that you can't get the model to reason over the entire data source. For example, let's imagine that your data source was Moby Dick (and let's pretend this was outside the training data)... neither vector or keyword search would allow you to ask "what is the moral of the story", as this requires developing a meta-narrative concept over all possible chunks. The only way current way language models can do this is to somehow fit the whole novel in context - but even then there are issues with how attention is dispersed over the text. In time it would be great to see whether Microsoft Mechanics can innovate around this problem somehow, as being able to reason over the full non-chunked data source would unlock much more intelligent and useful insights.
@fallinginthed33p
@fallinginthed33p 11 месяцев назад
Maybe there could be multiple passes to combine different vector results into one large query that attempts to answer the user's question. That context window limit is a real problem. Human brains remember both tiny details selectively and the overall gist of a document.
@uploaderfourteen
@uploaderfourteen 11 месяцев назад
​@@fallinginthed33p Agreed! I'd be interested to see how well combining vector results works. Alternatively, we know that LLMs can determine the 'gist' of a document if it's in their training data. Based on that observation, I'd like to see (a) some deep research into exactly how the model extracts that 'gist' from its training set (I'm not sure this is fully understood yet), (b) decompose that process into its fundamental steps, and then (c) try to replicate that process through a kind of pseudo-training. My hunch is that there is, somewhere, a relatively easy solution to this... the human brain seems to nail it very easily even with very little training data, so there must be a trick we're missing in respect of LLMs. I can skim read a small sized book in very short time (barely taking in the details) and then make a fairly accurate overall appraisal of its content, purpose, key message etc... LLMs should in theory be able to outclass this through some fairly straightforward mechanism as yet not understood.
@fallinginthed33p
@fallinginthed33p 11 месяцев назад
@@uploaderfourteen I think in a nutshell, humans are doing both training and inference every time we read. Our context window includes the current document and past documents, and each pass updates the past documents store with new data and weights. LLMs can't do that yet: each inference run is a blank slate that depends heavily on trained weights, but to update those weights through training requires a huge amount of computing power.
@pupthelovemonkey
@pupthelovemonkey 10 месяцев назад
@@fallinginthed33p Do the re-ranking steps and human feedback not feed back into the model to update its weights? Like for example a conversation on Bing Chat where you successfully drill down into a complex answer that takes a bit of back and forth to get to a solution like a coding problem where Bing Chat was giving you a solution that has a small error / punctuation mistake.
@fallinginthed33p
@fallinginthed33p 10 месяцев назад
@@pupthelovemonkey It might. It's known that OpenAI uses the chat interactions on its web interface to train its models. I don't know about Microsoft though. You can already do something similar using Lora techniques on open source models. Training doesn't happen immediately, unlike with human brains. You need to get updated weights or a training dataset and then spend hours or days running a training job.
Далее
CORTE DE CABELO RADICAL
00:59
Просмотров 1,7 млн
Microsoft AI Search Index - Vector Search
29:03
Просмотров 9 тыс.
OpenAI Embeddings and Vector Databases Crash Course
18:41
Can ChatGPT work with your enterprise data?
15:56
Просмотров 211 тыс.
Why are vector databases so FAST?
44:59
Просмотров 17 тыс.
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
Vector and Hybrid Search with Elasticsearch
40:55
Просмотров 4,4 тыс.
CORTE DE CABELO RADICAL
00:59
Просмотров 1,7 млн