Chapters 00:00:00 - Introduction 00:00:18 - Overview of the Azure OpenAI service 00:01:23 - Applying ChatGPT to enterprise-grade applications on the Azure service 00:02:29 - Retrieval Augmented Generation 00:03:06 - Private Knowledge 0:03:32 - Using ChatGPT in an App 0:04:25 - Asking Questions in the App 0:05:49 - Exposing Details of Conversation Turns 0:06:31 - Injecting fragments of documents 0:06:46 - Different approaches for generating responses 0:08:14 - Adapting style of response 0:09:41 - How Information Protection Works 0:10:02 - Demonstration of Document-Level Granular Access Control 0:11:00 - Adding New Information into Search 0:11:20 - Running Scripts to Add New Information 0:12:04 - Code Behind Sample App 0:12:50 - Overview of ChatGPT 0:13:31 - Using Azure OpenAI Studio Playground 0:14:30 - Building Your Own Enterprise-grade ChatGPT-enabled App
This is pure gold. Would it not be reasonable to expect that in the next few years every major MSFT cloud storage and development tool like Azure SQL, SharePoint, Dataverse, Power Apps and Power Bi to offer this feature automatically?
I have been playing around with this solution and it's amazing! Nice work from Pablo and the rest of the team! One thing that I still don't get is when we should use a Cognitive Search to index the content for later retrieval based on text versus using Embeddings to get the semantics of each document, store them in a Vector Store for later search based on embbedings of text search (with cosine similarity) for example.
Thanks for watching and checking out the demo. It is correct, that vector stores can often be useful for this and is an active area of research for us, however from what we have seen although embeddings (for vector search) are generally quite good at helping to recall candidate content they are not necessarily as good at relevancy. The research we have seen seems to show that a more hybrid approach (vector with traditional linguistic search such as BM25) generally provides the best results. In this demo, you might have noticed that we leverage our semantic search capability which first uses linguistic search (BM25) to find good candidate content (L1). Then as a second stage (L2), this content is automatically passed to a ML model (which BTW is the same family of models that powers Bing.com) to help with the re-ranking of these results. Hopefully you will find as you test this demo that this does perform quite well. The other advantage here is that you do not have to do your own vectorization of content which can be both time consuming and expensive. However, as mentioned earlier, we are continuing to research how vectorization can play a part here.
@@RichardsonNascimento I am a complete novice when it comes to these development issues. I am looking for someone to create such a chatbot for my website which uses my own organisation's data / information. Can you maybe help me with this?
Honestly, the more I hear about this tech, the more I want to use it for worldbuilding and lore for games and storytelling because wooooow that seems like a good way to prevent ever making another characterization flub or timeline mistake ever again.
Great application! However in my experience you would not be able to rely on the current generation of models to avoid flubs or continuity errors. Have a play - see what you find - but I have found that while the response always makes grammatical sense it doesn't always make logical sense. All it does it estimate a plausible set of words to complete the meaning of your prompt. Most of the time this makes logical sense, but there's nothing forcing it to. So while it would generate a plausible, immersive world which mostly worked, I'm sure every now and again you would still get characters coming back from the dead or teleporting from one place to another or whatever...
@@MSFTMechanics I was wondering how did you use that giant prompt by using all the histories to the Completion API. I thought there is a limit on the number of token the Completion API can digest?
How do you ensure the cognitive search results don’t exceed the 4096 token limit for the ChatGPT? And if they do exceed this (entirely possible with large amount of corporate data), how son you chunk it for ChatGPT ?
It looks really interesting. What about using it to gather insights about structured data? Say for a set of headlines, what is the top-performing headline (based upon summary data) and what the CTA is and how far above or below a benchmark? Basically gathering insights, thru a guided process, from structured data?
Thanks for this great video, really exciting! I have one question: Are prompts (and thus company information) processed exclusively in Azure OpenAI Service, and NOT through OpenAI's API?
We are trying to build a similar solution to enable conversational q&a but using elastic search for indexing with embeddings. 1. How do you decide on the chunk size before indexing? 2. How different would the retrieved chunks based on cosine similarity be when compared with cognitive search?
OMG been looking for this information for 3 weeks. Thank you. Saw it before on another channel but it was very confusing compared to how this video explained things.
Could you share the script to update the data? explain how it works? Do you have an automatic way to update the application by doing azd deploy? or other?
Amazing one! I love it, there are so much nuances in this - its not just simple retrieve and generate - Sudden curiosity - if LLM has much reasoning and language understanding - why cant we ask it straightaway on ranking the documents and filtering the unnecessary ones as a in-context learning prompt - why do we need separate re-ranker component?
Sorry, I didn't get it. Do I understand correctly that if we want to keep our data private, we need to keep it separate from the model? And only add a piece of information during the response generation process. If we start to teach the model our data, will it become public?
Great questions. Short answer is no. The search is retrieving additional information to add to the prompt. Information from prompts is not stored in the large language model. Also, there are multiple instances of the model running, and the ones used for the Azure OpenAI Service are not public instances.
Please help!!! What advice do you have for where to store the data used to tune the GPT? We have a complex data set in Azure Storage tables, and are wondering the best database...is it AzureSQL, Access? AzureBlob? Something else?
Great presentation! Now that this information is public knowledge, I need to come up with something more creative when communicating with clients who are interested in LLM's : )
Very interesting! Would it also be possible to make an integration between GPT and SAP or MS Dynamics? I am an SAP FI consultant handling incidents and changes submitted by the finance departments. Would it be possible to make a private model in which GPT can read through the SAP system and give instructions on how to solve certain incidents? For example, if a user gets a certain error when performing a payment run, would GPT be able to analyse where in the system this error is coming from and how to solve it? Not just giving recommendations as is it does now when anonymizing the data and submitting it in the public GPT environment. And fff course all without sharing any information to the outside world.
are source citations accurate? because in bing they are often wrong: when you click on a citation, you realize the referenced website is not the actual source of the information provided
Impressive, the one i was looking for a long while. can anyone suggest, which language model of Azure Open AI can i use to compare 02 pdf documents to check whether the information is available in both documents or not?
Hi, do you have details on the type of RBAC role required to be able to deploy the demo? I am getting a : 'the client does not have the necessary permissions to perform the specified action', I have Cognitive Services Contributor access
I looks great!! thanks. is there a video explaining step by step the coding from star to end. It would be great for those us who are starting in the AI and Azure
We cover that in the video. Your data is not used for training the large language model, it's only part of the prompt for inference as demonstrated in the example.
@@MSFTMechanics Noted. Thanks for responding! So if that's the case, can an organization be HIPAA-compliant (w.r.t to not exposing PII and PHI)? I want to make sure that the in-context learning (or 'RAG') paradigm doesn't expose our customers data to OpenAI / Azure OpenAI or anyone else. That's probably the biggest blocker to implementing any production-grade app for our team. Thanks for a thorough answer in advance.
The question I would have is this one: do the « private data » is « protected »? In a chat, chatgpt said that I sould not share private information with it, because it cannot guarantee that the data « will not be used / made public or something ».
Where someone has a long session, how does AzureOpenAI service deal with token limits where it has to give the whole context especially where previous responses are long?
Can anyone help me with robust strategy for handling dependent and independent questions during the conversation, including generating a standalone question to provide additional context for dependent questions? Is the strategy used here to augment the user's latest question with prior conversation history robust for all kinds of scenarios?
You would instrument the same access controls and permissions as you would now for implementing Azure Cognitive Search. We demonstrate that in the video. The information used to augment the prompt is retrieved based on the individual's permissions.
Hi! I've been trying to recreate this project on my machine and I'm getting an error I don't quite get. I've found a workaround, but I feel like my workaround is reducing the performance of the assistant. I'm using an AzureOpenAI service that is based on gpt-35-turbo and when I try to ask a question using RRR or RDA I'm getting an exception saying that gpt-35-turbo does not support parameters "logprobs, best_of and echo". I've deactivated them in order to make the project work, but as I've said it feels like the quality of the responses have diminished. Did anybody else encounter this problem?
We show the manual, on-demand process for updating the search index at 11:27, but normally these types updates would run on a schedule or based on eventing logic.
You can sign up for it as an individual developer, but you do need an Azure subscription. For a "free to use" option, you can also try Bing Chat if you're looking for alternatives to OpenAI chat.