Fine Tuning GPT-3 & Chatgpt Transformers: Using OpenAI Whisper

Подписаться 12 тыс.

Просмотров 4,5 тыс.

50% 1

Fine-tuning GPT-3.
In this video we demonstrate how to use APIs and audio to create prompts and completions that can be used to fine-tune Transformers such as GPT-3. We show how to use a News API with Python to extract news articles and create a dataset for training and fine-tuning models.
We explain what prompt and completion datasets are, and the role they play in natural language processing in fine-tuning transformers. We then discuss using video and audio streams to build data pipelines for fine-tuning and transcribing speech to text using OpenAI's Whisper.
With over 500 hours of RU-vid videos being uploaded every minute video is a rich source of detailed information with which to train AI.
The video first includes step-by-step demonstrations of using the News API to create a prompt and completion dataset and then shows how to build pipelines transcribing speech to text using Whisper.
Foundation models like GPT-3 can be 'Fine-Tuned'. This Fine_tuning allows them to learn the vocabulary and semantics of particular disciplines. If you want your AI solution to be familiar with options pricing algorithms, then you can show it lots of options pricing papers. If you'd like your AI to be able to converse and write about monetary policy then you can expose it to policy papers, books on economic theory, central bank meeting minutes and news conferences.
You train transformer AI models in a particular way. You don't simply feed in a file containing the text of the book or article that you are using to train the AI. You break the text up into a series of "Prompts" and "Completions". Prompts are the beginning of a passage of text and completions are what follow a prompt.
In this post we leverage the breadth and depth of content on RU-vid. We propose a strategy to use selected videos to train our AI language models. These videos are chosen because they are rich in expertise in an area we want to train our language model in. The example used in this video is to train our AI in monetary policy by studying content generated by the Federal Reserve. We devise a strategy to use speech to text transcription of RU-vid videos. We use OpenAI's 'Whisper' to perform this transcription. We then break this text up into 'prompts' and completions to train our model.
Following this approach we can easily train our NLP models to have expertise in any field we choose.
#NLP #Python #FineTuning #AI #MachineLearning #NewsAPI #Whisper #Lucidate #chatgpt #gpt3 #gpt-3 #openai #whisper #openai-whisper
Links:
Attention Mechanism in Transformers: • Attention is all you n... Transformer playlist: • Transformers & NLP
Federal reserve Description: • Fed Functions: The Thr...
FOMC Feb 2023 Press Conference: • FOMC Press Conference ...
Python code for Transcriber class: github.com/mrs...
=========================================================================
Link to introductory series on Neural networks:
Lucidate website: www.lucidate.c....
RU-vid: www.youtube.co....
Link to intro video on 'Backpropagation':
Lucidate website: www.lucidate.c....
RU-vid: • How neural networks le...
'Attention is all you need' paper - arxiv.org/pdf/...
=========================================================================
Transformers are a type of artificial intelligence (AI) used for natural language processing (NLP) tasks, such as translation and summarisation. They were introduced in 2017 by Google researchers, who sought to address the limitations of recurrent neural networks (RNNs), which had traditionally been used for NLP tasks. RNNs had difficulty parallelizing, and tended to suffer from the vanishing/exploding gradient problem, making it difficult to train them with long input sequences.
Transformers address these limitations by using self-attention, a mechanism which allows the model to selectively choose which parts of the input to pay attention to. This makes the model much easier to parallelize and eliminates the vanishing/exploding gradient problem.
Self-attention works by weighting the importance of different parts of the input, allowing the AI to focus on the most relevant information and better handle input sequences of varying lengths. This is accomplished through three matrices: Query (Q), Key (K) and Value (V). The Query matrix can be interpreted as the word for which attention is being calculated, while the Key matrix can be interpreted as the word to which attention is paid. The eigenvalues and eigenvectors of these matrices tend to be similar, and the product of these two matrices gives the attention score.
=========================================================================
#ai #deeplearning #chatgpt #gpt3 #neuralnetworks #attention #attentionisallyouneed

Опубликовано:

7 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 31

@Takodachiiii Год назад

I cannot emphasize enough how much your video has helped me, both in my studies and my extracurricular stuff, thank you!

@lucidateAI Год назад

Glad it helped! Thank you for your kind words and support. Have you checked out the other videos in this playlist > ru-vid.com/group/PLaJCKi8Nk1hwaMUYxJMiM3jTB2o58A6WY or the other content on Lucidate's RU-vid channel -> ru-vid.com/show-UClqbtKdqcleUqpH-jcNnc8g . I hope you find them equally helpful. - With thanks, Lucidate.

@rafsanjanisamir4221 Год назад

This is a truly remarkable piece of content, filled with insights that rival those of top-tier media outlets like National Geographic and Discovery Channel.

@lucidateAI Год назад

Humbled that you think so. Touched by your very kind words and feedback. There is more to see here -> ru-vid.com/show-UClqbtKdqcleUqpH-jcNnc8g. Any and all feedback sincerely appreciated. Lucidate.

@Dxeus Год назад

Hi Richard, your videos are amazing. I can't wait to connect with you and tell you about my small pet project that I have recently started. I have a full-time job so I usually work on it on weekends and holidays.

@lucidateAI Год назад

Sounds great!

@lorenzoleongutierrez7927 Год назад

Great video, thanks !

@lucidateAI Год назад

Glad you liked it! Thanks for supporting the channel.

@TylerHodges1988 Год назад

I used the requests library to query the coinstats api to get real time crypto prices and then dump that into a conversation history that can be refed into a prompt to give historical context.

@lucidateAI Год назад

Tyler, thanks for sharing. Sounds intriguing. Do you have any results you are happy share? Happy to take this offline at info@lucidate.co.uk if you would prefer.

@jpg9750 Год назад

awesome video, thank you so much

@lucidateAI Год назад

And thank you too for your generous comment. Really glad you found the video useful. Have you had a chance to check out any of the other videos in this series? -> ru-vid.com/group/PLaJCKi8Nk1hwaMUYxJMiM3jTB2o58A6WY

@DavidBrown-tv8fx Год назад

awesome video,i wanna know if i donnot provide a prompt,just give many completions,how it works

@lucidateAI Год назад

Hi David, thanks for the comment on the video and for your question. The honest answer is "I'm not sure", as this is not something that I've tried. However my feeling is that if you try to train OpenAI's GPT-3 model using a file that has only completions and no prompts, the model is unlikely to learn anything useful. This is because GPT-3 relies on a prompt to generate context and structure for its responses. The prompts provide GPT-3 with an initial _context_ and guidance for generating text. Without prompts, the model will lack the necessary context to generate meaningful responses. As a result, the completions provided in the file will likely be generated in isolation and without any meaningful connection to the broader context. Is is even possible that if you try to upload a prompts and completions file for fine tuning that it will be kicked out by the fine-tuning API (again, I've not tried this, so it is just speculation on my part). Even if OpenAI did accept the file, the GPT-3 model is designed to generate text in a contextually appropriate way, and without prompts, it will have no way of determining the appropriate context for the generated text. This could lead to nonsensical or irrelevant responses. (Although it can give those in its "normal" course of operation. Therefore, I'd recommended to provide both prompts and completions when fine-tuning GPT-3. The prompts should provide a context and structure for the completions, which can then be used to generate coherent and relevant text - assuming that that is what you want. - Thanks for your support of the channel - Lucidate.

@carloswong5325 Год назад

😊

@lucidateAI Год назад

Thanks!

@roihirshberg398 Год назад

This technique looks great however, i dont understand the use case 100%. Eventually, what can we do with a gpt3 model which was fine tuned with 500 hours of monetary content? If you fine tune a model then you are then using it on the basis of one of openai's model such as davinci, ada, etc and you are not using the fine tune model on the basis of chatGPT. In other words, you are adding more up to date content and adding new vocabulary but once trained, the model can no longer work as a chat agent similar to chatGPT. Could you provide insight of what you can do with the model you fine tuned in your video so we can understand the use case? Thank you!

@lucidateAI Год назад

Thanks Roi, great question. A model trained with 500 hours of monetary policy would have a more nuanced and detailed understanding of the vocabulary and semantics of: inflation, interest rates, central banks etc than a model that was trained on a general corpus of text (as the base GPT-3 and by extension ChatGPT both are). So a fine-tuned model has more relevant detail. If you are a journalist or a blogger you might also wish to have up-to-date content in your NLP model to reflect recent market changes. In a fast-paced, constantly evolving field like finance that would be important. Though I concede in other areas perhaps less so. As base models can take a considerable amount of time to train, so there is always a risk that they will be somewhat out of date. An advantage of fine-tuning is that the training data can be captured and the fine-tun run in low units of hours - making them more contemporary. (Though to keep them contemporary you need to continue to capture new data and to fine tune). Nevertheless this means that a fine-tuned model will be more up to date. While ChatGPT is a powerful tool that can perform a wide variety of natural language processing tasks, it has limitations when it comes to certain complex language-related tasks. Fine-tuning GPT-3, on the other hand, provides a more advanced and nuanced level of language processing, which opens up new possibilities for various use cases. Here are some examples of tasks that can be accomplished with a fine-tuned version of GPT-3 that are either not possible or not as effective with ChatGPT: 1. Text Generation: Fine-tuning GPT-3 can be used to generate coherent and grammatically correct text that is more advanced and nuanced than what can be achieved with ChatGPT. This is particularly useful in creative writing, where GPT-3 can be fine-tuned to produce high-quality prose and poetry in a particular style. 2. Language Translation: While ChatGPT is capable of zero-shot learning for language translation, fine-tuning GPT-3 can produce more accurate and contextually-appropriate translations. With the ability to understand and analyze the meaning of sentences and translate them accurately, GPT-3 can be fine-tuned for language translation, providing a more accurate and efficient method for translating text. 3. Writing Assistance: Fine-tuning GPT-3 can be used to provide advanced writing assistance to improve the quality and style of written communication. For instance, it can be trained to identify grammatical errors and suggest corrections or improvements to the text. Additionally, it can be trained to suggest improvements in tone, style, and word choice to make writing more compelling and effective. 4. Content Creation: Fine-tuning GPT-3 can be used to create unique and engaging content across various industries. For example, it can be trained to generate product descriptions for e-commerce websites, create blog posts or articles for media outlets, and even write scripts for video production. 5. Advanced Conversational AI: GPT-3 can be fine-tuned to provide a more advanced level of conversational AI, such as chatbots that can understand more complex queries and provide more nuanced responses. This can be particularly useful in customer service, where a more advanced conversational AI system can provide personalized responses and solutions to customer queries. 6. AI-Assisted Research: Fine-tuning GPT-3 can be used to improve the efficiency and accuracy of research processes. It can be trained to analyze vast amounts of research data and provide contextually appropriate summaries, or even generate new research questions based on its analysis of existing data. Overall, while ChatGPT is a powerful tool that can perform a wide range of natural language processing tasks, fine-tuning GPT-3 can provide a more advanced and nuanced level of language processing, opening up new possibilities for various use cases. With its advanced capabilities, fine-tuned GPT-3 has the potential to transform the way we approach language processing and AI more broadly. If you are happy with the out-of-the-box capabilities of ChatGPT, then there is no need to fine tune - and you can accomplish a lot with zero-shot techniques. But if you want more nuance, more detail or more contemporary information then it may be worth considering fine tuning. Many thanks for your question and your support of the channel. - Lucidate.

@1975nikola Год назад

Very good video, despite annoying audience music :) My question is - you are just randomly splitting the text into prompt and completion. Is that logical? Should we not take care about it? Completion should be the reaction to the prompt - if we just randomly split it between prompt and completion, it might miss the point there?

@lucidateAI Год назад

Hi Nikola, thanks for the question. I guess the answer is both 'yes' & 'no'. There are two methods that I have found to be successful. The first is structured. Here you take titles and section headings - these may be found with tags in Beautiful Soup, for a more sophisticated approach you can use XPATH in Selenium. In this structured approach the prompts are the headlines/titles/paragraph headings and the text is the content of the tags that follow. This approach involves a little more work, but it does follow the structured approach that you advocate. The second is, as you say, random and arbitrary. It just takes a sentence as a prompt and the next 5-10 sentences that follow as the completion. Why does this work? We'll it doesn't really work at all for summary, NER or text translation (though frankly neither does the structured method). But this unstructured approach really helps the fine-tuned transformer in text generation. You are simply feeding it _much_ more text in that subject matter (finance, medicine, civil engineering - whatever your chosen field of specialisation is). You are allowing it to update its word embeddings and attention heads to be more attuned to the idioms, vocabulary, jargon of that specific area of expertise. You are also allowing the attention heads to be much more attuned to recent information. So while this might not help text summarisation or classification, I have found it useful for text generation. And I hope you do too! If you do try it, please let me know how you get on. Many thanks for the question which allowed me to elaborate on this dual structured/unstructured strategy. I appreciate the contribution to the channel and I hope that you and anyone else reading this comment benefits from. my response. Appreciated - Lucidate.

@1975nikola Год назад

@@lucidateAI Great reply, I appreciate it. I have started playing with plan to fine-tune the model for basketball. It would be fun if it could make comments prior the games or during the games, based on some live stats I can feed to it; results, threes, blocks, good defense etc. Until I saw your video, I thought I would need to go the structured way - give it some stats as a pattern and then create few possible completions for it. It would probably take a lot time to do so. With your unstructured approach, I could do as you suggested with Whisper (great idea), take basketball highlights from RU-vid and other channels and feed commentators' words into it, to learn about jargon and vocabulary. Not sure how the combination of two would work though. With some time, I could try both and report back on results, it would be good experiment, but you know - there is never enough time

@lucidateAI Год назад

Thank you for your kind words and I'm glad to hear that you found my previous reply helpful. Your plan to fine-tune the model for basketball sounds very interesting, and it's exciting to think about the potential applications for such a model. (I must confess, it is not something that I would have thought of doing, but it does sound like a novel and imaginative use of the technology - I’m sure Sam Altman will be intrigued!) Combining the unstructured approach of learning from basketball highlights with the structured approach of feeding live stats during games could potentially lead to a more comprehensive and accurate model. It would certainly be an interesting experiment to try, although I understand that time is always a limiting factor. If you do decide to try both approaches and report back on the results, I would be very interested to hear about it. Good luck with your project, I’d love to hear how you get on. Lucidate.

@joshuacunningham7912 Год назад

I quite enjoy the background music, one's personal preferences notwithstanding.

@lucidateAI Год назад

Thanks!

@Anonym-ny6wz Год назад

Wouldn’t a vector database of the data for semantic search with Ada to search for context for additional info to put in the prompt be cheaper for a researcher for example? I would guess you could input data into the system way faster and you have way more control over the formatting of the data you want answers on. Im still very confused about all of this so take what I say with a teaspoon of salt please😂

@Anonym-ny6wz Год назад

Well I found a few use cases in another comment of yours…trying to figure out when semantic databases and prompt engineering are better than fine tuning and the other way around and I’m not sure I’m mentally qualified to find the answer😂

@lucidateAI Год назад

Well, that's a very interesting and astute observation! The use of a vector database of data for semantic search, in conjunction with a tool like Ada to search for context for additional information to put in the prompt, could indeed be a more cost-effective and efficient solution for researchers. TBH it is not something that I have tried, so I can't comment on its efficacy or efficiency, but I can't see any reason why you shouldn't try it and compare the different approaches. By inputting data into the system more quickly and with greater control over the formatting of the data, researchers may be able to streamline their workflows and gain more efficient access to the information they need. The use of semantic search can also help to identify relevant information and connections between different pieces of data, which can provide valuable insights for researchers. As with any approach, there are potential limitations and challenges that must be carefully considered and addressed. However, by continuing to explore new ideas and approaches, and by engaging in ongoing experimentation and analysis, we can continue to push the boundaries of what's possible in the field of natural language processing, and unlock new discoveries and insights that will help us to better understand the world around us. I appreciate your willingness to share your thoughts and ideas, if you do pursue this I'd be keen to hear how you get on. Thanks very much for your support of and contribution to the channel. Greatly appreciated - Lucidate.

@raselhossain8855 Год назад

can you share your presentation ?

@lucidateAI Год назад

Hi Rasel, sorry - but I don't quite understand your question? Can you be more specific? I'm happy to share anything that I can. Thanks for your request and your support of the channel. - Lucidate.