Тёмный

Fine Tune Transformers Model like BERT on Custom Dataset. 

Pradip Nichite
Подписаться 32 тыс.
Просмотров 43 тыс.
50% 1

Learn How to Fine Tune BERT on Custom Dataset.
In this video, I have explained how Finetune transformers models like BERT on the custom dataset. How to use hugging face Trainer API, saving and loading fine-tuned model, evaluate the model on validation dataset, and making model prediction on a single example.
Code: github.com/Pra...
NLP Beginner to Advanced Playlist:
• NLP Beginner to Advanced
I am a Freelance Data Scientist working on Natural Language Processing (NLP) and building end-to-end NLP applications.
I have over 7 years of experience in the industry, including as a Lead Data Scientist at Oracle, where I worked on NLP and MLOps.
I Share Practical hands-on tutorials on NLP and Bite-sized information and knowledge related to Artificial Intelligence.
LinkedIn: / pradipnichite
#machinelearning #artificialintelligence #datascience #nlp #bert #transformers

Опубликовано:

 

12 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 68   
@FutureSmartAI
@FutureSmartAI Год назад
📌 Hey everyone! Enjoying these NLP tutorials? Check out my other project, AI Demos, for quick 1-2 min AI tool demos! 🤖🚀 🔗 RU-vid: www.youtube.com/@aidemos.futuresmart We aim to educate and inform you about AI's incredible possibilities. Don't miss our AI Demos RU-vid channel and website for amazing demos! 🌐 AI Demos Website: www.aidemos.com/ Subscribe to AI Demos and explore the future of AI with us!
@infrared.6130
@infrared.6130 Год назад
I searched lot read lot to solve one simple compony assessment problem but not able to solve...it ...as wont find any fine tunning video. You are gem
@ashishmalhotra2230
@ashishmalhotra2230 6 месяцев назад
Hey Pradip. Your videos are very informative. Just a suggestion, instead of putting chapter numbers can you put a small description so that one can jump straight to the desired timeline
@mansibisht557
@mansibisht557 4 месяца назад
Great video!!! You just solved a proposed RFP at my work. Thanks Pradeep!!!
@athariqraffi8674
@athariqraffi8674 2 месяца назад
Thanks for the video, I can understand easily from your explanation.
@koushik7604
@koushik7604 2 месяца назад
It's a nice tutorial brother.
@OnLyhereAlone
@OnLyhereAlone 8 месяцев назад
New subscriber here. Thanks for this clear explanation. I have watched a couple other videos of your and still watching but i have this question that you did not get to in this example because you had only 1 epoch. If i trained say for 10 epochs while tracking metrics (e.g., validation loss, accuracy or F1 score), if my best model was arrived at at the 6th epoch, how do i specify saving that 6th epoch? Thank you.
@FutureSmartAI
@FutureSmartAI 7 месяцев назад
This might be helpful. "If you set the option load_best_model_at_end to True, the saves will be done at each evaluation (and the Trainer will reload the best model found during the fine-tuning)." discuss.huggingface.co/t/trainer-save-checkpoint-after-each-epoch/1660
@jacobpyrett2668
@jacobpyrett2668 Год назад
GREAT video! solved exactly what I was looking for.. thanks so much!
@FutureSmartAI
@FutureSmartAI Год назад
Great to hear!
@FutureSmartAI
@FutureSmartAI Год назад
You can join discord if you need help with any of my videos. discord.gg/teBNbKQ2
@abhijitnayak1639
@abhijitnayak1639 Год назад
@@FutureSmartAI Hello Pradip, thank you for the amazing informational content. I was wondering if you could make some videos on Fine-Tuning a language model (for instance: BERT, RoBERTa) on any dataset using Deepspeed on multiple GPUs. This would be very helpful in case of my learning. Thanks in advance.
@bassemgouty9840
@bassemgouty9840 Год назад
very nice video and well explained , well done !
@FutureSmartAI
@FutureSmartAI Год назад
Glad you liked it!
@adekunledavidgbenro4823
@adekunledavidgbenro4823 2 года назад
Thanks for this video. Really helpful. Can you do a similar video for pretrained NMT model for let’s say Danish language?
@FutureSmartAI
@FutureSmartAI 2 года назад
Hi Adekunle if its hugging face transformer model then process will be same.
@183lucrido_ase
@183lucrido_ase 4 месяца назад
Is it natural way to create custom dataset?! Can't believe you have to write custom class for this simple task.
@Tiger-Tippu
@Tiger-Tippu Год назад
Hi Pradip,whats the purpose of creating Pytorch custom dataset when we already have our own dataset
@FutureSmartAI
@FutureSmartAI 11 месяцев назад
Hi Custom Dataset is just wraper that makes iterating through your dataset and getting correct item easy. check __getitem__ method
@matanakhni
@matanakhni 2 года назад
Brilliant hats off
@FutureSmartAI
@FutureSmartAI 2 года назад
Thank you for your support
@121_bimandas9
@121_bimandas9 Год назад
Hey Pradip, for News Summarisation project can I fine-tune BERT with CNN/Daily dataset ? Will this perform better than the basic BERT model ?
@FutureSmartAI
@FutureSmartAI Год назад
Hi Did you try first pre trained model directly like huggingface.co/facebook/bart-large-cnn. What improvement are you looking for ? Finetuning will definately improve performance but first check whether you need finetuning. Instead of Bert you can finetune other models like T5. check this huggingface.co/docs/transformers/tasks/summarization
@victorwang9538
@victorwang9538 10 месяцев назад
Great explanation and the notebook works! I followed the notebook and fine-tuned a BERT model. I found two ways to use the model: tokenizer = BertTokenizer.from_pretrained('custombert') model = BertForSequenceClassification.from_pretrained('custombert',num_labels=2) ; tokenizer = AutoTokenizer.from_pretrained("custombert") model = AutoModelForSequenceClassification.from_pretrained("custombert"). Either way, I can't load the tokenizer. Is this because I didn't update the vocabulary? And what's the difference between "AutoModelForSequenceClassification" and "BertForSequenceClassification"? Thanks a lot!
@FutureSmartAI
@FutureSmartAI 10 месяцев назад
AutoModelForSequenceClassification is generic class that can be used with any model where as BertForSequenceClassification as specific implemetation of it
@victorwang9538
@victorwang9538 10 месяцев назад
Got it Thank you!@@FutureSmartAI
@vinaykulkarni8948
@vinaykulkarni8948 2 года назад
Excellent!!
@FutureSmartAI
@FutureSmartAI 2 года назад
Thank you Vinay for your support. Keep watching and learning.
@Slimshady68356
@Slimshady68356 Год назад
nice explanation dude
@thisjitislegitimatelytripping
@thisjitislegitimatelytripping 2 месяца назад
you too dood
@AK-wj5bx
@AK-wj5bx Год назад
Hi @Pradip Nichite , Thanks for the great explanation :) I have a question: I have a machine generated data which is not natural language(Although the sequence of words in the data is important). I do not have any labels in the data, would it be wise to fine tune BERT and generate word embeddings using BERT? The idea is to check if BERT would generate more meaningful embeddings when opposed to word2vec skip gram. Thanks in Advance :)
@josiahadesola
@josiahadesola Год назад
Wow, thank you so much
@FutureSmartAI
@FutureSmartAI Год назад
You are very welcome
@TâmVõMinh-t2k
@TâmVõMinh-t2k Год назад
Hi Pradip, thank you for this tutorial. I just want to ask you that do you have any tutorial for fine tuning BERT (or BERTology methods) for GENERATIVE question answering task? Hope you can see my comment. Thanks in advance!
@FutureSmartAI
@FutureSmartAI Год назад
Yes. This shuould clear your concept and show you procedure. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-9he4XKqqzvE.html
@rahulgirase78
@rahulgirase78 2 года назад
Very Helpful
@FutureSmartAI
@FutureSmartAI 2 года назад
Glad it helped
@AlexXu-cs7bt
@AlexXu-cs7bt Год назад
Hi Pradip, thank you for this tutorial. Is it possible to fine tune the BERT model to predict a multiclass output? For example, emotions rather than a binary classification like this example.
@FutureSmartAI
@FutureSmartAI Год назад
Yes, You can fine-tune BERT model for multi class. Here is one examples shows multi classification using bert towardsdatascience.com/text-classification-with-bert-in-pytorch-887965e5820f
@AlexXu-cs7bt
@AlexXu-cs7bt Год назад
@@FutureSmartAI Thank you so much!
@angduybui7051
@angduybui7051 6 месяцев назад
​@@FutureSmartAI Hi Pradip. I am a university student. I really appreciate your tutorial and instructions. I really appreciate it. I also followed the instructions on the link you commented. They already work, but I don't know how to save, test and deloy the model. Hope you can help me. Forgive me for this lack of knowledge!
@ahsanrossi4328
@ahsanrossi4328 2 года назад
Amazing Thanks Man
@FutureSmartAI
@FutureSmartAI 2 года назад
Glad you liked it!
@tehzeebsheikh165
@tehzeebsheikh165 5 месяцев назад
Hi can we use the same code for distilbert or roberta as well?
@harrylu4488
@harrylu4488 Год назад
Hi Pradip, this is a great video. Thanks for your efforts to create this for us. Could you please give me some advice to tackle the data privacy issues when using these pre-trained model from hugging face? I understood that when we import these pre-trained model and do training, we might be sending the private data that we are training through API? Based on your experience, if we want to secure the data from public but still enjoy the benefits of these pre-trained model, what would you recommend? I know hugging face is promoting their private hub demo. What do you think about that?
@FutureSmartAI
@FutureSmartAI Год назад
Hi Harry, When you use pre trained model using hugging face and fine tune it, you are not sending any data to hugging face. If you fine tune model like GPT-3 then you have to send your data to open ai server.
@harrylu4488
@harrylu4488 Год назад
@@FutureSmartAI Thanks Pradip. So confirming that if we use hugging face trainer API just like the video tutorial shown, we are sending our data to hugging face, correct?
@FutureSmartAI
@FutureSmartAI Год назад
@@harrylu4488 No. we are not sending it. Though we call it Trainer API it's just part of the open-source library. If you use Huggingface Inference API then you need to send data to their server. huggingface.co/inference-api
@saadkhattak7258
@saadkhattak7258 2 года назад
HI pardip, I was following your code and got this error Target size (torch.Size([8])) must be the same as input size (torch.Size([8, 2])) can you help me fix it? I was simply running your notebook in google colabb
@FutureSmartAI
@FutureSmartAI 2 года назад
Can you share me on LinkedIN screenshot what line you got that error
@DivyaPrakashMishra1810
@DivyaPrakashMishra1810 8 месяцев назад
Followed the same approach but getting this error for trainer.train() method Expected input batch_size (1360) to match target batch_size (16).
@saralasri9129
@saralasri9129 Год назад
Hi Pradip, how can i solve this problem ? InvalidRequestError: The model `curie:ft-wrAQszDv88OVOWOQSjjqLZqe` does not exist
@FutureSmartAI
@FutureSmartAI Год назад
How does model curie come here?
@Mostafa_Sharaf_4_9
@Mostafa_Sharaf_4_9 9 месяцев назад
if the number of labels are 3 for example [positive ,negative , neutral] what are the changes of the code
@FutureSmartAI
@FutureSmartAI 9 месяцев назад
HI there is `num_labels` parameter. model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5) you can check this here they have 5 labels huggingface.co/docs/transformers/training
@Mostafa_Sharaf_4_9
@Mostafa_Sharaf_4_9 9 месяцев назад
@@FutureSmartAI thank you
@cCcs6
@cCcs6 Год назад
Hi Pradip, thanks first of all for this great content! One question, I reproduced exactly your code from this tutorial and the model seems to work like yours in the video, however, it doesnt predict correctly the toxic label for inputs from the training-data. For example for the comment_text from Line 14 from train_data the label should be toxic = 1 but the model predicts almost 0 for toxic. Can you explain what is wrong? This is the comment_text from Line 14: Hey... what is it.. @ | talk . What is it... an exclusive group of some WP TALIBANS...who are good at destroying, self-appointed purist who GANG UP any one who asks them questions abt their ANTI-SOCIAL and DESTRUCTIVE (non)-contribution at WP? Ask Sityush to clean up his behavior than issue me nonsensical warnings... Is the reason that the model predicts "better" the toxicity than labeled in the train_data or "worse"?
@cCcs6
@cCcs6 Год назад
* I have to add that so far I only trained the model with epoch=1, not yet with epoch=10.
@FutureSmartAI
@FutureSmartAI Год назад
Train for more epochs, even you train great model there is still chance that may make mistakes on few examples. If you find such examples include them in training data.
@cCcs6
@cCcs6 Год назад
@@FutureSmartAI thank you! 😇
@Sarmoung-Biblioteca
@Sarmoung-Biblioteca 4 месяца назад
This is BERT Mobile ?
@sachinborse4178
@sachinborse4178 5 месяцев назад
Its not working at cell of #define trainer args=training_arguments please make one more video as soon as possible 🙏🏻
@FutureSmartAI
@FutureSmartAI 5 месяцев назад
Sure. You should check new syntax
@MrMadmaggot
@MrMadmaggot 7 месяцев назад
Can you explain the LOSS metrics please
@pulikantijyothi9388
@pulikantijyothi9388 Год назад
👏👏👏👏👏👏👏👏👏👏👏👏👏👏
@Starius2
@Starius2 Год назад
Basically. You wish to limit people's ability to express themselves and arbitrarily label them as "toxic". Gotcha.
@punamsarmah3436
@punamsarmah3436 9 месяцев назад
Hi Pradeep. Can I please get your email id.
@FutureSmartAI
@FutureSmartAI 9 месяцев назад
Hi you can connect with me on linkedin
Далее
4 Year Sibling Difference! 😭 #shorts
00:11
Просмотров 10 млн
Как улучшить шоколадку Милка?
00:35
Applying BERT to Question Answering (SQuAD v1.1)
21:13
Simple Training with the 🤗 Transformers Trainer
26:42