Fine-Tuning BERT with HuggingFace and PyTorch Lightning for Multilabel Text Classification | Dataset

Подписаться 27 тыс.

Просмотров 34 тыс.

50% 1

🔔 Subscribe: bit.ly/venelin-...
🎓 Prepare for the Machine Learning interview: mlexpert.io
📔 Complete tutorial + notebook: curiousily.com...
📖 Get SH*T Done with PyTorch Book: bit.ly/gtd-wit...
🗓️ 1:1 Consultation Session With Me: calendly.com/v...
🔣 GitHub: github.com/cur...
Learn how to use BERT to classify toxic comments from raw text. You'll learn how to prepare a custom dataset, tokenize the text using the Transformers library by HuggingFace. We'll have a look at PyTorch Lightning and create a data module for our dataset.
#PyTorch #BERT #PyTorchLightning #NLP #Python #MachineLearning #DeepLearning

Опубликовано:

12 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 39

@venelin_valkov 3 года назад

In the next part we'll fine tune BERT to classify toxic comments and show you a couple of fine-tuning tricks/hacks along the way. Next part video: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-UJGxCsZgalA.html Complete tutorial (including Jupyter notebook): curiousily.com/posts/multi-label-text-classification-with-bert-and-pytorch-lightning/ Thanks for watching!

@MMphego 3 года назад

After 5 months of being AWOL, you came back with a bang... Great video thanks

@ayoutubechannel7376 Год назад

I was having such a hard time plugging hf with lightning, much clearer how they are plugged together. tx!

@tehJimmyy 3 года назад

Thank you man , you helped me get through my master thesis.

@VjayVenugopal 3 года назад

Do more and regular videos brother🙌🏻hatsoff for your efforts😁

@venelin_valkov 3 года назад

Thanks for the encouragement! I appreciate it 🙏

@fedyaskitsko8698 3 года назад

Thanks for great tutorial, where I can find the notebook for this video? It looks like, it's missing inside github repo. Thanks

@ellenzou2986 3 года назад

Thanks for the tutorial!! This is so helpful! Would you share the link of the codes (the colab page)? Thankssss!!!

@malikrumi1206 3 года назад

Was this supposed to end so suddenly like that? When or where will we get the rest of it? Or is there a 'rest of it'?!

@venelin_valkov 3 года назад

Hey, Yes, there is next part coming up. It will contain the actual model + some hacks/tricks that will make our model train faster and perform better. Thanks for watching!

@saharyarmohammadtoosky Год назад

Hi Venelin, thank you for the great information, I have a question. How you prepared classes to be in the numerical shape? I have about 2K classes and they are text labels. Can you give me a hint?

@d3v487 3 года назад

Nice explanation ❤️ . I Love it the way you explain things and goes hands on. It'll very helpful If you upload Named entity Recognition using BERT.

@Deepakkumar-sn6tr 3 года назад

Great video! looking forward to BERT4Rec Fine Tuning

@mariere2156 3 года назад

Some more questions, if you have the time ^^ Ist unsqueeze basically just the opposite of flatten? If so, why did we flatten the data in the first place?

@kvp9553 3 года назад

Hi Venelin!! Great work:) May I know whether continuous retraining is possible using BERT? i.e.,I have a fine-tuned model,Can I further tune it using additional dtaset without merging the new dataset with the old.

@mariere2156 3 года назад

Your videos are really helpful! Would your example work just as well with BertForSequenceClassification, or is there a specific reason why you use the 'generic' BERT model?

@yashumahajan7 3 года назад

Its awesome .where is the second part for this .?

@kachrooabhishek Год назад

my pretrained bert model is returning the value of batch_size, max_token_length and classes and the target is of size batch_size, so not able to calculate loss

@teetanrobotics5363 3 года назад

Amazing content and tutorials bro. Thank you so much . Could you please organise all your videos into proper playlists?

@r_pydatascience 3 года назад

Hahahaha, that toxic comment made my day lol

@EarlZMoade 3 года назад

pytorch_lightning has seed_everything() in utilities. Great vid :)

@venelin_valkov 3 года назад

Didn't know that. Thanks!

@semprepi6503 3 года назад

the gdown link is down, is toxic_comment.csv the train.scv dataset from toxic comment classification at Kaggle?

@testingemailstestingemails4245 2 года назад

how to do that trained huggingface model on my own dataset? how i can start ? i don't know the structure of the dataset? help.. very help how I store voice and how to lik with its text how to orgnize that I an looking for any one help me in this planet Should I look for the answer in Mars?

@rafiuzzamanbhuiyan 3 года назад

Can you give me resources or video how I fine-tuning my question Answering work with my own dataset ??

@venelin_valkov 3 года назад

Haven't played with Question Answering yet. What type of Question Answering you have? Can you give me a sample of your dataset?

@rafiuzzamanbhuiyan 3 года назад

@@venelin_valkov the dataset is same as squad format

@noumaaaan 3 года назад

Is this script availabe somewhere?

@xv0047 Год назад

This video assumes deep familiarity with PyTorch. Otherwise you're just flying blind.

@eastvalleyreviews24 2 года назад

how do you save the model?

@MMphego 3 года назад

What happened to the rest of the video?

@venelin_valkov 3 года назад

The final part is coming soon. Will show the fine-tuning and inference using the model :) Thanks for watching!

@MrFramue 3 года назад

Hi, when I try to download from your Google Drive with the link I recieve the following error message: Permission denied: drive.google.com/uc?id=1VuQ-U7TtggShMeuRSA_hzC8qGD12LRkr Maybe you need to change permission over 'Anyone with the link'?

@venelin_valkov 3 года назад

Just tried downloading from another account. It works fine. Use this in Google Colab: !gdown --id 1VuQ-U7TtggShMeuRSA_hzC8qGDl2LRkr Try it out

@MrFramue 3 года назад

Thx a lot, it works :)