TensorFlow Tutorial #10 Fine-Tuning

Подписаться 25 тыс.

Просмотров 24 тыс.

50% 1

How to use the pre-trained VGG16 model for Transfer Learning and Fine-Tuning with the Keras API and TensorFlow.
github.com/Hva...
This tutorial has been updated to work with TensorFlow 2.1 and possibly later versions.

Опубликовано:

20 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 69

@srinivasvalekar9904 6 лет назад

My blood pressure has come, My hair is starting to back, Sun is shining and life is good, When I start listening to your tutorials

@hvasslabs 6 лет назад

Thanks! :-)

@nelluriashok9764 4 года назад

You deserve more viewers and now i understood, people believe other scrap very easily than your content one.Thank you for your wonderful explanation, it helped me a lot for my master thesis project

@yashagrawal8650 5 лет назад

Sir you are great, i was searching for the same for the past few weeks but didn't find the solution, but finally found it here thank you so so much for such great videos

@HardikPatel07 6 лет назад

This is the best Video I have seen based on Tensorflow, Deep Learning. You actually helped us all out by preparing such descriptive documentation. I had read so many projects on GitHub and various other websites but is the best one. Keep up the great work. Awaiting more Tutorials from you.

@hvasslabs 6 лет назад

Thanks! It's a pity that people spend so much time elsewhere before they come here :-)

@andrewh5640 4 года назад

I would love to see you do an update to this work especially in the area of object detection! Your channel is a hidden gem on YT but extremely informative. Thanks.

@hvasslabs 3 года назад

Thanks! I won't be doing anymore TensorFlow tutorials. It was a tremendous amount of work and I only made about USD 2500 in donations.

@katyagilbo8391 6 лет назад

Fantastic tutorial. I've been searching for something like this and reading a lot of websites, but yours really helped me most. Thanks!

@hvasslabs 6 лет назад

I'm glad!

@michaelmcnaughton1535 5 лет назад

This tutorial is a breath of fresh air. You rule!

@vijayabhaskar-j 6 лет назад

You deserve more viewers!

@hvasslabs 6 лет назад

Thank you. I guess we are the 1% club for Machine Learning :-)

@NisseOhlsen 5 лет назад

@@hvasslabs ahahhh!! Yes, the evil 1%ers ahahhh! HAPPY NEW YEAR and thanks for the great tutorials!

@nelluriashok9764 4 года назад

Agree, that was really nice explanation and very helpful for my Master thesis

@Crazymuse 5 лет назад

Your github repo is the best thing I have come across in a long while. Great work dude :)

@hvasslabs 5 лет назад

Thank you.

@nouraburaad8746 5 лет назад

This is such a wonderful YT channel. Thanks for your efforts!

@MsgrTeves 6 лет назад

Good timing Magnus! I have been trying for three days to fine tune the Slim Inception_V3 model. I had no idea those weights were frozen. I will try the Keras Inception_V3 model following your tutorial. Thanks again for your great work!

@hvasslabs 6 лет назад

Let us know how that goes.

@MsgrTeves 6 лет назад

Hello Magnus, so I was able to load the InceptionV3 model in the tutorial. It performed the prediction a bit faster than vgg. About .13 seconds vs. .7 seconds. During this process I noticed that there is a potential improvement in the way the layers are added to the pretrained model. As it is now in the tutorial if you save the model and reload it the pretrained model is made one layer so you cannot select which layers of the pretrained model to fine-tune after reloading. I recommend adding the layers to the pretrained model as follows: model = InceptionV3(include_top=False, weights='imagenet', input_tensor=Input(shape=(image_shape[0], image_shape[1], 3)), classes=1000) transfer_layer = model.get_layer('mixed7') x = Flatten()(transfer_layer.output) x = Dense(512, activation='relu')(x) x = Dropout(keep_prob)(x) x = Dense(256, activation='relu')(x) x = Dropout(keep_prob)(x) x = Dense(num_classes, activation='sigmoid')(x) fin_model = Model(inputs=model.input, outputs=x) Then after saving fin_model you can access all the layers in the pretrained model as well.

@gaspiman2 6 лет назад

Hey Magnus, thank you for the wonderful tutorials and funny comments :) Keep it up!

@hvasslabs 6 лет назад

Thanks!

@ThanhNguyen-dq2eh 6 лет назад

@Hvass Laboratories Thanks for sharing. Do you have a plan to do this in Tensorflow only (not Keras API)? It is really helpful if you also provide a tut how to configure, train, test in TF for multiple outputs (heads) VGG16 model, which combines 2 new classifiers (all fc layers can be reconfigured (number of neurons)). Fine-tuning classifier 1: Block1 to Block 5 pre-trained with imageNet, fc weights and biases are randomly initialized. After training classifier 1, we get a new VGG16_1, adding one more head (classifier) we will obtain a two heads model VGG16_2. We then train only FC layers of the 2nd classifier of VGG16_2 (Transfer learning).

@stanislavsmirnov4670 6 лет назад

Dear Magnus, Thanks for the tutorial! Could you please clarify for me what will happen if we initiate all layers as 'trainable' and do optimization directly? Your tutorial says: 'But once the new classifier has been trained we can try and gently fine-tune some of the deeper layers in the VGG16 model as well. We call this Fine-Tuning.' or from F.Cholet's tutorial: Fine Tune can be done in 3 steps: 1) instantiate the convolutional base of VGG16 and load its weights 2) add our previously defined fully-connected model on top, and load its weights 3) freeze the layers of the VGG16 model up to the last convolutional block' Both of you says that firstly classifier should be trained. That leads (as it given in your example) to the two steps training process, two model compiling etc..What happens if I will load VGG (with or without the top), allow all layers being trainable and run training? I'm asking because its become very time-consuming, first of all, I have to find right hyper-parameters to train classifier, then the right combination of learing rate + trainable layers for fine tuning.

@hvasslabs 6 лет назад

Good question. As I recall, I first trained the entire network (VGG plus new classification layers) because I had forgotten to disable the training for the original network. I think it was possible, but I needed to have quite low learning-rates, maybe 1e-4 or 1e-5. With high learning-rates it didn't work, as I recall. I don't know if it will work for all data-sets and models, though. It may also happen that the training takes much longer because of the lower learning-rate, so it might actually be faster to first train the new classification layers, and then fine-tune the entire model. Please write back here what you find out so others may benefit from your experiments.

@salinikashyap7971 3 года назад

Great tutorial. it helped me a lot. Thank you so much

@hvasslabs 3 года назад

You are welcome!

@taureanamir 4 года назад

Do you mean that we cannot fine tune inception v1 model written in tensorflow?? Could you please lead me to any documentation that says so?

@MavsFit 4 года назад

Why are you using generator_test to evaluate the model if you use it to validation_set to train the model? In theory... There are 3 independent sets.. Train / validation / test... No???

@antikoo1 6 лет назад

I appreciate your contribution! You nailed it!!!! Thanks!

@hvasslabs 6 лет назад

Thanks!

@scientistgeospatial 4 года назад

Very lovely teacher!

@hajeressefi4666 6 лет назад

Hi! I have a question regarding the number of steps which you set to 100 in 21:12. You've mentioned in the video that you did this because Keras would run for eternity. I have found other scripts where the number of steps is calculated using the formula: number of steps = The total number of images (either in the validation or trainig set) divided by the batch size. This number would still be a fixed number. What is it that makes the difference between the two methods? Thank you for your efforts!

@hvasslabs 6 лет назад

I can't remember exactly why I chose that number here. It's been a while since I made this tutorial. But as I recall it generates an infinite number of variations of the images in the training-set, so I probably just set the number of steps per epoch to some number that was reasonable.

@AB-dw8vo 4 года назад

can you please comment how to adapt keras VGG on greyscale , i didnt find any correct info anywhere

@soufieneghribi5183 6 лет назад

I've made a classifier from scratch to classify Yoga poses i've got 72% accuracy so i wanted to improve my model using the pre-trained model VGG16 , i've got only 70% but the performence had improved ( convergence, less iterations .. ) does transfer learning always improve accuracy ?

@hvasslabs 6 лет назад

I don't know. That's an unusual task and I imagine your data-set is quite small? Where did you even get the data-set? You could make a Notebook on GitHub Gist and ask a more detailed question on StackOverflow to get more feedback. Please write a link here to the question as I'm curious to see your Notebook.

@soufieneghribi5183 6 лет назад

I've collected the images from google images and i spent time cleaning it up , i end up with 7 classes each containing 370 images so it is fairly small. i will upload my notebook and send to you. Thank you

@AquibJavedKhan 6 лет назад

Hey, Magnus thanks for the great tutorials. Can we use transfer learning using inception v-3 model for face Recognition problem?

@hvasslabs 6 лет назад

I don't know. You can try it and report back here if it worked.

@AquibJavedKhan 6 лет назад

Hvass Laboratories I tried but I have very less data i.e, 22 faces per subject and total 19 subjects it throws me some error and I found out it was due to size of the dataset, I am trying facenet. Thanks for replying..

@tuantangle5937 6 лет назад

Your video help people know how to do it. Thank you so much for your contribution. I admire you. Hope that you can help us know how to do the localization and custom object detection together with image caption one of these days. These topic really interesting but it's not easy for us but sometime they are all secret for thoes who have just get familiar with deep learning like us.

@hvasslabs 6 лет назад

Thanks for the kind words!

@udaykiranchalla2962 6 лет назад

can you please help me in solving the erro r: module 'tensorflow.python.ops.variable_scope' has no attribute '_VARSCOPE_KEY'

@333peacher4 6 лет назад

How to use the trained model anyway? I tried "new_model.save('nm.h5');load_model('nm.h5')", but it won't load, complaining: ValueError: Unknown layer:name

@abdellahgrinzou548 5 лет назад

hi sur...thanks a lot,you realy deserve better i had a little problem,i wish u reply i used your code i just changed the dataset(i used auc distracted driver dataset) and i used 50 epoches i got 75% accuracy of train(before fine tuning) .BUT 30% accuracy of test !!!! please any suggestions !

@abdechafineji8782 5 лет назад

What about regression model?

@raindancemaggie5769 6 лет назад

Thank you so much for an amazing/helpful tutorial!! I did your tutorial on google colab and then I tried using different datasets (Fashion_Mnist). I used: from keras.datasets import fashion_mnist (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data() x_train, x_test: uint8 array of grayscale image data with shape (num_samples, 28, 28). y_train, y_test: uint8 array of labels (integers in range 0-9) with shape (num_samples,). However, unlike the datasets you used, where all the images are separated into different folders, it is just arrays . How could I use this for transfer learning? Thank you :)

@hvasslabs 6 лет назад

I don't give support for people using other data-sets. But you could try and look at Tutorial #03-C for inspiration on how to do that.

@ajitegar7738 3 года назад

Amazing sir

@victorsung3582 6 лет назад

For those who get the error when trying to load the kera API: ModuleNotFoundError: No module named 'tensorflow.python.keras' replace from tensorflow.python.keras with from keras. I've tested it with the rest of the code and it works fine.

@hvasslabs 6 лет назад

I think I forgot to put version-numbers in this Notebook. It was made in TensorFlow 1.4 which should include Keras and it is imported like I wrote in the Notebook. I believe the way you import it requires Keras to be installed as a separate package e.g. using pip install.

@bilginaksoy2003 6 лет назад

Maybe you can add a new tutorial about fine-tunning with pure tensorflow. I think it would be pretty-awesome when you do it.

@victorsung3582 6 лет назад

In the original inception model, we were able to classify a single image after training. From what I understand, we set the model = inception.Inception() Then we defined for the model to classify in the def classify(image path) with pred = model.classify(image_path=image_path) Is there a similar way to call for the new model we've trained on the forks and knifes to form predictions on an image whether it contains forks, spoons, knifes or none of the three? Currently, it's only classifying everything in the test set. Thank you very much for your videos. They are such a valuable resource!

@hvasslabs 6 лет назад

I'm glad you like my work! The Inception-code in my previous tutorials was a quick API that I put together myself. Using the Keras API you have to call new_model.predict(image). First you need to load the image as a numpy array, see the Notebook's predict() function for an example. You can also see Tutorial #03-C.

@victorsung3582 6 лет назад

Wow, Keras makes it much easier to pull predictions from the model. I missed your #03-C file. I will definitely go through it in detail. So far, Ive gotten the new conv_model to make a prediction on the new image with: pred = conv_model.predict(img_array) However, I’m not sure how to edit this part to get the new prediction to make sense. Is decode_predicitions a function that is specifically for the VGG1 model? I’ve looked in the VGG16.py file and couldn’t find a reference to it. # Decode the output of the VGG16 model. pred_decoded = decode_predictions(pred)[0] The error message is: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 7, 7, 512) I’m guessing the new conv_model’s predictions are not the same format as VGG16?

@victorsung3582 6 лет назад

So far this is what I was able to come up with: for pred[0,0], pred[0,1], pred[0,2] in pred: print("{0:>1%}".format(pred[0,0]), class_names[0]) print("{0:>1%}".format(pred[0,1]), class_names[1]) print("{0:>1%}".format(pred[0,2]), class_names[2]) I still cant find documentations for pred_decoded = decode_predictions(pred)[0] I assume it only works for the the output of the VGG16 model?

@hvasslabs 6 лет назад

You may want to watch the tutorial again. conv_model is the part of the pre-trained VGG16 model that we are re-using inside new_model. So you need to call new_model.predict(img_array) to use the output of new_model. I think decode_predictions() is used for all pre-trained models in Keras that were trained on ImageNet. If you use PyCharm you can press ... hmmm... Ctrl+B or Ctr+Alt+B or Ctrl+Shift+B (I think, but not sure) and it takes you to the definition of the function. resources.jetbrains.com/storage/products/pycharm/docs/PyCharm_ReferenceCard.pdf But new_model is trained on your new data-set so it doesn't make any sense to use decode_predictions() on the output of new_model, because it is only for ImageNet classes.

@victorsung3582 6 лет назад

Hvass Laboratories lol yes I see the mistake I made. Called the wrong layer. So decode_prediction is just to sort out the 1000 possible outputs from the full model and isn't necessary here for 3 outputs?