If you're completely new to GANs I recommend you check out the GAN playlist where there is an introduction video to how GANs work and then watch this video where we implement the first GAN architecture from scratch. If you have recommendations on GANs you think would make this into an even better resource for people wanting to learn about GANs let me know in the comments below and I'll try to do it :) I learned a lot and was inspired to make these GAN videos by the GAN specialization on coursera which I recommend. Below you'll find both affiliate and non-affiliate links, the pricing for you is the same but a small commission goes back to the channel if you buy it through the affiliate link. affiliate: bit.ly/2OECviQ non-affiliate: bit.ly/3bvr9qy Here's the outline for the video: 0:00 - Introduction 0:29 - Building Discriminator 2:14 - Building Generator 4:36 - Hyperparameters, initializations, and preprocessing 10:14 - Setup training of GANs 22:09 - Training and evaluation
8:00 - transforms.Normalize((0.1307,), (0.3081,)) will not work because of the following: * nn.Tanh() output of the Generator is (-1, 1) * MNIST values are [0, 1] * Normalize does the following for each channel: image = (image - mean) / std * So transforms.Normalize((0.5,), (0.5,)) converts [0, 1] to [-1, 1], which is ALMOST correct, because nn.Tanh() output of Generator (-1, 1) excluding one and minus one. * transforms.Normalize((0.1307,), (0.3081,)) converts [0, 1] to ≈ (-0.42, 2.82). But Generator can not generate values greater than 0.9999... ≈ 1, so it will not generate 2.8 for white color. That is why transforms.Normalize((0.1307,), (0.3081,)) will not work. P.S. To use transforms.Normalize((0.1307,), (0.3081,)) you should multiply nn.Tanh() with 2.83 ≈ nn.Tanh() * 2.83 ≈ (-2.83, 2.83)
I think in 18:22 usung detach is better. for one thing retain_graph = True cost more memories and for another if we dont use detach we optimize the paras in G when we train D
if we use detach, what is the point of disc_fake? disc_fake = disc(fake.detach()).view(-1) and if we do a backward() we get no grads out of it(because fake.detach()'s require_grad=False) which means no update happens here
Hello. Thx for the video. I tried this code exactly except that i used 400 epoch. But still fake images are like noises. How did you get this results on the tensorboard. Can you please share the hyperparams that you used?
If you're using Google Colab Just add these lines: %load_ext tensorboard # To load the tenserboard notebook extension %tensorboard --logdir logs # before training your model
out of experience, mixing relu with tanh does not work super well, this is also a point you might add to your final possible improvements list, like only use tanh for the whole generator.
Thank you so much for the material, this is awesome! I have a small question. Why would it be `disc.zero_grad()` instead of `opt_disc.zero_grad()`? in general, are these 2 statements interchangeable?
Heyo, awesome vid as always! I wanted to ask you if you could do some variational autoencoders in pytorch & maybe also cover some of the mathematics of the special variants, if you are interested (i.e. as you're doing for GANs)? :)
Hey, so this simple GAN generates any number? What I mean is, the neural networks have not learnt the features of 0, 1, 2, 3, ... individually, they have learnt what features make up a number in general? Then when z, the random sample from a distribution, is plugged into the generator, it generates a random number because of the noise it was given? Hence, the results could be better if you created a GAN pair for each individual number, which would obviously take a lot more training time and the networks would be mutually exclusive and not random, so you'd have a GAN pair that generates a fake version of every digit.
Hi can you explain why we would use BCE loss on the Generator as well and why we would compare it to a tensor of 1s? It makes sense to me to use it for the discriminator as it is a classifier, but is the generator not doing some form of regression?
This video is was really helpful, but what if I don't want to use the MNIST dataset and I want to use my own dataset from my local machine, please how do I go about it?
I have separate videos on how to use custom datasets, for something written I highly recommend: pytorch.org/tutorials/beginner/data_loading_tutorial.html
these are the parameters you can change according to a known distribution to use the generator to produce images. I guess 64 is way to high for mnist. maybe you can use 10 so you can blend any of the digits.
At discriminator we want max log(D(real)) + log(1-d(g(z))). Since loss functions work by minimizing error we can minimize - (log(D(real)) + log(1-d(g(z)))). The bceloss is similar to min the above written loss. So it works fine. At Generator we want to max log(d(g(z))). Could you please explain how criterion(output, torch.ones_like(output)) maximizes log(d(g(z)))? because the loss function is ln =−wn [yn.logxn+(1−yn)⋅log(1−xn)]. According to your code aren't we trying to maximize -log(d(g(z)))? because there is a negative in loss function. shouldn't we add negative in our training phase? please explain me. I am stuck here
@@madhuvarun2790 can you please elaborate . as i see it on discriminator side, loss_real = - (log(D(real)) and loss_fake = - log(1-d(g(z)))).. but its still minimizing right ? I cant understand how thats maximizing the loss, the same doubt with generator loss
@@asagar60 Yes. It is minimizing the loss. I was wrong. At discriminator we are minimizing -(log(d(real)). At generator we are minimizing -log(d(g(z)))
Question: 18:18 Code line 77. We have to compute disc(fake) twice? can't we simply write: "output = disc_fake"? (I thought we add retain_graph=True in order to avoid the computation of the disc(fake) twice)
We do retain_graph so that we don't have to compute fake twice so we can re-use the same image that has been generated. We send it through the discriminator again because we updated the discriminator, and they way I showed in the video is the most common setup I've seen when training GANs. Although it would probably also work if you did reuse disc_fake from previously
you will need to call .detach() on the generator result to ensure that only the discriminator is updated! line 69 should pass fake. detach() so generator weights get removed from the computation graph and there is no need to retrain_graph of discriminator since you will not use it again I think
@@zyctc000 opt.disc graph is connected to generator it will update generator layers weights Opt.disc see discriminator and generator as a 1 network Opt.gen optmizer will not affect discrimintor but not vice virsa
Is it normal that this easily takes 1-2 hours for 50 epochs? I first ran it on my computer which unfortunately has no nvidia GPU. Then I tried it on Google Colab, which originally had it running on its CPU too. So I changed their Hardware acceleration to GPU, aaaaand... if it's faster, then not by much. Is that normal? Does this not benefit significantly from GPUs?
Hey, how did you overcome this error in colab? TypeError Traceback (most recent call last) in () 1 for epoch in range(num_epochs): ----> 2 for batch_idx, (real, _) in enumerate(loader): 3 real=real.view(-1, 784).to(device) 4 batch_sz= real.shape[0] 5 4 frames /usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py in __getitem__(self, index) 132 133 if self.transform is not None: --> 134 img = self.transform(img) 135 136 if self.target_transform is not None: TypeError: 'module' object is not callable
@@drishtisharma3933 Hey Drishti! Yes, I was able to overcome this error but I do not remember the exact changes I made to the code. I could share my colab notebook for your clarity. Honestly, I didn't try your approach. I was following the video as a code-along. Link: colab.research.google.com/drive/1l1Vt7mcoEQKFxxVbpQOeKZ-UiEHU9ggt?usp=sharing
I'm admittedly a noob to all of this, but I keep getting this "TypeError: __init__() takes 1 positional argument but 2 were given" and I can't figure out how to resolve the issue, any advice would be appreciated
it seems like while defining the class method you originally coded a method which takes one argument , but while calling the same method as object you provided two arguments in there. eg. def lets_solve(error): pass #Instantiating an object now solution = lets_solve(error, YOU PROVIDED ONE EXTRA ARGUMENT HERE) YOU PROVIDED ONE EXTRA ARGUMENT HERE ----> denotes the extra argument which you shouldn't have provided going by the original code which takes just one arg. Hope this makes sense. Good luck!
Generator don't identify. It only generates. To minimize loss, is to make the generator generate samples very close to real in order not to be identified by the discriminator
If you can't follow this, then you're not ready yet. Start with python basics and work your way up. Plenty of videos out there. Alladin's videos are gold, and when you're ready, you'll appreciate them more.
!python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" - what is the relevance of this to the GAN you have worked on in this video
Really love this series man!! Just a quick question though why did we use fixed_noise and noise differently. In the training part can we not have used fixed_noise as input to generator because noise is noise right? Does it matter if we start from the same point?
Fixed noise is used to display the images to track the progress of the GAN. Fixed means it doesn't change over time, so if you were to use this in training, you would be feeding the GAN the same vector over and over again, and the GAN would only be able to generate a single image, and the rest of the latent space would remain unexplored.
Very good explanation of each and every line of code. Can you please make a video on how to optimize GANs with Whale Optimization Algorithm. i have to do my project in GAN and this is my base paper "Automatic Screening of COVID‑19 Using an Optimized Generative Adversarial Network". I have searched a lot about how to optimize GANs with WHO but couldn't find any related result. please help me as you have a detailed knowledge about GANs.
Perhaps I didn't show it in the video but you have to run it through conda prompt (or terminal etc). I have more info on using tensorboard in a separate video so I was kind of assuming that people knew it but I could've been clearer on that!
@@AladdinPersson this is new for me so I’m still learning all the tools. Please keep doing tutorials btw!! You have been helping me learn AI so much faster due to your pytorch implementations.
Roughly saying, we want the discriminator to estimate the *probability that its input is real*. Therefore the desired output for disc(real) is 1, and 0 for disc(fake).
The loss in GANs don't tell us anything really (one will go up when the other goes down and vice-versa). The only thing you want to watch out for is if discriminator would go to 0 or something like that, so that would be the case if one of them "takes over"