K-shot learning is a hot topic in research. Let's understand one of the first core algorithms introduced to train meta-models: Model Agnostic Meta Learning (MAML).
can anyone say if MAML can be applied on binary classification or not? if we have a data set that contain only Dog vs Cat images, why we need to apply with MAML...
What I don't understand about MAML is that there is a parameter Phi in some methods, representing the inner loop training, where in your diagram can be related to the update of phi? Thank you very much.
6:10, how can I backpropagate? In the backpropagation algorithm for weights of a layer, they need input and output_loss which went through the layer. query set has never gone through the original model. How can I calculate gradient descent? I can't under stand...
hi. Each support set is passed to it's respective copy of the model, and then an overall loss is calculated - this I find is easiest to see as a single function for the overall loss at 4.38 ... Now we can calculate the gradient of the overall loss wrt to the original parameters, because the initialisation of each model copy is with the original model parameters (which are tied/same across the copied models) -> i.e. we have a differentiable function from the original model parameters to the output loss which can be differentiated and thus we can do back propagation to calculate this gradient (and then update the model parameters)... hope this is helpful?
Hi! Transfer learning is where a model trained on one task can be applied to another (usually similar task) while meta learning is generally trying to understand things like which parts of their data is most valuable / which approaches generate the best predictions on a given dataset (using machine learning techniques).
Could you clarify what you mean by ensemble 'learning'? Do you mean training independent models with different seed initialisations and averaging their predictions?
@@TwinEdProductions yes, I thought about it more and read more about it and I realized that maml creates a generalized model that can be later used to learn more specific things later on, whereas in Ensemble learning (all kinds) we only train on the specific task.