To learn more about Lightning: github.com/PyTorchLightning/pytorch-lightning To learn more about Grid: www.grid.ai/ Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
Amazing!.... we actually need a video explaining the different normalization methods used in scRNA seq analysis especially SCTransform.. Appreciate your support, Thanks
Hey Josh, thanks for the videos and the great content! I was wondering if you can make a video about causal inference, that would be great (for me lol), thanks.
Thank you for the great video! One question: What happens if there are multiple closest neighbors with the same distance? Then there will be multiple similarity scores = 1. Then changing sigma might not help to get close to log2(num_neighbors) for the sum of similarities.
I'm not sure what the technical details are exactly, but I would guess it simply finds the value for sigma that gets the sum closest to the ideal value. It doesn't have to be exact.
Hi Josh! Is it possible, that you make some videos for time series related topics like: serial correlation or the box jenkins method? And thank you for all the videos you made in the last years. They are awesome :-)
Hey, Josh! Can you make video(s) about likelihood and MLE for an unknown distribution, if it can't be easily approximated or it's impossible to approximate? Because everyone talks about well-known distributions, but literally nothing about working with something unknown. Of course it would be better, if you decided to make a playlist with everything about unknown distributions, but couple of videos is also OK
I'll talk about this topic when we cover Bayesian statistics. That said, even when the distribution is unknown, the central limit theorem ( ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-YAlJCEDH2uY.html ) results in a known (gaussian) distribution. And that means it doesn't matter what the original distribution is.
Hi Josh, could you please take some time to give some explanation about Markov chain decision process and it application in ML and how do we code it in R. Thanks
Hello could you bring Spectral Embedding topic, and I have a question regarding UMAP how is it possible that UMAP is faster than T-SNE, in which part does it beat the t-sne? Since logically T-SNE moves more point in a time compared to UMAP(?) Correct Me If I Wrong and BTW we are hoping the new updates ❤
Hi Josh, I saw both of your videos for UMAP, but I have a doubt regarding how you did the low-dimension graph. I knw you mention spectral embedding an what I know from that is that you calculate the Laplacian of the graph and then get the eigenvalues and eigenvector and the coordinates of the new dimension will be the values of the eigenvector for the lowest eigenvalue (ignoring zero eigenvalues). But, when I try that for your data, I am not able to get the values that you showed, so, I wanted to know if you did something different. Also I wanted to know if for more than 1 dimension, I will use more than 1 eigenvector right? For the 2d case the x-y coordinates will be the values of the first and second eigenvectors of the lowest eigenvalues. Thanks
To be honest, I just drew the low-dimensional graph in a way that I thought would best highlight how UMAP works, rather than stay faithful to how spectral embedding would have projected the points. In other words, I completely ignored spectral embedding when I drew the low-dimensional graph and only took pedagogical aspects into consideration. I'm sorry if this caused confusion. :(
If you look at the number line that the points are on, you'll see that point 'b' is is at about 1.8 and point 'a' is at about 3.9. Now we just do the math: 3.9 - 1.8 = 2.1. Bam.
@@statquest thanks. In fact that’s the main reason of why I am watching all your videos of likelihoods, gaussian distribution etc jajaja (they are great btw)
Hi Josh! Thank you for the info! It's really helpful. What to do if I have zeros or NAs in my dataset? I couldnt find anything on imputation before UMAP on Google :(
Hi, thank you very much! Just one question. In your case you could compute the initial distances between the data points as euclidean distances because you are only working wuth two features. How are they computed when you have much more features, do you always start with euclidean distances??
The Euclidean distance works for more than 2 features, en.wikipedia.org/wiki/Euclidean_distance so there's no problem adding more features. That said, if you wanted to use a different distance metric, it would probably be OK.
Thank you very much! I also realized there is a little error in the video. It is the part that you say the result seems strange to you. To compute the Symmetrical Score they don't take what you call the "Similarity scores". What they do is to iterate over the y nearest neighbours and compute the distance between x and y as the maximum value between 0 and the distance between x and y minus the distance to the nearest neighbour of x; all of this divided by the previously learnt sigmas. Then they exp-1 this dist to get a Similarity score between x and y (saved as the probability of this element in the fuzzy set). Once you have all this fuzzy set you apply the t-conorm you mention over this Similarity scores to have the Symmetrical score. I hope this is helpful to you, and I also hope not being wrong hehe. Thank you very much!@@statquest
Hi Josh I was wondering how you feel about using some stills from your channel to explain these types of plots prior to displaying them? This would be done I'm an educational setting and I would credit the channel and provide a link if that is okay?