DataMListic

DataMListic

150
646 842

Подписаться

Welcome to DataMListic (former WhyML)! On this channel I explain various machine learning concepts that I encounter in my learning journey. Enjoy the ride! ;)

The best way to support the channel is to share the content. However, If you'd like to also support the channel financially, donating the price of a coffee is always warmly welcomed! (completely optional and voluntary)
► Patreon: www.patreon.com/datamlistic
► Bitcoin (BTC): 3C6Pkzyb5CjAUYrJxmpCaaNPVRgRVxxyTq
► Ethereum (ETH): 0x9Ac4eB94386C3e02b96599C05B7a8C71773c9281
► Cardano (ADA): addr1v95rfxlslfzkvd8sr3exkh7st4qmgj4ywf5zcaxgqgdyunsj5juw5
► Tether (USDT): 0xeC261d9b2EE4B6997a6a424067af165BAA4afE1a

AI Weekly Brief - Week 1: OpenAI o1-preview, DataGemma, AlphaProteo

4:27

AI Weekly Brief - Week 1: OpenAI o1-preview, DataGemma, AlphaProteo

19 часов назад

Covariance Matrix - Explained

3:33

Covariance Matrix - Explained

День назад

The Bitter Lesson (in AI)...

8:10

The Bitter Lesson (in AI)...

28 дней назад

Marginal, Joint and Conditional Probabilities Explained

5:40

Marginal, Joint and Conditional Probabilities Explained

Месяц назад

Least Squares vs Maximum Likelihood

4:49

Least Squares vs Maximum Likelihood

2 месяца назад

AI Reading List (by Ilya Sutskever) - Part 5

3:50

AI Reading List (by Ilya Sutskever) - Part 5

3 месяца назад

AI Reading List (by Ilya Sutskever) - Part 4

4:27

AI Reading List (by Ilya Sutskever) - Part 4

3 месяца назад

AI Reading List (by Ilya Sutskever) - Part 3

4:48

AI Reading List (by Ilya Sutskever) - Part 3

3 месяца назад

AI Reading List (by Ilya Sutskever) - Part 2

5:02

AI Reading List (by Ilya Sutskever) - Part 2

3 месяца назад

AI Reading List (by Ilya Sutskever) - Part 1

4:31

AI Reading List (by Ilya Sutskever) - Part 1

3 месяца назад

Vector Database Search - Hierarchical Navigable Small Worlds (HNSW) Explained

8:03

Vector Database Search - Hierarchical Navigable Small Worlds (HNSW) Explained

4 месяца назад

Singular Value Decomposition (SVD) Explained

5:40

Singular Value Decomposition (SVD) Explained

4 месяца назад

ROUGE Score Explained

3:27

ROUGE Score Explained

4 месяца назад

BLEU Score Explained

5:48

BLEU Score Explained

4 месяца назад

Cross-Validation Explained

3:38

Cross-Validation Explained

5 месяцев назад

Sliding Window Attention (Longformer) Explained

3:51

Sliding Window Attention (Longformer) Explained

5 месяцев назад

BART Explained: Denoising Sequence-to-Sequence Pre-training

3:36

BART Explained: Denoising Sequence-to-Sequence Pre-training

5 месяцев назад

RLHF: Training Language Models to Follow Instructions with Human Feedback - Paper Explained

20:28

RLHF: Training Language Models to Follow Instructions with Human Feedback - Paper Explained

6 месяцев назад

Chain-of-Verification (COVE) Reduces Hallucination in Large Language Models - Paper Explained

27:43

Chain-of-Verification (COVE) Reduces Hallucination in Large Language Models - Paper Explained

6 месяцев назад

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits - Paper Explained

13:59

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits - Paper Explained

6 месяцев назад

LLM Tokenizers Explained: BPE Encoding, WordPiece and SentencePiece

5:14

LLM Tokenizers Explained: BPE Encoding, WordPiece and SentencePiece

6 месяцев назад

Hyperparameters Tuning: Grid Search vs Random Search

3:15

Hyperparameters Tuning: Grid Search vs Random Search

7 месяцев назад

Jailbroken: How Does LLM Safety Training Fail? - Paper Explained

26:16

Jailbroken: How Does LLM Safety Training Fail? - Paper Explained

7 месяцев назад

Word Error Rate (WER) Explained - Measuring the performance of speech recognition systems

2:44

Word Error Rate (WER) Explained - Measuring the performance of speech recognition systems

7 месяцев назад

Spearman Correlation Explained in 3 Minutes

3:15

Spearman Correlation Explained in 3 Minutes

7 месяцев назад

Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings

3:40

Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings

8 месяцев назад

LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

8:11

LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

8 месяцев назад

Kullback-Leibler (KL) Divergence Mathematics Explained

3:21

Kullback-Leibler (KL) Divergence Mathematics Explained

8 месяцев назад

Covariance and Correlation Explained

4:36

Covariance and Correlation Explained

8 месяцев назад

Комментарии

@BrunoMorency_perso День назад

Great video, thanks. Still struggling with the origin of the loss of that degree of freedom at the end. You said "if we have the sample mean, we don't need to know all 3 data points but actually only 2 because the 3rd can be estimated using the sample mean and the other 2 points." This makes sense, I agree that knowing the sample mean and 2 of the 3 points gives you the third. Where I'm struggling is how can you know that sample mean unless all three data points are known and set freely? In other words, how could you know the mean of a sample of three points if you only know two points in that sample? In other words, don't you need to know all three points in order to get the sample mean that you then use to say that third isn't necessary?

@BrunoMorency_perso 18 часов назад

I think I got through it. So yes, you obviously need to know all values in your sample in order to calculate the sample mean. Where the N-1 comes in is when we then want to calculate the variance of that sample of which you now know the mean. We know from the proof you linked to in your description that sample variance calculated using N is *not* a good estimate of pop. variance since it's biased to too small of a value. Why is it so? Variance of a sample makes no conceptual sense unless the mean of the sample is know and set (variance sets how much elements in the sample differ from the mean). So to figure the sample variance, the mean must be set and once the mean is set, how many elements are free to vary for the sample mean to remain? The answer is N-1 as the Nth is no longer free to vary for the mean to remain as it is. So only N-1 elements actually contribute to the variance of a sample with a given sample mean. Calculating the sample variance with N-1 rather than N will therefore give of a better estimate of the population variance (which would be calculated with N). It's a bit like hiding a ball under 1 of 3 cups. Once you know the ball must be under one of the cups, you only need to check a max of 2 cups to know under which one it will be (because if it's neither cup A nor cup B, it must be cup C, no need to check it).

@Coder.tahsin 2 дня назад

Hi , I have just watched lots of your explanation of basic building blocks of AI,ML they are just spectacular! would love more that type of thing.....

@datamlistic 5 часов назад

Thanks for the feedback! More content is coming! :)

@mohammedrumaan2704 3 дня назад

Another lovely episode!! Excited for more!

@datamlistic 5 часов назад

Thank you!

@mohammedrumaan2704 4 дня назад

Love the series! Please keep going❤️

@datamlistic 5 часов назад

Thank you! Will do!

@Denice-x9l 4 дня назад

Thomas Cynthia Brown Christopher Rodriguez Gary

@datamlistic 4 дня назад

The full AI Weekly Brief series can be found here: ru-vid.com/group/PL8hTotro6aVFDF2At8osPsTC0Jrsoxn9B

@datamlistic 4 дня назад

The full AI Weekly Brief series can be found here: ru-vid.com/group/PL8hTotro6aVFDF2At8osPsTC0Jrsoxn9B

@agathakeatase9243 5 дней назад

Martinez Patricia Anderson Robert White Dorothy

@CyanogenEstimate 5 дней назад

Anderson Daniel Clark Mary Hernandez George

@RyanKelly-bv3bb 6 дней назад

Davis Timothy Robinson Michelle Davis Margaret

@SweetJeff-x7g 6 дней назад

Rodriguez Kimberly Thompson Thomas Jackson George

@datamlistic 6 дней назад

Full video link: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-ekZQthaCrfU.html

@houstonfirefox 6 дней назад

Great video. My understanding is that you would almost always use Bagging, evaluate the results and, if good enough, stop there. However, you COULD go on to try various boosting methods to see if the model improved even more but at what cost? If the best boosted model (Adaboost, XGBoost, etc) performed 1% better but took 3x longer to compute then boosting the already-bagged models might not be worth it right? Still trying to cement in my mind the process flow from a developer standpoint 😉

@Melissa-zi4xq 6 дней назад

Harris Jason Johnson Nancy Walker Jessica

@BurneJonesClaire-b1v 7 дней назад

Lewis Timothy Allen Larry Thompson Larry

@FinnTobey-j6l 7 дней назад

Jones Melissa Jones Michelle Davis Kevin

@KmfyjnxdgcbCh 7 дней назад

White Frank Thomas Ronald Martin Christopher

@JamesaGray-b1l 7 дней назад

Miller Matthew Harris Scott Garcia Melissa

@informatikuz 8 дней назад

Interesting video, didn't hear about google DataGemma, really a fascinating concept, thanks! I would also like to add that the first part of the video could have been more dynamic, I felt like up to DataGemma included the images were too "still" and didn't provide much in simplifying/visualizing the concepts, when talking about o1 reasoning, for example a showcase of its Chain of Thought output could have been helpful; from then on I liked the pace and the information provided, so good job! Waiting for the next week one!

@datamlistic 8 дней назад

Thank you so much for the feedback! Really appreciate it!

@informatikuz 8 дней назад

No worries, keep doing what you're doing 😉

@shklbor 9 дней назад

so how does it do open set object detection exactly? by making the cross-modality features lie close to the textual features in the embedding space using contrastive loss, it automatically learns to decode the correct bounding boxes for visual objects it hasn't been trained on as well? seems like magic

@waisyousofi9139 9 дней назад

very intuitive explanation. Can you please let us know, where can we interact with that visualization?

@datamlistic 8 дней назад

Well, I've written them in Python when I created them. Not really sure if I still have them, but I can check out ideas you're still interested.

@AkshayKadamIN 10 дней назад

Good Explanation Provided. Thank you vey much for this.

@datamlistic 8 дней назад

Thanks! Glad it was helpful! :)

@afollowerofchrist5789 11 дней назад

Those are gold! Thank you so much for this wonderful effort! A question out of pure curiosity: How long did it take you to attain such a level of knowledge?? I'm learning on my own at my own pace, revising things that I may come across, and it's just an endless pool of knowledge. Yet... you seem to already know most of these and are even able to teach them in a very intuitive way!

@datamlistic 4 дня назад

Thanks! Glad you liked the explanation! Regarding your question, I am 1000% not the most knowledgeable person in the ML space, I know many, many people that know more than I do. However, what I can say is that if you study any field long enough, you encounter certain concepts (like the covariance matrix) again and again in different scenarios, and you tend to get a deeper understanding of how it works. Then, it gets easier to explain it.

@datamlistic 12 дней назад

The link to the multivariate normal distribution can be found here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-UVvuwv-ne1I.html

@donfeto7636 14 дней назад

Your video is great, and the presentation is great. the english was bit hard to understand

@datamlistic 8 дней назад

Thanks for the feedback! Yeah, I'm not a native speaker of English so I may misspronounce some words. I hope this hasn't caused any confusion.

@emanuelgerber 20 дней назад

Very nicely explained! Thank you for making this video

@datamlistic 19 дней назад

Glad it was helpful! :)

@John-hw2ys 20 дней назад

Found no one else explains the actual concept like this video does ..thanks

@datamlistic 19 дней назад

Thanks! Glad you think so! :)

@a7med7x7 21 день назад

Amazing explanation ❤

@datamlistic 19 дней назад

Glad you think so! :)

@ajamessssss 21 день назад

Omg. This video answers a lot of questions in my mind. Thank you.

@datamlistic 19 дней назад

Glad it was helpful! :)

@EddieVBlueIsland 24 дня назад

For a normal distribution least squares and maximum likelihood are identical

@datamlistic 22 дня назад

Normally distributed errors*

@LouisRobinLouisRobin-i9b 24 дня назад

Jones Daniel Garcia Paul Young Brian

@datamlistic 23 дня назад

?

@Hateusernamearentu 24 дня назад

I like the definition on the formula at 3:45. Everything is so clear this way. Thanks!!!!!!

@datamlistic 22 дня назад

Happy to hear that! :)

@AlbanDemirie 25 дней назад

Lewis Sandra Wilson Kenneth Lewis Richard

@datamlistic 22 дня назад

?

@alsonyang230 25 дней назад

Thanks for the explanation, there is one thing that I hope you can elaborate on. With the weight matrix randomised, why is it easier for the NN to learn Zero Matrix compared to the Identity matrix?

@datamlistic 22 дня назад

Good question! IMO mainly because you also usually use weight regularization (i.e. L2) in your final loss, so the NN can easily shrink the weight's matrix values to 0.

@alsonyang230 18 дней назад

@@datamlistic Makes sense, thanks for the explanation!

@TambraToepel-s3t 25 дней назад

Grady Crossing

@datamlistic 22 дня назад

:)

@usmansiddiqui7821 25 дней назад

turn up the voume

@datamlistic 22 дня назад

Thanks for the feedback! I did that in the other videos. Please check them out! :)

@AshishRaj04 26 дней назад

This video is a gem

@datamlistic 23 дня назад

Thanks!

@OmarAmil-n7u 27 дней назад

thank you for this video <3

@datamlistic 23 дня назад

You're welcome! Glaf you liked it! :)

@MayedaMoni 28 дней назад

Harris Jeffrey Young Christopher Allen Eric

@datamlistic 22 дня назад

?

@maheswaranparameswaran8532 28 дней назад

Hey, have a question, isnt there a risk of getting stuck in a local optima when comparing similarity between query node and db nodes in the graphs?

@datamlistic 22 дня назад

Good question! Of course there's always a chance of getting stuck in a local optima because you are basically using a greedy algorithm here, and that's why you usually perform the search algorithm multiple times, so you reduce the chance of that happening.

@boonivo 29 дней назад

Why are the amplitudes at 2:51 so much bigger compared to 3:40?

@datamlistic 22 дня назад

I've checked the scripts I used to generate the plots for this video. It seems that in the plot at 2:51 I forgot to normalize the amplitude (i.e. divide by half of the max frequency). Sorry for the confusion this may have caused!

@zefmgamer3843 29 дней назад

You have no idea how much I appreciate you. Thank you so much, you made these unintuitive-looking equations make so much sense.

@datamlistic 22 дня назад

You're very welcome! Glad it helped! :)

@datamlistic Месяц назад

If you're interested in learning more about AI, you can check out the following reading list: ru-vid.com/group/PL8hTotro6aVGtPgLJ_TMKe8C8MDhHBZ4W&si=u9Gk38MaQ7VLH3lf

@RumayzaNorova Месяц назад

Very good explanation :)

@datamlistic Месяц назад

Thanks! Glad you liked it! :)

@CharlesJackson-i7q Месяц назад

Jones Ruth Johnson Brian Moore Gary

@datamlistic Месяц назад

?

@mingqiang-wu4mv Месяц назад

how to use the GMM in the kalman filter?

@datamlistic Месяц назад

Unfortunately, I'm not so familiar with the kalman filter to get into such details. :(

@MaxPenelope-w4j Месяц назад

Hall Donald Jones Christopher Gonzalez Jennifer

@datamlistic Месяц назад

?

@pranavs7844 Месяц назад

Underrated channel, great job bro!

@datamlistic Месяц назад

Thanks!

@datamlistic Месяц назад

Full video link: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-xu-HhF3SpbE.html