Тёмный
Prince Canuma
Prince Canuma
Prince Canuma
Подписаться 968
Founder of Kulissiwa.com | Ex - ML Engineer at Neptune.ai.
In this channel, you'll learn about:
- MLOps (Machine Learning Operations),
- LLMs (Large Language Models),
- RAG (Retrieval Augmented Gneerations) applications
Want AI project support? My in-house team offers customized solutions. Book a call below.
Комментарии
@haaanhson
@haaanhson 19 дней назад
Hi Bro, I understand that in inference process, we have seq_len = 1 and use KV cache, no need causal mask I wonder that, why in traning process, with seq_len > 1 , do not have causal attention mask, this can be lead to the leak information of word behind Could explain about that ?
@aliasad7086
@aliasad7086 Месяц назад
Great content. Really appreciate the step by step execution. Makes it a lot easier to understand.
@jacksmith-ih9rm
@jacksmith-ih9rm Месяц назад
Great!
@tharunbhaskar6795
@tharunbhaskar6795 2 месяца назад
Now hwow to scale it, like how to run it in mltple GPUs? or in Multiple nodes having multiple GPUs?
@princecanuma
@princecanuma 2 месяца назад
You can use Huggingface TRL, transformers, accelerate or Axolotl to that.
@skanderbegvictor6487
@skanderbegvictor6487 2 месяца назад
Subscribed, been following you on twitter, I am currently trying to write custom kernels for graph machine learning in mlx and am stuck.
@princecanuma
@princecanuma 2 месяца назад
Great to hear 👌🏽 keep up the good work
@vinaypandya7054
@vinaypandya7054 16 дней назад
@@princecanuma I was able to contribute to mlx_graphs because of this. I also created mlx-cluster for fast er random walk generations. I will keep working on it, thank you for inspiring us
@AZisk
@AZisk 2 месяца назад
good sound. good video. looking forward to seeing it in action
@princecanuma
@princecanuma 2 месяца назад
Thank you very much! ❤️ I’m using a synco A2 mic.
@MaziyarPanahi
@MaziyarPanahi 2 месяца назад
A complete walk through! Thank you. king!
@princecanuma
@princecanuma 2 месяца назад
My pleasure @MaziyarPanahi
@gokayfem
@gokayfem 2 месяца назад
lets go king!!
@princecanuma
@princecanuma 2 месяца назад
Let’s go 🚀
@Create-The-Imaginable
@Create-The-Imaginable 3 месяца назад
What did you say in the beginning? 😃 What language was that? 🤔
@princecanuma
@princecanuma 3 месяца назад
I said “hi, my name is Prince Canuma” in Polish 🇵🇱
@liyanan2004
@liyanan2004 3 месяца назад
Could you please make a tutorial on vlm, and how it works. Like this series of videos, from scratch.
@princecanuma
@princecanuma 3 месяца назад
That’s a great idea! 💡 Will do 👌🏽
@Tuscani2005GT
@Tuscani2005GT 3 месяца назад
This channel is pure gold. Keep it up!
@princecanuma
@princecanuma 3 месяца назад
Thank you very much! Glad you enjoy it :) There is a lot more coming soon 🚀
@sharjeel_mazhar
@sharjeel_mazhar 4 месяца назад
So in this series, you don't use any pre-trained weights? You build and train the model from scratch on a custom dataset?
@marinepower
@marinepower 4 месяца назад
Removing every other layer or something along those lines would be much more effective. If you think about it, this just means that one layer needs to do the work of two layers (one layer + one missing layer). Whereas if you just lop off half the network you suddenly need to learn 16 layers worth of processing in one fell swoop. And not only that, but your old layers need to be retrained since it is no longer sufficient for them to just do their one layer of work they were doing before. Basically, removing every other layer is a finetune, lopping off half the network is a cataclysmic change that (almost) requires training a brand new model from scratch.
@marinepower
@marinepower 4 месяца назад
The only thing that saves this technique is using the learned embeddings / the learned output layer, but you get that with strided layer removal too. Wish I had seen this video earlier, I'd have saved you $500 lol.
@wilfredomartel7781
@wilfredomartel7781 4 месяца назад
😊
@wilfredomartel7781
@wilfredomartel7781 4 месяца назад
😊🎉
@RadRebel4
@RadRebel4 4 месяца назад
Amazing Video ! Could you Please Upload The traning scripts as well
@princecanuma
@princecanuma 2 месяца назад
They are available in the video description. It’s an Axolotl config file
@fliptip
@fliptip 4 месяца назад
such a high quality content piece
@princecanuma
@princecanuma 2 месяца назад
Thank you very much!
@sharjeel_mazhar
@sharjeel_mazhar 4 месяца назад
Can you please make sure that your future videos have higher resolution? Maybe 1440p or above? Other than that, great job! 💯
@linz4213
@linz4213 4 месяца назад
Well made Prince! Learned a lot
@maslaxali8826
@maslaxali8826 4 месяца назад
CS programmers are vampires. My eeeeyyyes. great content though
@sergey_a
@sergey_a 4 месяца назад
Why are there only 3 likes, I put 4 on HF.)
@spkgyk
@spkgyk 4 месяца назад
Why do you use 32 bit paged optimzier when the model is being fine-tuned with QLoRA? Surely QLoRA stores the weights in 8bit double quantized form, so using a 32 bit optimizer makes no difference, and the weight updates need to be converted back to 8 bit anyway? Please help me understand this
@princecanuma
@princecanuma 4 месяца назад
Additionally, 8bit states are dequantized to 32bit for the update anyways. huggingface.co/docs/bitsandbytes/main/en/explanations/optimizers
@spkgyk
@spkgyk 4 месяца назад
@@princecanuma Thank you for the quick response. With 8-bit optimizers, large models can be finetuned with 75% less GPU memory without losing any accuracy compared to training with standard 32-bit optimizers. The reduced memory requirements means 8-bit optimizers are 4x faster than a standard optimizer, and no hyperparameter tuning is required. Surely this means that using 32 bit just wastes compute power? Please correct me if I'm wrong, I'm really trying to understand the benefits. Is it because training with 32 bit means that despite converting to 8 bit for the weight update, the conversion leads to small accuracy gains?
@princecanuma
@princecanuma 4 месяца назад
There are no accuracy gains only reduced GPU usage and potentially some extra speed. In terms of speed, I personally didn’t notice any changes. I tested it yesterday and besides reduced GPU usage I noticed that it would take just as long as the 32bit to complete training.
@PaoloTshiyole
@PaoloTshiyole 4 месяца назад
Your English is nice
@princecanuma
@princecanuma 4 месяца назад
Thank you very much!
@leiray7465
@leiray7465 4 месяца назад
cool
@princecanuma
@princecanuma 4 месяца назад
Awesome, I’m happy you liked it :)
@kishoretvk
@kishoretvk 4 месяца назад
Thanks for committing to the open source and educating people on cutting edge knowledge.
@princecanuma
@princecanuma 4 месяца назад
Most welcome, it’s my pleasure!
@yoanijosias
@yoanijosias 4 месяца назад
Very good, can’t wait to see updates to it.
@princecanuma
@princecanuma 4 месяца назад
You and me both!
@vivekpadman5248
@vivekpadman5248 4 месяца назад
Bro how did you train llama 3 without paper?
@princecanuma
@princecanuma 4 месяца назад
Could you elaborate?
@vivekpadman5248
@vivekpadman5248 4 месяца назад
@@princecanuma As far as I know there hasn't been an official llama 3 paper released and no data Info as well. But I could be wrong... 😅
@princecanuma
@princecanuma 4 месяца назад
@@vivekpadman5248 true, they only released a blog detailing the data, model arch and performance. Here is how I did it: Llama-3 has the same exact architecture of Llama-2 which we already covered in this channel. ru-vid.com/group/PLDn_JsyofyfQp4td_ub6LfIg5vxyu6YJK&si=0Gyt9mdaA-ydiWOA Finally, if you understand how these models work you don't need the paper, the code implementation is more than enough.
@vivekpadman5248
@vivekpadman5248 4 месяца назад
@@princecanuma oh understood, thanks I'll check it out and also your video 💙
@princecanuma
@princecanuma 4 месяца назад
Most welcome :)
@ngamcode2485
@ngamcode2485 5 месяцев назад
this is very impressive and great content. thank you
@princecanuma
@princecanuma 4 месяца назад
You're very welcome!
@jihoonjung2776
@jihoonjung2776 5 месяцев назад
Best video i ever seen. thanks~~!~!~!~!
@princecanuma
@princecanuma 5 месяцев назад
Most welcome!
@princecanuma
@princecanuma 5 месяцев назад
It’s my pleasure
@sheikhakbar2067
@sheikhakbar2067 5 месяцев назад
Command-R is one of the best models out there for non-English / non-European languages. In Arabic I tried it, it's almost perfect, not as good as Claude (which also perfect for Arabic), but as far as I understand command-R from cohere (the community version I guess) is free! Is that true, it's free (I know command-R-plus is not free).
@kishoretvk
@kishoretvk 5 месяцев назад
Super impressive. Great value One question How do I further train the model on my custom content Instead of LORA ? Can we further full training it and add new memory
@princecanuma
@princecanuma 5 месяцев назад
Most welcome! You can do that, but that can be very expensive.
@AC-go1tp
@AC-go1tp 5 месяцев назад
This is very thoughtful and great initiative! researchers with enough gray matter but limited means can be still in the game . Thank you PC🙏!
@princecanuma
@princecanuma 5 месяцев назад
Most welcome! It’s my pleasure:) I lived through this so others don’t have to.
@ojasvisingh786
@ojasvisingh786 6 месяцев назад
🥳🤩👏💐
@philgoddard8606
@philgoddard8606 6 месяцев назад
Thank you for the really nice entry into using gemma locally! Could you share how to utilize GPUs on mac - i just got a mac studio and saw you had referenced some code earlier for NVIDIA. Thnks in advance :)
@princecanuma
@princecanuma 6 месяцев назад
Most welcome! You can use MLX: github.com/ml-explore/mlx-examples/tree/main/llms
@sayantan336
@sayantan336 6 месяцев назад
Great work 🎉. Would be great if you can introduce tutorial on coding GPT and BERT from scratch as well using only Pytorch. And then show how to do their pre training on custom data.
@princecanuma
@princecanuma 6 месяцев назад
Thank you very much! Llama is pretty close to GPT so I think BERT is more differentiated. What kind of data would you suggest?
@morningstar3996
@morningstar3996 6 месяцев назад
Can we have the presentation please?
@princecanuma
@princecanuma 6 месяцев назад
Sure, here you go! www.canva.com/design/DAF7MlJ2Zoc/f75ryYIZnLc80NlIFZhS5A/edit?DAF7MlJ2Zoc&
@morningstar3996
@morningstar3996 6 месяцев назад
@@princecanuma Appreciate it my friend
@girijeshthodupunuri1300
@girijeshthodupunuri1300 6 месяцев назад
Great video! Learnt a lot.
@princecanuma
@princecanuma 6 месяцев назад
Thank you very much! I’m happy you liked it :) There is so much more on the way.
@girijeshthodupunuri1300
@girijeshthodupunuri1300 6 месяцев назад
@@princecanuma Could you go over how to implement Parent Document retriever?
@princecanuma
@princecanuma 6 месяцев назад
@user-vd7im8gc2w Why do you need position ids? You use it to map the input ids to their respective position in the sequence. Example: input_ids = [100, 20, 4, 50] position_ids = torch.arange(input_ids.shape…) print(position_ids) >> [0, 1, 2, 3]
@Frost-Head
@Frost-Head 7 месяцев назад
Keep up the good work
@princecanuma
@princecanuma 7 месяцев назад
Thank you!
@sayantan336
@sayantan336 6 месяцев назад
Brilliant 🎉
@princecanuma
@princecanuma 6 месяцев назад
Thanks!
@Bebetter11111
@Bebetter11111 7 месяцев назад
First time watching your video. Keep going bro 💪, its your friend Afzal
@princecanuma
@princecanuma 7 месяцев назад
Thank you very much brother! It's been long my friend :)
@RemekKinas
@RemekKinas 7 месяцев назад
Really great job!
@princecanuma
@princecanuma 7 месяцев назад
Thank you very much, Remek! I’m happy you liked it :)
@dossantos4415
@dossantos4415 7 месяцев назад
Hey please continue with the coding llama 2 from scratch
@princecanuma
@princecanuma 7 месяцев назад
Hey, thanks for watching and pinging me for part 3. Don’t worry, Coding Llama 2 from scratch part 3 should be up soon. Potentially tomorrow :) The video has been recorded, However, it was delayed due to my first ever graduation which occurred today, a very important moment for me. 👨🏾‍🎓
@tharunbhaskar6795
@tharunbhaskar6795 7 месяцев назад
waiting for the training part
@princecanuma
@princecanuma 7 месяцев назад
Working on it 👌🏽 The video should be out this week.
@banbephanboi4708
@banbephanboi4708 7 месяцев назад
Great work! Wait for your next videos
@princecanuma
@princecanuma 7 месяцев назад
Thank very much! New videos dropping soon.
@CarlosAntunesZ
@CarlosAntunesZ 7 месяцев назад
Amazing video 🖖🏽
@princecanuma
@princecanuma 7 месяцев назад
Thank you very much! I’m happy you enjoy it :)
@shihab-soft
@shihab-soft 7 месяцев назад
Thank you very much this was very useful
@princecanuma
@princecanuma 7 месяцев назад
Most welcome :)
@illia_user
@illia_user 7 месяцев назад
Great job! Thank you!
@princecanuma
@princecanuma 7 месяцев назад
Hi, thank you very much!
@buddhu9364
@buddhu9364 7 месяцев назад
Is there a way I could go about doing the same thing in Windows and Gemma?
@princecanuma
@princecanuma 7 месяцев назад
Hi, thanks for watching! Yes, there is and I will cover it in a future video soon. 👌🏽
@NitoKuvell
@NitoKuvell 7 месяцев назад
Parabens Prince é um orgulho ver oque te tornaste na esfera das tecnologias. Avante
@princecanuma
@princecanuma 7 месяцев назад
Thank you very much brother! It means a lot coming from you :) Long time no see, let’s catch up.
@steliomoiane494
@steliomoiane494 8 месяцев назад
uau, amazing Prince, thanks for sharing this very useful content
@princecanuma
@princecanuma 8 месяцев назад
Most welcome :) Thank you for watching, Stelio!