Facebook DETR | ML Coding Series | End to end object detection with transformers

Aleksa Gordić - The AI Epiphany

Подписаться 55 тыс.

Просмотров 14 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

13 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 67

@TheAIEpiphany 2 года назад

Longest video I ever made - do let me know what you think! If I do I'll be making more of these!

@darnokjarud9975 2 года назад

I want more, amazing!!!

@architjain6766 2 года назад

This is Amazing. Thanks for covering in detail. Can you also cover how to evaluate the custom dataset and what type of data to add in your dataset to improve the performance of models?

@johnb1550 2 года назад

Just what I needed! Thank you so much!

@PritishMishra 2 года назад

These videos are gold!

@davidro00 10 месяцев назад

Followed the whole video and learned more than in my whole Semester at university, since this is 1. Completely practical - easier to follow up with the theoretical part in the paper and 2. Diving so deep that it vanishes out all questions that may rise during your explanation because of your Illustrations using one note. Thank you very much for this gem and i hope to get to watch more of these in future!

@lukas-santopuglisi668 3 месяца назад

thx, enjoyed it a lot, i watched your video next through going myself through the repository and like completing your explanations to understand it fully. I really value when you go into details and specifically always validating my thoughts with your explanation was great, in sum, the more you go into detail, the more you help me! thx a lot again

@kvnptl4400 3 месяца назад

Watched the entire video and followed along with the code on my side. I really like the way you use the VS Code debugger to constantly check shapes and values. I found this method of understanding very helpful. Thanks a lot for preparing this long yet informative video. Keep it up!

@huuthien 7 месяцев назад

Hey! I want to let you know that I passionately love your Coding Series! I have gone through the DETR paper myself recently, and some concepts still stay vague. After watching this one, they become clearer to me, especially the Bipartite Matching Loss :D Now I get it when they apply the Hungarian Algorithm in the calculation of that loss, which is a brilliant idea! Thank you so much!! Keep going 💪

@vforventordici1 Год назад

This video is gold. Watched many times because every time I learn something new. Sometimes I do understand the overview of a paper but then, when I want to try it myself, it's hard to follow the implementation, with this video you helped me a lot 🎉

@InturnetHaetMachine 2 года назад

Thank you so much for the coding series. I'm not from CS background, got interested in AI, but don't just like to import model, but like to go through original code to understand it better. Often it becomes too obfuscated (like in deBERTa) and glad you're doing something like this.

@ahmadhamdan44 2 года назад

Now I know what to do when I get home :)

@TheAIEpiphany 2 года назад

Hah!

@eliaweiss1 5 месяцев назад

Wow! I watch to the end, and learned a lot. Few remarks * I think it would be better to break it into a serie * The first hour was very clear, but till the end it became a bit vague * I download the mini coco and modified the download script as u suggested, but needed chatgpt help to generate the appropriate annotation file Thanks 💫

@kvnptl4400 3 месяца назад

I completely agree with this 👍

@fahim78611 Год назад

thaNQ so much for explaining code line by line understanding of DETR paper 🎉 with this i got more clear on the transformers 🙏

@hilmantmacc Год назад

Your video was well-explained and easy to follow, making it accessible for someone like me who is new to this algorithm. I truly appreciate the effort you put into creating such valuable content.

@jonasammeling69 2 года назад

Thank you so much for the video. I appreciate your efforts immensely! I have been interested in DETR for some time and am finally encouraged to adapt it to my use case in medical images thanks to your video ;)

@TheAIEpiphany 2 года назад

Wow nice to hear that!!

@stalinsampras 2 года назад

Thank you for doing this type of video, much appreciated

@TheAIEpiphany 2 года назад

Thanks!!

@wassimz20 Год назад

Very detailed video about a very important architecture. You explained every detail in a clear way. Thank you very much, you deserve much more subscribers!

@minhajulhoque2113 2 года назад

Watched the full video! Learned a lot. Really like the VSCode Walkthrough. It helps a lot when we see the shape of the tensors and how they change throughout the model. The explanations with the drawn diagrams are excellent as well. I am a fan of these series :). Thanks for your work and time.

@TheAIEpiphany 2 года назад

Thanks Minhajul!

@RevolutionofTime 2 года назад

Thankyou for the video I've always loved your videos And these are my favourite 🤩

@TheAIEpiphany 2 года назад

Glad to hear that!

@meklitmaki5605 Год назад

Thank you so much for this wonderful video. It helped me a lot to understand how DETR works:)

@Fordance100 2 года назад

A lot of details, but was very helpful. I understand the model, loss, training etc better.

@1potdish271 2 года назад

Awesome content. Can you please make a code walk through video for NLP tasks like text generation, question-answer etc. Your way of explaining paper is really unique and interesting.

@priyankajain1691 Год назад

Thank you so much for this amazing content. Thank you for taking out the time to explain it so nicely :)

@shuzhao4172 Год назад

This is an amazing video! I learn a lot from it! Thank you!

@shubhamsuryavanshi1461 2 года назад

It was really an awesome video. I loved it.

@TheAIEpiphany 2 года назад

Thanks man!

@researchai6376 Год назад

Great explanation. Thank you so much!

@devanshaggarwal7256 Год назад

Thanku So Much SIr for the DETR model. kindly upload a video on scratch level imlementation of detr or on custom dataset implementation of DETR

@HieuNguyen-qs6ig 2 года назад

amazing work, sir!

@TheAIEpiphany 2 года назад

Ty!!

@НиколайНовичков-е1э 2 года назад

Thank you for video!

@user-uf4zx7cg6x 7 месяцев назад

Thank you for creating a great content. I was wondering if vision transformers can be used for training satellite images with more than three channels?

@bcode2 5 месяцев назад

Thanks for your effort. The video is perfect. I have a question. Can I detect objects in video with this?

@eranjitkumar11 2 года назад

Thank you for doing this. Will you look into the VQ-VAE model with a prior (pixelcnn or wavenet) ?

@TheAIEpiphany 2 года назад

Appreciate the feedback - I may cover it!

@andreashorlbeck8749 Год назад

hey, in your detr demo notebook it says that the bb boxes are normalezied to the interval 0 to 1 with sigmoid. when i use the detr of github, and i have a look at the model, then there is no sigmoid, but anyways the predicted bounding boxes are in the same interval 0 and 1. Do you know how this is achieved?

@tonihullzer1611 Год назад

thank you for the amazing walk through. What is a typical "good" loss when you normalized bounding boxes?

@adityaagarwal7997 5 месяцев назад

great video thanks for explaining so nicely. Can you pls share the code as well?

@ajayshastry5427 Год назад

Amazing video 😊

@vigneshvicky6720 Год назад

great tq man🥰

@codewithdev1375 4 месяца назад

could you please make a video on the paper called Trackformer for multi object tracking.

@technick51 Год назад

Hi Aleksa, this walk-through is amazing, thank you! Could you please provide the modified script to sub-sample the coco dataset?

@yaodadadadada7090 2 года назад

Thank you very much!!!!!!! very wonderful

@TheAIEpiphany 2 года назад

Thanks!

@eliaweiss1 5 месяцев назад

question regarding DETR bipartite Matching loss If I got it right than the loss match for every target only one query expert Therefor if there are 10 targets than the loss guide the model to match 10 query to the target classes and 290 (out of 300) to the no-object class is this correct? If so, than it might happen that a query that was not match to a target, but also was very close to match the target will be panelize by the loss model (panelize for not showing the no-obj class). This might confuse the model, especially if in the next epoch the the matched query and the unmatched query will be flipped. If I got it correct than the penalty for not showing the no-obj class is low, but it still shouldn't be panelize. I think it would be much better to include in the loss all the query that matched the target, so that they could be use for by the loss to improve the model This will cause the model to produce multiple prediction per target, but during prediction we can use only the prediction with the best score and ignore the rest (or use NMS)

@shuangyi2275 2 года назад

thanks for your explanation！ How to output the AP of each class

@xray1111able Год назад

Thank you for this amazing tutorail video, it really helps me a lot! But I'm still confused about the DETR forwarding at 12:05, when feeding into transformer, why there is a "0.1" multiplication on the image feature, what does it mean?

@skramturbo8499 Год назад

Can we have access to this notebook?

@junpyohong2132 7 месяцев назад

thx.

@ripsirwin1 Год назад

I have a hell of a time training this thing. The network "trains" but the loss never goes down. Is there something tricky about the hyperparameters?

@masternobody1896 2 года назад

epic hopefully i get a job at google

@TheAIEpiphany 2 года назад

Tell them you watched the video it adds + 100 points to your resume

@masternobody1896 2 года назад

@@TheAIEpiphany thanks

@MuhammadAdnan-tq3fx Год назад

can i implement this on my custom dataset ?

@rares1263 2 года назад

Could you show us how you sampled the small dataset. I tried using the miniCoco repo but i'm getting some odd error.

@aradhyamathur7355 2 месяца назад

are these notebooks available on your github as well ?

@aradhyamathur7355 2 месяца назад

also the weight file is not longer available any drive link or other location ?

@ayushroy6208 Год назад

Can you please code the TESTR(CVPR-2022) which is a modified DETR? It would be really helpful. Also can you please tell where to modify the DETR to make it a polygonal detection model rather than a rectangular bounding box detection model