This is Amazing. Thanks for covering in detail. Can you also cover how to evaluate the custom dataset and what type of data to add in your dataset to improve the performance of models?
Followed the whole video and learned more than in my whole Semester at university, since this is 1. Completely practical - easier to follow up with the theoretical part in the paper and 2. Diving so deep that it vanishes out all questions that may rise during your explanation because of your Illustrations using one note. Thank you very much for this gem and i hope to get to watch more of these in future!
thx, enjoyed it a lot, i watched your video next through going myself through the repository and like completing your explanations to understand it fully. I really value when you go into details and specifically always validating my thoughts with your explanation was great, in sum, the more you go into detail, the more you help me! thx a lot again
Watched the entire video and followed along with the code on my side. I really like the way you use the VS Code debugger to constantly check shapes and values. I found this method of understanding very helpful. Thanks a lot for preparing this long yet informative video. Keep it up!
Hey! I want to let you know that I passionately love your Coding Series! I have gone through the DETR paper myself recently, and some concepts still stay vague. After watching this one, they become clearer to me, especially the Bipartite Matching Loss :D Now I get it when they apply the Hungarian Algorithm in the calculation of that loss, which is a brilliant idea! Thank you so much!! Keep going 💪
This video is gold. Watched many times because every time I learn something new. Sometimes I do understand the overview of a paper but then, when I want to try it myself, it's hard to follow the implementation, with this video you helped me a lot 🎉
Thank you so much for the coding series. I'm not from CS background, got interested in AI, but don't just like to import model, but like to go through original code to understand it better. Often it becomes too obfuscated (like in deBERTa) and glad you're doing something like this.
Wow! I watch to the end, and learned a lot. Few remarks * I think it would be better to break it into a serie * The first hour was very clear, but till the end it became a bit vague * I download the mini coco and modified the download script as u suggested, but needed chatgpt help to generate the appropriate annotation file Thanks 💫
Your video was well-explained and easy to follow, making it accessible for someone like me who is new to this algorithm. I truly appreciate the effort you put into creating such valuable content.
Thank you so much for the video. I appreciate your efforts immensely! I have been interested in DETR for some time and am finally encouraged to adapt it to my use case in medical images thanks to your video ;)
Very detailed video about a very important architecture. You explained every detail in a clear way. Thank you very much, you deserve much more subscribers!
Watched the full video! Learned a lot. Really like the VSCode Walkthrough. It helps a lot when we see the shape of the tensors and how they change throughout the model. The explanations with the drawn diagrams are excellent as well. I am a fan of these series :). Thanks for your work and time.
Awesome content. Can you please make a code walk through video for NLP tasks like text generation, question-answer etc. Your way of explaining paper is really unique and interesting.
Thank you for creating a great content. I was wondering if vision transformers can be used for training satellite images with more than three channels?
hey, in your detr demo notebook it says that the bb boxes are normalezied to the interval 0 to 1 with sigmoid. when i use the detr of github, and i have a look at the model, then there is no sigmoid, but anyways the predicted bounding boxes are in the same interval 0 and 1. Do you know how this is achieved?
question regarding DETR bipartite Matching loss If I got it right than the loss match for every target only one query expert Therefor if there are 10 targets than the loss guide the model to match 10 query to the target classes and 290 (out of 300) to the no-object class is this correct? If so, than it might happen that a query that was not match to a target, but also was very close to match the target will be panelize by the loss model (panelize for not showing the no-obj class). This might confuse the model, especially if in the next epoch the the matched query and the unmatched query will be flipped. If I got it correct than the penalty for not showing the no-obj class is low, but it still shouldn't be panelize. I think it would be much better to include in the loss all the query that matched the target, so that they could be use for by the loss to improve the model This will cause the model to produce multiple prediction per target, but during prediction we can use only the prediction with the best score and ignore the rest (or use NMS)
Thank you for this amazing tutorail video, it really helps me a lot! But I'm still confused about the DETR forwarding at 12:05, when feeding into transformer, why there is a "0.1" multiplication on the image feature, what does it mean?
Can you please code the TESTR(CVPR-2022) which is a modified DETR? It would be really helpful. Also can you please tell where to modify the DETR to make it a polygonal detection model rather than a rectangular bounding box detection model