Accelerate Image Annotation with SAM and Grounding DINO | Python Tutorial

Подписаться 37 тыс.

Просмотров 43 тыс.

50% 1

Description:
In this comprehensive tutorial, discover how to speed up your image annotation process using Grounding DINO and Segment Anything Model (SAM). Learn how to convert object detection datasets into instance segmentation datasets, and see the potential of using these models to automatically annotate your datasets for real-time detectors like YOLOv8. Stay tuned for the upcoming release of a Python library that will make this process even more effortless.
Chapters:
00:00 Introduction
00:58 Python Environment Setup
04:13 Load Grounding DINO and Segment Anything Models
05:42 Single Image Mask Autoannotation
08:24 Full Dataset Mask Autoannotation
09:58 Save Labels to Pascal VOC XML
14:17 Upload Annotations to Roboflow
15:23 Review and Refine Annotations in Roboflow UI
17:11 Convert Object Detection to Instance Segmentation Dataset
22:35 Conclusions
23:28 Announcement
Resources:
🌏 Roboflow: roboflow.com
🌌 Roboflow Universe: universe.roboflow.com
📓 Automated Dataset Annotation and Evaluation with Grounding DINO and SAM notebook: colab.research.google.com/git...
📚 Grounding DINO blog post: blog.roboflow.com/grounding-d...
🎬 Detect Anything You Want with Grounding DINO | Zero-Shot Object Detection SOTA video: • Detect Anything You Wa...
📚 How to Use the Segment Anything Model (SAM) blog post: blog.roboflow.com/how-to-use-...
🎬 SAM - Segment Anything Model by Meta AI: Complete Guide | Python Setup & Applications video: • SAM - Segment Anything...
Stay updated with the projects I'm working on at github.com/roboflow and github.com/SkalskiP! ⭐
Don't forget to like, comment, and subscribe for more content on AI, computer vision, and the latest technological breakthroughs! 🚀
#MetaAI #SegmentAnythingModel #SAM #ImageSegmentation #Python #FoundationModels #ComputerVision #ZeroShot #GroundingDINO #ObjectDetection #DataLabeling

Наука

Опубликовано:

5 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 98

@harumambaru 11 месяцев назад

Thank you so much for the video explanation. The walk through makes all the difference. For example that 5:53 prompt engineering explanation is so useful.

@cyberhard Год назад

Nice! Looking forward to seeing the new library in action.

@Roboflow Год назад

I’ll do my best to not disappoint you ;)

@praveen9083 Год назад

wow... excited for the auto distill! :)

@Roboflow Год назад

That’s what I wanted to hear 💜

@johnpoc6594 Год назад

Very nice video and explanation, thank you very much!

@_ABDULGHANI Год назад

Thank you this is exactly what I was waiting for.

@Roboflow Год назад

I love to hear that! 🔥

@lorenzoleongutierrez7927 Год назад

Great job as usual!

@Roboflow Год назад

Thanks a lot! 🙏 we are not slowing down

@user-xi9ib1lp9w 11 месяцев назад

You're awesome man, thank you so much

@tomaszbazelczuk4987 Год назад

Awesome video as usual😮👍

@Roboflow Год назад

Thank you very much… doing my best 🙏🏻

@kamaraalhassanshaike1625 Год назад

Wow , this is fantastic

@Roboflow Год назад

Wow, thanks a looot

@body1024 Год назад

thank you so much 😍

@Roboflow Год назад

Thanks for watching! :)

@lorisdeluca610 Год назад

It's a very cool concept and surely helpful for some segmentation tasks. However, I see this working mainly with clear and not crowded images. With many tests I did, quite often a lot of items were mislabeled. Nonetheless cool idea and love the channel!

@Roboflow Год назад

Absolutely! But keep in mind that 3 years ago it was impossible. We just try to highlight cutting-edge models in 2023. I absolutely agree. We are not yet able to get good results for every image.

@adolfusadams4615 Год назад

Hey Peter, could you do a video showing how to integrate SuperGradients/Yolo NAS with Roboflow's Autodistill for custom detections on a live real-time webcam feed. Could you also show maybe in another video how to add custom objects to an existing dataset like the coco dataset? This would be Epic.🔥

@kobic8 Год назад

I have noticed you use in the supervision awesome package a method to load datasets in PASCAL-VOC format, are you planning to also support COCO formats (also for export?)?

@kaisbedioui7456 Год назад

As always a very cool video! Really curious to see Autodistill tool🎉 Does smart polygon tool leverage SAM as well?

@Roboflow Год назад

Yes it is! We are running SAM in smart polygon since last week 🔥

@user-lt5yt8uz4z Год назад

Can it be used to annotate for semantic segmentation or only instance?

@mentarus Год назад

Great video and notebook! However it looks like supervision install step fails with: groundingdino 0.1.0 requires supervision==0.4.0

@shamukshi 11 месяцев назад

for "solar panel counting from UAV image"...which approach is better ? 1. creating bounding box (BB) for solar panel using object detection model and then using BB as input for SAM....or.... 2. segmenting everything in the image from SAM...and then classifying each segment as solar panel and non solar panel.

@patrickwasp Год назад

Can you combine separate polygons into a single object?

@bb-andersenaccount9216 Год назад

I guess that it would be great to include in both supervision and autodistill a feature that gets the bounding box given a polyline segmentation from sam

@Roboflow Год назад

we have that already! supervision - roboflow.github.io/supervision/detection/utils/#mask_to_xyxy

@alassanesakande8791 Год назад

Incredible video ! I was just reading the Grounded-SAM this morning, and boum you're making a tutorial on it. Great job ! I'm just wondering if I could find ways to use it in a medical imagery task ! What do you think ?

@Roboflow Год назад

You want to do full auto or bounding box to mask?

@alassanesakande8791 Год назад

@@Roboflow I would go for automatic segmentation but I'd also like it to be interactive for the user. So maybe combining the two would more appreciated

@Roboflow Год назад

@@alassanesakande8791 that is our plan for next stage. Allow full auto or human in the loop :) I also think that being able to interactively interact with those labels before you use them to train for example YOLOv8 is required.

@ranpinc Год назад

Thank you for your work, this is exactly what we need urgently, but at the moment I see that it seems to only support saving data in Pascal voc format, do you have any plans to provide an api to convert it to coco format?

@Roboflow Год назад

Currently the order is YOLO and than COCO. But it might happen next week.

@ranpinc Год назад

@@Roboflow that's cool! the soon the better, thank you for your work again!

@monkeywrench1951 Год назад

I wonder if segment anything can be accelerated or if even it it would run in the google coral edge accelerator.

@Roboflow Год назад

I heard you can use OpenVINO to run it on CPU. As long as it is Intel CPU.

@sebbecht Год назад

Hey there! I really like these videos a lot. Certainly with fast labelling the specific task can be trained supervised. But is there an opportunity in using SAM and/or DINO as a teacher for distillation into a smaller (final) model, even before creating an annotated dataset? Would this be competitive with other self-supervised pretraining methods?

@Roboflow Год назад

Hi 👋🏻 you mean SAM and GDINO would generate training examples on the fly during the training?

@Roboflow Год назад

@@sebbecht we didn't explore that rout yet but it would be awesome to test those theories. Thanks for sharing :) I never run out of ideas thanks to conversations like this.

@sebbecht Год назад

@@Roboflow my pleasure, I hope you get to explore and share some findings!

@Roboflow Год назад

@@sebbecht stay tuned :)

@chinnagadilinga5742 Год назад

Hi Sir I'm Beginner in I saw your Computer vision video's its fully combined and merged can you please update one by one video order that time we can understand easily thank you.

@Roboflow Год назад

Hi, it is Peter from the video? Do you mean videos related to zero-shot annotations?

@adriancontrerasgarcia7968 5 месяцев назад

Can I convert a multiclass object detection dataset to a segmentation dataset with this? I have only seen the example with the single class Blueberries dataset so im not sure.

@Roboflow 5 месяцев назад

You can :)

@hyunseungshin3955 Год назад

Great tutorial!! Is it possible to real time video? something like a webcam?

@Roboflow Год назад

Thanks a lot! 🙏🏻 model is to slow to run in real time :/ the whole inference for single frame can take around 1-2 seconds.

@kategeorge1152 Год назад

Any chance for a tutorial on SAM and Roboflow and remote sensing of satellite or uav imagery?

@Roboflow Год назад

Please tel me more about the idea? What would you like to see?

@gbo10001 Год назад

that's really great waited for that!!. btw why there is no support for tracking annotations formats like MOT/MOTS

@Roboflow Год назад

I know it took me a lot of time... But this was possibly the most complicated Jupyter Notebook I ever made.

@gbo10001 Год назад

@@Roboflow that's it really great contribution for the community😎 thanks for that

@Roboflow Год назад

@@gbo10001 we are working on something even beeeeter! 🔥

@Roboflow Год назад

@@gbo10001 hahaha better than SAM + DINO

@snehitvaddi Год назад

Hey Peter! Can I use the SAM labelling for object detection as well? or is it only for instance segmentation?

@Roboflow Год назад

You can always convert segmentation into detection. It is just a bit hm... poor usage of resources as it is super time-consuming. What project do you have in your mind?

@snehitvaddi Год назад

@@Roboflow I'm working on detecting potato quality on a conveyer belt. I labeled some photos using SAM, but I'm not sure if the polygon labeling actually helps object detection or if a basic rectangle boundary will enough.

@Roboflow Год назад

@@snehitvaddi yes, for modern models like YOLOv8 it helps: blog.roboflow.com/polygons-object-detection/

@snehitvaddi Год назад

@@Roboflow cool, thanks

@Roboflow Год назад

@@snehitvaddi use the one thet is faster to annotate? Polygons can be converted to boxes really easily.

@olanrewajuatanda533 Год назад

I keep getting error messages whenever I used some of the images in my dataset

@kobic8 Год назад

thank to this great vid (and notebook) I have tried using it together with SAM and I'm curious to know how can I use a labeled dataset I have (of sea-objects) to learn the model to detect not only a boat/ship but to identify the name of the marine-vessel.

@Roboflow Год назад

Do you have labels for marine-vessel in your dataset? Or only boat/ship?

@kobic8 Год назад

@@Roboflow thanks so much for the reply! am really trying to figure out how to solve this issue: yes! I do have human-labeled dataset for specific classes of marine-vessels e.g., frigatte, corvette, and also some ships with their specific names. My question was if there is a way to fine-tune the grounded-DINO model to identify the objects not as "boat" or "ship" but on more accurate labels

@Roboflow Год назад

@@kobic8 yes it probably is possible, but you would be much better of if you train model like YOLOv8. Power od GroundingDINO comes from zero shot detection - ability to detect objects that it never saw. If you already have annotated dataset, just train regular object detection model. :)

@kobic8 Год назад

@@Roboflow but it be "less powerfull" compared to G-DINO, I just thought to tune G-Dino to refine specific labels, so I tought it be btter to somehow get the traning code

@saharabdulalim Год назад

thank u for this incredible vid !💖 but I have a question, when trying to run the following command it told me that " 41 detections.mask = segment(sam_predictor=sam_predictor, image=image, xyxy=filtered_detections.xyxy) 42 43 mask_annotator = sv.MaskAnnotator() NameError: name 'segment' is not defined " and I search for the __init__ in SAM but there isn't found, so is this function is built in sam_anything module or should I wrote it ?

@saharabdulalim Год назад

i replaced this command of yours from tqdm.notebook import tqdm for image_name, image in tqdm(object_detection_dataset.images.items()): detections = object_detection_dataset.annotations[image_name] detections.mask = segment( sam_predictor=sam_predictor, image=cv2.cvtColor(image, cv2.COLOR_BGR2RGB), xyxy=detections.xyxy )

@Roboflow Год назад

Looks to me like you didn’t run all cells in notebook. Segment function is defined in one of the cells in notebook. No need to change the code.

@saharabdulalim Год назад

@@Roboflow oh I see, thanks, it had been solved. can I ask another question? my dataset is into coco format as it on my PC not roboflow so I converted it into pascal format to be able to follow your steps from converting to segmentation but it didn't work at all. is it a function in supervision to read coco format like pascal? as I searched but it give me errors

@Roboflow Год назад

@@saharabdulalim hi! We ant to add COCO loading to supervision but it won't happen to soon :/ if you wan to follow those steps now I'd upload dataset to Roboflow. That's probably the fastest way for now.

@saharabdulalim Год назад

@@Roboflow is it possible to upload the whole dataset to RoboFlow? without annotate every image as I have already the annotation file

@Aziz-bg4ph Год назад

How can I extract the segmented object produced by SAM?

@Roboflow Год назад

Masks are stored here `detections.mask`.

@heetshah5718 Год назад

I am currently working on pollution detection and classification system project, can I use GDINO and Sam for the same?

@Roboflow Год назад

What would that be? Images of smoke for example?

@heetshah5718 Год назад

@@Roboflow Images of plastic underwater and Oil Pollution in water

@kobic8 Год назад

in you previous video on grounding dino, you elaborated on a text prompt as an input, can this be implemented here as well? are you planning on extending this tutoorial (or notebook) to show how to implement it? also, I have noticed that you can also implement stable diffusion tools such as "change do to a monkey". can that also be in the next vid?

@Roboflow Год назад

Auto labeling with prompts will be part of the auto-distill package that is coming soon. As for stable diffusion, I can't promise anything :/ We have a lot of stuff in the backlog. But maybe I'll play with it on Twitch stream.

@kobic8 Год назад

@@Roboflow thanks a lot! any estimation regarding the release date of auto-distill?

@Roboflow Год назад

@@kobic8 it is close! Reaaaaaaaly close!

@Roboflow Год назад

@@kobic8 don't want to over promis but I heard something about today :)

@kobic8 Год назад

great tutorial! can you post the link to the jupyter notebook in the vid bio?

@Roboflow Год назад

It is in the description. But here is the link: colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb

@user-xq8ik4bf4m 11 месяцев назад

Hi, Can this also be implemented on custom objects, if so how to implement it

@Roboflow 10 месяцев назад

What do you mean by custom object?

@dilshodbazarov7682 Год назад

Awesome tutorial!!! But while I am running during 6:25, I got error: "NameError: name '_C' is not defined" (after long error description). Anyone can help?

@Roboflow Год назад

Could you give me a bit more info? Do you run it in Google Colab?

@thegodofrotation-animeamvs7204 Год назад

@@Roboflow I have the same error. I ran the colab from top to bottom and got this error at the first annotation part on the line detections = grounding_dino_model.predict_with_classes(.. Any help would be appreciated!

@Roboflow Год назад

@@thegodofrotation-animeamvs7204 I'll do my best to take a look at that. Could you submit new issue here: github.com/roboflow/notebooks/issues