SAM - Segment Anything Model by Meta AI: Complete Guide | Python Setup & Applications

Подписаться 37 тыс.

Просмотров 65 тыс.

50% 1

Description:
Discover the incredible potential of Meta AI's Segment Anything Model (SAM) in this comprehensive tutorial! We dive into SAM, an efficient and promptable model for image segmentation, which has revolutionized computer vision tasks. With over 1 billion masks on 11M licensed and privacy-respecting images, SAM's zero-shot performance is often competitive with or even superior to prior fully supervised results.
🔍 Explore this in-depth guide as we walk you through setting up your Python environment, loading SAM, generating segmentation masks, and much more. Master the art of converting object detection datasets into segmentation masks and learn how to leverage this powerful tool for your projects.
Chapters:
00:00 - Introduction and Overview of SAM by Meta AI
01:00 - Setting up Your Python Environment
02:46 - Loading the Segment Anything Model
03:09 - Automated Mask Generation with SAM
06:36 - Generate Segmentation Mask with Bounding Box
10:02 - Convert Object Detection Dataset into Segmentation Masks
12:01 - Outro
Resources:
🌏 Roboflow: roboflow.com
🌌 Roboflow Universe: universe.roboflow.com
✏️ Roboflow Annotate power by SAM: blog.roboflow.com/label-data-...
📚 How to Use the Segment Anything Model (SAM) blog post: blog.roboflow.com/how-to-use-...
📓 How to Segment Anything with SAM notebook: colab.research.google.com/git...
🎬 Segment Anything Model (SAM) Breakdown RU-vid video: • Segment Anything Model...
🔥 Automated Data Labeling with SAM: • Accelerate Image Annot...
💻 Segment Anything Model repository: github.com/facebookresearch/s...
🌌 Access the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images: segment-anything.com/dataset/...
📚 Segment Anything arXiv paper: arxiv.org/pdf/2304.02643.pdf
On July 29th, 2024, Meta AI released Segment Anything 2 (SAM 2), a new image and video segmentation foundation model. According to Meta, SAM 2 is 6x more accurate than the original SAM model at image segmentation tasks. Learn more: blog.roboflow.com/what-is-seg...
Stay updated with the projects I'm working on at github.com/roboflow and github.com/SkalskiP! ⭐
Don't forget to like, comment, and subscribe for more content on AI, computer vision, and the latest breakthroughs in technology! 🚀
#MetaAI #SegmentAnythingModel #SAM #ImageSegmentation #Python #FoundationModels #ComputerVision #ZeroShot

Наука

Опубликовано:

5 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 126

@SS-zq5sc Год назад

This was a great explanation and so was your blog entry. You gained another subscriber today. Thank you!

@Roboflow Год назад

Hi, it is Peter from the video :) That's awesome to hear! Thanks a lot!

@dloperab Год назад

Great video... thanks Piotr and Roboflow for all the great videos you generate. I am resuming my interest in CV thanks to you!

@Roboflow Год назад

This is big! If I managed to convince you even a little bit I am proud of myself.

@samzhu6728 9 месяцев назад

Thanks for the wonderful video! Is it possible to annotate specific objects (with labels) in a few frames of a video (fixed perspective) and keep tracking those objects in the entire video?

@shaneable1 Год назад

Great video! Thank you! What hardware are you running this on?

@user-ld8lc4ex4m Год назад

Thank you so much

@gbo10001 Год назад

wow that's really great waited for that...

@Roboflow Год назад

I’m super happy you like it!

@lorenzoleongutierrez7927 Год назад

Great !

@anestiskastellos4150 Год назад

Very nice video. Next video -> Grounded Segment Anything !! 👏

@Roboflow Год назад

🚨 SPOILER ALERT: That's the plan!

@abdshomad Год назад

Second that! +1

@Roboflow Год назад

@@abdshomad I think we should have something by Friday/Monday

@EkaterinaGolubeva-pr9ih Год назад

Thank you ! Can SAM handle 3D images ? Any advice on how to approach it ?

@froukehermens2176 7 месяцев назад

One use case is the annotation of eye tracking data. Per video frame one would like to annotate whether a person is looking at other people or objects in the environment. One could use YOLO and bounding boxes, but these are less precise than regions.

@anandhsaspect4560 Год назад

Great. Thanks.

@Roboflow Год назад

Thanks a lot 🙏

@diyorpardaev 5 месяцев назад

It is really best video ever)I am making a great project with using sv

@Roboflow 5 месяцев назад

This is so kind! Thank you very much!

@mohammedinnat1247 Год назад

Nice. Thanks.

@Roboflow Год назад

Thank you! 🙏

@JenishaThankaraj Год назад

Can we annotate polygon shape instead of rectangle using SAM

@alaaalmazroey3226 5 месяцев назад

Does SAM segment all objects in the scene very well when there is an occlusion?

@badrinarayanan686 2 месяца назад

Great video!! I do have a question. How do we use MskAnnotator to annotate only one specific mask instead of the entire set of masks in sam_result?

@kobic8 Год назад

Thanks so much for the clear video! Are you planning on also intergrate it with some tools to get an output that will include also labels of each mask?

@Roboflow Год назад

We already did. Take a look here: colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb

@jamtomorrow457 Год назад

Hi thanks for the great tutorial! How can I download the masks created using SAM and upload them into roboflow?

@MrJesOP Год назад

First of all, thank you so much for the content, amazing contribution to the community! I wonder if it is possible to implement the negative point prompt in the SAM model similarly as it can be done in the website, where you can choose several points belonging to the object that you are interested in as well as points that do not belong to it... Some help would be amazing!! Thanks in advance!!

@Roboflow Год назад

Hi thanks a lot for those kind words :) As for your question - "implement the negative point prompt". I was looking for any project that would implement that functionality. And I didn't found anything :/

@geniusxbyofejiroagbaduta8665 Год назад

I can't wait to see how it can be used for annotations

@Roboflow Год назад

Stay tuned for RF update;) we also plan to drop one more video probably Friday/Monday where we will dive deep into auto annotation in Colab

@Roboflow Год назад

We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!

@iflag9775 Год назад

Great! Could you give a talk on possibility of object detection with SAM

@Roboflow Год назад

Hi 👋🏻 Could you please explain a bit more what do you mean? It is segmentation model. Would you like to convert masks into boxes?

@user-rm9ml1th8h 10 месяцев назад

Nice video! I have 1 question, can you please suggest which is better out of this "SAM" or "YOLOv6-v3" for real-time detection in terms of accuracy? My requirement is to detect car parts(e.g. Michelin tire). Thank you in advance.

@Roboflow 10 месяцев назад

If you want to run real time, then you can’t use SAM. It will be to slow.

@user-rm9ml1th8h 10 месяцев назад

@@Roboflow: Thank you very much for your quick response! For our specific requirement to detect car parts(e.g. wheel type - alloy wheels or not, specific accessory, etc.,) after captured image being uploaded(taken from mobile camera), can you please suggest best algorithm based on your vast experience in this area? Do you recommend YOLOv6-v3 or GroundingDINO or any other? Tons of thanks to you again in advance!

@user-yw6wf3uu1o Год назад

How can I do semantic segmentation labeling using sam?

@imimiliades629 3 месяца назад

Hello! Can anybody explain how i can evaludate this model after training? What commands can i run?

@SweetShotGaming Год назад

Is there any way you can create an auto-labeler using SAM? (SAM would take care of everything with no human intervention). My specific need would be to label lane markings, but for an entire dataset of raw images.

@SweetShotGaming Год назад

Forgot to mention, great video! Is there any functionality with SAM where you can give it a few examples of what the label is and then it will assume the labels for the dataset. Thanks!

@Roboflow Год назад

Yes! Stay tuned to our next vid. We are doing full auto dataset generation and generation of masks from boxes. Should be on Monday.

@ifeanyianene6770 Год назад

Thanks so much for this video!! Is there another way to draw the bounding box (in like a single python file format whereby you just run your main function) that doesn't require jupyter widgets? Oh btw, Liked and subscribed you guys are awesome!

@Roboflow Год назад

Thanks for like and sub ;) as for your question. That was the only interactive way that I could come up with. But if you don’t want to do it in interactive way than you have plants of options.

@DTM6559 8 месяцев назад

How can I train with different color masks rather than black and white mask??

@ruiteixeira2324 Год назад

Very nice work. How do you see SAM being used in practice? You see this as a model to be integrated in a tool to generate training data for your task or being your final model for a certain task?

@Roboflow Год назад

Hi 👋! It is Peter from the video. I think we will see broad use of SAM in image and video editors. But I also think it will be the default feature in all major annotation tools. It is a bot too slow for real-time usage. But we will transfer the knowledge it provides int datasets that we use for training real-time models. What is your prediction?

@ruiteixeira2324 Год назад

@@Roboflow yes, I totally agree with you on the fact that it will be the default tool to annotate data. Since you think this is to slow, what's in your opinion the current state of art model for semantic segmentation for real-time applications?

@Roboflow Год назад

@@ruiteixeira2324 hahaha hard question. According to papers with code that would be latest version of YOLOv6.

@Roboflow Год назад

Hi Rui - we have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!

@willemweertman1178 Год назад

You should try it with data taken in a underwater marine context. Lots of models struggle with that.

@Roboflow Год назад

Cool idea!💡 I work on next vid I’ll try to take that into consideration

@ish694 Год назад

Trying to work on the same use case with coral segmentation

@mnx64 Год назад

You said it’s real time ready - but what IS real time ready is only the prediction for various prompts on the *same* image. Generating the embeddings for each new image is actually really slow (multiple seconds depending on image size and hardware) and cannot be done in browser or in real time. This makes it less useful for live/video analysis, of course, but it’s still great to generate segmentation mask training sets! Thanks for the video

@Roboflow Год назад

Hi 👋🏻 It’s Peter from the video. You are right I really think I shouldn’t say that because it is confusing. The decoding part is very fast and can get executed in real time, but the encoding is quite slow. Can be faster if you use version B versus version H but still… So I really wish I would be more precise. Apologize for that.

@mattizzle81 Год назад

There is a fully onnx optimized and quantized model out there that is faster. Still not ultra fast but at least 1 FPS on my RTX 2080 ti, which is not bad. Semi real-time.

@Roboflow Год назад

@@mattizzle81 thanks for that info I was actually not aware of that… but still I think it would be much cleaner without that sentence in the vid :)

@EkaterinaGolubeva-pr9ih 11 месяцев назад

Can you make a video on MedLSAM ( medical localize and segment anything model) ?

@TheArkLade Год назад

For [Single Image Bounding Box to Mask] what should be changed if we have more than 1 class and want to see detection for all classes?

@Roboflow Год назад

Hi it is Peter from the video 👋🏻 So you want to have multiple boxes converted into multiple masks?

@TheArkLade Год назад

@@Roboflow Yes. so two-part question: (1) current script returns one mask at a time. How can I change it so it returns all detected masks? (2) Let's suppose I have 5 classes. How should I do so that all detections for all 5 classes are shown?

@MisterWealth Год назад

how do we switch out with our own videos?

@tomtouma Год назад

Can it be used to segment and label objects in a video stream from a live camera? I've been reading a lot of the feedback and people are saying it's computationally heavy and will run too slow at a meaningful refresh rate. I noticed the Meta advertisement was doing it in realtime and labeling and tracking stuff. What are your thoughts on this? Is it easier to stick to OpenCV/Yolo for a live video feed?

@Roboflow Год назад

> I noticed the Meta advertisement was doing it in realtime Could you point me to the resource?

@tomtouma Год назад

@@Roboflow At about 0:06 in your video. Looks like real time?

@Roboflow Год назад

@@tomtouma I wish! That was just a lot of work online to produce it. It is not real time.

@tomtouma Год назад

@@Roboflow Could you explain what you mean by "work online"? Do you mean the Meta team recorded a video and then post-processed the video offline? Also, is there a way to pass video frames (even if it is very slow like 1Hz or slower) to SAM and have it segment the image frames? I want to then run some python scripts to get me geometry information about these segmented masks.

@tomtouma Год назад

Just thought I'd bump this.

@saharabdulalim Год назад

thank u for that ❤ i have a question ,when I make annotations for more than one object in image and pass it to sam it only mask one object like the brain cell in ur project and didn't mask the other objects in same image? i make it one class for all

@saharabdulalim Год назад

i think cuz of xyxy[0] but if i want to pass multimask on multi object?

@Roboflow Год назад

@@saharabdulalim I'm actually working on next vid right now. And we will cover more general autoannotation usecases. Vid should be out tomorrow.

@GohOnLeeds Год назад

Since SAM is trained on photos, any idea how well it does on synthetic images, like artwork or games? Cheers

@Roboflow 10 месяцев назад

Great question. Unfortunately I didn’t experiment with those :/ sorry

@user-jo5pw5kn1e Год назад

How can I tag objects in a picture using SAM? For example: In the picture that was used in the video: Man holding a dog, I want to identify all the objects in the picture like man, Dog, building, etc

@Roboflow Год назад

Stay tuned for our video tomorrow. 🔥 I’m going to show how to auto annotate images with Grounding DINO and SAM

@javier_medel Год назад

Great video, Do youthink that you can share the jupyter notebook?

@Roboflow Год назад

It is linked in description ;) all our demo notebooks are open sourced

@unknown-wm9ru Год назад

This still doesn't work in live action does it? Like if I connected it to a camera or a vr headset like the meta quest pro/ pico 4 and used their cameras for AR powered by Sam. That would definetly be awsome!

@Roboflow Год назад

You should get few fps. But if you want 30 fps, than we are not there yet.

@unknown-wm9ru Год назад

@@Roboflow Hmm I see, but the fact that it's there already is awesome in itself! The future is here and It's really exciting I love it

@Roboflow Год назад

@@unknown-wm9ru true that!

@drm8164 Год назад

Help please, i need to learn computer vision, but i struggle a lot. Is the OpenCv certificate worth it, it's around 1200 us dollars ? Thanks

@Roboflow Год назад

Take a look here: github.com/SkalskiP/courses. In general, the Internet is full of free resources. It is not worth paying 1200 USD for a course like that.

@mithilanavishka4531 3 месяца назад

Hi i am in the process of learning this SAM model follwing your video , this is very helpful i am planing to use this model to segmentahistorical documents charchters, according to your knowledge will it be possbile or time wasting ?

@Roboflow 3 месяца назад

SAM is not really good at document segmentation

@shamukshi 11 месяцев назад

Do you do freelancing ? my ACADEMIC project is "solar panel detection and counting using SAM."

@Roboflow 10 месяцев назад

Nope. We do not do freelancing. :/

@yatinarora9650 Год назад

Great video , create some video to explain how SAM works internally please

@user-lv5rd3ck2p 11 месяцев назад

Is it possible to get segmentated image without passing its bounding box?

@Roboflow 10 месяцев назад

You don’t need to pass box. If you won’t pass any prompt the whole image gets segmented.

@chinnagadilinga5742 Год назад

Hi Sir I'm Beginner in I saw your Computer vision video's its fully combined and merged can you please update one by one video order that time we can understand easily thank you.

@Roboflow Год назад

Hi :) Are you only interested in auto annotation videos? Or all of them?

@Aziz-bg4ph Год назад

How can I extract the segmented object produced by SAM?

@Roboflow Год назад

You can find masks inside `sv.Detections` object. `detections.masks`

@darinkumarnsit4780 Год назад

@@Roboflow Could you show me how to use 'detections.masks', please? I try to use it and got AttributeError: 'Detections' object has no attribute 'masks'

@kgylsd Год назад

Please let us know anytime when the SAM/ Roboflow integration is accomplished 😊

@Roboflow Год назад

It will be for sure part of out weekly newsletter! ;)

@Roboflow Год назад

We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!

@itayhilel2168 Год назад

Let's do this on a superstore dataset

@Roboflow Год назад

send a link :D I'll take a look

@cedricvillani8502 Год назад

So how many SAMs Dics are you expecting to come in and out of your model? You just seem to really enjoy SAMs Dics, but I suppose using research from Michael J Black, about 10 years ago, and it makes sense why you really enjoy utilizing SAMs Dics

@husseinali-yx7uf Год назад

Thanks for your video I have learnt a lot from you but this time each time i try to follow up with your steps this error encounter me : --------------------------------------------------------------------------- OutOfMemoryError: CUDA out of memory. Tried to allocate 14.40 GiB (GPU 0; 15.90 GiB total capacity; 6.53 GiB already allocated; 7.95 GiB free; 7.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@Roboflow Год назад

Is that happening in notebook?

@saharabdulalim Год назад

try using parallel gpu

@husseinali-yx7uf Год назад

@@Roboflow Yes on both google colab and kaggle

@husseinali-yx7uf Год назад

@@saharabdulalim how to do that?

@Roboflow Год назад

@@husseinali-yx7uf od that happening with your own image?

@cedricvillani8502 Год назад

Seriously you don’t need to pay these people for a API garbage, this is old look up panoptic segmentation stop giving these people your money and your data that’s all an API is it’s an application programming interface. In other words you’re just paying them when you don’t need it.

@lalamax3d Год назад

first tutorial using sam should be mask of yourself which you showed on 36sec of this video. but thanks anyways

@Roboflow Год назад

Just convince my boss to do it and you’ll have it ;)

@lalamax3d Год назад

@@Roboflow i like your boss. now when you got time for that. (imho) please keep few pointers in mind a- do it with image sequence or video or both b- getting progress bar / status tqdm / queue c- giving area/prompt in one go. d- how to acheive consistancy.. etc.. if subject move out or something come in front (if possible)

@swipeshark5311 5 месяцев назад

Не рабочий код, дизлайк

@fintech1378 Год назад

how can this model be used in detecting type and quantity of inventory in a shop?

@Roboflow Год назад

Hi it is Peter from the video. I think that if you look for type and quantity of inventory in a shop you will be much better off with using detection models like YOLOv8 or YOLO-NAS.

@aarontan5434 9 месяцев назад

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 mask_annotator = sv.MaskAnnotator(color_map='index') TypeError: MaskAnnotator.__init__() got an unexpected keyword argument 'color_map I got this error. May I know what went wrong?

@Roboflow 9 месяцев назад

Hi! I just fixed the notebook. Feel free to try it. :)