You went pretty quickly over the “treat missing data as ignore” option, but it’s one of the most useful when you have a mix of a lot of missing data points and a lot of over-threshold data-points and are using something like “average”. “Ignore” basically means: whatever the alarm state is when the missing data point is introduced, that missing data point will be determined by that alarm state. It’s basically capable of being either “alarm” or “ok”. If you are in an alarm state, and move to the next time period, and there is missing data, the missing data is treated as above the threshold rather than being treated as below - which would drop your average and potentially drop your average below the threshold, which would change your alarm state to ok, even though the system might, and most likely is supposed to be, in alarm state. The same way a true for the inverse. If the alarm is in “ok” state, the missing data point won’t be treated as above the threshold, which could kick your average above the threshold. Basically - it’s Schrödinger’s cat.
Agree, it depens on the purpose and source of the metric. In some system, no data means no error, while in others, no data could mean something(such as a canary) stopped working.
Did you put the link to your "Anomaly detection" CloudWatch video in your description (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-lHWrAAzoxJA.html)?
What if you only want the email notification to be sent once a day, even if the alarm is in alarm state more than once in a day? (asking so as to not clutter up recipients inboxes if we expect the alarm to be triggered multiple times throughout the day while devs are troubleshooting some issue)
when you set 5m 2 outof 3 you said we have 15 minute window then you said 2, 5 minuet in a row we need to be above the threshold don't understand that 2, 5 minuet in a row part
Is it important to know how often data points appear on a graph (metric resolution) when setting period + evaluation periods + data points to alarm values?
A cool thing about CloudWatch Alarm is, you can integrate it with your own services, so that a red alarm can trigger things in your own monitor/paging/ticket system.
Hey Thanks for this awesome video.But I got confused at one point , when we are using additional configurations at that time the threshold value has no significance...am I right here ?
Fantastic video. Do you have a followup where you set up alarms for error status and for OK status? I want to use this for an app healthcheck. I want to trigger a lambda when the alarm goes off for errors, and trigger another lambda for when it goes back to OK status as I need to update some SSM params using this. Or, if you have a tutorial on how to set up a 'healthcheck' for an app/API using alarms, then that would be amazing too! thank you
Brilliant video, thanks! I've got my alert setup, and have it in an "alarm state" for testing, but I'm not getting emails. The address is verified, but not sure what to do. One thing I don't think I heard in your video: How often (once triggered) will the alert be sent? Is it based on the "period" interval? So if the interval is 5 mins, is the alert sent that often... or is the alert only sent once regardless of the interval, once it enters that state? Hopefully that makes sense?