Ruby self-throttling Nagios alerts

19 Mar 2010

If you use Nagios no doubt you've experienced a situation where a network issue has triggered an alert on every service on every server. There are a few ways to deal with this in Nagios, but after spending 10 minutes deleting SMS messages on my iPhone I decided to implement an alert system which will self throttle and not allow a gigantic flood of alerts.

In Nagios I created 2 notify commands, one for host notifications and another for service notifications:

Then throttle_alert.rb uses a control file to determine how long it has been since the previous alert, and will redirect alerts to a secondary (non-SMS) email address (and note the throttling in the alert itself) if an alert occurs inside the throttled window (90 seconds in this example):

Not rocket science and not perfect but it works to prevent draining my iPhone battery in a single incident.