Day 24 of the #100DaysToOffload Series:
Tonight when I was writing something else for the #100DaysToOffload, I got an “emergency” call from work that something was wrong. This happens a whole lot more than I'd like it to, but tonight it got me thinking about alerting strategy. I have some thoughts on the matter.
We have a couple different “official” methods for notification.
First, let's get this out of the way right way. Email sucks as a means of emergency notification. When you're getting literally hundreds of emails a day, trying to pick out the “emergency” emails is like trying to find a needle in a stack of other needles. It's just useless.
Of course that doesn't stop people from doing it because everybody has email. Other stuff has to be setup, but everybody has email. For the record, that doesn't make it better. It still sucks.
I'm not sure how many people are familiar with this service. It's actually not horrible. It's a cloud service that can be kicked off through a variety of different methods.
Most often, we use an email to the system. When an alert gets kicked off, and individual or group of individuals gets alerted in a variety of ways that they can configure in their profiles. Personally, I get alerted via work email, personal email, a phone call (yes, voice), a text message, and an alert via their little app. Most often I answer the call, hang up immediately without listening, and then respond via the app or the text message.
There are some other methods that we use that are a little “less official” than the ones listed above. We've used Boxcar and Pushover. We've used Slack channels and the dearly departed HipChat. Most of the time we could handle these alerts through MIR3 if we wanted, but it often feels like using a jack hammer to hang a picture on your wall.
Sometimes you want to tell people something happened, and you don't want those alerts to get lost in the deluge, but you also don't want to make a huge deal of it either.
For some reason, alerts always feel like they're something that's always in flux. Tonight's incident spawned three new alerts I've got to address tomorrow in my copious amounts of free time (note: that was pretty blatant sarcasm there). I'd really love to hear how you're handling your alerts. I could really use some ideas on how to reduce the overhead.