Scott Becker's Blog

Tech startup experiences and lessons learned

Startup Tip #1: Measure, Monitor, Alert

with 17 comments

I can’t stress enough the importance of good monitoring.  At Invite Media, monitoring and alerting were indispensable.  We monitored everything we could: server, business, and application metrics.  We knew about problems long before they became serious.

Some sample questions you should be able to answer about your app:

  • How many requests are we serving?
  • How many failed requests are happening?
  • How many exceptions, warnings, and errors occur per minute?
  • What is our resource utilization per customer?
  • What is our response time?

Here is sample graphite graph from our war console showing test campaigns from a few years back.

I recommend making a webpage that lists all your important graphs.  Bonus points for having it constantly refresh on a big monitor for everyone to see (i.e. a “war console”).

Recommended Tools:

  • Zenoss
    • Standard monitoring and alerting software.  Great for server level monitoring: disk, memory, server up/down.  Its a bit tricky to learn.
  • Pagerduty
    • A useful tool in combination with a tool like zenoss to make sure that critical alerts get resolved.   It enables “on-duty” rotations as well as escalation rules if alerts are not resolved in a fixed time period.
  • Pingdom / Webmetrics / Gomez
    • Easy way to monitor url up/down and response time.  Pingdom is cheaper and best for up/down status of simple requests.  Gomez & Webmetrics are good for response time monitoring of full pages.  Besides just monitoring your own site, I recommend also including partner url’s in pingdom.  This will help in debugging the root cause of latency on pages.  Gomez is overpriced but, if your partners are using it, you have to use it.  Otherwise, you won’t be able to isolate if a gomez datacenter is to blame instead of your service [yes, partners will blame Gomez issues on you unless you can prove otherwise].

You can’t keep an eye on everything all the time of course; Setting up lots of thresholds and alerts is also important.  Zenoss allows for setting these up.  A cron script will do the job on top of graphite.

I recommend breaking alerts into critical (must be fixed immediately) and non-critical (can wait until morning) subgroups.   You can then set pagerduty to make sms & phone calls for the critical group and only send emails for the non-critical group.



Advertisements

Written by scottb

August 31, 2010 at 11:42 am

17 Responses

Subscribe to comments with RSS.

  1. Scott J. Becker is my hero.

    I am proud that my name is J. Scott.

    Jordan

    August 31, 2010 at 6:13 pm

  2. I totally get the monitor, measure, alert message in this article but as start up tip, it confused me a little. My expectations of the article were different, plus I read tip 2 before I read tip 1 so kinda confused myself a bit :/

    Inbound Tweet Rating 9
    Facebook Like = Yes
    Article Rating 8
    Overall Rating 5.5 lol

    Good stuff Scott. Thanks for sharing.

    Neil

    September 12, 2010 at 9:51 pm

  3. I enjoy what you guys are usually up too.
    This sort of clever work and coverage! Keep up the great works guys I’ve added you guys to blogroll.

    Gender equity

    January 12, 2013 at 6:55 am

  4. With the long reign of Elizabeth, architecture and decoration had passed into a new and Flemish phase, though Italy was still to the traveller “gazing only on the beauty of their cities, and the painted surface of their houses,” the
    only paradise of Europe. In evaluation you take into consideration things
    like height, safety etc. With modern houses devoid of
    the sprawling living rooms and large rooms, it is the neat, simplistic appearance that is favored.

    staircase railings

    April 19, 2013 at 4:24 pm

  5. Hey There. I discovered your weblog the usage of msn.
    That is an extremely smartly written article.
    I will be sure to bookmark it and return to read extra of your useful info.
    Thank you for the post. I’ll certainly return.

    http://irishdebs.ie

    April 30, 2013 at 11:55 am

  6. Heya fantastic website! Does running a blog similar
    to this take a great deal of work? I have no expertise in programming
    but I was hoping to start my own blog in the near future. Anyway, if you have any suggestions or tips for new blog owners please share.
    I understand this is off subject nevertheless I
    just had to ask. Thanks!

  7. Do you have a spam issue on this website; I also am a blogger, and I was wondering
    your situation; we have created some nice methods and we
    are looking to swap strategies with other folks, why not shoot me an e-mail if interested.

  8. One can easily bluff about themselves in online dating,
    but this will be quite tough when someone is meeting with their counterpart face to face.
    But use it as a tool for just that: to locate potential dates.
    You end up revealing many personal details over the course of an online
    dating relationship, but these details can be used against you to steal your identity.

  9. Write more, thats all I have to say. Literally, it seems as though
    you relied on the video to make your point. You clearly know what youre talking about,
    why throw away your intelligence on just posting videos to your weblog when you could be giving us something
    enlightening to read?

  10. Hey! I know this is kind of off topic but I was wondering if
    you knew where I could get a captcha plugin for my comment form?
    I’m using the same blog platform as yours and I’m having trouble finding one?

    Thanks a lot!

  11. You have made some decent points there. I checked on the net for additional information about the issue and found most individuals
    will go along with your views on this web site.

    Source

    July 17, 2013 at 5:43 am

  12. Whoa! This blog looks just like my old one!

    It’s on a entirely different subject but it has pretty much the same page layout and design. Superb choice of colors!

  13. I will immediately seize your rss feed as I can not
    to find your email subscription link or newsletter service.

    Do you have any? Kindly allow me recognize so that I
    could subscribe. Thanks.

    http://www.youtube.com/

    July 26, 2013 at 8:06 am

  14. Do you have a spam problem on this site; I also am a blogger, and I
    was wanting to know your situation; we have created some nice methods and we are looking to trade strattegies with other folks, please shoot
    me an e-mail iif interested.

    Johnny

    October 18, 2013 at 5:32 pm

  15. Wow, this paragraph is nice, my younger sister is
    analyzing such things, therefore I am going to inform her.

    mobile games

    April 14, 2014 at 11:59 am

  16. Unquestionably believe that which you said. Your favorite justification seemed to be on the net the simplest
    thing to be aware of. I say to you, I definitely get irked
    while people think about worries that they
    plainly do not know about. You managed to hit the nail upon the
    top and also defined out the whole thing without having side-effects , people can take a signal.
    Will likely be back to get more. Thanks

  17. Excellent content you post here! You can earn some extra $$
    from your blog, don’t miss this opportunity, for more info simply type in google –
    omgerido monetize website

    Erica

    December 11, 2014 at 5:52 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: