Review of “Icinga Network Monitoring”

Icinga Network MonitoringI am a fan of monitoring. And of (implementing) Alfresco. There are too many organizations that could do better with some monitoring, for a whole bunch of business reasons. Especially at the start of the project until their IT department can catch up using their own tools, or from the starting point onward. I roughly know what Icinga can do, I know why an organization needs to have monitoring. I got the Icinga Virtual Appliance up and running and monitoring Alfresco at some 30-something parameters (Thanks for the plugin Tony de la Fuente, also note his Lightning Talk at the Alfresco DevCon 2012). Now I discovered Ansible, and I have a dream… Then this book arrived at my inbox…

The Content
Icinga is about monitoring; Checking hosts/services, Evaluating thresholds, and Alerting contact persons, or reporting availability. The Icinga core orchestrates just that, and a few web-based UI’s can visualize ‘things’ for you.

The first chapter is about installation and configuration. Icinga is a Linux based tool , and there are pre-packaged installers for Debian/Ubuntu, (apt-get/dpkg)  CentOS/RedHat (yum) and of course compiling from source. None of the options is a pain, if you don’t fear the command line.  In this chapter a basic monitoring system will be created monitoring itself. The configuration of how to monitor what and when is fully text file based. Actually, the core of the tool is just controlling when to invoke  measurement. The actual measuring of services is done by scripts, plugin’s or the native Linux tooling.

Chapter 2 is all about configuration of Objects; servers, services, inheritance… groups of servers, groups of services, configuring notification (you want to know if your serv[er|ice] is down right??). One of the powers of the tool is the simplicity of configuration, and the complex configuration you can create from that…

In chapter 3 the difference between active and passive checks is explained, as well as the option of using add-ons on the remote system(s) to delegate execution towards (versus all ssh from the Icinga server). There are arguments for both, so it will depend on the situation at hand. At least the concepts and config is clear now…

The definition of host, hostgroups ad hostdependencies are discussed in Chapter 4. These define the network topology, and by defining clever dependencies, Icinga can alert on the ‘first’ system generating a failure (e.g. a switch), instead of all systems behind that bottle neck too (all servers behind that switch, from Icinga’s point of view). Usually, alerting against this switch solves the problem… You do need to model your network/service topology in Icinga though. The same concepts apply to services, servicegroups and service dependencies. This chapter nicely illustrates the possibilities using examples and clear schema’s.

One joy of monitoring is that you get notified if stuff is broken. Chapter 5 covers just that. There is a load of possibilities, all relatively simple definitions actually. It makes sense to notify a person/group after so many units-of-time of failure. (lets wit until it recovers by itself…) And when normal operation is restored. And there are automatic escalations to notify other people/groups if some unit of time passed by, and the failure is still there… Nice!

Chapter 6 describes the rules to obey, if one would create their own plugins. What calling syntax can be expected, what resulting codes and output can be processed. A nice primer if needed.

The final chapter describes the web UI’s on top of Icinga. The core functions well without a UI. Icinga can be used with the ‘Icinga Classic’ web UI, or the more ‘modern’  ‘Icinga Web’ interface. This chapter describes in enough detail what packages to install on Debian/RedHat, and what dependencies are needed. Screenshots of both web-based I’s are provided. Thruk is mentioned as a third user interface, with additional advantage that it can aggregate and combine multiple Icinga instances in a single UI.

The Verdict
I like the condensed explanation of how and why. Using the online documentation and the ‘localhost sample’ provided, I managed to gain 90% of the insights without the book… Until and including chapter 2… And that took me quite some experimenting and reverse engineering. Having the book makes the trip more comfortable and way more time efficient.

Starting in chapter 3, I am learning more and more new stuff, and the book becomes valuable to me. After finishing chapter 4, I start to feel the urge of getting stuff up and running again. I like monitoring to detect failures early. From my experience; Icinga delivers a lot of options and configurations. Especially if you can rely on email. I have to figure out affordable SMS/text services though 🙂 These options have easy syntax to configure, an an extremely powerful way of covering complex scenario’s using these ‘simple’ building blocks of configuration. This books describes ow you can technically realize that.

More than being re-active (detect warnings and failures, and notify the contact persons that can fix the problem) I also like to be pro-active on monitoring networked applications. I want to see the trend of (growing?) number of login tickets versus memory and IO usage, as well as performance statistics. If the one in context of the other(s) has a trend that leads to future failure, I want to be aware as soon as possible, so action can be planned, instead of fixed overnight. (Planning a fix takes way more than a quick-fix overnight.) The relative growth or decrease of parameters relative to each other, have a meaning for applications. Therefore I have a need for storing the numbers historically, and be able to display these historical values. Take a look at PNP4Nagios (Thanks again Tony de la Fuente) to get graphics in Icinga too. I wonder if there are other options too, possibly better?

In general, this is the book you need to quick-start monitoring, and to quick start getting Icinga up-and-running. It describes the recipes to configure the monitoring system, explains the basic concepts, and describes how to build many simple things into a complex monitoring solution. If you want insight in where your topology is going wrong and what host/service to fix: This book helps you out. If you need to spot trends: you need a little bit more than provided in this initial version of the book.

But maybe my focus on monitoring a networked application, might be slightly not the average use of Icinga. But that is exactly why Icinga rocks for me. It is a monitoring solution that can monitor many JMX based  hosts. And that is exactly ‘my world’ if there are multiple Alfresco environments around…

2 Responses to “Review of “Icinga Network Monitoring””

  1. 1 Douglas C. R. Paes February 19, 2014 at 16:25

    What a great review…

    If it relies on JMX to monitor the application, then it’s not compatible with Alfresco Community, right?

    Thank you.

    • 2 Tjarda Peelen February 20, 2014 at 08:21

      Hi Douglas,

      Thanks for your feedback.
      This monitoring sure is applicable for Community. Java/Tomcat provide quite some metrics themselves. Not the full-blown set from Enterprise (like active Tickets) but still, Java Memory, CPU is still available. And the OS-level statistics of course (like in Windows is service running, discspace), as well ass port availability. Still quite useful I think…


Comments are currently closed.