Unconventional Zabbix: Part 1 — Extended External Monitoring

Zabbix 2.2 will be the first version which will allow web scenario templating. But in the days of 1.6, 1.8 and 2.0, I had (still have) serious shortcomings in web monitoring to overcome. I had hundreds of URLs which needed monitoring with a large bulk of them being nearly identical, or at least having identical test URL responses. I had perhaps 5 basic URL types, of which one type alone was over 80% of the sample pool.

Using Zabbix’s native URL monitoring I would have spent hours, perhaps days, entering in each URL, each grep string, each potential response code, just to create the items. After that I’d still have to create triggers (though this part would be largely copy/clone-able). There had to be another way which would allow me to save time and preserve my sanity. Enter custom external monitoring.

Zabbix allows customized external monitoring with a few provisos: Do not overuse external checks! It can decrease performance of the Zabbix system a lot.  I have done some digging and found the reasoning behind this, and found a work-around for the aforementioned “decrease” in performance.

In ordinary Zabbix checks, after a database call for parameters, the call to the remote machine is initiated from within the Zabbix binary (already running).  It’s just a network call.  For an external check, Zabbix must spawn a shell and wait for the exit code to that process before it can free up resources and return the value.  Imagine what happens if you have hundreds of these per minute!  Zabbix could—and likely would—get held up.  The simplest response to this is to make Zabbix call a wrapper script as the external script.  The wrapper script will do little or no processing, pass the args to the actual process and run with an appended ampersand (&) to background the “actual” process and give a near-immediate return.  The “actual” script will send the values back as trapper type using zabbix_sender.  By using this method, I am able to process hundreds of new values per minute in 100% external script setups.

The only down side to this setup that I can ascertain is that I am tracking a Zabbix item whose only purpose is to initiate the external script (the wrapper).  I technically don’t need to have a trigger watching this item as it is always likely to have a zero as its return code (backgrounded shell).  This logic can be extended to a number of other clever tests, such as SSL certificate expiration watching and SSL certificate chain validation.  I will share these in a different post as this post is more about the methodology than the implementation.