Nagios PlugIn Architecture

The power of Nagios NMS is in its plug-in architecture. All check commands are external utilities that can be written in any languageā€”C, Python, Ruby, Perl and so on. The plug-ins communicate with the Nagios system by means of OS return codes and the standard input/output mechanism. In other words, Nagios has a predefined set of return codes that the check scripts must return. The return code dictates what the new service state should be set to. All return codes and the corresponding service states are listed in Table 8-1.

Table 8-1. Nagios Plug-In Return Codes

Return Code Service State

0 OK. The service is in a perfectly healthy condition.

1 WARNING. The service is available but is dangerously close to the critical condition.

2 CRITICAL. The service is not available.

3 UNKNOWN. It's not possible to determine the state of the service.

In addition to the return code, a plug-in should also print at least one line to the standard output. This printed string should contain a mandatory status text followed by the optional performance data string. So a simple one-line report example can be:

WebSite OK

This text will be appended to the status report message in the Nagios GUI. Similarly, with the performance data appended, it would look like this:

WebSite OK | response_time=1.2

The performance data part then is available through the built-in Nagios macros and can be used to plot the graphs. More information about using the performance data parameter is available at http://nagios.sourceforge.net/docs/3_0/perfdata.html.

When you write a new plug-in, you must provision it first in the configuration files, so that Nagios knows where to find it. Conventionally, all plug-ins are stored in /usr/lib/nagios/plug-ins.

Once you've written a check script you must define it in the command. cfg configuration file, which can be found in /etc/nagios/objects/. The actual location may be different depending on how you installed Nagios. Here is an example of a check definition:

define command {

command_name check_local_disk command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$

When you define a service or a host, you now can refer to this check with the check_local_disk name. The actual executable is $USER1$/check_disk and accepts three arguments. Following is an example of the service definition that uses this check and passes all three parameters to it:

define service { use host_name service_description check_command

local-service localhost

Root Partition check local disk!10%!5%!/

The $USER1$ macro that you've seen earlier in the command-line definition simply refers to the plug-ins directory and is defined in /etc/nagios/private/resource. cfg as $USER1$=/usr/lib/nagios/plugins.

If you want, you can define a new macro and use it with your check scripts. This way, you'll separate the packaged scripts from your own, and it becomes easier to maintain. I recommend doing this for check scripts that have a complicated structure with external configuration files or other dependencies.

+3 0

Post a comment