Disclaimer
Articles in the "Unsupported Community Documents" space are not supported by ITRS Support.Compatibility
This article was written for version 6.2.2 of OP5 Monitor. It could work on both lower and higher versions if nothing else is stated.Most plugins shipped with OP5 Monitor will always print the same number of labels in the performance data part of the check result, making the default storage type for rrd files single. This means that each service will have one rrd file containing a static amount of data sources, which is determined either by a special template for the particular check command used, or by the default template the first time the check runs.
You could compare it to creating a MySQL table for each check, with a fixed number of columns because you know that the number of columns will never increase.
The result is one rrd file per service, for example a simple HTTP check, which in this case would contain two data sources (one for time, and one for size):
HTTP_Server.rrdds[1].index = 0ds[1].type = "GAUGE"ds[1].minimal_heartbeat = 8460...ds[2].index = 1ds[2].type = "GAUGE"ds[2].minimal_heartbeat = 8460
However, there are situations where this can become a problem. More specifically, when you use a check plugin that for whatever reason has a variable amount of labels in its performance output. For example, a check plugin that checks the size of all databases, which returns a separate performance data label for every found database. The number of databases could both increase and decrease, which also applies to the number of resulting performance data labels (data sources).
What then happens is the number of data sources the rrd file will hold is determined the first time the check runs, and if this later changes due to the dynamic nature of the response from the plugin, the result will be an error message in the system logs, and the rrd file is left without updates. The log messages as seen below are related to these type of rrd issues:
rrdcached[1708]: queue_thread_main: rrd_update_r (/opt/monitor/op5/pnp/perfdata/host/service.rrd) failed with status -1. (/opt/monitor/op5/pnp/perfdata/host/service.rrd: expected 3 data source readings (got 1) from 1397107364)
rrdcached[1708]: queue_thread_main: rrd_update_r (/opt/monitor/op5/pnp/perfdata/host/service.rrd) failed with status -1. (/opt/monitor/op5/pnp/perfdata/host/service.rrd: found extra data on update argument: 0:0)
The solution is to enable RRD_STORAGE_TYPE MULTIPLE, which means that instead of storing a fixed amount of data sources in one rrd, the system will create one rrd file per data source, allowing for a dynamic scaling. This would be comparable to creating a new MySQL table for each data point, allowing for greater flexibility in dynamically growing or decreasing labels in the performance data output.
Using the same example as mentioned above, this would result in the following two rrd files, extended by label name:
HTTP_Server_time.rrd
HTTP_Server_size.rrd
Each only containing one data source.
Enabling RRD_STORAGE_TYPE MULTIPLE can be done on a per check command basis. For the purpose of this exercise, the dummy check plugin script found below demonstrates the behavior of a plugin with a dynamic number of performance data labels. This script will generate a random number of data points (0-9) each time it runs and then simply output a check output of OK, followed by the random number of data points, each containing a value between 0 and 19.
#!/bin/bashecho -n "OK | "for (( c=0; c<=$((RANDOM % 10)); c++ ))do echo -n "label_$c=$((RANDOM % 20));"doneecho "" exit 0
The check command object configuration:
Parameter | Value |
---|---|
command_name | my_multi_test |
command_line | $USER1$/custom/generate_multiple.sh |
And most importantly, the PNP configuration file corresponding to this particular check command using the same name as the command, followed by .cfg:
/opt/monitor/etc/pnp/check_commands/my_multi_test.cfgCUSTOM_TEMPLATE = 1RRD_STORAGE_TYPE = MULTIPLE
Now it is simply a matter of creating a service check that uses the my_multi_test check command. In this case the service is called Graph Multiple:
Parameter | Value |
---|---|
service_description | Graph Multiple |
check_command | my_multi_test |
The result after the check has run for a while, ten different rrd files, each containing the data for one data point allowing data to be populated into each graph when available, without problems or interruptions.
-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:25 Graph_Multiple_label_0.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:25 Graph_Multiple_label_1.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:25 Graph_Multiple_label_2.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:20 Graph_Multiple_label_3.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:20 Graph_Multiple_label_4.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:20 Graph_Multiple_label_5.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:20 Graph_Multiple_label_6.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:20 Graph_Multiple_label_7.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:09 Graph_Multiple_label_8.rrd-rw-rw-r-- 1 monitor apache 384952 Apr 8 12:09 Graph_Multiple_label_9.rrd
And, the resulting graphs created:
It may take some time before all graphs become available, due to the xml file for the service being updated every 15 minutes, so be patient and allow the check to run for a while.
Comments
0 comments
Please sign in to leave a comment.