The check_distribution service notifies error.
Example of error:
ERROR: There are 11 expired checks
Solution - fix active_checks_enabled setting
When running a distributed solution. Expired checks can be the product of diverging settings on active_checks_enabled.
To analyze what services that has diverging settings you can download the script attached to the bottom of this article (
perl mon_node_output_parse_diff.pl Usage: my-program <input-file-name> Use case to check services active_checks_enabled Get data for service checks # mon node ctrl --self --all "mon query ls services -c host_name,description,active_checks_enabled" > mon_node_services.txt Parse data with scrip perl mon_node_output_parse_diff.pl mon_node_services.txt Use case to check hosts active_checks_enabled Get data for host checks # mon node ctrl --self --all "mon query ls hosts -c name,active_checks_enabled" > mon_node_hosts.txt Parse data with scrip perl mon_node_output_parse_diff.pl mon_node_hosts.txt
The script will output the checks that differ and on what hosts they differ on.
If there are no output. Then the settings does not differ between nodes.
Review the services that has active checks disabled. Passive checks including business services should have active checks disabled.
Run this command which will display all checks with active_checks disabled for service checks:
# mon query ls services -c host_name,description,active_checks_enabled | grep "0\$"
And this for the hosts:
Next step you must decide a host which will act as master data for updating the others.
You can run the one liner below on the chosen master server to propagate it's settings to the other master and pollers. It will log all commands to the file propagate_active_checks.txt and run in the background.
Propagate settings for host checks: