Sometimes one needs to troubleshoot the argument parsing or the results of strangely behaving check plugins, notification scripts, etc. that are being executed from within op5 Monitor. This article explains how to install a generic command wrapper script, pwrap, which simplifies the troubleshooting of things like this.
The pwrap command wrapper script
- Installed in a way that temporarily replaces the executable file subject to troubleshooting.
- Behaves as if the real script/binary was executed.
- Logs received arguments and environment variables (prior to executing the real script/binary).
- Logs the stdout and stderr streams of the executed command (while it's running, just like tee).
- Logs the return code of the executed command (given that the script was not prematurely terminated).
In the instructions found below, the /opt/plugins/check_nrpe check plugin is used, but this can be replaced with any other executable in the system, such as the /opt/monitor/op5/notify/notify.pl script responsible for sending alert notifications.
Creating a backup of the executable file
Prior to installing the pwrap script, the executable file subject to troubleshooting must be placed in a backup location. The pwrap script supports three different path name variations:
- original_executable_name.bak (a .bak filename extension appended)
- .bak_original_executable_name (a .bak_ string prepended to the filename)
- bak/original_executable_name (the executable file placed in a sub directory called bak)
The executable file can be moved to this location at once, but it is recommended to make a copy instead. For example, in case the executable file is a check plugin that is periodically run by op5 Monitor, copying the file instead of moving it avoids any temporary "file not found" issues, that could possibly otherwise appear at least until the next step is completed.
Installing the wrapper script
The pwrap script should be installed at the original location of the executable that is subject to troubleshooting ? overwriting the original file in case it was copied (not moved) in the previous step of the instructions. The pwrap script will automatically find the original executable in one of the bak path variations.
Download the attached pwrap.sh.gz file and upload it to your server, prior to running the command above.
Executing the wrapped command
The log directory
The log files are stored in a /tmp sub directory tree, created like this:
/tmp/pwrap/<basename of wrapped executable>/<year>-<month>/<day>/<hour>/<minute><second>.<nanosecond>/
For example, if check_nrpe was executed June 19th 2014 16:33:56, the resulting directory ends up like this:
This means that a new directory is created for each run (unless two commands are executed in the same nanosecond of course...), containing 7 or 8 files.
- The standard output (stdout) data generated by the executed command.
- The standard error (stderr) data generated by the executed command, if any.
- All command line arguments, except the name of the called command. Each argument is null terminated, just like the /proc/*/cmdline files.
- The name of the running command, just like it was called, such as /opt/plugins/check_nrpe, ./check_nrpe or check_nrpe.
- The return code of the executed command.
- All environment variables, also null terminated, just like the arg file.
- All collected information, except the stdin/stderr data, formatted in a human readable manner.
- The path to the actual executable file that is run and wrapped.
The standard input (stdin) stream is not collected, but commands executed by nagios/naemon are not fed anything on stdin, anyway.
Contents of the log files
Thanks to the log files, it's easy to determine what the command line arguments actually looked like...
Re-executing the command
In some cases it could be useful to re-execute the command which was previously executed by the wrapper script. Perhaps some system issue has been resolved and now you would like to find out if the command works better this time around.
The following example shows how to use the xargs tool to execute the command again, exactly the same way as before, but without the wrapping. The -0 argument means that the list of arguments read on stdin (from the arg file) are null separated. The -t argument means that the resulting command line executed by xargs is displayed.
Using the real file this way means that the actual backed up executable file will be run. The $(<real) part can replaced with another executable file path to run some other program with the same argument set.
Restoring the wrapped executable
Once the troubleshooting is complete, simply move the executable from the backup location back to its original location (overwriting the wrapper script).