Preface
ist-collect is a tool to collect logs, configuration files, and other system data for analysis by Client Services or Engineering as part of a Support ticket. This article will describe how ist-collect is collecting these files, and to some extent what they contain. It will also detail how you can manually select what to collect.
For more information on how to inspect the data being collected by ist-collect, see this article.
Different data
ist-collect collects data in five steps outlined below. There are two main types of collection processes: file collection and data collection. For a single node the workflow is the following:
File collection
⤷ File list generation (based on modules) → Discovery → Collection
Data collection
⤷ Cluster data
⤷ General system data
⤷ File system data
⤷ PHP data
File collection
Two types of files are collected during the file collection stage: log files and configuration files. First, the node is scanned to detect what files to collect. This is done once per node in a cluster.
File list generation
This is done once per ist-collect run. The file collection in ist-collect is based on different modules. A list of available modules can be found with ist-collect -m. For each module, ist-collect has a list of files and/or directories divided into the categories logs and configs. The current list of modules as of ist-collect version 0.9.8 is:
cron
httpd
livestatus
lmd
logging
mayi
merlin
mysql
nachos
nacoma
naemon
nagvis
network
ninja
php
pnp
postfix
queryhandler
secure
smsd
snmpscan
synergy
syslog
trapper
yum
Let's inspect one of these closer to get some more details. The module httpd will serve as our example. The following are the files and directories for httpd:
httpd:
logs:
"/var/log/httpd"
configs:
"/etc/httpd/conf"
"/etc/httpd/conf.d"
"/etc/httpd/conf.modules.d"
In this case, only directories are listed. This means that during the file list generation stage, these 'files' are passed on as a single list to the file discovery stage. As such:
"/var/log/httpd"
"/etc/httpd/conf"
"/etc/httpd/conf.d"
"/etc/httpd/conf.modules.d"
File discovery
The file list is only generated once, but for each node, file discovery is performed individually. During this step, every single file in the file list passed on from the file list generation stage is being checked for the following:
- If it exists and is readable
- If it's a directory
- In each existing directory, all files (but not directories) contained in the directory is marked for collection.
- If there are archived log files, only the five most recent are marked for collection.
The result of the file discovery is a new list containing the actual files that will be collected. For our example above, the following list is produced:
/etc/httpd/conf/httpd.conf
/etc/httpd/conf/magic
/etc/httpd/conf.d/README
/etc/httpd/conf.d/autoindex.conf
/etc/httpd/conf.d/httpd-autodiscovery.conf
/etc/httpd/conf.d/httpd-nachos.conf
/etc/httpd/conf.d/httpd-portal.conf
/etc/httpd/conf.d/monitor-api.conf
/etc/httpd/conf.d/monitor-backup.conf
/etc/httpd/conf.d/monitor-ninja.conf
/etc/httpd/conf.d/nagvis.conf
/etc/httpd/conf.d/op5-dokuwiki.conf
/etc/httpd/conf.d/op5-header-options.conf
/etc/httpd/conf.d/op5-method-trace-track.conf
/etc/httpd/conf.d/op5-rewrite-rule.conf
/etc/httpd/conf.d/php.conf
/etc/httpd/conf.d/pnp.conf
/etc/httpd/conf.d/ssl.conf
/etc/httpd/conf.d/userdir.conf
/etc/httpd/conf.d/welcome.conf
/etc/httpd/conf.modules.d/00-base.conf
/etc/httpd/conf.modules.d/00-dav.conf
/etc/httpd/conf.modules.d/00-lua.conf
/etc/httpd/conf.modules.d/00-mpm.conf
/etc/httpd/conf.modules.d/00-proxy.conf
/etc/httpd/conf.modules.d/00-ssl.conf
/etc/httpd/conf.modules.d/00-systemd.conf
/etc/httpd/conf.modules.d/01-cgi.conf
/etc/httpd/conf.modules.d/10-php.conf
/var/log/httpd/access_log
/var/log/httpd/access_log-20200405
/var/log/httpd/access_log-20200412
/var/log/httpd/access_log-20200419
/var/log/httpd/access_log-20200426
/var/log/httpd/error_log
/var/log/httpd/error_log-20200405
/var/log/httpd/error_log-20200412
/var/log/httpd/error_log-20200419
/var/log/httpd/error_log-20200426
/var/log/httpd/ssl_access_log
/var/log/httpd/ssl_access_log-20200405
/var/log/httpd/ssl_access_log-20200412
/var/log/httpd/ssl_access_log-20200419
/var/log/httpd/ssl_access_log-20200426
/var/log/httpd/ssl_error_log
/var/log/httpd/ssl_error_log-20200405
/var/log/httpd/ssl_error_log-20200412
/var/log/httpd/ssl_error_log-20200419
/var/log/httpd/ssl_error_log-20200426
/var/log/httpd/ssl_request_log
/var/log/httpd/ssl_request_log-20200405
/var/log/httpd/ssl_request_log-20200412
/var/log/httpd/ssl_request_log-20200419
/var/log/httpd/ssl_request_log-20200426
This new list is then passed on to the file collection stage.
File collection
Finally, the actual files are copied from the host, either the local host or a remote host, to the temporary directory for later compression. This is performed using the GNU/Linux tool rsync.
Log files
Log files containing Unix timestamps have their timestamps converted to ISO 8601 after collection. This behavior can be disabled with the -u (--keep-unix-timestamps) option.
Configuration files
The configuration files are not changed during the collection.
Data collection
During data collection, four different types of data are being collected.
Cluster information
This file is only generated once for the cluster, no matter if it comprises one or 100 nodes. This is because the output should not significantly differ across nodes, and the third command already concatenates version information across all nodes. The content of the file is the output from the following commands:
# asmonitor mon node tree
# asmonitor mon node status
# asmonitor mon node ctrl --self --all "cat /etc/op5-monitor-release" | grep VERSION | uniq
# asmonitor mon node show
# mysql nacoma -e 'SELECT * FROM changelog\G'
From our example above, the full file looks like this:
Cluster information generated for OP5 by ITRS Support Tool.
================================================================================
$ asmonitor mon node tree:
+-----+ +----------+
| ipc |----| master02 |
+-----+ +----------+
|
|
| HOSTGROUP: se-gbg +----------+
= --------------------| poller01 |
| +----------+
|
|
| HOSTGROUP: unix-servers +--------+
= --------------------------| slim01 |
+--------+
--------------------------------------------------------------------------------
$ asmonitor mon node status:
Total checks (host / service): 11 / 166
#00 1/1:1 local ipc: ACTIVE - 0.000s latency
Uptime: 1d 7h 48m 11s. Connected: 1d 7h 48m 10s. Last alive: 0s ago
Host checks (handled, expired, total) : 3, 0, 6 (50.00% : 27.27%)
Service checks (handled, expired, total): 9, 0, 19 (47.37% : 5.42%)
#01 0/1:1 peer master02: ACTIVE - 0.000s latency
Uptime: 1d 7h 56m 13s. Connected: 1d 7h 48m 10s. Last alive: 5s ago
Host checks (handled, expired, total) : 3, 0, 6 (50.00% : 27.27%)
Service checks (handled, expired, total): 10, 0, 19 (52.63% : 6.02%)
#02 0/0:0 poller poller01: ACTIVE - 0.000s latency
Uptime: 1d 7h 56m 14s. Connected: 1d 7h 48m 10s. Last alive: 1s ago
Host checks (handled, expired, total) : 3, 0, 3 (100.00% : 27.27%)
Service checks (handled, expired, total): 126, 0, 126 (100.00% : 75.90%)
#03 0/0:0 poller slim01: ACTIVE - 0.000s latency
Uptime: 4w 2d 22h 10m 25s. Connected: 1d 7h 48m 7s. Last alive: 0s ago
Host checks (handled, expired, total) : 2, 0, 2 (100.00% : 18.18%)
Service checks (handled, expired, total): 21, 0, 21 (100.00% : 12.65%)
--------------------------------------------------------------------------------
$ asmonitor mon node ctrl --self --all "cat /etc/op5-monitor-release" | grep VERSION | uniq:
VERSION=8.1.2
--------------------------------------------------------------------------------
$ asmonitor mon node show:
# master02
ADDRESS=master02
TYPE=peer
PORT=15551
NAME=master02
# poller01
ADDRESS=poller01
TYPE=poller
PORT=15551
HOSTGROUP=se-gbg
NAME=poller01
# slim01
NAME=slim01
HOSTGROUP=unix-servers
TAKEOVER=no
CONNECT=no
ADDRESS=192.168.0.4
TYPE=poller
PORT=15551
--------------------------------------------------------------------------------
$ mysql nacoma -e 'SELECT * FROM changelog\G':
--------------------------------------------------------------------------------
The output from each command above is surrounded by a divider of "-". You can see that the last command did not generate any output.
General system data
The 'sysinfo' file is created once for each node in the cluster and contains the output from the following commands:
# grep -D skip . /etc/*-release
# rpm -qf /etc/op5-monitor-release
# uname -a
# uptime
# id
# date
# locale
# iostat -x
# df -hP
# df -iP
# free -mh
# lscpu
# lspci | sed -r 's/^[^ ]+ //' | sort | uniq -c
# ps aux | grep 'merlind\|naemon\|lmd\|httpd'
Besides the above, the status of the following services is also checked:
merlind
naemon
nachos
lmd
httpd
collector
postfix
synergy
smsd
mysqld / mariadb
And this is what the complete file from our example looks like:
Sysinfo generated for OP5 by ITRS Support Tool.
================================================================================
$ grep -D skip . /etc/*-release:
/etc/centos-release:CentOS Linux release 7.7.1908 (Core)
/etc/op5-monitor-release:VERSION=8.1.2
/etc/op5-release:VERSION=2019.a.2-op5.1.20190130130201.el7
/etc/os-release:NAME="CentOS Linux"
/etc/os-release:VERSION="7 (Core)"
/etc/os-release:ID="centos"
/etc/os-release:ID_LIKE="rhel fedora"
/etc/os-release:VERSION_ID="7"
/etc/os-release:PRETTY_NAME="CentOS Linux 7 (Core)"
/etc/os-release:ANSI_COLOR="0;31"
/etc/os-release:CPE_NAME="cpe:/o:centos:centos:7"
/etc/os-release:HOME_URL="https://www.centos.org/"
/etc/os-release:BUG_REPORT_URL="https://bugs.centos.org/"
/etc/os-release:CENTOS_MANTISBT_PROJECT="CentOS-7"
/etc/os-release:CENTOS_MANTISBT_PROJECT_VERSION="7"
/etc/os-release:REDHAT_SUPPORT_PRODUCT="centos"
/etc/os-release:REDHAT_SUPPORT_PRODUCT_VERSION="7"
/etc/redhat-release:CentOS Linux release 7.7.1908 (Core)
/etc/system-release:CentOS Linux release 7.7.1908 (Core)
--------------------------------------------------------------------------------
$ rpm -qf /etc/op5-monitor-release:
op5-monitor-2020.c.1-op5.1.20200330032245.el7.noarch
--------------------------------------------------------------------------------
$ uname -a:
Linux master01 3.10.0-1062.18.1.el7.x86_64 #1 SMP Tue Mar 17 23:49:17 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
--------------------------------------------------------------------------------
$ uptime:
11:36:19 up 25 days, 23:47, 1 user, load average: 0.28, 0.16, 0.14
--------------------------------------------------------------------------------
$ id:
uid=0(root) gid=0(root) groups=0(root)
--------------------------------------------------------------------------------
$ date:
Mon Apr 27 11:36:19 CEST 2020
--------------------------------------------------------------------------------
$ locale:
LANG=C
LC_CTYPE="C"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="C"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="C"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=
--------------------------------------------------------------------------------
$ iostat -x:
Linux 3.10.0-1062.18.1.el7.x86_64 (master01) 04/27/2020 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
2.66 0.00 1.16 0.65 0.00 95.53
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.30 0.25 21.34 8.22 123.82 12.23 0.71 32.90 42.03 32.80 2.59 5.59
dm-0 0.00 0.00 0.00 0.02 0.00 0.06 8.10 0.01 521.57 32.27 543.65 3.89 0.01
dm-1 0.00 0.00 0.25 21.62 8.21 123.75 12.07 0.77 34.87 42.35 34.79 2.56 5.60
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 37.08 0.00 4.04 1.53 123.25 2.49 0.00
--------------------------------------------------------------------------------
$ df -hP:
Filesystem Size Used Avail Use% Mounted on
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 2.0G 201M 1.8G 11% /run
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/mapper/centos-root 42G 19G 24G 45% /
/dev/sda1 497M 233M 265M 47% /boot
/dev/mapper/centos-home 21G 33M 21G 1% /home
tmpfs 396M 0 396M 0% /run/user/0
--------------------------------------------------------------------------------
$ df -iP:
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 502739 353 502386 1% /dev
tmpfs 505679 1 505678 1% /dev/shm
tmpfs 505679 578 505101 1% /run
tmpfs 505679 16 505663 1% /sys/fs/cgroup
/dev/mapper/centos-root 44019712 145792 43873920 1% /
/dev/sda1 512000 358 511642 1% /boot
/dev/mapper/centos-home 21491712 7 21491705 1% /home
tmpfs 505679 1 505678 1% /run/user/0
--------------------------------------------------------------------------------
$ free -mh:
total used free shared buff/cache available
Mem: 3.9G 758M 2.1G 237M 1.0G 2.6G
Swap: 1.0G 123M 932M
--------------------------------------------------------------------------------
$ lscpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz
Stepping: 1
CPU MHz: 1999.998
BogoMIPS: 3999.99
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0,1
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm 3dnowprefetch ssbd ibrs ibpb stibp fsgsbase smep arat spec_ctrl intel_stibp flush_l1d arch_capabilities
--------------------------------------------------------------------------------
$ lspci | sed -r 's/^[^ ]+ //' | sort | uniq -c:
1 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
1 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)
1 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)
1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
1 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 08)
1 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01)
32 PCI bridge: VMware PCI Express Root Port (rev 01)
1 PCI bridge: VMware PCI bridge (rev 02)
1 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
1 System peripheral: VMware Virtual Machine Communication Interface (rev 10)
1 VGA compatible controller: VMware SVGA II Adapter
--------------------------------------------------------------------------------
$ ps aux | grep 'merlind\|naemon\|lmd\|httpd':
root 1270 0.0 0.4 521124 19464 ? Ss Apr01 2:13 /usr/sbin/httpd -DFOREGROUND
apache 1732 0.0 0.5 630652 22164 ? S Apr26 0:03 /usr/sbin/httpd -DFOREGROUND
apache 1733 0.0 0.5 629868 21732 ? S Apr26 0:01 /usr/sbin/httpd -DFOREGROUND
apache 1734 0.0 0.3 521820 13924 ? S Apr26 0:00 /usr/sbin/httpd -DFOREGROUND
apache 1735 0.0 0.6 632668 24284 ? S Apr26 0:01 /usr/sbin/httpd -DFOREGROUND
apache 1736 0.0 0.3 521820 13864 ? S Apr26 0:00 /usr/sbin/httpd -DFOREGROUND
monitor 1737 0.0 0.0 137920 3816 ? Ss Apr26 0:06 /usr/bin/merlind --config /opt/monitor/op5/merlin/merlin.conf --debug
monitor 1805 0.1 0.2 311632 9244 ? Ss Apr26 2:09 /usr/bin/naemon --daemon /opt/monitor/etc/naemon.cfg
monitor 1806 0.0 0.0 20332 1380 ? S Apr26 0:07 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
monitor 1807 0.0 0.0 20332 1364 ? S Apr26 0:07 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
monitor 1808 0.0 0.0 20332 1364 ? S Apr26 0:07 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
monitor 1809 0.0 0.0 20332 1368 ? S Apr26 0:07 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
monitor 1810 0.0 0.1 114496 5368 ? S Apr26 0:09 /usr/bin/naemon --daemon /opt/monitor/etc/naemon.cfg
monitor 2024 0.1 0.4 456476 17596 ? Ssl Apr01 49:02 /usr/bin/lmd --config=/etc/op5/lmd/lmd.ini --logfile=/var/log/op5/lmd.log &
apache 9148 0.0 0.5 632652 24224 ? S Apr26 0:00 /usr/sbin/httpd -DFOREGROUND
apache 13373 0.0 0.3 521928 13588 ? S 00:00 0:00 /usr/sbin/httpd -DFOREGROUND
apache 13380 0.0 0.2 521260 10792 ? S 00:00 0:00 /usr/sbin/httpd -DFOREGROUND
root 27213 0.0 0.0 113156 1196 pts/0 S+ 11:36 0:00 bash -c ps aux | grep 'merlind\|naemon\|lmd\|httpd'
root 27215 0.0 0.0 112688 928 pts/0 S+ 11:36 0:00 grep merlind\|naemon\|lmd\|httpd
--------------------------------------------------------------------------------
$ service merlind status:
* merlind.service - Merlin
Loaded: loaded (/usr/lib/systemd/system/merlind.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2020-04-26 03:48:05 CEST; 1 day 7h ago
Main PID: 1737 (merlind)
CGroup: /system.slice/merlind.service
`-1737 /usr/bin/merlind --config /opt/monitor/op5/merlin/merlin.conf --debug
Apr 26 03:48:05 master01 systemd[1]: Started Merlin.
Apr 26 03:48:05 master01 merlind[1737]: Initializing IPC socket '/var/lib/merlin/ipc.sock' for daemon
Apr 26 03:48:05 master01 merlind[1737]: Logging to '/var/log/op5/merlin/daemon.log'
Apr 26 03:48:05 master01 merlind[1737]: Merlin daemon 2020.c.2 successfully initialized
Apr 26 03:48:07 master01 merlind[1737]: Accepting inbound connection on ipc socket
Apr 26 03:48:07 master01 merlind[1737]: NODESTATE: ipc: STATE_NONE -> STATE_NEGOTIATING: Accepted
Apr 26 03:48:07 master01 merlind[1737]: NODESTATE: ipc: STATE_NEGOTIATING -> STATE_CONNECTED: Connected
--------------------------------------------------------------------------------
$ service naemon status:
* naemon.service - Naemon Monitoring Daemon
Loaded: loaded (/usr/lib/systemd/system/naemon.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/naemon.service.d
`-op5-monitor.conf
Active: active (running) since Sun 2020-04-26 03:48:07 CEST; 1 day 7h ago
Docs: http://naemon.org/documentation
Process: 1798 ExecStart=/usr/bin/naemon --daemon /opt/monitor/etc/naemon.cfg (code=exited, status=0/SUCCESS)
Process: 1755 ExecStartPre=/usr/bin/su --login monitor --shell /bin/sh --command /usr/bin/naemon --verify-config --precache-objects /opt/monitor/etc/naemon.cfg (code=exited, status=0/SUCCESS)
Process: 1753 ExecStartPre=/usr/bin/chown --recursive monitor:apache /var/cache/naemon (code=exited, status=0/SUCCESS)
Process: 1751 ExecStartPre=/usr/bin/mkdir --parents /var/cache/naemon (code=exited, status=0/SUCCESS)
Process: 1749 ExecStartPre=/usr/bin/sh /etc/sysconfig/naemon (code=exited, status=0/SUCCESS)
Main PID: 1805 (naemon)
CGroup: /system.slice/naemon.service
|-1805 /usr/bin/naemon --daemon /opt/monitor/etc/naemon.cfg
|-1806 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
|-1807 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
|-1808 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
|-1809 /usr/bin/naemon --worker /opt/monitor/var/rw/nagios.qh
`-1810 /usr/bin/naemon --daemon /opt/monitor/etc/naemon.cfg
Apr 27 05:48:07 master01 naemon[1805]: Auto-save of retention data completed successfully.
Apr 27 06:00:00 master01 naemon[1805]: TIMEPERIOD TRANSITION: early_workday;0;1
Apr 27 06:25:00 master01 naemon[1805]: HOST DOWNTIME ALERT: openbsdLab;STOPPED; Host has exited from a period of scheduled downtime
Apr 27 06:48:07 master01 naemon[1805]: Auto-save of retention data completed successfully.
Apr 27 07:48:07 master01 naemon[1805]: Auto-save of retention data completed successfully.
Apr 27 08:00:00 master01 naemon[1805]: TIMEPERIOD TRANSITION: nonworkhours;1;0
Apr 27 08:00:00 master01 naemon[1805]: TIMEPERIOD TRANSITION: workhours;0;1
Apr 27 08:48:07 master01 naemon[1805]: Auto-save of retention data completed successfully.
Apr 27 09:48:07 master01 naemon[1805]: Auto-save of retention data completed successfully.
Apr 27 10:48:07 master01 naemon[1805]: Auto-save of retention data completed successfully.
--------------------------------------------------------------------------------
$ service nachos status:
* nachos.service - A Naemon configuration management service
Loaded: loaded (/usr/lib/systemd/system/nachos.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:49:03 CEST; 3 weeks 4 days ago
Main PID: 1998 (nachos-server)
CGroup: /system.slice/nachos.service
|-1998 /opt/monitor/nachos/venv/bin/python3.4 /opt/monitor/nachos/venv/bin/nachos-server --server api --config-file /opt/monitor/nachos/nachos.cfg
`-2078 /opt/monitor/nachos/venv/bin/python3.4 /opt/monitor/nachos/venv/bin/nachos-server --server api --config-file /opt/monitor/nachos/nachos.cfg
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
--------------------------------------------------------------------------------
$ service lmd status:
* lmd.service - op5 lmd integration
Loaded: loaded (/usr/lib/systemd/system/lmd.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:49:20 CEST; 3 weeks 4 days ago
Main PID: 2024 (lmd)
CGroup: /system.slice/lmd.service
`-2024 /usr/bin/lmd --config=/etc/op5/lmd/lmd.ini --logfile=/var/log/op5/lmd.log &
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
--------------------------------------------------------------------------------
$ service httpd status:
* httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:49:00 CEST; 3 weeks 4 days ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 1715 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
Main PID: 1270 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
CGroup: /system.slice/httpd.service
|- 1270 /usr/sbin/httpd -DFOREGROUND
|- 1732 /usr/sbin/httpd -DFOREGROUND
|- 1733 /usr/sbin/httpd -DFOREGROUND
|- 1734 /usr/sbin/httpd -DFOREGROUND
|- 1735 /usr/sbin/httpd -DFOREGROUND
|- 1736 /usr/sbin/httpd -DFOREGROUND
|- 9148 /usr/sbin/httpd -DFOREGROUND
|-13373 /usr/sbin/httpd -DFOREGROUND
`-13380 /usr/sbin/httpd -DFOREGROUND
Apr 21 08:34:08 master01 sudo[21752]: apache : TTY=unknown ; PWD=/opt/monitor/op5/nacoma ; USER=monitor ; COMMAND=/usr/bin/mon node show
Apr 26 03:48:01 master01 systemd[1]: Reloading The Apache HTTP Server.
Apr 26 03:48:03 master01 systemd[1]: Reloaded The Apache HTTP Server.
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
--------------------------------------------------------------------------------
$ service collector status:
* collector.service - op5 trap collector collects incomming SNMP traps
Loaded: loaded (/usr/lib/systemd/system/collector.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:49:03 CEST; 3 weeks 4 days ago
Main PID: 1997 (collector)
CGroup: /system.slice/collector.service
`-1997 /opt/trapper/bin/collector -On -c /opt/trapper/etc/collector.conf -A -Lsd -f
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
--------------------------------------------------------------------------------
$ service postfix status:
* postfix.service - Postfix Mail Transport Agent
Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:48:55 CEST; 3 weeks 4 days ago
Main PID: 1742 (master)
CGroup: /system.slice/postfix.service
|- 1742 /usr/libexec/postfix/master -w
|- 1744 qmgr -l -t unix -u
`-25350 pickup -l -t unix -u
Apr 27 00:00:13 master01 postfix/pickup[12087]: 49455400604A: uid=299 from=<ITRSOP5Monitor@master01.int.op5.com>
Apr 27 00:00:13 master01 postfix/cleanup[13347]: 49455400604A: message-id=<1587938413.5ea6046d2d8a6@swift.generated>
Apr 27 00:00:13 master01 postfix/qmgr[1744]: 49455400604A: from=<ITRSOP5Monitor@master01.int.op5.com>, size=94141, nrcpt=1 (queue active)
Apr 27 00:00:13 master01 postfix/smtp[13354]: 49455400604A: to=<jthoren@op5.com>, relay=eu-smtp-inbound-2.mimecast.com[195.130.217.236]:25, delay=0.76, delays=0.18/0/0.36/0.22, dsn=5.0.0, status=bounced (host eu-smtp-inbound-2.mimecast.com[195.130.217.236] said: 550 zen.mimecast.org https://www.spamhaus.org/sbl/query/SBLCSS. - https://community.mimecast.com/docs/DOC-1369#550 [P4NbsQUBPuOny0bUlP9L3Q.uk58] (in reply to RCPT TO command))
Apr 27 00:00:14 master01 postfix/cleanup[13347]: 025FA40D7ED6: message-id=<20200426220014.025FA40D7ED6@master01.localdomain>
Apr 27 00:00:14 master01 postfix/qmgr[1744]: 025FA40D7ED6: from=<>, size=2741, nrcpt=1 (queue active)
Apr 27 00:00:14 master01 postfix/bounce[13361]: 49455400604A: sender non-delivery notification: 025FA40D7ED6
Apr 27 00:00:14 master01 postfix/qmgr[1744]: 49455400604A: removed
Apr 27 00:00:14 master01 postfix/smtp[13354]: 025FA40D7ED6: to=<ITRSOP5Monitor@master01.int.op5.com>, relay=none, delay=0.01, delays=0/0/0/0, dsn=5.4.4, status=bounced (Host or domain name not found. Name service error for name=master01.int.op5.com type=AAAA: Host not found)
Apr 27 00:00:14 master01 postfix/qmgr[1744]: 025FA40D7ED6: removed
--------------------------------------------------------------------------------
$ service synergy status:
* synergy.service - synergy processor
Loaded: loaded (/usr/lib/systemd/system/synergy.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:48:51 CEST; 3 weeks 4 days ago
Main PID: 1285 (synergy)
CGroup: /system.slice/synergy.service
`-1285 /usr/bin/lua /opt/synergy/bin/synergy --monitor --no-daemon
Apr 27 10:48:28 master01 synergy[1285]: Livestatus socket:
Apr 27 10:53:28 master01 synergy[1285]: Livestatus socket:
Apr 27 10:58:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:03:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:08:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:13:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:18:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:23:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:28:28 master01 synergy[1285]: Livestatus socket:
Apr 27 11:33:28 master01 synergy[1285]: Livestatus socket:
--------------------------------------------------------------------------------
$ service smsd status:
* smsd.service - smsd
Loaded: loaded (/usr/lib/systemd/system/smsd.service; disabled; vendor preset: disabled)
Active: active (running) since Sun 2020-04-26 03:48:12 CEST; 1 day 7h ago
Process: 1849 ExecStartPre=/bin/chown smstools:smstools /var/run/smsd.working (code=exited, status=0/SUCCESS)
Process: 1847 ExecStartPre=/bin/touch /var/run/smsd.working (code=exited, status=0/SUCCESS)
Process: 1845 ExecStartPre=/bin/chown smstools:smstools /var/run/smsd.pid (code=exited, status=0/SUCCESS)
Process: 1843 ExecStartPre=/bin/touch /var/run/smsd.pid (code=exited, status=0/SUCCESS)
Main PID: 1852 (smsd)
CGroup: /system.slice/smsd.service
|-1852 /usr/sbin/smsd -t
`-1856 /usr/sbin/smsd -t
Apr 26 03:48:12 master01 systemd[1]: Stopped smsd.
Apr 26 03:48:12 master01 systemd[1]: Starting smsd...
Apr 26 03:48:12 master01 systemd[1]: Started smsd.
--------------------------------------------------------------------------------
$ service mysqld status || systemctl status mariadb:
* mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-01 11:49:03 CEST; 3 weeks 4 days ago
Main PID: 1404 (mysqld_safe)
CGroup: /system.slice/mariadb.service
|-1404 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
`-1682 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mariadb/mariadb.log --pid-file=/var/run/mariadb/mariadb.pid --socket=/var/lib/mysql/mysql.sock
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
--------------------------------------------------------------------------------
File system data
To troubleshoot file permissions and similar issues a file containing ls output from several interesting directories is included in the collection. The following commands are run:
# ls -alh /opt
# ls -alh -R /opt/monitor/
# ls -alh -R /opt/plugins/
The output is too long to repeat here. Test the commands on your own server.
PHP data
The final file is a dump of the current PHP variables, based on this command:
# php -r 'print_r(ini_get_all(null, false));'
Sensitive files
To some extent, all log files and configuration files contain sensitive data, and user data is scattered across multiple locations. However, there are some files with more sensitive data than others. ist-collect has a special category of modules that will only be included in the file collection stage when using the option -A or by manually specifying the modules as arguments to ist-collect. These modules are:
- mayi
- secure
- smsd
Mayi
This module contains log files and configuration files related to authorization, including user data.
Secure
This module contains the log file /var/log/secure which contains sensitive data.
Smsd
This module contains log files and configuration files that contain telephone numbers, names, and some other user data.
Selecting what to collect
By default, ist-collect will perform file collection of non-sensitive files and a full data collection on the local node. Data from other nodes in the cluster will only be collected when using the option -a.
To skip file collection all together you can use the option -s. With this option, only data collection will be performed. This may be combined with -a to perform the data collection on all nodes in the cluster.
It's currently not possible to skip data collection.
The recommended way to limit the files to collect is to manually specify what modules to collect. Remember that the currently available modules can be seen with -m.
Modules must be listed at the end of the command, after any options. See examples below:
Collect files relating to merlin, naemon, and ninja from all nodes in the cluster:
# ist-collect -a merlin naemon ninja
Only collect files related to livestatus:
# ist-collect livestatus
Comments
0 comments
Please sign in to leave a comment.