Related to:
OP5 Monitor email or SMS notifications are not being sent correctly.
Problem
Email or SMS notifications are expected for hosts and/or services with Warning or Critical status but none have been received.
Possible Cause(s)
For both email and SMS notifications
- The objects could be in downtime. Objects in downtime do not send notifications.
- Notifications may be disabled system-wide or for the specific object. The object may also not have notifications enabled for specific states.
- No contact is associated with the object that is trying to send a notification.
- The object has been in a persistent warning or down state, causing notifications to be suppressed.
For email notifications
- The email service has errors.
- The email service is inactive.
For SMS notifications
- The SMS service has errors.
- The SMS service is inactive.
Possible Solution(s)
For both email and SMS notifications
- Install and run ist diagnose to get details on the status of your system.
- Check that the Naemon service is up.
On the command line, run:
systemctl status naemon
If the status of the service shows inactive or dead, start the service with the following command:
systemctl start naemon
You can also scan the Naemon logs for errors about notifications. In some instances, the logs can provide clues on when and why a notification was or was not sent. The Naemon log file is /opt/monitor/var/naemon.log. An example of this can be seen below:
[root@op5-system ~]# cat /opt/monitor/var/naemon.log | grep SUPPRESSED:
[1631794051] HOST NOTIFICATION SUPPRESSED: switche977ca;No notification sent, because no contacts were found for notification purposes.
Read on for further details on the notification suppression.
When raising tickets with this kind of issue, a copy of the naemon.log file will be helpful.
- Check that the host or service is not in a state of scheduled downtime.
Objects that are currently in downtime may have their notifications suppressed. To check if an object is in downtime, you can navigate to the object's UI page. There will be text indicating that the object is in downtime.
- Check that notifications are enabled system-wide.
Navigate through Manage > Process information on the UI to see if notifications are enabled or not.
Notifications can be toggled through the Naemon configuration file on the command line. The directive enable_notifications in /opt/monitor/etc/naemon.cfg dictates whether or not notifications are active (value is 1) or inactive (value is 0).
- Check that the host or service has notifications enabled for the state you are expecting.
Navigate through the UI to the object's configuration page and look for the notifications_enabled and notification_options fields.
In the above image, notifications will be sent out for all states that the object is experiencing. If you only need notifications for Critical, Warning, and Recovery, then only tick those boxes.
If the configuration page does not show these by default, click the Advanced button on the upper right and then locate them.
- Check that the host or service has contact information associated with it.
Navigate through the UI to the object's configuration page and look for the notifications_enabled and notification_options fields:
In the image above, if the object is configured to send a notification, then it will be sent to "Test Contact 1" and all members that are part of the "Test Group 1" contact group.
- If the object in question has been in a down state for a prolonged period of time, check the notification interval.
In some instances, you may be expecting notifications to be sent every x minutes that an object is in a non-OK state. This can be configured in the object's configuration page and specifying a value for the notifications_interval field.
If you set this to 0 only one notification will be sent out, this is considered best practice. If you set this to 10 a notification will be sent out every 10 minutes.
For email notifications
- Check that the Postfix service is active.
On the command line, issue:
systemctl status postfix
If the status of the service shows inactive or dead, start the service with the following command:
systemctl start postfix
- Check the mail service log file.
The default log file for a standard installation of OP5 Monitor is /var/log/maillog. In some instances, you may see errors. An example can be found below:
Aug 13 12:13:26 op5-system postfix/pickup[1512]: 48EF514E8E1: uid=0 from=<root>
Aug 13 12:13:26 op5-system postfix/cleanup[2670]: 48EF514E8E1: message-id=20210813064326.48EF514E8E1@op5-system.woodstock.ac.in
Aug 13 12:13:26 op5-system postfix/qmgr[1513]: 48EF514E8E1: from=root@op5-system, size=393, nrcpt=1 (queue active)
Aug 13 12:13:27 op5-system postfix/smtp[2676]: 514C814E8DF: to=receiver@domain.com, relay=mail.smtp2go.com[xx.xx.xx.xx]:2525, delay=7.7, delays=0.11/0.05/3.1/4.4, dsn=5.0.0, status=bounced (host mail.smtp2go.com[xx.xx.xx.xx] said: 550-Verification failed for root@op5-system 550-Unrouteable address 550 unable to verify sender address. (in reply to RCPT TO command)
The error code indicated on the maillog is a good place to start checking. Since the example above shows error code 550, this pertains to a mailbox configuration issue. The mail server of your email address may either be not responding or is non-existent. In this case, check on the configuration with your System Administrator.
When raising tickets with this kind of issue, a copy of the maillog file will be helpful.
For SMS notifications
- Check that the smsd service is active.
On the command line, issue:
systemctl status smsd
If the status of the service shows inactive or dead, start the service with the following command:
systemctl start smsd
- If smsd cannot be started, check that its run directory exists.
Check if the /var/run/smsd directory exists, if not, run the following command:
mkdir -p /var/run/smsd && chown smstools: /var/run/smsd
The smsd service is also not configured to start automatically. Run the following command to enable it:
systemctl enable smsd
Then restart the service:
systemctl restart smsd
- Check the smsd service log.
The logfile for the smsd service is /var/log/smsd/smsd.log. The log will typically indicate errors in either starting the service or sending out a notification. An example problem can be seen below:
GSM1: Command is sent, waiting for the answer. (5)
GSM1: No answer, put_command expected (OK)|(ERROR)|(0)|(4), timeout occurred. 22.
GSM1: <-
GSM1: Modem is not ready to answer commands (Timeouts: 22)
GSM1: Failed to initialize modem GSM1. Stopping.
In this instance, the modem itself is having a problem. You can try restarting it. For further modem-related issues, you may contact your modem provider.
When raising tickets with this kind of issue, a copy of the smsd.log file will be helpful.
Related Articles
- Manage Notifications documentation
- Naemon notifications documentation
- How to send outgoing notifications via SMTP relay
- Configure Postfix
- How do I troubleshoot the GSM Modem / GSM Gateway / GSM Terminal? (SMS)
- Configure an SMS modem
- How to fix the smsd service not starting due to missing directory
If the issue persists
- Please contact our Client Services team via the chat service box available on any of our websites or via email to support@itrsgroup.com
- Make sure you provide to us:
-Any troubleshooting step already verified from the ones described in this article.
Comments
0 comments
Please sign in to leave a comment.