data_timeout
in the merlin.conf on a system that is peered, remember to also set the same value on the peer.When running a command like mon node status
, peers and pollers in your cluster may sometimes show up as INACTIVE
. This is because Merlin was unable to verify that the node is alive, and you can see Merlin actively trying to verify this in neb.log
.
There are however situations where you know that pollers or peers will be unstable, and where you'd rather see that Merlin is more tolerant towards the node not responding for a while. Perhaps you have a peer or poller that is geographically far away, and/or connected via an unreliable network. For cases like these, you can use the data_timeout
setting in Merlin to indicate that Merlin should be more lenient with this node, and allow it a longer time to re-connect before marking it as INACTIVE
.
Example configuration
Every Merlin configuration tells Merlin about all other nodes except itself, essentially from its own perspective. For a cluster with 4 nodes, the primary masters configuration will list 3 nodes, since it excludes itself. If you wish to increase the amount of seconds required for the master to classify a poller as INACTIVE
, the following value is added to the masters merlin.conf
:
poller poller01 { data_timeout = 600 hostgroup = foo address = poller01 port = 15551 takeover = no notifies = no }
The data_timeout
value means that this master will wait 600 seconds before actually marking poller01
as inactive.
-
Tags:
- Merlin
- timeout
- master
- poller
- configuration
- cluster
- synchronisation
- merlin.conf
- OP5 Monitor
Comments
0 comments
Please sign in to leave a comment.