Overview
Some processes within a Collector cluster replicate data to each other (i.e. `opsview-datastore` and `opsview-messagequeue`). In clusters of 4 or more nodes, there can be issues in high latency networks that that can cause problems with replication, so we recommend at most only 3 nodes within a cluster are used for these components (note: only use 1 node or 3 nodes, otherwise the cluster could be affected by 'lack of majority' issues).
This page documents how to amend the configuration of datastore and messagequeue within an existing Opsview Deploy managed Cluster (rather than "Remote Collector" managed Cluster).
Process
Note: There will be a monitoring outage while this process is being followed
1) Identify which 3 nodes in the cluster will become the Datastore/MessageQueue (DS/MQ) only nodes
2) On the Orchestrator, edit /opt/opsview/deploy/etc/opsview_deploy.yml, locate the section for the cluster and add in the following lines at the end of the section:
datastore_hosts: &repl_hosts_for_de_infra
ov-de-infra-1: { ip: 10.12.0.31 }
ov-de-infra-2: { ip: 10.12.0.32 }
ov-de-infra-3: { ip: 10.12.0.33 }
messagequeue_hosts: *repl_hosts_for_de_infra
For example, if the configuration originally looked like:
collector_clusters:
Cluster:
collector_hosts:
infra-col-1.openstack.opsview.local:
ip: 10.140.3.37
ssh_user: opsviewdeploy
infra-col-2.openstack.opsview.local:
ip: 10.140.2.5
ssh_user: opsviewdeploy
infra-col-3.openstack.opsview.local:
ip: 10.140.4.185
ssh_user: opsviewdeploy
infra-col-4.openstack.opsview.local:
ip: 10.140.2.81
ssh_user: opsviewdeploy
infra-col-5.openstack.opsview.local:
ip: 10.140.4.205
ssh_user: opsviewdeploy
infra-col-6.openstack.opsview.local:
ip: 10.140.2.175
ssh_user: opsviewdeploy
should be amended to look like:
collector_clusters:
Cluster:
collector_hosts:
infra-col-4.openstack.opsview.local:
ip: 10.140.2.81
ssh_user: opsviewdeploy
infra-col-5.openstack.opsview.local:
ip: 10.140.4.205
ssh_user: opsviewdeploy
infra-col-6.openstack.opsview.local:
ip: 10.140.2.175
ssh_user: opsviewdeploy
datastore_hosts: &repl_hosts_for_cluster
infra-col-1.openstack.opsview.local:
ip: 10.140.3.37
ssh_user: opsviewdeploy
infra-col-2.openstack.opsview.local:
ip: 10.140.2.5
ssh_user: opsviewdeploy
infra-col-3.openstack.opsview.local:
ip: 10.140.4.185
ssh_user: opsviewdeploy
messagequeue_hosts: *repl_hosts_for_cluster
There are no configuration changes for user_vars.yml
3) In the UI, on the Configuration -> Monitoring Collectors -> Clusters page, edit the cluster and unmark the DS/MQ nodes and submit the changes
4) In the UI, on the Configuration -> Monitoring Collectors -> Collectors page, delete the 3 DS/MQ nodes
5) In the UI, on the Configuration -> Hosts page, edit the 3 DS/MQ nodes and remove the following host templates:
- Opsview - Component - Cache Manager
- Opsview - Component - Executor
- Opsview - Component - Results Forwarder
- Opsview - Component - Results Sender
- Opsview - Component - SNMP Traps
- Opsview - Component - SNMP Traps Collector
- Opsview - Component - Scheduler
6) In the UI, on the Configuration -> Hosts page, edit the other nodes in the cluster and remove the following host templates:
- Opsview - Component - Datastore
- Opsview - Component - Messagequeue
Also, on the Variables tab, remove all OPSVIEW_LOADBALANCER_PROXY variables
7) Within the UI run Apply Changes
8) On the Orchestrator, remove the datastore/messagequeue components from *all* collector nodes in the cluster (NOTE: the 'opsview_cluster_cluster' name is taken from the cluster configuration within opsview_deploy.yml, with spaces changed to underscores, characters lowercased and prepended with 'opsview_cluster_'):
cd /opt/opsview/deploy
. bin/rc.ansible
ansible -m shell -a "rpm -e opsview-datastore opsview-messagequeue" opsview_cluster_cluster
ansible -m shell -a "rm -rf /opt/opsview/datastore /opt/opsview/messagequeue" opsview_cluster_cluster
9) On the Orchestrator, remove components from the cluster datastore/messagequeue hosts that are no longer required (NOTE: single quotes must be used - the limit works by identifying all nodes within the specific cluster that have datastore installed - only amend the 'opsview_cluster_cluster' part to the correct cluster name)
ansible -m shell -a "yum erase -y opsview-executor opsview-scheduler opsview-cache-manager opsview-*-scanner opsview-flow-* opsview*collector opsview-snmp*" 'opsview_cluster_cluster:&opsview_collector_datastore'
ansible -m shell -a "rm -rf /opt/opsview/executor /opt/opsview/scheduler /opt/opsview/cachemanager /opt/opsview/*scanner /opt/opsview/*collector /opt/opsview/snmp*" 'opsview_cluster_cluster:&opsview_collector_datastore'
10) On the Orchestrator, reinstall the required components on all cluster nodes by running the appropriate playbooks against the collector cluster group:
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/loadbalancer-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/datastore-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/messagequeue-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/setup-monitoring.yml
11) Within the UI run Apply Changes
-
Tags:
- Opsview
- Answerbot
- exported_docs_10_05_24
Comments
0 comments
Please sign in to leave a comment.