Overview
Some processes within a Collector cluster replicate data to each other, these being Datastore and MessageQueue. In clusters of 4 or more nodes, there can be issues in high latency networks that that can cause problems with replication, so we recommend only 3 nodes within a cluster are used for these components (note: only use 1 node or 3 nodes, otherwise the cluster could be affected by 'lack of majority' issues).
This page documents how to amend the configuration of datastore and messagequeue within an existing cluster.
Process
Note: There will be a monitoring outage while this process is being followed
1) Identify which 3 nodes in the cluster will become the Datastore/MessageQueue (DS/MQ) only nodes
2) On the Orchestrator, edit /opt/opsview/deploy/etc/opsview_deploy.yml, locate the section for the cluster and add in the following lines at the end of the section:
datastore_hosts: &repl_hosts_for_de_infra
ov-de-infra-1: { ip: 10.12.0.31 }
ov-de-infra-2: { ip: 10.12.0.32 }
ov-de-infra-3: { ip: 10.12.0.33 }
messagequeue_hosts: *repl_hosts_for_de_infra
For example, if the configuration originally looked like:
collector_clusters:
Cluster:
collector_hosts:
infra-col-1.openstack.opsview.local:
ip: 10.140.3.37
ssh_user: opsviewdeploy
infra-col-2.openstack.opsview.local:
ip: 10.140.2.5
ssh_user: opsviewdeploy
infra-col-3.openstack.opsview.local:
ip: 10.140.4.185
ssh_user: opsviewdeploy
infra-col-4.openstack.opsview.local:
ip: 10.140.2.81
ssh_user: opsviewdeploy
infra-col-5.openstack.opsview.local:
ip: 10.140.4.205
ssh_user: opsviewdeploy
infra-col-6.openstack.opsview.local:
ip: 10.140.2.175
ssh_user: opsviewdeploy
should be amended to look like:
collector_clusters:
Cluster:
collector_hosts:
infra-col-4.openstack.opsview.local:
ip: 10.140.2.81
ssh_user: opsviewdeploy
infra-col-5.openstack.opsview.local:
ip: 10.140.4.205
ssh_user: opsviewdeploy
infra-col-6.openstack.opsview.local:
ip: 10.140.2.175
ssh_user: opsviewdeploy
datastore_hosts: &repl_hosts_for_cluster
infra-col-1.openstack.opsview.local:
ip: 10.140.3.37
ssh_user: opsviewdeploy
infra-col-2.openstack.opsview.local:
ip: 10.140.2.5
ssh_user: opsviewdeploy
infra-col-3.openstack.opsview.local:
ip: 10.140.4.185
ssh_user: opsviewdeploy
messagequeue_hosts: *repl_hosts_for_cluster
There are no configuration changes for user_vars.yml
3) In the UI, on the Configuration -> Monitoring Collectors -> Clusters page, edit the cluster and unmark the DS/MQ nodes and submit the changes
4) In the UI, on the Configuration -> Monitoring Collectors -> Collectors page, delete the 3 DS/MQ nodes
5) In the UI, on the Configuration -> Hosts page, edit the 3 DS/MQ nodes and remove the following host templates:
- Opsview - Component - Cache Manager
- Opsview - Component - Executor
- Opsview - Component - Results Forwarder
- Opsview - Component - Results Sender
- Opsview - Component - SNMP Traps
- Opsview - Component - SNMP Traps Collector
- Opsview - Component - Scheduler
6) In the UI, on the Configuration -> Hosts page, edit the other nodes in the cluster and remove the following host templates:
- Opsview - Component - Datastore
- Opsview - Component - Messagequeue
Also, on the Variables tab, remove all OPSVIEW_LOADBALANCER_PROXY variables
7) Within the UI run Apply Changes
8) On the Orchestrator, remove the datastore/messagequeue components from *all* collector nodes in the cluster (NOTE: the 'opsview_cluster_cluster' name is taken from the cluster configuration within opsview_deploy.yml, with spaces changed to underscores, characters lowercased and prepended with 'opsview_cluster_'):
cd /opt/opsview/deploy
. bin/rc.ansible
ansible -m shell -a "rpm -e opsview-datastore opsview-messagequeue" opsview_cluster_cluster
ansible -m shell -a "rm -rf /opt/opsview/datastore /opt/opsview/messagequeue" opsview_cluster_cluster
9) On the Orchestrator, remove components from the cluster datastore/messagequeue hosts that are no longer required (NOTE: single quotes must be used - the limit works by identifying all nodes within the specific cluster that have datastore installed - only amend the 'opsview_cluster_cluster' part to the correct cluster name)
ansible -m shell -a "yum erase -y opsview-executor opsview-scheduler opsview-cache-manager opsview-*-scanner opsview-flow-* opsview*collector opsview-snmp*" 'opsview_cluster_cluster:&opsview_collector_datastore'
ansible -m shell -a "rm -rf /opt/opsview/executor /opt/opsview/scheduler /opt/opsview/cachemanager /opt/opsview/*scanner /opt/opsview/*collector /opt/opsview/snmp*" 'opsview_cluster_cluster:&opsview_collector_datastore'
10) On the Orchestrator, reinstall the required components on all cluster nodes by running the appropriate playbooks against the collector cluster group:
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/loadbalancer-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/datastore-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/messagequeue-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/setup-monitoring.yml
11) Within the UI run Apply Changes
-
Tags:
- Opsview
- Answerbot
Comments
0 comments
Please sign in to leave a comment.