Opsview - Reducing Datastore and Messagequeue nodes within a Collector cluster – Support - ITRS Group

Overview

Some processes within a Collector cluster replicate data to each other (i.e. `opsview-datastore` and `opsview-messagequeue`). In clusters of 4 or more nodes, there can be issues in high latency networks that that can cause problems with replication, so we recommend at most only 3 nodes within a cluster are used for these components (note: only use 1 node or 3 nodes, otherwise the cluster could be affected by 'lack of majority' issues).

This page documents how to amend the configuration of datastore and messagequeue within an existing Opsview Deploy managed Cluster (rather than "Remote Collector" managed Cluster).

Process

Note: There will be a monitoring outage while this process is being followed

1) Identify which 3 nodes in the cluster will become the Datastore/MessageQueue (DS/MQ) only nodes

2) On the Orchestrator, edit /opt/opsview/deploy/etc/opsview_deploy.yml, locate the section for the cluster and add in the following lines at the end of the section:

    datastore_hosts: &repl_hosts_for_de_infra
      ov-de-infra-1: { ip: 10.12.0.31 }
      ov-de-infra-2: { ip: 10.12.0.32 }
      ov-de-infra-3: { ip: 10.12.0.33 }

    messagequeue_hosts: *repl_hosts_for_de_infra

For example, if the configuration originally looked like:

collector_clusters:
  Cluster:
    collector_hosts:
      infra-col-1.openstack.opsview.local:
        ip: 10.140.3.37
        ssh_user: opsviewdeploy

      infra-col-2.openstack.opsview.local:
        ip: 10.140.2.5
        ssh_user: opsviewdeploy

      infra-col-3.openstack.opsview.local:
        ip: 10.140.4.185
        ssh_user: opsviewdeploy

      infra-col-4.openstack.opsview.local:
        ip: 10.140.2.81
        ssh_user: opsviewdeploy

      infra-col-5.openstack.opsview.local:
        ip: 10.140.4.205
        ssh_user: opsviewdeploy

      infra-col-6.openstack.opsview.local:
        ip: 10.140.2.175
        ssh_user: opsviewdeploy

should be amended to look like:

collector_clusters:
  Cluster:
    collector_hosts:
      infra-col-4.openstack.opsview.local:
        ip: 10.140.2.81
        ssh_user: opsviewdeploy

      infra-col-5.openstack.opsview.local:
        ip: 10.140.4.205
        ssh_user: opsviewdeploy

      infra-col-6.openstack.opsview.local:
        ip: 10.140.2.175
        ssh_user: opsviewdeploy

    datastore_hosts: &repl_hosts_for_cluster
      infra-col-1.openstack.opsview.local:
        ip: 10.140.3.37
        ssh_user: opsviewdeploy

      infra-col-2.openstack.opsview.local:
        ip: 10.140.2.5
        ssh_user: opsviewdeploy

      infra-col-3.openstack.opsview.local:
        ip: 10.140.4.185
        ssh_user: opsviewdeploy

    messagequeue_hosts: *repl_hosts_for_cluster

There are no configuration changes for user_vars.yml

3) In the UI, on the Configuration -> Monitoring Collectors -> Clusters page, edit the cluster and unmark the DS/MQ nodes and submit the changes

4) In the UI, on the Configuration -> Monitoring Collectors -> Collectors page, delete the 3 DS/MQ nodes

5) In the UI, on the Configuration -> Hosts page, edit the 3 DS/MQ nodes and remove the following host templates:

Opsview - Component - Cache Manager
Opsview - Component - Executor
Opsview - Component - Results Forwarder
Opsview - Component - Results Sender
Opsview - Component - SNMP Traps
Opsview - Component - SNMP Traps Collector
Opsview - Component - Scheduler

6) In the UI, on the Configuration -> Hosts page, edit the other nodes in the cluster and remove the following host templates:

Opsview - Component - Datastore
Opsview - Component - Messagequeue

Also, on the Variables tab, remove all OPSVIEW_LOADBALANCER_PROXY variables

7) Within the UI run Apply Changes

8) On the Orchestrator, remove the datastore/messagequeue components from *all* collector nodes in the cluster (NOTE: the 'opsview_cluster_cluster' name is taken from the cluster configuration within opsview_deploy.yml, with spaces changed to underscores, characters lowercased and prepended with 'opsview_cluster_'):

cd /opt/opsview/deploy
. bin/rc.ansible
ansible -m shell -a "rpm -e opsview-datastore opsview-messagequeue" opsview_cluster_cluster
ansible -m shell -a "rm -rf /opt/opsview/datastore /opt/opsview/messagequeue" opsview_cluster_cluster

9) On the Orchestrator, remove components from the cluster datastore/messagequeue hosts that are no longer required (NOTE: single quotes must be used - the limit works by identifying all nodes within the specific cluster that have datastore installed - only amend the 'opsview_cluster_cluster' part to the correct cluster name)

ansible -m shell -a "yum erase -y opsview-executor opsview-scheduler opsview-cache-manager opsview-*-scanner opsview-flow-* opsview*collector opsview-snmp*" 'opsview_cluster_cluster:&opsview_collector_datastore'
ansible -m shell -a "rm -rf /opt/opsview/executor /opt/opsview/scheduler /opt/opsview/cachemanager /opt/opsview/*scanner /opt/opsview/*collector /opt/opsview/snmp*" 'opsview_cluster_cluster:&opsview_collector_datastore'

10) On the Orchestrator, reinstall the required components on all cluster nodes by running the appropriate playbooks against the collector cluster group:

/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/loadbalancer-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/datastore-install.yml
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/messagequeue-install.yml 
/opt/opsview/deploy/bin/opsview-deploy -l opsview_cluster_cluster /opt/opsview/deploy/lib/playbooks/setup-monitoring.yml

11) Within the UI run Apply Changes

Opsview
Answerbot
exported_docs_10_05_24

Articles in this section

Opsview - Reducing Datastore and Messagequeue nodes within a Collector cluster

Overview

Process

Comments

Articles in this section

Overview

Process

Related articles