Geneos - How to collect Standard Operating Profile data collection metrics from a gateway

Why we need to collect and benchmark gateway profiles

Any given piece of software running on any operating system will use a certain amount of resource from that OS. As its state, configuration and the load it is put under changes so that resources foot print will change. In addition to the OS resources, the software may also expose a number of metrics which highlight its internal state, and any application specific resources it is using and needs. For example a JVM uses the memory and CPU of the OS, white it also exposes dozens of metrics that show the internal state of the Virtual machine. Applications that run low, or run out of the resources will ultimately fail, or act unpredictably, a situation that the end user will perceive as defective.

The use of the resources is normal for software, it should have access to what it needs, and we should be able to understand what that normal usage looks like. This is the purpose of the operational profile system detailed on this page.

A step in the right direction would be to have better tools and support for what the normal profile for any given (specific instance) of a ITRS component looks like. Then we can:

We can start collecting data for specific software components, in given configurations in given environments.
We could identify the effect on the software of any given change on the software binaries, configuration, load and so on.
We could identify if the environment that they are running the software in is fit for purpose (I.E. has it got enough resources, or is it resource staved?)

The solution provided here is a template for the building of the Standard Operating profile. Guidelines exist for understanding the content and recommendations for specific conditions (including alerts when known issues are likely to occur, or have occurred). While this may not solve the recommendation challenges, it should help identify the health of a component and start building a real world profile of our software components resource usage under specific conditions, and identify for anyone using the template the effect of any given change to the software (be it internal or external factors).

Getting the gateway operational Profile

The following allows you to measure the operating profile of a gateway

Prerequisites

Gateway and Netprobe versions must be GA3.6.0 or later
- Download the latest gateway and net probe binaries here
- Netprobe Installation steps
- Gateway Install Steps
Have the ability to add samplers to at least one Physical and one Virtual Netprobe on the component host (for example the server that the gateway is running on)
A working database logging for the target gateway.
The ability to add an include file and modify the database logging on the target gateway.

Setup files

You will need the following file/s

operating_profile.xml - The include file that contains the Operational Profile configuration

Step-by-step installation instructions

Open the target gateway in the GSE
Install the Standard Operating Profile Include file
To create an include file (doesn't require host access):
1. In the gateway settings Include Section right-click and select "New Include"
2. Name it "operating_profile.xml", and a priority higher than 1 to avoid conflict with the main file(e.g. 999)
3. Click Load include file. If prompted to Create a new one, select yes
4. In XML mode, copy and paste the contents of "operating_profile.xml" (See link above) to the new include
If you have access to the host command line:
1. Copy the include xml to the host, in your desired path
2. In the gateway settings, copy that file name and path in a new include file
Edit Database Logging Connection details
1. Go to the Database Logging section and tick the "Enabled" checkbox to enable Database Logging
2. Fill in the appropriate database connection details
Important: Replace the probe (ChangeMe) used by entity "Host Hardware Profile" to a physical probe running in the gateway's host. You only need to do this once per host, see step no. 6 for multiple gateway setups.
Edit the Gateway Process details (Optional, only if gateway process results in a count other than 1); The gateway name is included in the gateway process search string.
1. Go to the Standard Operating Environment (in the operating_profile.xml include file, "Environments" section)
2. Replace the "gateway_process" variable with the target gateway process string
For multiple gateways running in a host, enable Gateway Sharing in the Imported data section by editing the hostname and port of the target gateway host where a physical probe is attached.
Save your changes
Verify that the Standard Operating Profile Entities appear with all the included data views (See below)

How the content will appear in the gateway?

Expected Data views

There shall be two Managed Entities namely "Standard Operating Profile" and "Host Hardware Profile", with Virtual and Physical Probes attached respectively. See below for the expected dataviews.

Figure 1.1 Expected View for Standard Operating Profile Managed Entity with Virtual Probe

Figure 1.2 Expected View for Standard Operating Profile Managed Entity with Physical Probe

Definition of Terms

Spec	Unit Value	Originating Sampler	Description
conflationTime	conflation time / total (processing) time	orbStats	This is a measure of the time spent waiting for conflation to complete. Conflation means that the gateway would deal with the backlog of data queues by discarding out of date cell updates and only processing and publishing the latest cell values. Conflation works best when it is preventing stale data from building up rather than clearing large backlogs (not only does it have fewer backlogged messages to process, but it minimises the amount of updates conflated away).
cpuUtilisation	percent Utilisation of the Host	hardwareProfile	A measure of the total CPU utilisation of the Host
dbLogging	dbLogging time / total (processing) time	gatewayComponents	Ratio of cpu time spent on dbLogging against the total cpu processing time, time units vary per platform.
directory	directory time / total (processing) time	gatewayComponents	Ratio of cpu time spent on directory related operations against the total cpu processing time, time units vary per platform. Includes constructing and modifying the state tree among other tasks.
freeSpacePct	sum of free disk space / total disk space	diskProfile	Percentage of total free disk space on all mounted partitions in the host. Some partitions are excluded (see operating_profile.xml "diskProfile" sampler for the list of excluded partitions)
maxDataAge	headline max data age in milliseconds	probeData	The maximum age of backlogged updates (as displayed by the probeData plugin). Normalised to milli seconds.
memoryAvailablePct	memoryAvailable / totalPhysicalMemory	hardwareProfile	Total available memory of host to all applications.
messagesQueued	sum of mesages queued in the gateway	connectionStats	Total count of all messages made to all connections to the gateway, uniquei IP address and port.
percentCPU	gateway CPU usage percentage of total	gatewayProcess	Percentage of total CPU used up by the gateway as reported by the process sampler.
percentMemory	gateway Memory usage percentage of total	gatewayProcess	Percentage of total Memory used up by the gateway as reported by the process sampler.
probeManagement	probeManagement time / total (processing) time	gatewayComponents	Ratio of cpu time spent on probe management against the total cpu processing time, time units vary per platform. Includes establishing and maintaining communication with Netprobes.
queueMem	sum of Mem (in KB)	connectionStats	Total memory spent in processing all the messages in the queue with respect to unique connections of the gateway
roles	roles time / total (processing) time	gatewayComponents	Ratio of cpu time spent on roles against the total cpu processing time, time units vary per platform. Includes the time spent on Hot-Standby functionality.
rules	rules time / total (processing) time	gatewayComponents	Ratio of cpu time spent on rules against the total cpu processing time, time units vary per platform.
schema	schema time / total (processing) time	gatewayComponents	Ratio of cpu time spent on schema validation and changes against the total cpu processing time, time units vary per platform.
swapUsed	swapUsed percentage of the Host	hardwareProfile	Percentage of total swap Memory used by the OS.

Viewing the data via a dashboard

When the operational profile data is displayed in the data views it will just be as values at that point in time. The primary motivation and benefit however is to see the profile of the gateway over time, from busy trading days to calmer weekends. Assuming you have not changed the names of the managed entities, samplers and data views then you can use the following to get that historical data quickly and easily.

Download the dashboard adb file ( ADB file) and via the Active Console, use the file -> import function (selecting either the default or target dashboard dockable of your choice).
ENSURE YOU ARE CONNECTED ONLY to the gateway you want the stats for. You may need to disconnect your other gateways while performing step 3, though can can reconect after once you are looking at the data.
On the imported dashboard right click and select 'Repopulate all charts with historic data --> Week' (or day etc based on your requirement). This will populate the all the charts with a common time period of data, An example of which is shown below

The scrollbars at the top of the charts can be manipulated to adjust the time period that you can see the data for (so you can zoom in on specific events).

Articles in this section

Geneos - How to collect Standard Operating Profile data collection metrics from a gateway

Why we need to collect and benchmark gateway profiles

The following allows you to measure the operating profile of a gateway

Setup files

How the content will appear in the gateway?

Expected Data views

Definition of Terms

Viewing the data via a dashboard

Comments

Articles in this section

Why we need to collect and benchmark gateway profiles

The following allows you to measure the operating profile of a gateway

Setup files

How the content will appear in the gateway?

Expected Data views

Definition of Terms

Viewing the data via a dashboard

Related articles