Geneos Gateway load monitoring statistics can be complicated and how to choose what further action to take.
- Having enabled Gateway load statistics, you know need to interpret them so that you can take action to optimise your Geneos monitoring performance
- The data returned by the Gateway-Load plugin can be hard to understand
- First, make sure that you are collecting performance data with the Gateway-Load plugin for a known period of time.
- If load stats collection is enabled at gateway start-up with the -stats command line option and your gateway has been running for a long period of time then any current, real-time load data may be lost in the aggregate values averaged over the life of the current gateway process. So, assuming that the load on the gateway is not so high that you can still issue commands, right-click on the gateway icon and select the Load Monitoring sub-menu and Reset Stats
- If load stats data collection is not enabled at gateway start then you have to manually enable them. Right-click on the gateway and in the Load Monitoring sub-menu select the Start Stats Collection menu and then choose either Now for ongoing collection or For Time Period if you want to collect data for a fixed period, such as the next hour or day.
- The Load Monitoring plugin has a number of options and if you want to collect data for the different subsystems then you have to instantiate multiple samplers with different configurations for each area of interest
- Looking at a couple of common Categories:
- Components; If you sort by the time column, with the largest values first you will typically see Rules and perhaps SetupManagement as the top consumers. In a typical environment this is to be expected as Rules are constantly running as monitoring data is processed and the SetupManagement component can take up significant system resouirces each time a Geneos administrator saves a configuration and the gateway has to rebuild it's internal representation of the monitored estate.
- DirectoryStats - Rules; Assuming you have a Gateway Load plugin configured to collect Rule stats (Category DirectoryStats, Grouping -> Rules) then you will presented with a number of columns to sort the list by and what you are looking for, in general, are large values that jump out at you - so sort by each of the numeric columns in turn, reviewing the largest values and make a judgement if any are unexpectedly high or very much out of proportion to the others. You may find one or two Rules are being executed too often or taking up too much processing time. This could be for a variety of reasons, usually related to "depended" data items, such as Path Aliases being too general or updating too often.
- XPathStats; Closely related to the Rule Stats above, the XPathsStats category will present a couple of numeric columns, invocations and time - which can again be used to look for outliers and if there are specific forms of XPath in your configuration that are being used more often than you expect.
- Gateway Load Plugin Reference
- Gateway Performance Tuning
- How to collect Standard Operating Profile data collection metrics from a gateway