This page contains details about the AC2 and GSE diagnostic file feature. It details when it is useful, how to create it, what is contained within it, how to interpret the contents and a summary of the times it has been requested and an analysis of how useful it proved to be.
Why did we introduce diagnostics files?
In the absence of a diagnostics function an issue could still be diagnosed but there would have to a series of questions, answers and files bounced backwards and forwards between the client and ITRS. A certain amount of guess work would also have to employed about where to start a search for a problem. It is likely that the lead time to finding a solution or raising a change request would be considerably longer, and the client would have to be more involved in process.
When is a diagnostic file useful?
A diagnostic file is useful under the following conditions:
- A client reports a perceived defect with the AC and GSE
- We want to review a users AC configuration, generally to suggest improvements and changes
Note: A diagnostic file is useful even after a defect has been witnessed and can no longer be observed (like in the case of a crash) because the logs, exceptions and core file data remain on the HDD of the effected PC. This means that restarting the AC and getting the diagnostics is still valuable even though you have started a new session.
How to generate a diagnostics file?
See here for how to create a diagnostics file
What is in a diagnostics?
An AC2 diagnostics contains the following resources:
- Diagnosticsworkspace.awx, A copy of the workspace that they are currently running with
- *.gci, the Active Console, GSE and Express Reports .gci files, which determine the start up configuration of these applications
- about.txt, a summary of the run time environment, it includes the version of the console, the versions of the various libraries that the application needs, including the JVM. A report on the java and system memory states at the time the diagnostics was created. The java and system paths, and the state of the JVM.
- activeconsole<date>.log, All the available logs that the application has generated into the working directory. These are not cleaned up by default so may go back many days or weeks. A new log is started each day, assuming the application is started at least once.
- *.txt, we extract all text files from the install and working directory since people tend to create txt files for various reasons, including listing gateways etc. Although not immediately relevant it is done for completeness
- config.txt this is an old file required by the AC1 GSE, it contains no configuration information useful to the AC2, but changing its contents can result in errors, so from time to time we need to review it.
- exception<date>.log, These logs contain all the uncaught exceptions being thrown by the application, one of these is created each day that >0 exceptions are thrown.
- hs_err_pid<number>.log, These files are generated when the Activeconsole.dll fails (crashes), when this occurs the whole AC will crash, so their inclusion is an important part of the debug process
- summary.txt, this file contains a list of how many gateways they were conntected to when the diagnostics was taken, and how many and what type of data items the gateways contained. It is a useful insight into the size of the system they are connecting to
- version.txt, contains details of the gateway and netprobe versions that the Console is connected to
- gatewaylist.txt if they are using gateway list files then copies are made and included in the diagnostics
- apmsummary.txt and apmdetail.txt, a summary of the Active Path model which handles all updates from the connected gateways, this is a core component of the AC and therefore its configutaion (which is defined by the workspace) is a key consideration
Running through the basic sanity checks
Independently of the specific issue that is being investigated, and that required the diagnostics file, there are a few basic checks that should almost always be performed to ensure the health of the AC installation, they are covered here:
Check | Description | Likely observables | Action |
---|---|---|---|
Ensure that all files named : FILEVERSION GAXXX.txt all have the same value for XXX | The installation of the console is a simple extract of a zip file, but it has been known for users to extract a copy over the top of an existing version, this can be easily spotted since there are multiple copiers of the FILVERSION XXX.txt. | This is somewhat of an all bets are of situation, the AC comes with a JVM as well as the application, so all kinds of unexplained behavior could be caused by this |
Install a fresh version of the console in its own directory.
The version file also tells you what version of the console they are running, anything more than 2 major versions back and an upgrade will be required |
Presence of hs_err_pidXXX_XXX.log files | These are log files produced when Java has crashed, and of course the AC process with it. For the most part it will be someone with access to the code base that will have to review these logs, so sending them on to dev is best. Just make sure that if the user has reported a crash then the date of the file corresponds with the time of the crash | The Active console crashes, meaning the process is no longer running. | Attempt to pin down the steps which lead to the crash |
Review the ActiveConsole.gci file for a modified memory setting |
Its common place for users to add user specific flags to the ActiveConsole.gci file, but there are also lines that it is strongly recommended that are not changed. This is the standard file for all 3.X and beyond consoles. The lines marked blue are some times changed for specific problems but as a rule would would strongly advise against this.
########################### ########################### #### The JVM to use and arguments ########################### |
Some clients increase the memory allocated to Java, but in reality this is just covering up a problem. In some cases increasing memory (notably 756Mb) has been proven to cause significant instability |
A better solution is to work out why the memory is high. One option being to start a new workspace, assert the memory is low, then start adding in the dockables one by one from the workspace that exhibited high memory. Tools like VisualVM, which is shipped with the JDK can give detailed insights into where the memory is being used. |
Review the ActiveConsole.gci file for the use of the -wsp flag |
Check whether the user has added the -wsp flag to the activeconsole.gci file
This modifies the working directory of the console, and can be a desirable flag when the user is having issues with any read write access, problems with workspaces, or slow load and write times |
This flag - assuming the specified directory is local - should make the console run faster and work space management more reliable, however you lose the benefit of all your configuration being available during an upgrade | Use this flag where there are problems with workspaces and load and save times. Also where the diagnostics cannot be easily created or fails. |
Load up the Diagnosticsworkspace.awx |
At the point the diagnostics is created the console does a 'Save copy as' function and creates the diagnostcisworkspace.awx. This means its a save of the console work space at the point the button is pressed, not the work space that was loaded at the point of open | Some times on load errors it may also be worth asking the client for the source work space as well. | |
Check the contents of the work space |
Check the 'View' menu and make sure that they do not have rouge Metric Overviews. Sometimes users create these views using the 'Metric overview wizard' in the toolbar which can leave plenty of views using resource in the background |
Large numbers of metric overviews or very general ones (like all CPU samplers for example), can cause high CPU and high memory, both of these situations may also lead to exceptions being thrown. | More often than not the user did not been to create these views as permanent entries in the work space, and may not be aware they are even present. So removing them is the best option. |
review the about.txt |
There is plenty of information in here, but its probably worth checking the location of the working directories If the are located of mapped network drives then this can cause slow down in the console, notably where the auto save period is > 0 |
Slow down and lock ups | Consider using the -wsp flag to use a local drive, and store the work spaces locally. |
Review the Summary.txt | This will give you some idea of the number of data items the client is requesting to view, once you start moving into the millions of cells the load may start to become prohibitive, depending on the clients PC | Slow down and lock ups | Review whether there is a need to stay connected to all gateways all the time |
-
Tags:
Comments
0 comments
Please sign in to leave a comment.