JazzMon 1.4.0 Monitor Stops Collecting CounterContextServer.html Files for a Repository
We recently switched on JazzMon tasks to use version 1.4.0. We are noticing that the monitor process for this version collects maybe one or two days of CounterContextServer.html files and then stops doing so, but only for one of 5 RTC servers requested in the jm.properties file. The context name for this particular RTC application is somewhat unique for this repository; it is "ccm04" instead of "ccm" or "jts". I do ask for this context name in the properties file, e.g.:
# Comma-separated list of applications to aggregate for cluster based monitoring
ANALYSIS_AGGREGATE_LIST=jts,ccm,ccm04
Does this look correct?
One answer
The ANALYSIS_AGGREGATE_LIST is only used in the post-collection analysis phase, when it is used to aggregate or combine data from applications with the mentioned suffixes in common. It shouldn't have anything to do with stopping data collection during the monitoring phase.
The monitoring process creates a separate thread for each server-application being monitored and each thread is given the same amount of time or number of iterations to run so it shouldn't be treating different targets differently. But in the course of running if there are too many exceptions or problems with a given thread it may shut down before the others.
I would start to investigate what's happening by looking in the JazzMon Runtime.log file where it records a few lines every time it takes a snapshot on each thread. This file is found in c:\temp\JazzMonRuntime or /var/tmp/JazzMonRuntime or wherever you've set PATH_OUTPUT_DIR to. You will find entries like this one:
2012.12.06,12:02:03,499 [CounterServiceSequence_1:ServerMonitorSequence_3] Writing counters to c:\temp\JazzMonRuntime.140.2\jazzdev.torolab.ibm.com.qm\CounterContentServer2.html
The monitoring process creates a separate thread for each server-application being monitored and each thread is given the same amount of time or number of iterations to run so it shouldn't be treating different targets differently. But in the course of running if there are too many exceptions or problems with a given thread it may shut down before the others.
I would start to investigate what's happening by looking in the JazzMon Runtime.log file where it records a few lines every time it takes a snapshot on each thread. This file is found in c:\temp\JazzMonRuntime or /var/tmp/JazzMonRuntime or wherever you've set PATH_OUTPUT_DIR to. You will find entries like this one:
2012.12.06,12:02:03,499 [CounterServiceSequence_1:ServerMonitorSequence_3] Writing counters to c:\temp\JazzMonRuntime.140.2\jazzdev.torolab.ibm.com.qm\CounterContentServer2.html
-
"jazzdev.torolab.ibm.com.qm" - find a line that mentions the server-application you want
-
"ServerMonitorSequence_3" is the unique thread-id monitoring that server-application