Jazzmon usage in Production environment
Hi,
Some questions on Jazzmon usage in production environment
1. Whats the recomendded or best time schedules for montoring? 60 minutes gap for total of 8 hours?
2. Is there any negatives seen with the usage of Jazzmon in a live production environment? Like any performance related issues when Monitoring is ON?
3. How is the general usage in the production env? Do people use it all the time or switch it ON for a week, gather the data , analyze and swith OFF?
Accepted answer
Hi Srikanth,
The best place to start for your questions would be:
https://jazz.net/wiki/bin/view/Deployment/JazzMonFAQ
To answer your questions specifically:
1) The timing interval very much depends on why you're running Jazzmon in the first place. If you are creating a baseline for the future of overall load then 60 minutes may well be sufficient.
If you need the snapshots to record peak times with more granularity then you could reduce this to 15 minutes. It is important not to collect too frequently unless specifically requested to do so as part of an investigation, say with Rational Client Support.
The duration also can be longer than 8 hours and typically the longer you leave it the more representative the data will be. Typically 3 days would be recommended.
2) The overhead of Jazzmon is light and certainly below the 3-5% aim of monitoring tools in general. Of course, if you're at the limit of your resources to start with 3-5% suddenly becomes important. Therefore you may wish to monitor the JVM and system before you initiate Jazzmon. If you have no other monitoring in place already then you could consider another low cost monitoring (although not historical) https://jazz.net/wiki/bin/view/Deployment/JazzMonitoringThroughJMX
3) Jazzmon is a tool to help you see how your system is running in general and over time. Therefore it is envisaged that you'd initially create a baseline for future comparisons (over say a 3 day period) and then periodically re-run the tool and analyse against the original baseline.
We certainly wouldn't expect or think you would require constant monitoring.
I hope that this information helps. If not then please let us know what other thoughts and concerns you have.
3 other answers
| the counters that Jazzmon uses are always updated, so the actual jazzmon overhead is tiny.. capture the counts on some cycle and reset the counters.
I built some extensive Excel macros that could help us quickly view the massive data and build trend reports.
we were able to use this data to identify specific areas of the deployment to watch: Source code retrieve operations, RSS/Event feed activity,
here are a couple of those reports, screen shots.
the table is a daily by repository (ccm server)
use the service_etCnt.csv file, sorted in most to least active (for the average of the sample period)
col A and E
display which was most consuming on the server (rank in the top 10, col B) using the service_etTot.csv file
average elapsed time vs the IBM baseline file (8hour), col C , using the sevice_etAvg.csv file
relative percentage compared to the ibm baseline, green is better, red is worse, col D
counts col E & F, column G is one 15minute sample period counts
col A has specific entries highlighted to indicate Source Code related operations.
this was produced daily by server
then to make more sense of the daily, I produced a consolidated set of graphs from the daily data that displayed one item each.
here is 9 months of source code file retrieve counts. (scm.common.iVersionedContent.Get, our most active operation) across all the servers.
Red is the overall trend line.
if you look carefully, you can see when users started doing these operations on a particular server.. (as we add capacity of more ccm servers)
the green line comes off the base axis about midway thru the chart. The blue line much later.
the top purple line is the sum of all the servers for this operation.
the peaks and valleys are roughly weekly cycles, as you might expect, where the workforce has saturday and sunday off
using the daily reports I could build graphs that helped identify specific behaviors that impacted our users.
(we had a UI response time performance problem, which was caused by an accidental concurrence of multiple software builds).
the jazzmon team has since provided some additional excel macros that can provide some of this data in graphical form.
[1] In case the performance issues are erractic, you might just want to use the mointoring for a longer duration [a week or so]. The default interval is sufficient - when you are monitoring some event that causes bottlenecks once in couple of hours - just choosing the sample duration as 1d would suffice.
[2] JazzMon only collects snapshot data and makes this data available for statistical study, if you goto the jts/admin internal Counters, you will anyways see this data. So the JazzMon collection is not going to stress the server otherwise.
[3] I usually recommend using JazzMon to figure what components/services are being used to the max when there is a huge number of users using the server. [Usually during performance investigation]. So the best time for scheduling the monitoring is when you feel the issue is going to occur.
But then if you find your server performing badly on mondays after a weekend, its worth to keep a tab on what has been going on over the weekend via JazzMon, than spend the weekend monitoring the manually ;)
Comments
Thanks KK. That answers most of it.
One more question.
Is there a predefined Baseline for X number of users to compare my results with?
If not what is the best way and approach to create a baseline? Is baseline (of a "good health" servers) purely a user choice or are there any guidelines?
It would be nearly impossible to have an identical setup as yours [unless we really compare it against your own setup in regular intervals]. This because, even if you have 2 server with identical number of users, their activities would not always be the same, so your comparisions would need to be with how your server was behaving sometime back in timeline.
If you are referring to the baseline argument that is available with JazzMon, we have the similar disclaimer in the JazzMon documentation
To choose a good baseline for yourself, I would say monitor your setup for a while and then get a hint on the user load and the load on the server to choose a good baseline for yourself.
Oh one other thing that I wanted to mention for the last part of your question:
[3] If you set JazzMon to run for 5 days (lets say on Monday morning), it would run for 5 days, and then stop at the end of the final day and also performs the Analysis of these logs.
Suppose you have set the monitoring for 2 weeks and need to analyze the reports on a daily basis at the end of the day, you could manually run the JazzMon.jar -analyze command on another command prompt and check the status intermediatly [not the correct word for this] too.
[1] We switch on JazzMon when we need to monitor the performance of a Jazz repository. We run it for seven days, with a time interval of an hour. At the end of the seven days we perform the gather and analyse steps and then restart the next 7 day run. This continues for as long as needed.
[3] We use it when we need to assess performance. Whilst I said in [1] above that we switch it ON when we want to assess performance of a given repository, where we have done so it has remained ON and we haven't yet, stopped any analysis of our production servers.