Jazzmon usage in Production environment
Hi, Some questions on Jazzmon usage in production environment
1. Whats the recomendded or best time schedules for montoring? 60 minutes gap for total of 8 hours? 2. Is there any negatives seen with the usage of Jazzmon in a live production environment? Like any performance related issues when Monitoring is ON? 3. How is the general usage in the production env? Do people use it all the time or switch it ON for a week, gather the data , analyze and swith OFF?
|
Accepted answer
Hi Srikanth, The best place to start for your questions would be: https://jazz.net/wiki/bin/view/Deployment/JazzMonFAQ To answer your questions specifically: 1) The timing interval very much depends on why you're running Jazzmon in the first place. If you are creating a baseline for the future of overall load then 60 minutes may well be sufficient. If you need the snapshots to record peak times with more granularity then you could reduce this to 15 minutes. It is important not to collect too frequently unless specifically requested to do so as part of an investigation, say with Rational Client Support. The duration also can be longer than 8 hours and typically the longer you leave it the more representative the data will be. Typically 3 days would be recommended.
2) The overhead of Jazzmon is light and certainly below the 3-5% aim of monitoring tools in general. Of course, if you're at the limit of your resources to start with 3-5% suddenly becomes important. Therefore you may wish to monitor the JVM and system before you initiate Jazzmon. If you have no other monitoring in place already then you could consider another low cost monitoring (although not historical) https://jazz.net/wiki/bin/view/Deployment/JazzMonitoringThroughJMX
3) Jazzmon is a tool to help you see how your system is running in general and over time. Therefore it is envisaged that you'd initially create a baseline for future comparisons (over say a 3 day period) and then periodically re-run the tool and analyse against the original baseline. We certainly wouldn't expect or think you would require constant monitoring.
I hope that this information helps. If not then please let us know what other thoughts and concerns you have. Srikanth Bhushan selected this answer as the correct answer
|
3 other answers
In my prior company, we used a 15 minute sample cycle recording 24hrs/day with daily reports and weekly restarts.
| the counters that Jazzmon uses are always updated, so the actual jazzmon overhead is tiny.. capture the counts on some cycle and reset the counters. I built some extensive Excel macros that could help us quickly view the massive data and build trend reports. we were able to use this data to identify specific areas of the deployment to watch: Source code retrieve operations, RSS/Event feed activity, here are a couple of those reports, screen shots. the table is a daily by repository (ccm server) use the service_etCnt.csv file, sorted in most to least active (for the average of the sample period) col A and E display which was most consuming on the server (rank in the top 10, col B) using the service_etTot.csv file average elapsed time vs the IBM baseline file (8hour), col C , using the sevice_etAvg.csv file relative percentage compared to the ibm baseline, green is better, red is worse, col D counts col E & F, column G is one 15minute sample period counts col A has specific entries highlighted to indicate Source Code related operations. this was produced daily by server then to make more sense of the daily, I produced a consolidated set of graphs from the daily data that displayed one item each. here is 9 months of source code file retrieve counts. (scm.common.iVersionedContent.Get, our most active operation) across all the servers. Red is the overall trend line. if you look carefully, you can see when users started doing these operations on a particular server.. (as we add capacity of more ccm servers) the green line comes off the base axis about midway thru the chart. The blue line much later. the top purple line is the sum of all the servers for this operation. the peaks and valleys are roughly weekly cycles, as you might expect, where the workforce has saturday and sunday off using the daily reports I could build graphs that helped identify specific behaviors that impacted our users. (we had a UI response time performance problem, which was caused by an accidental concurrence of multiple software builds). the jazzmon team has since provided some additional excel macros that can provide some of this data in graphical form. |
Hello Srikanth
[1] In case the performance issues are erractic, you might just want to use the mointoring for a longer duration [a week or so]. The default interval is sufficient - when you are monitoring some event that causes bottlenecks once in couple of hours - just choosing the sample duration as 1d would suffice. [2] JazzMon only collects snapshot data and makes this data available for statistical study, if you goto the jts/admin internal Counters, you will anyways see this data. So the JazzMon collection is not going to stress the server otherwise. [3] I usually recommend using JazzMon to figure what components/services are being used to the max when there is a huge number of users using the server. [Usually during performance investigation]. So the best time for scheduling the monitoring is when you feel the issue is going to occur. But then if you find your server performing badly on mondays after a weekend, its worth to keep a tab on what has been going on over the weekend via JazzMon, than spend the weekend monitoring the manually ;) Comments
Srikanth Bhushan
commented Jul 29 '13, 5:46 a.m.
Thanks KK. That answers most of it. One more question. Is there a predefined Baseline for X number of users to compare my results with? If not what is the best way and approach to create a baseline? Is baseline (of a "good health" servers) purely a user choice or are there any guidelines? It would be nearly impossible to have an identical setup as yours [unless we really compare it against your own setup in regular intervals]. This because, even if you have 2 server with identical number of users, their activities would not always be the same, so your comparisions would need to be with how your server was behaving sometime back in timeline.
Oh one other thing that I wanted to mention for the last part of your question:
|
[1] We switch on JazzMon when we need to monitor the performance of a Jazz repository. We run it for seven days, with a time interval of an hour. At the end of the seven days we perform the gather and analyse steps and then restart the next 7 day run. This continues for as long as needed.
[3] We use it when we need to assess performance. Whilst I said in [1] above that we switch it ON when we want to assess performance of a given repository, where we have done so it has remained ON and we haven't yet, stopped any analysis of our production servers.
|
Your answer
Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.