Typical questions raised during a Technical Deployment Planning Workshop

Authors: TimFeeney, GrantCovell, StevenBeard, MikeDelargy
Build basis: N/A

Page contents

The following are the typical questions that could come up during a Technical Deployment Workshop along with some of the information we like to see in advance.

Prerequisites

Current topology with detailed server specifications (cores, memory, disk, OS, etc)
JVM Settings
System Monitoring reports for typical 24hr period
TBD Standard Topology Survey results

Integrated view of software development lifecycle tool landscape

Which Rational products have you deployed?
What other IBM tools/versions are integrated into your Rational environment?
What 3rd party tools/versions are integrated into your Rational environment?
What open source tools/versions are integrated into your Rational environment?
What home grown tools are integrated into your Rational environment?
1. If using RTC, what Eclipse/Visual Studio clients do you integrate with? What difficulties do you experience keeping them in sync with a deployed/supported version of RTC/CLM? Do you have or need a mixed client version environment (n-1 compatibility)?

Current deployment topology and characteristics

Which standard topology is your topology closest to? see Standard deployment topologies overview :Recommended and alternate topologies
How many JTS, RM, CCM, QM and DM instances?
Are you using a Reverse Proxy server?
Do you have a load balancer?
1. Note we have limited hands on experience with appliances for the web tier such as f5 BIG-IP
Do you have a content caching proxy server setup for remote sites or your build farm?
What is the network bandwidth for each site?
1. Rule of thumb latency between servers 1-5ms, <50ms to license server , server to client <200ms 90% of time. See Sizing Strategy
What other applications are running on the same server as the CLM applications?
Do you have separate environments for Development, Test and Production? How are they the same or different?
Do you have a DR environment?

Existing infrastructure technology choices

How many cores per for each server?
How much RAM for each server?
Are your servers virtual? see Principles of good virtualization
1. What is your hypervisor technology?
2. What is affinity/entitlement for the VM resources? Are the CPU, memory, and network resources dedicated and uncapped?
  1. Recommend to have anti-affinity rules in place so no two servers (VM) are on same physical server else recovery is more difficult/complex
3. What are the overcommitment rules in place for the VM resources?
  1. memory should never be overcommitted
4. Do the allocated CPUs for each VM correspond to an actual physical CPU?
Which operating system / platform are you using for the application tier (servers hosting your Rational applications)?
What browser(s) do you use?
Which database and version are you using for Jazz?
1. How long have you been using your Jazz database?
2. How large is your Jazz database?
Which application server edition/version are you using?
1. What are the JVM heap min/max/nursery settings for each application server?
What storage type/technology do you use?

Current/planned usage model

What is the current/planned number of registered users?
1. Take care when total gets into the 4000-5000 user range. Rule of thumb is 3000 per JTS (more from management/failure perspective, not performance)
What is the current/planned number of concurrent users?
What are the current/planned registered/concurrent user breakdown by role and geography?
What is the timeline for the current and future state deployment?

Understand CLM configuration and typical CLM usage scenarios/patterns by role

Be aware of application limits
In cases of multiple Jazz application servers of a given kind (e.g. multiple CCM servers), how are projects assigned to given server? Are there cross-server relationships between these projects?
What traceability relationships exist between the application artifacts?
Is RTC distributed SCM in use?
Is RTC process sharing in use?
What is your typical release pattern/schedule?
Have you implemented a continuous integration practice?
What is your volume of builds per day? week?
Have you implemented a continuous deployment practice?
What is your volume of deployments per day? week?
What are the typical application usage patterns by each CLM application and each role?

Top end user issues and concerns

What performance concerns do you have at this time?
What are the primary end user issues and concerns at the moment? Have you logged PMRs?
What are the primary application administration issues and concerns at the moment? Have you logged PMRs?

Operational context for project deliverables

Public vs Private vs Hybrid Cloud
Release and Deploy process/strategy

High Availability and Disaster Recovery

Background
1. What is your required level of availability for your Rational environment for supported hours? see Availability - /"number of nines/"
  1. Typical development environments should be 99.9, anything more is unnecessary
2. What are your supported business operating hours for your Rational environment?
3. What is your current level of availability? see Measuring availability
4. How do you measure your availability?
5. Do you have planned outages? How often/long? What do you during those times?
6. What is your requirement for Mean Time to Recovery (MTTR) for high availability scenarios in hours? see Mean Time to Recovery (MTTR). What is your typical/actual MTTR?
  1. 1 hour or more is reasonable. Anything less will be difficult to achieve.
7. What is your required Recovery Point Objective (RPO) for disaster recovery scenarios in hours? see Recovery point objective (RPO). What is your typical/actual recovery point?
  1. 24hours is fairly standard. Less is very aggressive, difficult to achieve.
8. What is your required Recovery Time Objective (RTO) for disaster recovery scenarios in hours? see Recovery time objective (RTO). What is your typical/actual recovery time?
  1. 2 days is reasonable
Approach
1. What is your strategy for backup?
  1. What do you back-up? see CLM Backup (db conf files, proxy, virtual servers)
  2. How often to you back-up your development environment?
  3. How often do you test your approach to back-up (and restore)? How?
2. Do you perform root cause analysis of your failures and what are they telling you?
3. How are failures detected? How are you notified of your failures? How are they triaged?
4. Do you monitor for failures? Which levels?
5. How long do you try to recover from an HA failure before you decide to treat it as a DR failure?
6. What approach and technology do you use to support application tier high availability?
7. What approach and technology do you use to support application tier disaster recovery?
8. How often do you test your approach to high availability? How? see Chaos Monkey.
9. How often do you test your approach to disaster recovery? How?
10. Do you have a backup for every person involved in the process from failure to recovery?
11. Do you have the author or another test the procedures?
12. How do you verify that the environment has recovered properly?
Failure scenarios
1. Jazz application failures
2. WAS failures
3. Single application server failure
  1. If you use virtualization, is it setup for auto restart on server failure?
4. Database failure
  1. Do you perform database log shipping between primary and secondary data center?
5. Web server failure
6. Jazz indexes corruption
7. Primary data center failure
8. Network failure
9. Storage failure

Capacity planning, performance and monitoring

Explain why these three topics are grouped together (see Monitoring: Where to Start?)
1. Monitoring a system provides insight into trends and behavior based on real production data
2. Understanding what monitoring data tells us can enable capacity planning
3. Performance capability and problems can be best analyzed by understanding the normal behavior of a system as documented through monitoring data
Introduce the ideas of
1. Tactical monitoring, or “Monitoring in a time of war” I.e.: during a crit when logs are needed
2. Strategic monitoring, or “Monitoring in a time of war” i.e.: when things are good and systems are usually left alone
What does the client monitor today?
1. App,App Server, System/OS, VM/LPAR, DB, Network, availability, end-user performance benchmarks, end to end user transaction performance, users/license, reverse proxy, caching proxy
What would the client like to monitor in addition?
1. App,App Server, System/OS,VM/LPAR, DB, Network, availability, end-user performance benchmarks, end to end user transaction performance, users/license, reverse proxy, caching proxy
What tools does the client use to monitor today?
1. IBM tools, jtsmon, homegrown, 3rd-party?
2. And at what tiers do these tools monitor?
What does the client do with the data that they monitor?
1. Tactical (monitoring > triggers)
  1. Introduce, or acknowledge the “detect, decide, do” model
  2. Are there triggers which cause alarms, e.g. JVM > X GB, CPU > 90%, end-user ping > 10sec, etc. ?
  3. Do they have flowcharts, engagement matrix?
  4. How do they do triage?
  5. Do they do any postmortems or analysis?
2. Strategic (monitoring > planning)
  1. Is data consolidated into a dashboard?
  2. Are reports created?
  3. How much data is created? How long do they save it?
  4. Can they use their monitoring to formulate any trend data?
  5. Can this trend data be used to forecast capacity planning?
3. What other groups might view monitoring data and what might they see?
  1. IT looks at servers; Finance looks at licenses; Team admins look at usage; dbas look at dbs
Introduce the 5 (actually 6) things we suggest they monitor at a minimum
1. JVM size, CPU %, license usage, db size, data moved up/down into app server and db server, uptime
2. Do they monitor these 5 things?
What do they think is missing from our tools in re monitoring?

Security, audit and compliance

What security export regulations must you comply with?
1. local, regional, national, international
2. people, facility, corporate
3. database, application, web tiers
4. Are there multiple levels of security to work within e.g. unclassified, restricted, confidential, secret, top secret?
What confidentiality standards must you comply with?
1. teams within organization, between organization and with customer, between partners/subcontractors
2. people, facility, corporate
3. database, application, web tiers
Is there a need for multi-tenancy, e.g. host environment for multiple customers/contractors and need to segregate them?
Are there any needs to restrict what a given role, project, application, etc. has access to?
Do you have any corporate, national, etc. audit requirements (eg. UK FSA, SOX)? What impact does that have on data retention and access control?
Are there any other standards that impact the construction and use of your development environment?
What remote access/VPN/VDI requirements do you have?
Do you permit BYOD/mobile access? What standards/policies govern their use?
Do you have any corporate SSO standards and preferred technologies?
What web tier security do you have in place?
Do you do SSL offloading at the web tier (HTTPS outside data center and HTTP inside)?

Administration, configuring, tuning

Do you use your corporate help desk for capturing tickets against your environment?
Do you use RTC for your own infrastructure/tools team planning?
What are the administrative responsibilities of the CLM tools team?
What are the administrative responsibilities of the infrastructure team?
What are the administrative responsibilities of the project teams?
What are the support hours for the CLM tools team vs infrastructure team?
Do you have a regularly scheduled maintenance window?
What administrative procedures do you have documented in detail (e.g. backup, upgrade, maintenance)? How often are they reviewed or tested? Can they be performed by multiple team members?
Do you have a central knowledgebase (e.g. wiki) for capturing and publishing your procedures?
Where do your admins/end users go for documentation (external to IBM and/or internal to their site)?
Do you version control any part of your procedures or automation scripts?
For the responsibilities of the CLM tools team, do you have at least two administrators capable of performing each responsibility?
What approval process is required to make changes to the project, application, server, etc?
What training do you provide? How? When?
What additional administrative or application skills does your team need?

Typical questions raised during a Technical Deployment Planning Workshop

Prerequisites

Integrated view of software development lifecycle tool landscape

Current deployment topology and characteristics

Existing infrastructure technology choices

Current/planned usage model

Understand CLM configuration and typical CLM usage scenarios/patterns by role

Top end user issues and concerns

Operational context for project deliverables

High Availability and Disaster Recovery

Capacity planning, performance and monitoring

Security, audit and compliance

Administration, configuring, tuning

Environment Enablement and support - Development Environment as a Service

Related topics: Deployment web home, Deployment web home