r17 - 2014-04-02 - 15:10:34 - Main.gcovellYou are here: TWiki >

Deployment Web > DeploymentPlanningAndDesign > PerformanceDatasheetsAndSizingGuidelines > CLMETLPerformanceReport406

Collaborative Lifecycle Management performance report: Export Transform Load (ETL) 4.0.6 release

Authors: Peng Peng Wang
Last updated: Jan 23th, 2014
Build basis: CLM 4.0.6

Page contents

Introduction
- What our tests measure
Findings
- Performance goals
- Findings
Topology
- Data volume and shape
- Network connectivity
Methodology
Results
Appendix A
Appendix B
Appendix C

Introduction

This article presents the results of our "Extract, Transform, and Load" (ETL) performance testing for the Rational solution for Collaborative Lifecycle Managment (CLM) 4.0.6 release. The ETL type includes Java ETL and DM ETL. Data load includes full load and delta load. In this article, we focus on ETL performance comparison between the 4.0.6 release and the 4.0.5 release.

Disclaimer

The information in this document is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. Any pointers in this publication to external Web sites are provided for convenience only and do not in any manner serve as an endorsement of these Web sites. Any performance data contained in this document was determined in a controlled environment, and therefore, the results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multi-programming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

This testing was done as a way to compare and characterize the differences in performance between different versions of the product. The results shown here should thus be looked at as a comparison of the contrasting performance between different versions, and not as an absolute benchmark of performance.

What our tests measure

We use predominantly automated tooling such as Rational Performance Tester (RPT) to simulate a workload normally generated by client software such as the Eclipse client or web browsers. All response times listed are those measured by our automated tooling and not a client.

The diagram below describes at a very high level which aspects of the entire end-to-end experience (human end-user to server and back again) that our performance tests simulate. The tests described in this article simulate a segment of the end-to-end transaction as indicated in the middle of the diagram. Performance tests are server-side and capture response times for this segment of the transaction.

Findings

Performance goals

Verify that there are no performance regressions between current release and prior release.

Findings

DM and JAVA ETL of JTS, CCM and star job has similar ETL throughput for the comparison of 4.0.6 and 4.0.5.

DM and JAVA ETL of QM now take longer because a new feature includes QM and RM links (95244) which adds two new ETL builds. Using the performance team's test data and comparing with the 4.0.5 tests, these two new builds increase the QM ETL duration approximately 18% (about 15 minutes).

A design change in RM 4.0.6 results in properly loading all the data artifacts. RM 4.0.5 and prior releases inadvertently loaded only the first 100 shared public module view. This defect was fixed in RM 4.0.6 and tracked as "RRC REST Service only get 100 shared public module view" (82515). Consequently, 4.0.5 DM and JAVA ETL of RM appears faster, but this is because they were incomplete. 4.0.6 DM ETL of RM appears approximately 30% slower compared to 4.0.5, and JAVA ETL appears approximately 50% slower, but they are now complete and more accurate. See Appendix C for more details.

Topology

The topology under test is based on Standard Topology (E1) Enterprise - Distributed / Linux / DB2.

Server Overview

The specifications of machines under test are listed in the table below. Server tuning details are listed in Appendix A

This report used the same test environment and same test data to test the ETL performance for CLM 4.0.4, 4.0.5 and 4.0.6. Test data was generated using automation. The test environment for the latest release was upgraded from the earlier one by using the CLM upgrade process. To create four VMs, one X3550 M3 (at 2.67 GHz, 48 GB RAM, and 12 physical cores) was used. In the topology, four CLM applications (JTS, CCM, QM, and RM) were installed on VM1; the CLM repository was installed on VM2; the Data Manager ETL tool was installed on VM3; and the data warehouse was installed on VM4.

The same software configuration was used for both CLM 4.0.5 and CLM 4.0.6. The WebSphere Application Server was version 8.5.1, 64-bit. The database server was IBM DB2 10.1, 64-bit. The Rational Reporting for Development Intelligence tool was version 2.0.6. The Jazz Team Sever, CCM, QM, and RM applications co-existed in the same WebSphere Application Server profile. The JVM was set to use an 8 GB heap with a 1 GB nursery. Server tuning details listed in Appendix A

IBM Tivoli Directory Server was used for managing user authentication.

Function	Number of Machines	Machine Type	CPU / Machine	Total # of CPU Cores/Machine	Memory/Machine	Disk	Disk capacity	Network interface	OS and Version
ESX Server1	1	IBM X3550 M3 7944J2A	1 x Intel Xeon E5-2640 2.5 GHz (six-core)	12 vCPU	36GB	RAID0 SAS x3 300G 10k rpm	900G	Gigabit Ethernet	ESXi4.1
JTS/RM Server	1	VM on IBM System x3550 M3	ESX Server 1	4 vCPU	16GB		120G	Gigabit Ethernet	Red Hat Enterprise Linux Server release 6.2
Database Server	1	VM on IBM System x3550 M3	ESX Server 1	4 vCPU	16GB		120G	Gigabit Ethernet	Red Hat Enterprise Linux Server release 6.2
RRDI Development Tool	1	VM on IBM System x3550 M3	ESX Server 1	2 vCPU	4GB		120G	Gigabit Ethernet	Windwos 2008 Enterprise R2
ESX Server2	1	IBM X3550 M3 7944J2A	1 x Intel Xeon E5-2640 2.5GHz (six-core)	12 vCPU	36GB	RAID0 SAS x3 300G 10k rpm	900G	Gigabit Ethernet	ESXi4.1
CCM Server	1	VM on IBM System x3550 M3	ESX Server 2	4 vCPU	16GB		120G	Gigabit Ethernet	Red Hat Enterprise Linux Server release 6.2
QM Server	1	VM on IBM System x3550 M3	ESX Server 2	4 vCPU	16GB		120G	Gigabit Ethernet	Red Hat Enterprise Linux Server release 6.2
Data Warehouse Server	1	VM on IBM System x3550 M3	ESX Server 1	4 vCPU	16GB		120G	Gigabit Ethernet	Red Hat Enterprise Linux Server release 6.2

Data volume and shape

The data volume listed in Appendix B

Network connectivity

All server machines and test clients are located on the same subnet. The LAN has 1000 Mbps of maximum bandwidth and less than 0.3 ms latency in ping.

Methodology

The ETL loads the same test data set for the different releases test on the same test environment. The test data is migrated from older release by the CLM migration. That can help to make sure the performance data is comparable among releases. The test will do the initial load to data warehouse. And then do the delta ETL load against about 10% incremental data based on initial data set.

Results

DM ETL Full Load

In the figure below, the performance of JTS and CCM DM ETL has no degradation.

RM DM ETL appear to have about 30% duration increasing of Full ETL and 16% increasing of Delta ETL, because of a defect fix which causes RM to now properly load all the data. The RM defect is tracked as "RRC REST Service only get 100 shared public module view" (82515). Consequently, 4.0.5 DM ETL of RM appears faster, but this is because they were incomplete. 4.0.6 DM ETL of RM appears slower compared to 4.0.5, but they are now complete and more accurate. Please refer to Appendix C for more details.

DM and JAVA ETL of QM now take longer because a new feature includes QM and RM links (95244) which adds two new ETL builds. Using the performance team's test data and comparing with the 4.0.5 tests, these two new builds increase the QM ETL duration approximately 18% (about 15 minutes).

DM ETL Delta Load

NOTE: Star job does the same calculation based on all the operational data in the data warehouse. The performance of delta Star job is very similar with full Star job. So, we only use the full ETL load to evaluate the performance of Star job.

JAVA ETL Full Load

Precondition: CCM ETL has one build named as Workitembaseline which records the latest info of each workitem by getting the latest workitem history record. When the workitembaseline ETL build is running, the ETL gets the latest info (status, state, priority, severity, etc.) by requesting the latest WI history with the query condition that the change time of the WI history is earlier than the ETL build start time. If there is no ETL schedule one day, the latest info of each workitem on that day are not loaded in the worktiembaseline table of DW. JAVA ETL will fill the WI latest info on the days that have no ETL running. For example, if there is no JAVA ETL run on Jan 1st, the workitembaseline table won't have chance to load the workitem latest info into DW. However, the next JAVA ETL run on Jan 2nd will insert the missing data on Jan 1st. The performance team always uses the same data set to do the ETL performance so that the performance results are comparable. This feature will cause the ETL get more workitembaseline data for the JAVA Full ETL load along with time passed. That means the baseline data of 4.0.6 is slightly more than that in 4.0.5, and that the data of 4.0.5 is slightly more than that in 4.0.4, etc. To improve the comparability of performance data release by release, we insert one pseudo record so that the ETL only inserts a single day's baseline. We get the same number of workitembaseline ELT builds inserted by this way.

In the figure below, the performance of JTS and CCM DM ETL has no degradation.

RM JAVA ETL appear to have about 50% duration increasing of Full ETL and 12% increasing of Delta ETL, because of a defect fix which causes RM to now properly load all the data. The RM defect is tracked as "RRC REST Service only get 100 shared public module view" (82515). Consequently, 4.0.5 JAVA ETL of RM appears faster, but this is because they were incomplete. 4.0.6 JAVA ETL of RM appears slower compared to 4.0.5, but they are now complete and more accurate. Please refer to Appendix C for more details.

A new feature which adds QM and RM links into ETLs (95244) adds two new ETL builds which cause QM ETL duration to increase. Using the Performance Team's test data and comparing with the 4.0.5 tests, these two new builds increase the QM ETL duration approximately 18% (about 15 minutes).

JAVA ETL Delta Load

Precondition: CCM ETL has one build named as Workitembaseline which records the latest info of each workitem by getting the latest workitem history record. When the workitembaseline ETL build is running, the ETL gets the latest info (status, state, priority, severity, etc.) by requesting the latest WI history with the query condition that the change time of the WI history is earlier than the ETL build start time. If there is no ETL schedule one day, the latest info of each workitem on that day are not loaded in the worktiembaseline table of DW. JAVA ETL will fill the WI latest info on the days that have no ETL running. For example, if there is no JAVA ETL run on Jan 1st, the workitemBaseline table won't have chance to load the workitem latest info into DW. However, the next JAVA ETL run on Jan 2nd will insert the missing data on Jan 1st. The performance team always uses the same data set to do the ETL performance so that the performance results are comparable. This feature will cause the ETL get more workitembaseline data for the JAVA Full ETL load along with time passed. That means the baseline data of 4.0.6 is slightly more than that in 4.0.5, and that the data of 4.0.5 is slightly more than that in 4.0.4, etc. To improve the comparability of performance data release by release, we insert one pseudo record so that the ETL only inserts a single day's baseline. We get the same number of workitembaseline ELT builds inserted by this way.

Appendix A

Product	Version	Highlights for configurations under test
IBM WebSphere Application Server	8.5.0.1	JVM settings: GC policy and arguments, max and init heap sizes: -verbose:gc -XX:+PrintGCDetails -Xverbosegclog:gc.log -Xgcpolicy:gencon -Xmx8g -Xms8g -Xmn1g -Xcompressedrefs -Xgc:preferredHeapBase=0x100000000 -XX:MaxDirectMemorySize=1g
DB2	DB2 10.1.1	Transaction log setting of data warehouse: * Transaction log size changed to 40960 db2 update db cfg using LOGFILSIZ=40960
LDAP server	IBM Tivoli Directory Server 6.3
License server		Hosted locally by JTS server
Network		Shared subnet within test lab

Appendix B

	Record type	Initial load	Delta load
CCM	APT_ProjectCapacity	1	1
	APT_TeamCapacity	0	0
	Build	0	0
	Build Result	0	0
	Build Unit Test Result	0	0
	Build Unit Test Events	0	0
	Complex CustomAttribute	0	0
	Custom Attribute	0	0
	File Classification	3	3
	First Stream Classification	3	3
	History Custom Attribute	0	0
	SCM Component	2	0
	SCM WorkSpace	2	1
	WorkItem	100026	10000
	WorkItem Approval	100000	10000
	WorkItem Dimension Approval Description	100000	10000
	WorkItem Dimension	3	0
	WorkItem Dimension Approval Type	3	0
	WorkItem Dimension Category	2	0
	WorkItem Dimension Deliverable	0	0
	WorkItem Dimension Enumeration	34	0
	WorkItem Dimension Resolution	18	0
	Dimension	68	0
	WorkItem Dimension Type	8	0
	WorkItem Hierarchy	0	0
	WorkItem History	242926	20100
	WorkItem History Complex Custom Attribute	0	0
	WorkItem Link	112000	10000
	WorkItem Type Mapping	4	0
RM	CrossAppLink	605658	88293
	Custom Attribute	422710	51073
	Requirement	424760	51393
	Collection Requirement Lookup	163110	37200
	Module Requirement Lookup	206000	20000
	Implemented BY	100	0
	Request Affected	5988	0
	Request Tracking	0	0
	REQUICOL_TESTPLAN_LOOKUP	0	0
	REQUIREMENT_TESTCASE_LOOKUP	0	0
	REQUIREMENT_SCRIPTSTEP_LOOKUP	24000	2400
	REQUIREMENT_HIERARCHY	12626	2328
	REQUIREMENT_EXTERNAL_LINK	0	0
	RequirementsHierarchyParent	6184	0
	Attribute Define	10	10
	Requirement Link Type	176	176
	Requirement Type	203	203

Record type		Initial load	Delta load
TestScript		0	0
BuildRecord		2000	200
Category		55	12
CategoryType		12	0
Current log of Test Suite		600	60
EWICustomAttribute		0	0
EWIRelaLookup
	CONFIG_EXECUTIONWORKITM_LOOKUP	0	0
	EXECWORKITEM_REQUEST_LOOKUP	0	0
	EXECWORKITEM_ITERATION_LOOKUP	18000	1800
	EXECWORKITEM_CATEGORY_LOOKUP	0	0
ExecResRelaLookup
	EXECRES_EXECWKITEM_LOOKUP	54000	5400
	EXECRES_REQUEST_LOOKUP	6001	0
	EXECRESULT_CATEGORY_LOOKUP	0	0
	EXECUTION_STEP_RESULT	0	0
ExecStepResRequestLookup		0	0
ExecutionResult		54000	5400
ExecutionStepResult		0	0
ExecutionWorkItem		18000	1800
Job		0	0
JobResult		0	0
KeyWord		0	0
KeyWordTestScriptLookup		0	0
LabRequestChangeState		0	0
LabRequest		252	25
LabResource		2400	2640
Objective		0	0
Priority		4	0
RemoteScript		0	0
Requirement		0	0
Reservation		3199	320
ReservationRequestLookup		3	12
ResourceGroup		0	0
ScriptStep_Rela_Lookup		24000	2397
State		24	0
StateGroup		6	0
TestCase		6000	600
TestCaseCustomAttribute		0	0
TestCaseRelaLookup
	TESTCASE_RemoteTESTSCRIPT_LOOKUP	0	0
	TESTCASE_TESTSCRIPT_LOOKUP	6000	600
	TESTCASE_CATEGORY_LOOKUP	16106	1598
	REQUIREMENT_TESTCASE_LOOKUP	6000	0
	REQUEST_TESTCASE_LOOKUP	6000	0
	TestCase RelatedRequest Lookup	0	0
TestEnvironment		400	0
TestPhase		120	0
TestPlan		11	1
TestPlanObjectiveStatus		0	0
TestPlanRelaLookup
	REQUIREMENT_TESTPLAN_LOOKUP	0	0
	TESTSUITE_TESTPLAN_LOOKUP	600	0
	TESTPLAN_CATEGORY_LOOKUP	0	2
	TESTPLAN_TESTCASE_LOOKUP	6000	600
	TESTPLAN_OBJECTIVE_LOOKUP	0	0
	REQUIREMENT COLLECTION_TESTPLAN_LOOKUP	32	0
	TESTPLAN_TESTPLAN_HIERARCHY	0	0
	TESTPLAN_ITERATION_LOOKUP	120	12
	REQUEST_TESTPLAN_LOOKUP	0	0
TestScript		6000	1200
TestScriptRelaLookup _ Manual
	TESTSCRIPT_CATEGORY_LOOKUP	0	0
	REQUEST_TESTSCRIPT_LOOKUP	0	0
TestScriptRelaLookup _ Remote		0	0
TestScriptStep		24000	2397
TestSuite		600	60
TestSuite_CusAtt		0	0
TestSuiteElement		9000	900
TestSuiteExecutionRecord		600	60
TestSuiteLog		3000	300
TestSuiteRelaLookup
	TESTSUITE_CATEGORY_LOOKUP	1595	155
	REQUEST_TESTSUITE_LOOKUP	0	0
TestSuLogRelaLookup
	TESTSUITE_TESTSUITELOG_LOOKUP	3000	300
	TESTSUITELOG_EXECRESULT_LOOKUP	21303	2106
	TESTSUITELOG_CATEGORY_LOOKUP	0	0
TestSuiteExecutionRecord_CusAtt		600	60
TSERRelaLookup		0	0
	TSTSUITEXECREC_CATEGORY_LOOKUP	0	0
Total		299682	31094

N/A: Not applicable.

Appendix C

QM&RM integration new feature "As a report author, I want test script step including its index and its requirement links to be ETLed " (95244) added into 4.0.6. QM team added two ETL builds to implement the feature. The performance team was asked to create test data of QM/RM link data to evaluate the performance impact against the new feature introduced. The performance team created 24K link data of QM test script step and RM requirement. Compared with 4.0.5, the two new ETL builds take about 15 minutes for the full ETL, an increased duration of about 18%.

This defect was fixed in RM 4.0.6 and tracked as "RRC REST Service only get 100 shared public module view" (82515). The defect caused RRC ETL only load the 1st 100 share module view, so the lookup data of requirement and module view lost as well, because the the ETL build failed to lookup the shared module view. Before the defect fix, the ETL can load 100 shared moduel view and 9k lookup data of requirement and module view. But the performance test data have 2000 shared modue view and 180k lookup data of requirement and module view. Consequently, 4.0.5 RM ETL of RM appear faster, but this is because they were incomplete. After the defect fixed, 4.0.6 DM RM appears approximately 30% slower compared to 4.0.5, but they are now complete and more accurate. Similarly, 4.0.6 JAVA ETL of RM appears approximately 50% slower compared to 4.0.5, but it is now complete and more accurate.

For more information

Collaborative Lifecycle Management 2012 Sizing Report (Standard Topology E1)

About the authors

PengPengWang

Questions and comments:

What other performance information would you like to see here?
Do you have performance scenarios to share?
Do you have scenarios that are not addressed in documentation?
Where are you having problems in performance?

Warning: Can't find topic Deployment.PerformanceDatasheetReaderComments

Deployment

Community information and contribution guidelines

Status icon key:

To do
Under construction
New
Updated
Constant change
None - stable page

Smaller versions of status icons for inline text:

Copyright © by IBM and non-IBM contributing authors. All material on this collaboration platform is the property of the contributing authors.
Contributions are governed by our Terms of Use. Please read the following disclaimer.
Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.