It's all about the answers!

Ask a question

ETL long time running


simon park (1312) | asked Feb 02 '16, 9:52 p.m.
edited May 18 '22, 10:46 a.m. by Shailee Sinha (113)


I have been used RTC 4.0.4 and RRDI 2.0.1 since 2013.
It has about 340000 of work item data, 200 user, 3 projects

ETL is running without error in every night.
But it takes too long time. It takes 7 hour today.
I found log and It take long time in workitem history attribute (see log fig.)
I know this fetch ( WI history) is full fetch every time.
but I don't use history data report in RRDI and I don't need WI history in RRDI
Can I dimiss this work item history in ETL?
or How can I reduce ETL running time in my case?




2016-02-03 01:39:35,426 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Started Build WorkItemHisCustomAttr at 16. 2. 3      1:39
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Selected: 3482899
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Inserted: 3345588
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Updated: 0
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Ignored: 137311
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Inserting: 49 minutes 27 seconds 
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Updating: 0ms
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Looking Up: 23 minutes 54 seconds 
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Fetching Data: 9 minutes 36 seconds 
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Running: 3 hours 5 minutes 16 seconds 
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Finished Build WorkItemHisCustomAttr at 16. 2. 3      4:44. The build was successful
2016-02-03 04:44:52,021 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Started Build WorkItemComplexCustomAttr at 16. 2. 3      4:44
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Selected: 3880
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Inserted: 919
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Updated: 34001
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Records Ignored: 2294
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Inserting: 1 second 
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Updating: 10 seconds 
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Looking Up: Less than 1ms
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Fetching Data: 12 seconds 
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Time Running: 1 minute 24 seconds 
2016-02-03 04:46:16,932 [ccm: AsynchronousTaskRunner-3 @@ 00:13] DEBUG e.workitem.internal.WorkItemsRemoteSnapshotService  - ETL: Finished Build WorkItemComplexCustomAttr at 16. 2. 3      4:46. The build was successful

Accepted answer


permanent link
Francesco Chiossi (5.7k11119) | answered Feb 03 '16, 4:16 a.m.
Hello Simon,

there is no possibility to do this kind of customizations with the default data collection jobs in CLM.
This is only possible with Rational Insight where you can use Cognos Data Manager to customize the ETL to your needs.

However, since you are using an old CLM version, the easiest way to improve the performances of your ETL would be to upgrade to a recent version. Starting with CLM 5.X the Data Collection Component (DCC) is available to handle the data collection and offers significant performance improvements over the standard CLM ETL.

See some references:

Collaborative Lifecycle Management performance report: Data Collection Component Performance (DCC) 5.0.2 release
https://jazz.net/wiki/bin/view/Deployment/CLMDCCPerformanceReport502

Performance summary and guidance for the Data Collection Component in Rational Reporting for Development Intelligence
https://jazz.net/library/article/1433

Best Regards,

Francesco Chiossi
simon park selected this answer as the correct answer

Comments
simon park commented Feb 03 '16, 5:01 a.m.

Thanks Francesco.
          
I will review your guide.
Could you help me to upgrade to a recent version?
If you possible, let me know how to upgrade from 4.x to 5.x


Francesco Chiossi commented Feb 03 '16, 5:28 a.m.

Hello Simon,

this is the interactive upgrade guide for version 5.0.2 (last release of version 5):

Upgrading to version 5.0.2
http://www-01.ibm.com/support/knowledgecenter/SSYMRC_5.0.2/com.ibm.jazz.install.doc/topics/roadmap_clm_upgrade.html

You can also look at the following section of the deployement wiki for more general information about the CLM upgrade:

Installing and upgrading
https://jazz.net/wiki/bin/view/Deployment/DeploymentInstallingUpgradingAndMigrating

Note: If you want to migrate to a version 6.x release, you need to first upgrade to the latest version and ifix level of version 5, as it's not poissible to do a direct migration from version 4.X to 6.X.

Best Regards,

Francesco Chiossi

2 other answers



permanent link
Charlie Seo (22127) | answered Feb 02 '16, 10:09 p.m.
 Hi Simon, 

it seems ETL runs without an error but takes long time to complete. I guess it would be common to see delay with ETL jobs as data gets bigger. You maybe can check from DB side if there is any room to tune the performance. Otherwise, the following 2 articles might help you. Particularly, the second one with removing historical data from RICALM.REQUEST_BASELINE. However, you might consider to check your DB to see if this is a real contributor to the delay first to proceed to it. 

https://jazz.net/wiki/bin/view/Deployment/LongRunningETLNoError
https://jazz.net/wiki/bin/view/Deployment/WhyDoMyETLsTakeSoLongToRun


permanent link
simon park (1312) | answered Feb 02 '16, 11:59 p.m.
Hi, Charlie

Thanks for your answer and I knew your reference article.

We checked our DataBase(DB2, v9.5) by DBA last year.
 1) Delete RICALM.REQUEST BASELINE.
 2) Increase LOGBUFSZ from 1024 to 2048.
 3) Ensure reorg and runstate

But It takes continuously increase time. because it is history data.
If I use RTC, it will be more and more bigger data in every day.
and It will take more long time because history data is full fetch.
So I want to change config that  specific ETL process(Build WorkItemHisCustomAttr) do not start (refer to above log).
Can I change it?

We need how to don't increase running time although increasing history data.
Can I change it?

Your answer


Register or to post your answer.