It's all about the answers!

Ask a question

Problem with ETL job scheduling


Tayane Fernandes (251216) | asked Apr 03 '14, 2:09 p.m.
edited Apr 03 '14, 2:10 p.m.
“Data Warehouse Snapshot time” property  is 12. 
But it does not run at 12:00, it runs next at 10:20.
The timezone on server is correct.

What could be wrong? 

Thanks! 

Accepted answer


permanent link
Francesco Chiossi (5.7k11119) | answered Sep 26 '14, 4:39 a.m.
Hello Tayane,

if you need to run the ETL multiple times a day, you can consider the new Data Collection Component (DCC).
Note: this requires CLM 5.X.

For details see:

Data collection with Data Collection Component
http://www-01.ibm.com/support/knowledgecenter/SSYMRC_5.0.1/com.ibm.rational.dcc.doc/topics/c_ovr_process_etl_rrdi.html

Best Regards,

Francesco Chiossi
Tayane Fernandes selected this answer as the correct answer

9 other answers



permanent link
Kevin Ramer (4.5k9186201) | answered Apr 03 '14, 4:58 p.m.
I have observed that the start time of some DW jobs has crept forward.  For example the (in)famous Star job on the JTS can run anywhere from 3-10 hours on one of our setups.  The overall start hour is 24, but see this:
[ inverse sorted with respect to Reports / DW Job Status ]
Status          Job             Start Time                      End Time                        Duration
Succeeded    Common    Apr 1, 2014 11:59 PM    Apr 2, 2014 12:00 AM    14 seconds
Succeeded    Repository    Apr 2, 2014 12:00 AM    Apr 2, 2014 12:01 AM    59 seconds
Succeeded    Requirements    Apr 2, 2014 7:03 AM    Apr 2, 2014 7:03 AM    24 seconds
Succeeded    Star    Apr 2, 2014 7:05 AM    Apr 2, 2014 5:11 PM    10 hours, 5 minutes

I don't see any real consistency beyond the first couple of jobs, e.g. WHY does Requirements start @ 7A, Star start time varies wildly and is creeping into our "primetime" use.

permanent link
Francesco Chiossi (5.7k11119) | answered Apr 04 '14, 5:31 a.m.
Hello Kevin,

you see the gap between Repository and Requirements because the jobs are run in the following order:
  • Jazz Team Server/Common and Repository
  • Rational® Team Concert/Common and Repository
  • Rational Quality Manager/Common and Repository
  • Rational Team Concert/APT and SCM
  • Rational Team Concert/Work Items and Build
  • Rational Quality Manager/Work Items
  • Rational Quality Manager/Quality Management
  • Jazz Team Server/Requirements
  • Jazz Team Server/Star
See Running the data collection jobs
http://pic.dhe.ibm.com/infocenter/clmhelp/v4r0m6/topic/com.ibm.rational.reporting.admin.doc/topics/t_running_the_data_collection_jobs.html

So I expect  the RTC and RQM jobs to be running in the empty interval.

Best Regards,

Francesco Chiossi

Comments
Kevin Ramer commented Apr 04 '14, 8:47 a.m.

Thanks for that Francesco.  So taken to the next logical step: All the "other" DW jobs for those applications in the same JTS are run.  I guess it's time to concentrate on Insight once we get our new hardware.  Then that Star job can be omitted (as well as others)


permanent link
Tayane Fernandes (251216) | answered Apr 04 '14, 1:31 p.m.
Hi Francesco!

But I have just Rational Team Concert and yet my jobs are like:

Status

Tarefa de Coleção de Dados

Horário de Início:

Horário de Encerramento

Tempo Usado

Bem-sucedido

Work Items

04/04/2014 10:19

04/04/2014 10:21

2 minutos, 40 segundos

Bem-sucedido

Build

04/04/2014 10:19

04/04/2014 10:19

1 segundo

Bem-sucedido

SCM

04/04/2014 10:18

04/04/2014 10:18

3 segundos

Bem-sucedido

APT

04/04/2014 10:18

04/04/2014 10:18

22 segundos

Bem-sucedido

Repository

04/04/2014 10:17

04/04/2014 10:17

39 segundos

Bem-sucedido

Common

04/04/2014 10:17

04/04/2014 10:17

6 segundos

Bem-sucedido

Work Items

03/04/2014 22:18

03/04/2014 22:21

2 minutos, 32 segundos

Bem-sucedido

Build

03/04/2014 22:18

03/04/2014 22:18

1 segundo

Bem-sucedido

SCM

03/04/2014 22:18

03/04/2014 22:18

2 segundos

Bem-sucedido

APT

03/04/2014 22:17

03/04/2014 22:18

22 segundos



permanent link
Francesco Chiossi (5.7k11119) | answered Apr 07 '14, 2:16 a.m.
edited Apr 07 '14, 2:17 a.m.
Hello Tayane,

even if you only have RTC installed, the schedule of the ETL is controlled by JTS.
You should check https://server:port/jts/admin > Reports > Data Collection Jobs and see the value set for Job Scheduling > Start Time.

See:
http://pic.dhe.ibm.com/infocenter/clmhelp/v4r0m6/topic/com.ibm.rational.reporting.admin.doc/topics/t_running_the_data_collection_jobs.html#etl_start_time

Best Regards,

Francesco Chiossi


permanent link
Tayane Fernandes (251216) | answered Apr 07 '14, 7:39 a.m.
edited Apr 07 '14, 7:40 a.m.
Hello Francesco,

The Start Time is already 12 too. 
But jts jobs also run at 10:20 and 22:20
  

Status

Tarefa de Coleção de Dados

Horário de Início:

Horário de Encerramento

Tempo Usado

Bem-sucedido

Star

06/04/2014 22:20

06/04/2014 22:23

2 minutos, 43 segundos

Bem-sucedido

Repository

06/04/2014 22:16

06/04/2014 22:17

9 segundos

Bem-sucedido

Common

06/04/2014 22:16

06/04/2014 22:16

2 segundos

Bem-sucedido

Star

06/04/2014 10:20

06/04/2014 10:23

2 minutos, 39 segundos

Bem-sucedido

Repository

06/04/2014 10:16

06/04/2014 10:17

8 segundos

Bem-sucedido

Common

06/04/2014 10:16

06/04/2014 10:16

2 segundos

Bem-sucedido

Star

05/04/2014 22:21

05/04/2014 22:25

3 minutos, 38 segundos


 

permanent link
Francesco Chiossi (5.7k11119) | answered Apr 07 '14, 8:04 a.m.
edited Apr 07 '14, 8:05 a.m.
Hello Tayane,

by default the ETL should only run every 24h, but in your case it seems to be running every 12.

This should be regulated by the following entry in JTS advanced properties:
com.ibm.team.datawarehouse.service.internal.RemoteSnapshotManagerTask > Task delay
The default value is 86400 (24 hours in seconds).

While the starting time should be set by
com.ibm.team.datawarehouse.service.internal.MetaDataService > Data Warehouse Snapshot Time.
The default value is 24 (starting at midnight).

Best Regards,

Francesco Chiossi

Comments
Tayane Fernandes commented Apr 07 '14, 8:10 a.m.

Hello Francesco,


But it is running every 12 because my superiors want so. Because they want to see the updated reports faster. I know that is not indicated by IBM, but they want it that way.
The default value is 24, but I setted to 12, because I want to run at midnight and midday.


permanent link
Francesco Chiossi (5.7k11119) | answered Apr 07 '14, 9:01 a.m.
Hello Tayane,

thanks for the clarification.

If you look at the Data Collection Job Status page and set a larger number for "Number of previous jobs to display data for" can you see since when the starting time changed?

Has the server been restarted since then?

Could it be that for one day the ETL took longer than 12 hours, or it has been delayed for other reasons (like a manual execution) and after that the server just continued to run it every 12 hours?

Best Regards,

Francesco Chiossi



Comments
Tayane Fernandes commented Apr 07 '14, 10:18 a.m.
since when the starting time changed? 
The time changed since  February 17th.

Has the server been restarted since then? 
No

Could it be that for one day the ETL took longer than 12 hours, or it has been delayed for other reasons (like a manual execution) and after that the server just continued to run it every 12 hours? 

No 


Francesco Chiossi commented Apr 07 '14, 11:08 a.m.

Did anything happen on your system around February 17th?

I would also be curious to see it the correct schedule will be picked up after your next server restart, as all the configurations seems good.


Tayane Fernandes commented Apr 08 '14, 9:10 a.m. | edited Apr 08 '14, 9:24 a.m.

Around that date finished the summertime. Can it be?


Tayane Fernandes commented Apr 08 '14, 9:11 a.m. | edited Apr 08 '14, 9:26 a.m.
How often should I restart the server? Because we don't usually do this often

permanent link
Francesco Chiossi (5.7k11119) | answered Apr 08 '14, 9:44 a.m.
edited Apr 08 '14, 9:55 a.m.
Hello Tayane,

this is a very good point, if your server is using the Brazil time, on Sun Feb 16th the clock moved back 1 hour, correct (as far as I have found it's the only country ending DST on the Third Sunday February)?

This is a bit less than the schedule difference that you are seeing but is consistent with the change, assuming that the server counts 12 hours between each ETL run.

16 Feb 00:00 + 12h schedule -1 for DST end = 17 Feb 11:00

There are still 45 minutes unaccounted for, but given the coincidence with the dates I think this could definitely be related to the DST end.

It's possible that if the ETL schedule is not a full day, it could keeps counting the intervals in order to accommodate schedules that doesn't fit a full day.

I think that after the next server restart the ETL should get back to the schedule you configured.

Best Regards,

Francesco Chiossi

permanent link
Tayane Fernandes (251216) | answered Sep 25 '14, 3:05 p.m.
At a meeting with engineering staff, they told me that when I run the jobs twice a day, I can not set a schedule to run it. 

Your answer


Register or to post your answer.


Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.