JTS Common ETL consistently fails while all other ETL jobs will run
We are running Jazz 4.0.3 with /ccm, /qm, /rm, and /jts on Linux SLES 10 boxes. We are also using Websphere and integrating with LDAP. Our ETL named Common consistently fails. All other ETL jobs for /jts, /ccm, /qm, and /rm run successfully. I've seen several postings in the forums, as well as technotes and none seem applicable for us.
1. The ETL user ID is in the JazzAdmins security group and is assigned the Data Collector license
2. I have logged into the Jazz tools using the ETL user ID and password
3. The ETL user ID and password is set correctly on the JTS Common data collection job
4. Our Linux boxes have the following settings in the /etc/security/limits.conf files:
# the following two lines added for Jazz Rational Tools installation:
* hard nofile 76800
* soft nofile 76800
Note that 76800 is greater than the recommended 65535
Looking in the jts-etl.log application log at the time that the Common ETL fails seems to show that the Common job runs for a short while before failing with a "too many open files" error when executing the Build ReadAccess portion of the Common ETL:
Any ideas? Anyone encounter this issue before?
1. The ETL user ID is in the JazzAdmins security group and is assigned the Data Collector license
2. I have logged into the Jazz tools using the ETL user ID and password
3. The ETL user ID and password is set correctly on the JTS Common data collection job
4. Our Linux boxes have the following settings in the /etc/security/limits.conf files:
# the following two lines added for Jazz Rational Tools installation:
* hard nofile 76800
* soft nofile 76800
Note that 76800 is greater than the recommended 65535
Looking in the jts-etl.log application log at the time that the Common ETL fails seems to show that the Common job runs for a short while before failing with a "too many open files" error when executing the Build ReadAccess portion of the Common ETL:
2013-10-23 13:32:41,902 [WebContainer : 2 @@ 13:32 IBMRational /jts/service/com.ibm.team.reports.common.internal.service.IReportRestService/updateSnapshotData] DEBUG ervice.internal.common.CommonRemoteSnapshotService - ETL: Time Running: Less than 1ms 2013-10-23 13:32:41,902 [WebContainer : 2 @@ 13:32 IBMRational /jts/service/com.ibm.team.reports.common.internal.service.IReportRestService/updateSnapshotData] DEBUG ervice.internal.common.CommonRemoteSnapshotService - ETL: ***Finished Build IterationParent at 10/23/13 1:32 PM. The build was successful*** 2013-10-23 13:32:41,902 [WebContainer : 2 @@ 13:32 IBMRational /jts/service/com.ibm.team.reports.common.internal.service.IReportRestService/updateSnapshotData] DEBUG ervice.internal.common.CommonRemoteSnapshotService - ETL: ***Started Build ReadAccess at 10/23/13 1:32 PM*** 2013-10-23 13:32:43,323 [WebContainer : 2 @@ 13:32 IBMRational /jts/service/com.ibm.team.reports.common.internal.service.IReportRestService/updateSnapshotData] ERROR ervice.internal.common.CommonRemoteSnapshotService - java.net.SocketException: Too many open files java.sql.SQLException: java.net.SocketException: Too many open files
Any ideas? Anyone encounter this issue before?
One answer
This technote resolved the issue:
http://www-01.ibm.com/support/docview.wss?uid=swg21403391
In particular, the thing that resolved it for us was this excerpt:
If Rational Team Concert is running on IBM WebSphere, the WebSphere startup script may need to be edited to include this configuration.
http://www-01.ibm.com/support/docview.wss?uid=swg21403391
In particular, the thing that resolved it for us was this excerpt:
If Rational Team Concert is running on IBM WebSphere, the WebSphere startup script may need to be edited to include this configuration.
-
Navigate to the
../etc/init.d/was
startup script
-
Add ulimit -n 65536 to the startup script
Comments
Kevin Ramer
Oct 25 '13, 3:29 p.m.When 'ulimit -a' is executed by the ID running WebSphere actually show the 76800 for open files ? On some systems I've seen disparity between what's configured in /etc/security and the result of ulimit -a.
Are all your applications under single WebSphere ? If so, they're all scrambling at the same pool of resources.