Frequently Asked Rational Team Concert/Build Forge Integration Questions
This article will go through frequently asked questions about the Rational Team Concert (RTC)/Build Forge (BF) integration and answer them. It will also get into how to troubleshoot the problem.
Build Forge builds get stuck in <insert build state here>!
To understand this problem it is important to understand how the Build Forge integration works. The integration has two task loops that run continuously, a build loop and a event loop. The build loop’s responsibility is to continuously check for a RTC build request with a Build Forge contributor to pick up. When such a request is found, RTC will flip the build request from the Pending state to the Running state. This will also connect to Build Forge and start the corresponding build there. The other event loop will poll Build Forge looking for RTC events that have been generated in Build Forge. When those events are found it will parse the event and change the state of the build correspondingly.
So knowing all that, how do builds become stuck in a state? If you are using RTC 3.0.1.2 or earlier, it is likely you are hitting a defect with how the connection logic was handled. There was an issue with uncaught exceptions bypassing the event logic, which caused builds to never process the appropriate events. The events only persist in Build Forge for 20 minutes and if they are not picked up by then the event is gone for good and there is nothing that will change the state of the RTC build. The build is effectively ‘stuck’ in whatever state it is in. This case could also occur if the RTC server is brought down for more than 20 minutes with active Build Forge integrated builds running. In either case, the events that would change the state of the build would be gone, leaving the RTC build in stasis forever. This assumes that the build is out of the pending state, if the builds are stuck in pending state, and the build is not started in Build Forge, then the problem is the build loop. It could be that the loop is not running due to some system setting (e.g. Migration Mode Enabled set to true), or some kind of service error with the build loop task.
To drill deeper into the Build and Sync Loop, enable the debugging in the CCM log4j properties by changing the line log4j.logger.com.ibm.rational.buildforge=WARN to log4j.logger.com.ibm.rational.buildforge=ALL. The CCM log will start filling up with Build Forge integration debug data.
This log shows the build getting picked up initially by RTC:
2012-06-01 15:23:27,569 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - request: com.ibm.team.build.internal.common.model.impl.BuildRequestImpl@524b524b (stateId: [UUID _Q4gRYKwfEeGcvc2t1C2Ecw], itemId: [UUID _Otr_oKwfEeGcvc2t1C2Ecw], origin: <unset>, immutable: true) (contextId: [UUID _T042wKseEeGb9qUE-ET0UA], modified: 2012-06-01 15:23:27.478, workingCopy: false) (mergePredecessor: null, workingCopyPredecessor: null, workingCopyMergePredecessor: null, predecessor: [UUID _Ott07KwfEeGcvc2t1C2Ecw]) (processed: true, created: 2012-06-01 15:23:12.923) 2012-06-01 15:23:27,570 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - buildDefInst: com.ibm.team.build.internal.common.model.impl.BuildDefinitionInstanceImpl@52885288 (internalId: [UUID _Ott00qwfEeGcvc2t1C2Ecw]) (buildDefinitionId: BF Project build, label: <unset>, description: , ignoreWarnings: true) 2012-06-01 15:23:27,570 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - configurationData: Build definition: BF Project build Project name: EchoTest Project uuid: e4e1bf2b0c591000d700ef692ab32ab3 Host name: localhost Secure connection: true Port: 3966 Server key: localhost:3966 User id: null All logs: false All logs not passed or skipped: true Custom BOM enabled: true Num first logs:: 5 First logs enabled: true Num last logs: 5 Last logs enabled: true 2012-06-01 15:23:27,570 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - engineConfigurationData: Engine ID: BF Project engineEngine UUID: _kXrToKseEeGb9qUE-ET0UA Has configuration: true Host name: 9.37.26.16 Secure connection: false Port: 3966 Server key: 9.37.26.16:3966 User id: rtc_user 2012-06-01 15:23:27,654 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - buildPropsMap: {personalBuild=false, buildResultUuid=default, engineUUID=default, scheduledBuild=false, test=my_value, echoValue=${test}, buildEngineHostName=default, com.ibm.team.build.internal.template.id=com.ibm.rational.connector.buildforge.ui.buildDefinitionTemplate, new_property=my_new_value, buildResultUUID=default, buildRequesterUserId=default, buildEngineId=default} 2012-06-01 15:23:27,654 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - IBuildResult: com.ibm.team.build.internal.common.model.impl.BuildResultImpl@4eee4eee (stateId: null, itemId: [UUID _OttNwKwfEeGcvc2t1C2Ecw], origin: <unset>, immutable: false) (contextId: [UUID _T042wKseEeGb9qUE-ET0UA], modified: null, workingCopy: true) (mergePredecessor: null, workingCopyPredecessor: [UUID _Q5ZpQKwfEeGcvc2t1C2Ecw], workingCopyMergePredecessor: null, predecessor: null) (buildStatus: OK, buildState: IN_PROGRESS, label: , buildTimeTaken: -1, buildStartTime: 1338578607570, summary: , ignoreWarnings: true, tags: , deleteAllowed: true, personalBuild: false)
This part of the log shows that five build events were found in Build Forge. One line reports that Couldn't find result handles for:job.failed, which means that the event did not match a RTC build result. That is normal, it just means the build happens apart from RTC, in Build Forge directly. The next few lines describe the RTC build retrieving a build start event and what build definition that event applies to.
2012-06-01 15:23:45,046 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - RTC events found: 5 2012-06-01 15:23:45,046 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - Time to get rtc events:73 2012-06-01 15:23:45,119 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - RTC Event Type:job.failed:98da258d0c5a1000c0c3ef695b295b29 2012-06-01 15:23:45,119 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - Couldn't find result handles for:job.failed 2012-06-01 15:23:45,119 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - Time to process event [1] null: 73 msecs 2012-06-01 15:23:45,192 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - RTC Event Type:job.started:98e0757b0c5a10008747ef695bb45bb4 2012-06-01 15:23:45,192 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - resultHandles: 1 2012-06-01 15:23:45,192 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - Found event:[UUID _OttNwKwfEeGcvc2t1C2Ecw] 2012-06-01 15:23:45,193 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - Build definition: BF Project build Project name: EchoTest Project uuid: e4e1bf2b0c591000d700ef692ab32ab3 Host name: localhost Secure connection: true Port: 3966 Server key: localhost:3966 User id: null All logs: false All logs not passed or skipped: true Custom BOM enabled: true Num first logs:: 5 First logs enabled: true Num last logs: 5 Last logs enabled: true 2012-06-01 15:23:45,193 [ ccm: AsynchronousTaskRunner-1] DEBUG eam.internal.service.BuildForgeEventPollerRunnable - Time to process event [1] job.started: 74 msecs
Build Forge Agent or Build Forge engine in warning state!
The warning state is there primarily for the Jazz Build Engine (JBE) which works on a polling model. The JBE will start, connect to the Jazz repository, and start polling for available build requests to run. Build Forge and the Build Forge Agent do not work on a polling model. They have no idea about RTC or Jazz, but rather are just sitting waiting for a build request to be sent to them! So there is nothing about Build Forge or the Build Forge agent that will contact RTC to let it know that it is waiting for a request. So why does the warning appear? Why does it only appear sometimes? The warning is cleared within the code of the build loop processes. So if the task is running then the engine will be marked as having contacted RTC, even though in reality it has not. UPDATE: This behavior has changed in 4.0, where now the connection is verified before the warning is cleared. The first thing to check then is make sure that the services are running without errors. If something is interfering with the services then it will not mark the engine as having contacted RTC.
If having this warning is problematic for your environment, you can disable having the engine track the contact time. This option appears on the Overview tab of the engine definition, under Build Request Processing and the option is called Build engine process polls for requests. Disabling that option will turn off the warning state.
My RTC build properties are not getting into Build Forge! The RTC build property X just shows up as an invalid value!
When a project is selected in the Build Forge build definition, it will synchronize its properties into Build Forge environment. Below is an example of what that would look like.
When a build is started in RTC, the Build Loop task will pull up the Build Forge project environment and iterate through it replacing the RTC defined properties, e.g. buildResultUuid. So without the variables defined in Build Forge, RTC will not be able swap in the correct values for those RTC defined variables. This swapping in of RTC properties into Build Forge variables only affects those Build Forge variables at the project environment level. Other environments set a lower levels, e.g. step level, can still overwrite these variables again, setting them into another value that will likely be an invalid value. Make sure that the RTC specific variables are only set ONCE in the Build Forge project, at the project environment level!
If we turn on the log4j debug as detailed above we can see the Build Forge values getting swapped out for the RTC properties, like so:
2012-06-01 15:23:28,491 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - Checking build environment name=value: buildResultUUID=default, action=SET, vartype=STANDARD, mode=NORMAL, include uuid=null 2012-06-01 15:23:28,491 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - Build property value: default 2012-06-01 15:23:28,491 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - buildResultUUID value updated with: _OttNwKwfEeGcvc2t1C2Ecw 2012-06-01 15:23:28,544 [ ccm: AsynchronousTaskRunner-2] DEBUG .team.internal.service.BuildForgeBuildLoopRunnable - Updating build environment name=value: buildResultUUID=_OttNwKwfEeGcvc2t1C2Ecw, entry uuid=98e0756f0c5a10008739ef695bb45bb4
If you determine that the build environments are getting interrupted in the middle of the update with an exception stack like what is below, then the problem is with Build Forge sessions thrashing. If you have multiple RTC instances pointing to the same Build Forge instance, but use the same Build Forge userid between the RTC instances, then the Build Forge sessions will thrash. This can cause problems where sometimes RTC will attempt to update Build Forge with an invalidated session that causes the environment update to cease. This logic has been reinforced in 4.0 and 3.0.1.5, but we still recommend that for all integrations that all RTC instances have a unique Build Forge userid to prevent the thrash from occuring.
at $Proxy409.saveWithoutProcess(Unknown Source) at com.ibm.rational.buildforge.team.internal.service.BuildForgeBuildLoopRunnable.handleEnvironmentUpdates(BuildForgeBuildLoopRunnable.java:518) at com.ibm.rational.buildforge.team.internal.service.BuildForgeBuildLoopRunnable.handleEnvironmentUpdates(BuildForgeBuildLoopRunnable.java:326) at com.ibm.rational.buildforge.team.internal.service.BuildForgeBuildLoopRunnable.startBuildForgeProject(BuildForgeBuildLoopRunnable.java:233) at com.ibm.rational.buildforge.team.internal.service.BuildForgeBuildLoopRunnable.run(BuildForgeBuildLoopRunnable.java:171) at com.ibm.rational.buildforge.team.internal.service.BuildForgeBuildLoopScheduledTask.runTask(BuildForgeBuildLoopScheduledTask.java:57)
My RTC builds are staying in pending for a long time! Why are they not getting picked up.
There is a known defect in 4.0 and 3.0.1.4 and earlier, particularly with big server machines. The problem is a defect with how we assign servers in the round robin fashion. It is possible that if two or more engines happen to have the same timestamp as to when they were last used, only one engine will be entered as available.
Check the ccm.log file with the build debug turned on and see if you are frequently encountering the error below. If you verify that the Build Forge build engine is valid and Build Forge is running, then it is likely you are encountering this problem and will have to upgrade.
2012-07-27 15:11:41,176 [ ccm: AsynchronousTaskRunner-2] ERROR .team.internal.service.BuildForgeBuildLoopRunnable - There are no valid build engines to satisfy the request or Build Forge is not running.
For more information:
About the author
Spencer Murata currently works in the RTC Level 3 Team in Littleton MA. He can be contacted at murata@us.ibm.com.
Copyright © 2012 IBM Corporation