lscm command fails randomly on AIX
I have a build shop that uses AIX 5.3.0.0 on a 2 processor Power PC, 3GB memory, I am getting inconsistent intermittent failures using the lscm command on AIX. I need help debuging and resolving this issue so that I can finish an evaluation of RTC. I have followed suggestions on the RTC forums to use the linux client on AIX. (A sun howto) I am connecting to RTC 3.0 servers, so I used the RTC-Client-Linux-3.0iFix1.zip file. * I unzipped the RTC-Client-Linux-3.0iFix1.zip file under /home/dons/RTC-Client-Linux-3.0iFix1/ the directory * mv /home/dons/RTC-Client-Linux-3.0iFix1/jazz/client/eclipse/jdk /home/dons/RTC-Client-Linux-3.0iFix1/jazz/client/eclipse/jdk.orig * cd /home/dons/RTC-Client-Linux-3.0iFix1/jazz/client/eclipse * unziped an AIX java.zip /home/dons/RTC-Client-Linux-3.0iFix1/jazz/client/eclipse/jdk/jre/bin/java -version gives: java version "1.6.0" * mv /home/dons/RTC-Client-Linux-3.0iFix1/jazz/client/eclipse/java /home/dons/RTC-Client-Linux-3.0iFix1/jazz/client/eclipse/jdk * cp /home/dons/RTC-Client-Linux-3.0iFix1/jazz/scmtools/eclipse/lscm /home/dons/RTC-Client-Linux-3.0iFix1/jazz/scmtools/eclipse/lscm.orig * changed /home/dons/RTC-Client-Linux-3.0iFix1/jazz/scmtools/eclipse/lscm as follows: CHANGED: SCM_DAEMON_PATH="${PRGPATH}/scm" TO: SCM_DAEMON_PATH="${PRGPATH}/scm.sh" This works every time without error (other than slowness): * /home/dons/RTC-Client-Linux-3.0iFix1/jazz/scmtools/eclipse/scm.sh login -r https://SERVERNAME.com:9443/jazz/ -u dons -P password This command works, usually the first 3 or 4 times, but then starts failing randomly about 75 percent of the time: * /home/dons/RTC-Client-Linux-3.0iFix1/jazz/scmtools/eclipse/lscm login -r https://SERVERNAME.com:9443/jazz/ -u dons -P password When the command fails, it always gives the following error: java.net.ConnectException: A remote host did not respond within the timeout period. This same RTC-Client-Linux-3.0iFix1 package used with the included jdk and unchanged commands on a RHEL AS v3 u4 Linux box (3.20Ghz Intel Xeon, 2048MB memory) work fine and fast. Both machines are on the same network. The failure is happening the same way on all 3 AIX servers I have tried. The AIX servers are functioning fine for all other services (DB2, http, and many other programs). I have searched with Google and found some tidbits, but none have helped. * http://www.ibm.com/developerworks/forums/thread.jspa?threadID=271715&tstart=-1 I tried this out, AIXTHREAD_SCOPE to "S" but it had no effect on my results. * Most others talk about a hard failure where nothing else works. |
2 answers
It would (potentially) be helpful if we could get some additional debugging info out of the client about what exactly is going on.
Does the client produce any logs? Is there any way to change/augment what gets logged? |
I am still looking for any insights on how to debug this problem. I have done a few
things in the last few days, but I am no closer to a solution. Here a Couple of questions I recieved outside of this post: 1) As I understand, if you run 'lscm login' subcommand continuously (I am calling out 'login' just for this discussion), the subcommand fails with the below mentioned error. Once you get this error, does it work for further runs of the login subcommand? Or it always fails once you have encountered this error? 2) Are you running the lscm subcommands from within the same sandbox? The reason I ask is because whenever you run the lscm commands it will launch a daemon process and registers the current working directory with that daemon process. If you run again from some other directory, it will launch a new daemon process. 3) Is the daemon still alive? You could check this by running... 'scm list daemon'. It will list the description, port number and probably you could also confirm by running netstat with the port number. We have additional logging for the daemon process but I don't think it will help the cause because it fails at the very first step of connecting to the daemon process. You could try launching the daemon manually and then execute the subcommands.... scm daemon start -vmargs -Ddaemon.log.file=/log.txt. The 'daemon start' has other options such as --port and --connection-timeout that could be set. The log files will be located at /,jazz-scm/scratch/{Number} directory. The config_dir is usually users home directory unless specifically set using --config option. I have gotten the debug file, though it did not work for me exactly as you had it listed. Since I am on AIX, I had to edit the scm.sh file and add the -Ddaemon.log.file=/home/dons/debuglog.txt to the java line itself. This did produce the file in my home directory. It looks like the log gets 3 lines for every successful command and nothing is logged when the command fails. I have given all the commands invoked and after each command, I have cated the log file to show the contents. The following contents of the debuglog.txt file are only added to on a successful command: cat /home/dons/RTC-Client-Linux-3.0iFix1/jazz/scmtools/eclipse/debuglog.txt It was also suggested that I try using the same jre that was shipped with the RTC AIX server. I downloaded RTC 3.0.1 Server install image for AIX , and used it's jre, but the results did not change. |
Your answer
Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.