New DCC/JRS 6.0.2 install cannot register, randomly give 503 errors
Hello,
My client upgraded from JTS and RTC 5.0.2 to 6.0.2 ifix002. Everything went well in sandbox, so we went and stuck JRS and DCC on another server using the jython scripts and that worked as usual. We did the same stuff on prod and when we got to the JRS/DCC stuff, they would break during the registration process in the jts/setup. When they break, they show either a 400 or a 503 error. The logs only ever have two errors in them, and no java issue dumps; they repeat them over and over every minute or so; it starts with the root services one then does the storage areas one after a failed registration.
CRRCD9011E Storage areas have not been created. Please run set up to complete application finalization.
or
CRRCD9004E The Root Services are not defined.
The failed registration may seem to work and pop DCC up on the jts/setup steps, but when you step through down into the first one there's an authentication issue and it complains that your user is not authorized and ADMIN is disabled. After that, it will disappear from the jts/setup and won't re-appear until we delete the app/oauth keys from JTS and then shut off the DCC websphere instance and reinstall DCC (or just switch the teamserver.properties file back to stock and try again).
- Tried registering through JTS admin's registered applications section as well, registered fine but then wouldn't setup as before
- Teamserver.properties on DCC is getting updated during registration and has a pending OAuth request, but since we can never get to the part where we specify and create databases and finalize the app it never accepts it (can we do all that with repotools and bypass the setup wizard?)
- When we first install DCC and JRS we can go to the /dcc/scr or /dcc/rootservices and see that the server is running; after a failed registration attempt it tends to go into a 503 state. It then randomly flips between giving us the scr/rootservices documents (which have the correct URIs at this point and reference the jts/friends etc) or a 503.
We've investigated and ensured that our WAS and IIS settings match our sandbox and match the way the functioning CCM server is set up, and it all seems good; I'm having people double check but I'm not sure what the issue could be here. I think it randomly locking us out is what prevents proper registration/setup.
Configuration:
- WebSphere 8.5.5.4, 3 app server profiles on their own physical machines managed by a single deployment manager; one app server for jts, ccm, and rrdi (the uri was already set up from a previous failed rrdi attempt so that's our dcc/jrs server)
- AIX 7.1 for all the above servers, ulimits are unlimited so that shouldn't be an issue unless it hardchecks for the exact values given in the setup which I have never seen it do)
- IIS 7.5 on Windows Server 2008 R2 as a web server, WAS plugin forwards connections to the application servers
- DB2 10.something but we never get that far anyways
One answer
In the IIS website's application pool, change the Maximum Current Connections setting to a huge number. See example:
That will prevent DCC from collapsing the web server with 503 and 400 errors. Good luck.