Migrating many years of history from Dimensions into RTC
Hello,
Just after some general opinion on the feasibility of migrating all the history from one Change control tool to another. I have already created a script to migrate a bunch of work items from Dimensions and their interconnections and mapped the dimensions fields into RTC and the owners/ creators and statuses along with the final baseline of files in Dimensions with the various metadata attached as file properties (Ralph's various blogs were a great help on this). What I'm now being asked to do is migrate all of the history as well (all file revisions and all baselines, which could be a decades worth or more) and have the exact revisions of an item mapped to the relevant work item and make the right baseline of a component at the right time such that the history of Dimensions could be navigated in RTC. This is a scary proposition for me because I think I would have to have my utility fetch files and information from one system in chronological order cache it and wait for each operation to finish then deliver changes from the filesystem into RTC in the right order along with annotations and baselines such that the project can find and go back to certain milestones as an auditor might come in and want to do this, the idea is that the Dimensions tool in any form will cease to exist in this organisation and so cannot be used to navigate historical data. Has anyone done anything this ambitious? Is this something that IBM could be asked to do for a customer? Not sure I want to touch this task with a barge pole, too many individual operations and too many processes and too many things that could go wrong and too much responsibility. Thanks for any pointers!
Accepted answer
For SCM: Pretty much the same, really. Ideally take one or two baselines.
However, we have projects in our labs and we have customers that migrated all the change history from one tool/repository to another. They used the Java API or existing importers. I am not aware of a migration from dimensions, so you would have to do this yourself.
https://rsjazz.wordpress.com/2013/09/30/delivering-change-sets-and-baselines-to-a-stream-using-the-plain-java-client-libraries/
and
https://rsjazz.wordpress.com/2013/10/15/extracting-an-archive-into-jazz-scm-using-the-plain-java-client-libraries/
should be useful. The software described in the links basically create some history on a stream, by sharing the initial code and then code that has been modified for newer versions of it.
I am not a big fan of that approach. You will create a huge amount of data in the SCM system and it might be very hard to do and expensive to develop this.
I would rather keep the old system and reduce accessibility and you will likely see that after half a year, you can back it up and shut it down and no one will notice.
Comments
Thanks both useful articles I have read. I would like to give this customer several baselines or preferably just the latest one and could do this with a minor modification to my existing scripts, but the customer wants more. I can see how the item revision history could be recreated and how the baselines could be recreated and how it could be automated and how a work item could be related to the right changeset for the right delivery, but the sheer amount of data and number of operations scares me, how long would it take to migrate 10 years of data and how reliable would it be. It would be nice to pick the brains of those who have done a similarly ambitious migration. It would be nice to have a look at their scripts, but I doubt this is possible.
Cheers
Richard
I can only share what I have heard, I don't know how much data was migrated and how long it took. They had to throttle their API down, otherwise the server could not keep up with all the changes. You want to test this on a test system anyway, to be sure it works and that would give you a hint about the time it will take. You could actually bring it over to a test system e.g. up to the last baseline and only bring over the last changes from there in the final migration. Then do the same with the production system.
I would also be concerned about all the data - the amount of history could also have impact on performance.
Thanks for your updated answer.
I take it that this is not something we could contract out to IBM and get one of the RTC developers to do it for us? Getting the data out is not the problem as I can query the Dimensions database tables directly, it's getting the data into RTC and synching the processes, managing the sandbox state, adding the metadata and associating the work item at the right time and other things I haven't thought of yet and making it robust (it needs to be very robust) that are the issues. I think that migrating many years of data is a bad idea and something the customer should avoid, but my opinion is not sufficient for the customer, it would help if I get other people who understand RTC give their opinion. This would give my position more credibility.
No matter how much metadata you try to bring across, there still will be configurations and/or queries that you can create/perform in the old system that you cannot create/perform in the new system. So if they really care about all of the information in their old system, they will have to keep the old system available. And if they do that, they should just bring across the "important" baselines, and not clutter the new system with rarely used information that can be retrieved from the old system.
Thanks for that Geoff, I am aware that I cannot take everything across and have tried to communicate that. I can easily bring across a number of baselines using my existing system. Can you guess estimate how long it would take to update my existing scripts: essentially a two stage process which updates a temporary filesystem staging area with all the files and work item details and relationships then uses that staging area to write the files to a component and put the work items into a team area and connects the work items together with a new more powerful script that synchs the dimensions writing to a sandbox with the RTC reading/ delivering of this sandbox and does this many thousands of times and keeps track such that it creates baselines at the correct time with the correct collection of files? Note the person doing this would not be some outside contractor (not me)who would hopefully be very clever and have knowledge of Java and C#, but probably no knowledge of Dimensions and RTC
Unfortunately, I could really even guestimate how long that would take.
OK - thanks to everyone for their replies, I'll mark this as the answer and point a couple of chaps at this thread. I also have an issue guestimating how long this task would take me, let alone some unknown from a consultancy.
One important point to consider.. If you are successful you will have the sequence of the history, but NOT the dates. everything will be 'now'.. this is an important part of the history, and helps guide future changes based on the relative impact of the change.