It's all about the answers!

Ask a question

single vs. multiple repository strategy


David Sedlock (16122012) | asked Aug 11 '09, 6:30 a.m.
I would really appreciate some thoughts about this important topic for our possible migration from ClearCase to RTC.

1) My understanding is that moving to new RTC versions at least sometimes involves downtime, the larger the repository the longer the downtime. Would someone please comment on when this is necessary and when not (every major release?). Would someone also comment on the downtime (if any) for regular backups, assuming an Oracle database?

2) We cannot tolerate downtimes when all development simply stops for long periods of time. So we have to think very carefully about the obvious first idea of putting everything into one repository. It is helpful here to compare CC vobs with the RTC repository. Vobs are not monolithic, since they can be freely combined in a view. I have the impression RTC tends to assume that everything is in one repository.

3) We have a lot of data, consisting of chip designs, firmware, software and all sorts of documentation. We have two sorts of projects: (1) Chip projects, which tend to be relatively self-contained and generate a lot of data (e.g. GB size binary files). These tend to have definite life-spans of 1-2 years. (2) Firmware and software projects, which do not have such large files, but have many files and last for a long time; hence they accumulate a lot of data over time.

4) Given what I know or guess so far, I tend to think we should create separate RTC repositories for each chip project and one repository for the FW/SW/Doc. Is there any real experience here? Has RTC been deployed in such an area already? One problem here is that there is some reuse between the different areas. The least we would need is the ability to reuse components read-only from other repositories (i.e. the CC UCM "add component from other pvob"). I am also worried about work-item tracking and the various project planning and execution functions of RTC. The chip, FW, SW and Doc development come together in system projects. A defect or task may have implications for all four; the project planning involves all four. Will this really work with artifacts split into different repositories, or will we find real limitations?

5) If the best decision functionally is one repository, we would need something like an archive feature, which would allow us to "carve out" a project and its data into a separate repository. We could then execute a chip project to completion in the main repository and archive it into another repository, where it would be available (with functional limitations) if we ever needed to revisit it. This would, with luck, ensure that the main repository does not grow at an unacceptable rate. Any ideas about the feasibility of this sort of feature?

Feel free to expand the discussion if I have missed relevant considerations.

-David

3 answers



permanent link
Geoffrey Clemm (30.1k33035) | answered Aug 12 '09, 10:13 p.m.
FORUM ADMINISTRATOR / FORUM MODERATOR / JAZZ DEVELOPER
Currently, RTC only has limited capabilities for combining source code
from multiple RTC repositories. You can for example, have multiple
repository workspaces loaded into a single Eclipse workspace, but you
cannot create cross-repository snapshots, and you cannot have change
sets in one repository linked to work items in another repository.

So I'd suggest putting all the software in one repository, and filing
requests against RTC for any problems that arise (excessive backup time
would be a bug, and inability to free up space by deletion or archiving
is a highly requested feature).

Repository Team: Would you give different advice?

Cheers,
Geoff

David.Sedlock.infineon.com wrote:
I would really appreciate some thoughts about this important topic for
our possible migration from ClearCase to RTC.

1) My understanding is that moving to new RTC versions at least
sometimes involves downtime, the larger the repository the longer the
downtime. Would someone please comment on when this is necessary and
when not (every major release?). Would someone also comment on the
downtime (if any) for regular backups, assuming an Oracle database?

2) We cannot tolerate downtimes when all development simply stops for
long periods of time. So we have to think very carefully about the
obvious first idea of putting everything into one repository. It is
helpful here to compare CC vobs with the RTC repository. Vobs are not
monolithic, since they can be freely combined in a view. I have the
impression RTC tends to assume that everything is in one repository.

3) We have a lot of data, consisting of chip designs, firmware,
software and all sorts of documentation. We have two sorts of
projects: (1) Chip projects, which tend to be relatively
self-contained and generate a lot of data (e.g. GB size binary
files). These tend to have definite life-spans of 1-2 years. (2)
Firmware and software projects, which do not have such large files,
but have many files and last for a long time; hence they accumulate a
lot of data over time.

4) Given what I know or guess so far, I tend to think we should create
separate RTC repositories for each chip project and one repository for
the FW/SW/Doc. Is there any real experience here? Has RTC been
deployed in such an area already? One problem here is that there is
some reuse between the different areas. The least we would need is
the ability to reuse components read-only from other repositories
(i.e. the CC UCM "add component from other pvob"). I am
also worried about work-item tracking and the various project
planning and execution functions of RTC. The chip, FW, SW and Doc
development come together in system projects. A defect or task may
have implications for all four; the project planning involves all
four. Will this really work with artifacts split into different
repositories, or will we find real limitations?

5) If the best decision functionally is one repository, we would need
something like an archive feature, which would allow us to
"carve out" a project and its data into a separate
repository. We could then execute a chip project to completion in the
main repository and archive it into another repository, where it would
be available (with functional limitations) if we ever needed to
revisit it. This would, with luck, ensure that the main repository
does not grow at an unacceptable rate. Any ideas about the
feasibility of this sort of feature?

Feel free to expand the discussion if I have missed relevant
considerations.

-David

permanent link
Scott Rich (57136) | answered Aug 31 '09, 4:22 p.m.
JAZZ DEVELOPER
Great questions, David, thanks for posting them, and thanks for your patience while we got our thoughts together for a more complete response.

First, a couple answers to your specific questions. Repository migrations are only ever required at major versions, maintenance releases are compatible with the existing data for that version. Maintenance releases may require the execution of a "repotools -addTables" command to create new tables to enable new functionality, but that operation runs in a few minutes, independent of repository size.

Not surprisingly, the downtime to perform a migration depends on the speed of your server hardware (especially storage performance) and the size of the repo. For example, our main repository on jazz.net is somewhere around 40GB and takes most of a day to perform a complete export/import migration. There's an article here which describes how to estimate migration time: http://jazz.net/library/article/206

We know the current migration solution isn't ideal, and are starting design discussions about how to provide a more scalable approach, which may involve an incremental solution, or an in-place migration.

Now to your real question: how many repositories do I need? In general, your instinct is right, a single repository provides the most functional approach today. You get the most capability from RTC when your SCM artifacts, work items, and plans are in a single repository. Pushing against this is the increasing cost of managing that single repository, and finding a maintenance window for a single worldwide repository.

Because of this tension, we are working on ways to be able to change your mind later on. Being locked into a single repository forever is a daunting idea. We're starting work for RTC 3.0 to allow you to move projects between repositories, and revisiting the ability to work across SCM repositories in a distributed manner. Having that kind of move capability would let you run a repository to house the "archive" projects you mention. In addition, we're looking at enabling better linkage across repositories, so you could operate your chip/FW/SW projects on separate servers, maybe in different locales, with slightly different qualities of service, but still have common processes and timelines.

We're currently in the process of planning our releases for next year. You should see our plans for 2010 taking shape starting now, and see us making commitments on things like migration performance, project move, and cross-repository projects.

Hope this helps.

Scott Rich
Jazz Foundation Team

permanent link
David Sedlock (16122012) | answered Sep 16 '09, 3:18 a.m.

Not surprisingly, the downtime to perform a migration depends on the speed of your server hardware (especially storage performance) and the size of the repo. For example, our main repository on jazz.net is somewhere around 40GB and takes most of a day to perform a complete export/import migration.

I'd appreciate more detail about this statement.

I would expect that a migration that requires major changes in the schema would require exporting and importing only the tables that have changed. Now it's understandable that the representation of things such as metadata, work items, plans, etc. - the things that implement the distinctive functionality of RTC - will undergo significant changes in a major release. But the bulk of the data is usually going to be the files, which I would expect to be stored in very simple tables that are unlikely to change even with major releases (or to change in only minor ways - e.g. new attributes). Therefore they would not need to be exported/imported, but could be migrated in place.

If this reasoning is correct, then the downtime for the migration should be unaffected by the size of the file storage in the repository. Would you please comment?

Your answer


Register or to post your answer.


Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.