Practicing source control archaeology with Rational Team Concert
By Jeff Hanson & Jean-Michel Lemieux
Last updated: May 6, 2010
Build basis: Rational Team Concert 2.x
A good source control system will leave a few bones for the source archaeologist to dig up and piece together. A great source control system will leave well preserved, whole skeletons, thus removing the guess work about what happened in the past. It will tell you the what, who, and when about your past. If you use it properly it can also provide the why. The what are the physical changes represented by file versions contained in change sets. The why are the descriptive work items and comments associated with change sets. You are responsible for providing the why. So what is the purpose of this stuff? To help answer questions as easily and quickly as possible of course! How did my file evolve to its current state? What changes went into release 1 versus release 2? Who made that change and when? Did my bug fix for a maintenance release get migrated to the other development streams? That's source control archaeology!
The goal of this article is to shed some more light on how Team Concert keeps track of all the great work you do during your development. For those of you who have parallel development needs this article will be very helpful. We'll first review how Team Concert handles history, then discuss investigation techniques and wrap up with some insights into interpreting merge graphs.
There are three kinds of histories in Team Concert:
- File based history: the revisions of a file
- Change set history: the change sets in a component
- Baseline history: the baselines in a component
The history view shows all three kinds of histories. File histories and change set histories are very similar and look very much alike. File histories show the subset of change sets which affect that file whereas change set histories show all change sets in a component.
The example in Figure 1 shows a component, with its corresponding code, containing three baselines and each baseline contains five change sets. Each change set will contain a different set of file and directory changes. Building the example out further we show that the file "main.h" has been changed three times, in three different change sets, over the course of the component's change history. Based on this when you look at the file history for "main.h" you would see three changes, looking at the change set history for the component you would see 15 change sets, and looking at the baseline history you would see three baselines.
Figure 1: Histories Overview
Change sets can exist everywhere, as suspended changes, work items, or discarded changes. This means that a particular file can have many revisions scattered around. Some are more interesting than others. Instead of showing everything in the repository by default, Team Concert shows the history in the context of something. By default, file history is shown for a file in the context of its current workspace or stream. You are able to see a file's history in the context of a workspace, a stream or the entire repository. In Figure 2 we point out the information describing the file's context in the History view including its merge graph and a rich hover dialog for a selected change set.
Figure 2: File history view in a stream
Since a change set can come from anywhere (i.e. a stream, workspace or work item), we don't record where it came from. We do record when it was "added" to a change history and who added it. This is significant in that you are now able to distinguish between who created a change set and who actually added it to the change history of a workspace or a stream.
You see in the rich hover dialog in Figure 3 that Jeff created the change set but Jean-Michel was responsible for delivering it to the stream. This is a great example of the easy collaboration that Team Concert makes possible between colleagues. Jeff created the change set in his workspace as part of a feature he's working on with Jean-Michel. Jean-Michel was able to conveniently accept Jeff's change set into his own workspace and integrate it with his part of the feature implementation. Once satisfied with the result he delivered the completed feature, comprised of two change sets, to the main code stream. So you see, as changes get propagated throughout your code base you will always be able to figure out who propagated it.
Figure 3: Rich hover highlighting 'Creator' vs 'Added by'
By now you've said to yourself, "That colorful Team Concert merge graph must be the branch diagram or version tree for the file." Well, that's not the case and we'll explain in more detail later in the article. For now when you look at the history view with the merge graph you are not seeing where changes flowed from, only who flowed them and when.
Additionally we record the merge history between files/folders as change sets so that we can easily identify the merge points. A merge point is where conflict resolution was required during a deliver or accept operation. As a result of how Team Concert tracks change you are provided a clear picture of all the changes that went into making a file what it is and how much parallel work occurred on that file.
Looking at a file's history in the context of the entire repository (Figure 4) will show changes that have been suspended, discarded, attached to work items but not delivered, etc.
Figure 4: File history view in a repository
History View Actions
The History view itself is very dynamic, giving you a great deal of access and interaction with the individual changes it shows. This makes it very convenient for performing your investigations. Rather than discuss all the actions available from the history view we'll only cover a few investigation capabilities as it pertains to this article.
Figure 5: History view actions
Choosing Open in Change Explorer on a change set will show you the all the file versions and directories involved in that change set. Decorators and clarifying comments are displayed to help your understanding of what had occurred:
- File or directory changed:
- File or directory added:
- File or directory deleted:
Figure 6: Change Explorer comments for file
From the History view or the Change Explorer view you can initiate file comparisons to either the previous version (Compare with Previous) or with the file you have loaded locally (Compare with Local File). This is standard source control compare functionality that you should all be familiar with.
Figure 7: Text Compare with previous version
By choosing the Annotate action you can see an annotated view (Figure 8) of any file in the repository. Annotations associated with a line or group of lines in the file show who added or changed the lines, when the change was made, and any comments or work items associated with the change. At the left margin of the displayed file, vertical bars of different colors indicate lines associated with the changes within the file. Move your mouse cursor over a bar to see a rich hover showing the related change set information.
Figure 8: Annotated file display
Now that we've covered some of the basic functions for digging into the past, it's time to lay out the scenario to help put this all in context.
Parallel Development Interpretation Scenario
To aid our archaeological lesson, Jeff expanded on a parallel development scenario provided to him by one of his customers. You see in Figure 9 there are five parallel streams of development with nodes () representing change events. In most cases these change events resulted in the creation of a change set. In a couple events where non-conflicting Accepts (i.e. merges) occurred no change set was required. To rephrase this for clarity: merges that do not require your intervention will not generate a change set because there is no conflict resolution performed. This would be considered a trivial merge.
Let's take a moment to understand the scenario show in Figure 9. Each change event has a step number ( ) to indicate its historical sequence. Each change event is labeled ( ) with the work item name which also doubles as the snapshot name created following each event. For example the work item name FA-T4 represents Task #4 on the Feature A Development stream. Each stream has an associated workspace (). The result of laying out the scenario this way will help you easily see where change sets came from at any point during the scenario.
The change types and situations incorporated into the change events include:
- Merge conflicts between files
- No merge conflicts between files
- Evil twin files & directories
- File add, delete, rename & move
- Directory add, delete, rename, move
"Evil Twins" is a commonly used phrase to describe a situation in which two files or directories, of the same name, are created in two different versions of the same directory. Evil Twins are often created when two people add the same file to source control at the same time.
These are the normal situations that you deal with on a regular basis. For more specific information on merging and conflict resolution, check out the article "Jazz Source Control Resolving Conflicts" (http://jazz.net/library/article/39). This article discusses some of the fun stuff you do that makes the history so interesting to dig into!
Figure 9: Parallel development scenario
Now let's take a look at how Team Concert represents that completed scenario. The best way to show it is on a stream-by-stream basis shown in Figure 10 through Figure 14. In the History view the change sets are ordered from the oldest at the bottom to the latest at the top. If you take a close look at what Team Concert is showing in Figure 10 and Figure 11, you will see that work items 1.0-T7 (step 24) and FA-T3 (step 16) are not shown. The merges for these change events did not have merge conflicts so no change sets were created. Another point to make is that when merge conflicts do occur Team Concert will automatically create a change set named 'Merges' which is visible in the Comment column of the History view highlighted in Figure 10.
Figure 10: Stream - 1.0 Integration
Figure 11: Stream - Feature A Development
Figure 12: Stream - Feature B Development
Figure 13: Stream - 2.0 Integration
Figure 14: Stream - Technology Exploration
Glancing at the different stream-based histories of the same file, "alwaysConflicts.txt", you can see how the change for that file on a given stream maps to the scenario in Figure 9 and is unique in the context of each stream. The 1.0 Integration stream ends up with all changes from all streams.
Working with the Change Explorer
Okay so far? Now let's dig into how Team Concert represents the change history at key change events. We ran into two evil twin situations during the merge at TE-T2 (step 14) on the Technology Exploration stream. The first was that the same file, "evilTwinFile.txt", had been created in the same directory, "code", on both the Technology Exploration and Feature A Development streams. The second situation was that the directory "evilTwinDir" containing the file "inEvilTwinDir.txt" had also been created independently on the two streams. For each case during the change event Team Concert prompted for user intervention to resolve the merge. To begin our investigation we open the Change Explorer for TE-T2 which is shown in Figure 15.
Figure 15: Change Explorer & text compare of evil twin merges
You can see both the 'evilTwinFile.txt' files, each with a different decorator, indicating that one file was removed and the other was modified (i.e. merged). Based on the user's choice Team Concert had facilitated the merge of the two different files during the change event so one of the files was not kept (i.e. left behind) as part of the merge. If needed, we could dig deeper into the origin of each of those files by showing their histories relative to this point in time (i.e. snapshot TE-T2).
For the evil twin directories, "evilTwinDir", they also contained an evil twin file, "inEvilTwinDir.txt". During the merge one directory was kept, its contents were merged into the new directory and the evil twin files were also merged.
From the Change Explorer you can double click on any file to compare the current version against the previous version. At the top of Figure 15 you see the comparison of the file "evilTwinFile.txt" that was removed as part of the merge. The Text Compare view shows that the current version (i.e. after) doesn't exist and displays the file contents of the previous (i.e. before) version. By comparing the current and previous versions of the "evilTwinFile.txt" file that was kept will show that the contents of the file left behind was merged into the kept file.
So far we've talked about adding, deleting and modifying the content of files and directories. We can't forget about name changes. This is a fact of life in the development community, and again not all source control solutions handle this satisfactorily. The importance of tracking a name change is so that you don't lose the history of the file or directory. A name change is simply a part of the change history for a file or directory. In Figure 16 you see that Team Concert displays explicit information regarding both the relocation (i.e. move) of a folder and the renaming (i.e. move) of a folder. The same applies to files.
Figure 16: Directory move & rename tracking
Comparing Snapshots and Baselines
The investigation capabilities that we've show you so far have been at a micro-view by only considering a single file or a single change set. Many times the question you need to answer is "what got into a release?" A question like that is best answered with a list of the reasons for the change, the work items, instead of a list of files. From the History view for component baselines or the Search view for stream snapshots you can choose the comparisons you wish. The usual questions revolve around what has changed since the last significant release of the code, which might be used as guidance for the testing team or as release notes to customers. Here's an example of comparing snapshots from the 1.0_Integration stream. A snapshot is a baseline of the components contained in a stream at a particular time. Defined relative to a stream, a snapshot represents the collection of component baselines at the time it was set. From the desired stream you would choose to show snapshots, which opens the Search view to display the stream's snapshots.
Figure 17: Initiating a snapshot compare
From the first snapshot (Figure 17) to compare, select "Compare With" -> "Snapshot" which gives you the choice of where the second snapshot (Figure 18) will be picked from. You are able to compare with snapshots from other streams if needed. In this example (Figure 19) let's choose an earlier snapshot, 1.0_Int-T6, on the same stream.
Figure 18: Choosing the location of the second snapshot
Figure 19: Choosing the second snapshot
At this point the Change Explorer (Figure 20) will display all the change sets, with their related work items, that were delivered between the 1.0_Int-T6 and 1.0_Int-T11 baselines. You can see where all the changes came from because of the naming convention used in this example. You will also notice the change sets appended with "Merges" that were automatically created by Team Concert when they were required to track merge conflict resolutions.
Figure 20: Snapshot comparison results in Change Explorer
To share this list of changes with others you can save the change log (Figure 21) to a text file.
Figure 21: Exporting list of changes from snapshot comparison
Understanding Merge Graphs
Last but not least is the visualization of the changes related to files. Some systems provide version trees, others provide branch diagrams, others let you figure it out for yourself on a white board; Team Concert provides a merge graph. The merge graph, shown in the History view's Merges column, is an attribute of the history of a file. It provides a graphical linkage of the change sets related to a file for a selected stream or repository, and represents how much parallel development has occurred on a file. The changes may have been done on multiple streams or by multiple developers collaborating on a single stream or by direct collaboration between the repository workspaces of multiple developers; It matters not. A file may have been modified on multiple streams, but if its progression of changes were linear, including merges to successive streams, then it will have a single path in the merge graph.
Before we get too far let's review some terminology:
- Fork: The point (i.e. change set or file version) at which a file begins a parallel development path
- Merge: The point at which two different versions of a file are merged
- Dead-end: A file change that has never been flowed back to a stream
Common Path: The linear (i.e. non-parallel) sequence of changes to a file. A merge graph will have one or more common paths.
Figure 22: Merge graph terms
The merge graph is always displayed in the context of a stream, workspace or repository. Subsequently the merge graphs for a file may differ widely when viewed from any of the streams or workspaces it is related to. By maintaining this context you are only presented with the changes relevant to that file version. The graph helps to connect the dots for the progression of change of a file. By default Team Concert focuses the merge graph in the context of a stream so you will only see what the relevant changes are. This you have already noticed in Figure 10 through Figure 14. You see that you quickly get an idea of how much parallel development has occurred on a file relative to the stream of interest. Looking at the merge diagram in Figure 23 for a file in the context of the entire repository paints yet again a different picture. This diagram is inclusive of all change to a file, including undelivered changes. One point to make as you look at the diagrams is that the colors are there only to provide a visual differentiation. Please do not mistake them for stream or workspace relationships.
Figure 23: History relative to entire repository
On closer inspection of Figure 23 you will notice a dead-ended change set which will only be seen in the context of the entire repository. Let's take a closer look at this in Figure 24. The dead-end is an undelivered change. In this case a change set had been created, associated with a work item (Task 1.0-T3) and then subsequently discarded. Another change set was created, associated with the same work item and then delivered. Even though the change set was discarded it still exists in Team Concert as a change that had occurred. We've all been in the situation where colleagues come and go or accidents happen with computers and work never gets completed or delivered. With Team Concert you won't lose that work as long as it was checked into a repository workspace. It will always be locatable and recoverable.
Figure 24: Dead-end fork for an undelivered change set
How about one more example before we wrap up? You noticed in Figure 13 that the merge graph has only one common path. The traditional conclusion for that type of diagram would be that the changes all occurred on a single stream, which in this case is incorrect. What the merge graph is telling us is that there was no parallel development related to this file in the context of the 2.0 Integration stream. Figure 25 provides the comparison of how the changes to this file were made in two different streams but are represented as a linear progression since they occurred in a sequential manner. This merge graph concisely tells us what we want to know about the file's overall change history relative to the latest version on the 2.0 Integration stream. We see the why conveniently listed in the History view's Comment column and we understand that there were no parallel changes that contributed to the file version we are looking at.
Figure 25: Common path shows no parallel development
People generally don't appreciate the value of change history until they've been asked to dig it up for some reason (compliance, resurfacing bugs, build failures, etc). Once you have experienced the challenge of source control archaeology, you take a more critical look at the tools you use to manage your code. Ultimately you would like to minimize the amount of time you spend digging so you can spend more time building.
As a community, our context for thinking of change is being altered by Team Concert. Traditionally we have thought of change happening in a "branch" so our vision has always been constrained to that. The need of the development community has always been to understand what has been done in the code and why, but we have been limited by the source control solutions we have employed. Traditional source control tools have forced us to rely heavily on where changes were made in order to begin any investigation. With Team Concert the natural focus is on what has changed (i.e. the change sets) and why (i.e. the work items linked to change sets) as opposed to relying on where (i.e. the streams) it came from. With Team Concert we are finally able to elevate this conversation above the artificial restrictions of a tool's particular parallel development architecture.
If you haven't already, take some time to play with Team Concert. We know you'll find that having to become an source control archaeologist won't require a PhD!
Copyright © 2011 IBM Corporation
|Products||Downloads Community||Our Story|