Source/ported changesets and determining newest version of a file
We have source code that takes a collection of IChangeSet objects (obtained by comparing one stream or workspace with another, basically getting the outgoing list of changes...but not necessarily ALL outgoing changes, sometimes only a subset of them, e.g. ones that are linked to a particular work item or list of work items) and tries to determine which version of a part is the newest in that set. To date, we've always used the changeset last-modified timestamp to determine which change to a part is newest.
However, we have discovered a wrinkle whereby the "real" changeset was sourced from another one (getSourceChangeSet() returns non-null), and in these cases can sometimes have a newer timestamp than others with older timestamps (but which technically contain newer versions of the part).
So if we alter our code to check if getOriginalChangeSet() returns non-null, and if so use that changeset instead for timestamp comparisons, we seem to get what we expect. This method is documented to return the changeset itself when it was not sourced, or the ultimate original changeset if it's sourced by a source by a source...
A question on the safety of using this new logic:
Can the "real" changeset ever contain additional changes beyond the one it's sourced from? I've only seen cases so far where the source might have additional changes, but that case is fine (we would process the source changeset separately if it was also part of the outgoing history). It would only be if the "real" one had more changes in it, in which case we would have to figure out how to use the "real" timestamp for the additional parts (instead of the source's timestamp).
One answer
I'd like to start by pointing out one thing; when a change set was created due to 'gaps', or if it has 'source change sets' (per the API you mentioned), this is usually an indication of 'parallel streams' or 'branches' if you will. So the whole talk about which is the 'real' change set, or the 'latest' change set doesn't really make sense, as the answer might be relative to which stream you are talking about.
Ex: I do a fix in CS1 the 'main stream', but realize I need to backport that change set to a 'maintenance' stream, so I try to accept/deliver that change set to the 'maintenance stream', but it says I cannot due to gaps (as normally expected), so I use the Gap editor and hand-craft what I would consider a 'semantically equivalent' change set CS1Prime that is applicable to the 'maintenance stream' context.
It is possible for the CS1 and CS1Prime to contain different files in the change sets; ex: CS1 could have more files, or CS1Prime could have more files, and this could happen because their 'fixes' were applied to different streams, which may have had slightly different 'requirements' for applying the same semantic fix. Also the actual files in CS1 and CS1Prime could be at different states/versions (which is normally the case actually), and even the 'fix content' (i.e. the file diff) for a specific file may differ (i.e. it may have been required to do something slightly different in the maintenance stream to apply the same semantic fix that the one in the 'main stream did).
Comments
Well, bear in mind that the list of changesets that we gather is specific to a single workspace or stream. For example, we get the current outgoing history (relative to another given stream, e.g. a specific outgoing flow target) for that workspace/stream, and then filter the list of changesets down to just the ones that are linked to work item(s) which the user wants to build/compile. So one work item might have an older change to a given part, while another work item might have a newer change, and the user wishes to include both work items in the build...and so we must determine the newest change to that part. To date, we have always done that by using the changeset timestamp to make that determination, since the file's timestamp and version are both completely unreliable for that purpose.
As we recently discovered, one or more of the changesets might be sourced/ported (the source appears in the Pending Changes view inside the "actual" changeset, mixed in among the work items it's linked to, but with the delta/changeset icon) and that the timestamp of the "actual" changeset (contained directly in the source workspace/stream) timestamp was not correct in terms of determining the newest version of the part (the "actual" changeset was fairly new, presumably the time it was ported over, but its content was in effect just the sourced changeset, which was rather old...so we picked the change from the "actual" changeset when we should have it taken from another unsourced changeset that was newer than the first changeset's source). So, we had to use the Source/OriginalChangeSet in order to make the right timestamp comparison.
Hence why I'm curious if this is the best approach given the use case described in the first paragraph. The concern would be in determining which changeset's timestamp a given file within it should be associated, and thus which one (the "actual" changeset or its linked Source) is correct.
Sorry for splitting across multiple comments; this forum has a surprisingly small maximum length per comment.
In all honesty, I'm having a difficulty understanding the requirements. Hopefully someone else can chime in.
It sounds like you are using the change set mod time (which is basically the date it was closed); but just to point out that this time is not a proper indication of the 'ordered history of change sets in a given stream'. Ex: you could resume a change set from a long time ago, and even though it would be the 'most recent' change set in history, it would have an older mod time. Also, if a change set contains merges, the user could have chosen a 'resolve with mine', or 'resolve with proposed'. I'm not sure if this is what you want to do, but a logical timestamp for 'resolve with mine', would be the 'mine' change set involved in the merge (or the proposed change set, if a resolve with proposed was applied).
I also have doubts about considering the timestamp of a change set in a different stream (if one was a 'backport' or 'original).
You also asked: "Can the "real" changeset ever contain additional changes beyond the one it's sourced from?", the answer was "yes", both can have more or less, depending on how the merge was done.
"Resolve With Mine" is actually the case that originally led us to using the changeset timestamp instead of the file timestamp to make "newest" determination, and that works fine. Because that results in an older file time/version, but a newer changeset time. So to date, the changeset timestamp has been very reliable, and was recommended by RTC SCM developers in the past. It's only this "sourced" use case (which I'd never seen/heard of before) that has led to an amendment of the logic.
Regardless, the changeset timestamp does seem to be the only way RTC offers to determine the newest file change given two random changesets in a stream. That's really the distillation of the requirement. Given any two random changesets A and B from a stream/workspace's outgoing history, and both contain a change for file foo.c, which version of file foo.c is newer?
Comments
Ernest Crvich
Feb 05 '18, 3:03 p.m.bump
Anyone on the RTC SCM dev team willing to comment?