Discussion:
[git] repo-layout question: cross-repo cherry-picking?
Stephan Herrmann
2011-11-21 17:12:10 UTC
Permalink
Hi,

I'm working on migrating the Object Teams SVN to git and given that git is
meant to ease branching and merging I'm trying to solve the following:

Our repo contains a fork of org.eclipse.jdt.core, from which I periodically
fetch all changes and apply them to our fork. So far I'm doing this by manually
creating and applying patches, but in git times I figure I should just setup
the original jdt.core repo as a remote and use commands like cherry pick
for updating.
(Side issue: can I cherry pick (in EGit) many commits in one operation?)

Unfortunately, both repos don't match in structure, the jdt.core plugin
resides in these locations:

repo: eclipse.jdt.core
path: /org.eclipse.jdt.core
vs.
repo: org.eclipse.objectteams
path: /plugins/org.eclipse.jdt.core

At least from EGit I could not find a way how changes could be merged
across these two.

Is there a way to tell (E)Git that the two paths above match?


Alternatively, I've been playing with moving our fork to this location
(I did the move in git after the migration):

org.eclipse.objectteams
/org.eclipse.jdt.core

Now cherry-picking works, but EGit's history shows the moved version
as having no history. Is there a way to tell EGit to follow moves? If that's
possible, is it advisable to move a whole project so every history lookup
will require matching all those files across different paths?
To me this sounds more fragile than a move was in svn - am I missing
anything?

Next I'm thinking of deep-diving into some variant of svn2git to see if
it can do the move during migration so we won't lose history.
Does anybody have experience with re-shuffling the directory structure
during svn2git? If so, which "svn2git" should I use, given there are several
different tools of the same name.

many thanks,
Stephan
Paul Webster
2011-11-21 17:21:35 UTC
Permalink
I know we re-arranged our CVS repo git://
git.eclipse.org/gitroot/platform/eclipse.platform.ui.git before we
converted them to git (using cvs2git tool in
http://cvs2svn.tigris.org/cvs2git.html). The cvs2git tool only cared about
the versioning information in CVS' ,v files, not anything in the CVSROOT
itself, so that made it easy to re-position the modules. But this might
have been easy for us to do because of the file-based nature of CVS.

PW
--
Paul Webster
Hi floor. Make me a sammich! - GIR
Stephan Herrmann
2011-11-21 17:34:00 UTC
Permalink
Post by Paul Webster
I know we re-arranged our CVS repo git://
git.eclipse.org/gitroot/platform/eclipse.platform.ui.git before we
converted them to git (using cvs2git tool in
http://cvs2svn.tigris.org/cvs2git.html). The cvs2git tool only cared about
the versioning information in CVS' ,v files, not anything in the CVSROOT
itself, so that made it easy to re-position the modules. But this might
have been easy for us to do because of the file-based nature of CVS.
Yes, since svn uses a db I would need to use "svn move" for preparing,
but this would only affect HEAD, all older revisions would still appear
under their original path. To me this sounds like I'll again lose the
direct history connection when moving to git.

thanks anyway,
Stephan
Chris Aniszczyk
2011-11-21 17:52:56 UTC
Permalink
On Mon, Nov 21, 2011 at 9:12 AM, Stephan Herrmann
Post by Stephan Herrmann
Now cherry-picking works, but EGit's history shows the moved version
as having no history. Is there a way to tell EGit to follow moves? If that's
possible, is it advisable to move a whole project so every history lookup
will require matching all those files across different paths?
To me this sounds more fragile than a move was in svn - am I missing
anything?
Cherry picking doesn't include anything from the original history.

If you cherry pick with the '-x' parameter you'lll get a message that
includes the SHA1 where it came from. But the cherry picked commit of
course will get a new SHA1.

-x
When recording the commit, append to the original commit
message a note that indicates which commit this
change was cherry-picked from. Append the note only for
cherry picks without conflicts. Do not use this
option if you are cherry-picking from your private branch
because the information is useless to the
recipient. If on the other hand you are cherry-picking
between two publicly visible branches (e.g.
backporting a fix to a maintenance branch for an older
release from a development branch), adding this
information can be useful.

On a side note, I think it would be in your best interest to take the
JDT fork and apply your changes on top of it so you share the same
history. I hope you don't end up in a case where there's a different
history up to the first commit. This may be a bit more painful on your
end but will help you in the long run to catch up with JDT.
--
Cheers,

Chris Aniszczyk
http://aniszczyk.org
+1 512 961 6719
Stephan Herrmann
2011-11-21 20:08:44 UTC
Permalink
Hi Chris,
Post by Chris Aniszczyk
On Mon, Nov 21, 2011 at 9:12 AM, Stephan Herrmann
Post by Stephan Herrmann
Now cherry-picking works, but EGit's history shows the moved version
as having no history. Is there a way to tell EGit to follow moves? If that's
possible, is it advisable to move a whole project so every history lookup
will require matching all those files across different paths?
To me this sounds more fragile than a move was in svn - am I missing
anything?
Cherry picking doesn't include anything from the original history.
Thanks for your answer, but it seems I didn't sufficiently describe my
issue: I wasn't worried about missing connection due to cherry picking,
but BEFORE I can use cherry picking I had to MOVE a project to a different
folder within the repo and by that move it lost all its history, at least as
seen via EGit.

In short:
If I don't move, I can't cherry pick.
If I move I don't see all my existing history any more.
Post by Chris Aniszczyk
If you cherry pick with the '-x' parameter you'lll get a message that
includes the SHA1 where it came from. But the cherry picked commit of
course will get a new SHA1.
That's good to know anyways, thanks. (Is -x available via EGit ? :) )
Post by Chris Aniszczyk
On a side note, I think it would be in your best interest to take the
JDT fork and apply your changes on top of it so you share the same
history.
That would be lovely, but the differences are immense, it would
take weeks or months to re-apply our changes. In the same vein
attempts to rebase in the setup you suggest will probably result
again in myriads of conflicts to be resolved manually.
Each rebase would have to reapply many years of work :)
No, I think I'm fine with applying changes from jdt.core in small
chunks and otherwise keeping separate histories.

It's really just the directory structure that I need to solve at this point.

Anyone with experience of restructuring directories in SVN during
the migration?

thanks,
Stephan
Andrew Overholt
2011-11-21 21:38:44 UTC
Permalink
Post by Stephan Herrmann
If I don't move, I can't cherry pick.
If I move I don't see all my existing history any more.
It's not a silver bullet and I had to do a lot of manual work afterwards
but I combined several SVN repositories using this tool:

http://search.cpan.org/~book/Git-FastExport-0.07/script/git-stitch-repo

It allowed me to keep the SVN history as Git history and also let me
change the paths.

Like I said, it wasn't perfect, I had to do a lot of manual work
afterwards, and it was probably error-prone. But it got me most of the
way there :)

HTH,

Andrew
James Blackburn
2011-11-21 22:50:09 UTC
Permalink
Hi Stephan,
Post by Stephan Herrmann
Our repo contains a fork of org.eclipse.jdt.core, from which I periodically
fetch all changes and apply them to our fork. So far I'm doing this by manually
creating and applying patches, but in git times I figure I should just setup
the original jdt.core repo as a remote and use commands like cherry pick
for updating.
(Side issue: can I cherry pick (in EGit) many commits in one operation?)
Unfortunately, both repos don't match in structure, the jdt.core plugin
repo: eclipse.jdt.core
path: /org.eclipse.jdt.core
vs.
repo: org.eclipse.objectteams
path: /plugins/org.eclipse.jdt.core
The subtree merge does precisely this. If there's a foreign git repo,
whose content corresponds to a subdirectory of your main repo, then you can
mark the point where the repo's were most recently in sync (using a graft,
say), and then use git merge -s subtree to merge in more recent changes.
Post by Stephan Herrmann
Now cherry-picking works, but EGit's history shows the moved version
as having no history. Is there a way to tell EGit to follow moves? If that's
possible, is it advisable to move a whole project so every history lookup
will require matching all those files across different paths?
To me this sounds more fragile than a move was in svn - am I missing
anything?
Next I'm thinking of deep-diving into some variant of svn2git to see if
it can do the move during migration so we won't lose history.
Does anybody have experience with re-shuffling the directory structure
during svn2git? If so, which "svn2git" should I use, given there are several
different tools of the same name.
CGit can track renames and moves without problem (and git-blame tends to
track the origin of lines well).

If you want to move content around, changing history, then having done the
conversion you can git-filter-branch to change branches and tags to the
layout of your choice.

Cheers,
James
Matthias Sohn
2011-11-21 23:52:38 UTC
Permalink
Post by James Blackburn
Hi Stephan,
Post by Stephan Herrmann
Our repo contains a fork of org.eclipse.jdt.core, from which I periodically
fetch all changes and apply them to our fork. So far I'm doing this by manually
creating and applying patches, but in git times I figure I should just setup
the original jdt.core repo as a remote and use commands like cherry pick
for updating.
(Side issue: can I cherry pick (in EGit) many commits in one operation?)
Unfortunately, both repos don't match in structure, the jdt.core plugin
repo: eclipse.jdt.core
path: /org.eclipse.jdt.core
vs.
repo: org.eclipse.objectteams
path: /plugins/org.eclipse.jdt.core
The subtree merge does precisely this. If there's a foreign git repo,
whose content corresponds to a subdirectory of your main repo, then you can
mark the point where the repo's were most recently in sync (using a graft,
say), and then use git merge -s subtree to merge in more recent changes.
Post by Stephan Herrmann
Now cherry-picking works, but EGit's history shows the moved version
as having no history. Is there a way to tell EGit to follow moves? If that's
possible, is it advisable to move a whole project so every history lookup
will require matching all those files across different paths?
To me this sounds more fragile than a move was in svn - am I missing
anything?
Next I'm thinking of deep-diving into some variant of svn2git to see if
it can do the move during migration so we won't lose history.
Does anybody have experience with re-shuffling the directory structure
during svn2git? If so, which "svn2git" should I use, given there are several
different tools of the same name.
CGit can track renames and moves without problem (and git-blame tends to
track the origin of lines well).
If you want to move content around, changing history, then having done the
conversion you can git-filter-branch to change branches and tags to the
layout of your choice.
There were a couple of EGit fixes recently regarding following moves, so
you may try a recent nightly build
and enable "Preferences > Team > Git > Views > History View > Follow
Renames"
--
Matthias
Stephan Herrmann
2011-11-24 15:52:37 UTC
Permalink
Hi James,
Post by James Blackburn
Post by Stephan Herrmann
...
Unfortunately, both repos don't match in structure, the jdt.core plugin
repo: eclipse.jdt.core
path: /org.eclipse.jdt.core
vs.
repo: org.eclipse.objectteams
path: /plugins/org.eclipse.jdt.core
The subtree merge does precisely this. If there's a foreign git repo,
whose content corresponds to a subdirectory of your main repo, then you can
mark the point where the repo's were most recently in sync (using a graft,
say), and then use git merge -s subtree to merge in more recent changes.
Thanks for mentioning "graft", that looks like an important piece in the puzzle!

Also subtree merge could be good, but that might still impose some
limitations:
- not available via the UI of EGit (?)
- restricted to merge (vs. cherry-pick) (?)
- needs to guess directory correspondence on each operation
So your other suggestion looks more promising to me ...
Post by James Blackburn
If you want to move content around, changing history, then having done the
conversion you can git-filter-branch to change branches and tags to the
layout of your choice.
OK, I'm currently running git filter-branch using a script with a few lines like:
test -d plugins/org.eclipse.jdt.core && mv plugins/org.eclipse.jdt.core . || echo "jdt.core not found"
When done I should hopefully have a repo where everything is in the most
convenient location. To me, a one-time repo-conversion looks safer, than relying
on *every* subsequent command to apply magic to find corresponding bits.

thanks,
Stephan
Post by James Blackburn
There were a couple of EGit fixes recently regarding following moves, so
you may try a recent nightly build
and enable "Preferences > Team > Git > Views > History View > Follow
Renames"
I'm sure there will still be plenty of occasion where I will use this. Thanks!
Loading...