Git subtree allows you to insert any repository as a sub-directory of another one. It is one of several ways Git projects can manage project dependencies. People with good memory will remember I wrote about the usage and the advantages of the command in an earlier piece on Git submodule alternatives.
The basics of Git subtree
Let’s review the basics so that you can decide if git subtree
is useful for you. Imagine you want to add some external project to your own repository but you do not want to add too much to your daily process and the one of your peers. The subtree
command works well in this case.
For example to inject a vim extension in a repository that stores your vim setup you could do:
git subtree add --prefix .vim/bundle/fireplace https://github.com/tpope/vim-fireplace.git master --squash
This command will squash the entire history of the vim-fireplace
project into your folder .vim/bundle/fireplace
, recording the SHA-1 of master
at the time for future reference. The result of a squashed “git subtree add
” is two commits:
commit 8d6089b3faea64e1e31f8d7eb5e1bc82e3876e07
Merge: 96fa982 ce87dab
Author: Bob Marley <bob@mahrleey.com>
Date: Tue May 12 13:37:03 2015 +0200
Merge commit 'ce87dab198fecdff6043d88a26c55d7cd95e8bf9' as '.vim/bundle/fireplace'
commit ce87dab198fecdff6043d88a26c55d7cd95e8bf9
Author: Bob Marley <bob@mahrleey.com>
Date: Tue May 12 13:37:03 2015 +0200
Squashed '.vim/bundle/fireplace/' content from commit b999b09
git-subtree-dir: .vim/bundle/fireplace
git-subtree-split: b999b09cd9d69f359fa5668e81b09dcfde455cca
If after a while you want to update that sub-folder to the latest version of the child repository, you can issue a “subtree pull
” with the same parameters:
git subtree pull --prefix .vim/bundle/fireplace https://github.com/tpope/vim-fireplace.git master --squash
That’s it for the basic usage. If you want to be more careful and structured you can add
or pull
only tagged revisions (e.g. v1.0
) of your child project. This prevents you from importing code from a master
that might not be stable yet.
Note: git-subtree
stores sub-project commit ids and not refs
in the meta-data. But that’s not an issue since given a commit id (sha-1
), you can find the symbolic name associated with a commit with a command like ls-remote
:
git ls-remote https://github.com/tpope/vim-fireplace.git | grep <sha-1>
Git subtree aliases
If you use subtree commands often, you can shorten and streamline them with a couple of simple aliases in your $HOME/.gitconfig
:
[alias]
# the acronym stands for “subtree add” sba = “!f() { git subtree add –prefix $2 $1 master –squash; }; f” # the acronym stands for “subtree update” sbu = “!f() { git subtree pull –prefix $2 $1 master –squash; }; f”
The alias I use flips the original order of parameters because I like to think of adding a subtree a little bit like a scp
command (scp <remote src> <dest>
). You use them like this:
git sba <repository uri> <destination folder>
git sba https://bitbucket.org/vim-plugins-mirror/vim-surround.git .vim/bundle/tpope-vim-surround
Under the hood of git subtree
I recently had a look at the implementation of git-subtree and boy is it clever! The first insight – deep I know – is that Git subtree is implemented as shell script and it’s nicely readable.
The core technique of the command is the following: git-subtree
stores extra meta-data about the code it is importing directly in the commits. For squashed pulls
for example it stores these two values in the commit message before the merge:
git-subtree-dir: .vim/bundle/scrooloose-nerdcommenter
git-subtree-split: 0b3d928dce8262dedfc2f83b9aeb59a94e4f0ae4
The “git-subtree-split
” field records the commit id (sha-1
) of the subproject that has been injected at folder “git-subtree-dir
“. Simple enough! Using this information the subsequent git subtree pull
can retrieve the previous integration point as base for the next squash/merge.
Rebase after a git subtree
How do you rebase a repository with sub-trees mixed in? From what I could derive from this Stack Overflow discussion, there is no silver bullet.
A workable process seems to be just to basically do a manual rebase--interactive
and remove the add
commits, rebase--continue
and re execute the git subtree add
command after the rebase is done.
Hacking on git-subtree
One tiny thing that I found missing from the defaults of the command is that it does not store the URL of the original repository you are adding. I was reminded of this recently as I was trying to update all the vim extensions I track. I forgot all source repository URLs I had previously injected using git subtree add
.
Since attending Git Merge 2015 I’ve been energized to find ways to contribute to the project and so I said to myself: “instead of complaining about this, I can fix it!”.
So I’ve started tweaking the git-subtree.sh
script to do something extra.
I changed git subtree add
to annotate the squash commit with an extra field git-subtree-repo
. So issuing:
git-subtree.sh add --prefix .vim/bundle/fireplace https://github.com/tpope/vim-fireplace.git master --squash
Results in a commit with that extra field:
commit ce87dab198fecdff6043d88a26c55d7cd95e8bf9
Author: Bob Marley <bob@mahrleey.com>
Date: Tue May 12 13:37:03 2015 +0200
Squashed '.vim/bundle/fireplace/' content from commit b999b09
git-subtree-dir: .vim/bundle/fireplace
git-subtree-split: b999b09cd9d69f359fa5668e81b09dcfde455cca
git-subtree-repo: https://github.com/tpope/vim-fireplace.git
With this relatively small addition I can now write a new subtree
command to list all the folders which have been injected from other repositories:
git subtree list
Which helpfully outputs:
.vim/bundle/fireplace https://github.com/tpope/vim-fireplace.git b999b0
Update 11th March 2016: As the “list
” command finds commit ids for subtrees injected into the checked out branch the --resolve
flag tries to look up the repositories at git-subtree-repo
and retrieve the symbolic refs associated with the commit ids found. Example:
$ git subtree list --resolve
vim-airline https://repo/bling/vim-airline.git 4fa37e5e[...]
vim-airline https://repo/bling/vim-airline.git HEAD
vim-airline https://repo/bling/vim-airline.git refs/heads/master
The above changes and the “list
” command implementation have been submitted to the Git mailing list for review and are currently sitting on my Git fork if you want to try them out.
Conclusions
As soon as I have proper and solid tests to this change I’ll submit a patch to the core git mailing list and see if they find this addition useful. Hopefully yes! In any case I hope you enjoyed the above knowledge dump and ping me @durdn and @atlassiandev for more Git shenanigans.
You might also enjoy our ebook, “Hello World! A new grad’s guide to coding as a team” – a collection of essays designed to help new programmers succeed in a team setting. Grab it for yourself, your team, or the new computer science graduate in your life. Even seasoned coders might learn a thing or two.