What are they?

Well, as with everything git related, there is usually a helpful man-page. So let’s check that out…

Submodules allow foreign repositories to be embedded within a dedicated subdirectory of the source tree, always pointed at a particular commit.

For those that speak man-page, feel free to skip the rest. For those that want to know what a submodule is, how they are useful, and when to use them, read on!

An example

The best way to understand the purpose of submodules is to see them in action.

Let’s start with an example using jQuery. jQuery is a widely used Javascript library that helps with DOM interaction. It allows for very powerful strings, called selectors, to aid in pulling HTML elements out of the DOM (the in memory representation of your HTML). jQuery defers to a library called Sizzle for this power. Sizzle is freely available to download and use in applications, but it doesn’t require jQuery. jQuery depends on Sizzle, and it’s strictly a one-way relationship. How do we express this in our version control system?

Typically we would just copy over the source into our own project. This method is not always desirable though, as we lose the ability to track the history of our project as a separate entity from the history of the copied source. Any commit to jQuery would flood the Sizzle history and vice versa. Since they are only tangentially related, copying is not ideal. Any time we want to bring in upstream changes, we would then have to do a merge (SVN and merges always send me running to the hills). You could do this through the concept of svn:externals, but with one big limitation – svn:externals can only track the HEAD. HEAD is the latest commit on the main branch.

Git submodules to the rescue

Git submodules are simply a reference to another repository at a particular snapshot in time. They simply exist as a config entry inside .gitmodules. It would look something like the steps in the jQuery example above:

[submodule “sizzle”]
path = sizzle/
url = git://foo.com/git/sizzle.git

Path is the location that the submodule checks out to, and URL is the location of the remote repository. Simple huh?

So how do I go about setting one of these up? Surely I don’t have to edit a config file by hand, right? You are correct, astute reader!

Working with submodules

First, to create a submodule, you want to head to an existing git repo that you wish to add a submodule to. Once there, adding the submodule (this is a real git repo so feel free to follow along at home) is as simple as

git submodule add https://bitbucket.org/jaredw/awesomelibrary awesomelibrary

To confirm this type:

ls -la

You should see a listing similar to

drwxr-xr-x 5 jwyles staff 170 6 Dec 20:20 .
drwxr-xr-x 141 jwyles staff 4794 6 Dec 20:17 ..
drwxr-xr-x 10 jwyles staff 340 6 Dec 20:20 .git
-rw-r–r– 1 jwyles staff 103 6 Dec 20:20 .gitmodules
drwxr-xr-x 5 jwyles staff 170 6 Dec 20:20 awesomelibrary

We now have our standard .git folder, the awesomeLibrary we just added, and a new .gitmodules file which simply contains a reference to the repository we just added. These changes will need to be staged and commited to take effect. If we take a look inside .gitmodules, we see that the reference has been created.

[submodule “awesomelibrary”]
path = awesomelibrary
url = https://bitbucket.org/jaredw/awesomelibrary

There is no commit reference by default; instead, we are pointing to HEAD on the master branch of this repository. However, the latest version of awesomelibrary has a bug. It alerts “hello wrld” instead of “hello world”, and we need to ship (yesterday). Luckily there exists a branch that give us what we need, so let’s point our submodule to that branch.

Hopefully this shows you the flexibility of submodules – a submodule is simply a git repo inside another git repo. It is tracked separately however, so as not to update when you update your main repo (with a git pull for example). If we head into awesomeLibrary and type:

git checkout -b version1 origin/version1

We now track against a known good version, and we can ship on time! Hurray!

More with submodules

Cloning
A few things change when you are using submodules in your repository, specifically cloning. Clones must now include –recursive to ensure that they include all the submodules. If you do not clone in this way, you will end up with empty directories.

git clone git://myawesomefeature/repo.git –recursive

Init
If you have already cloned a repo, you can init the submodules by

git submodule update –init –recursive
Submodule ‘awesomelibrary’ (https://bitbucket.org/jaredw/awesomelibrary) registered for path ‘awesomelibrary’
Cloning into awesomelibrary…
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 8 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.
Submodule path ‘awesomelibrary’: checked out ‘eb962bffbf2fe0238bb22c4f3933a1287cfc9a09’

This is the equivalent of running the following commands in each individual submodule.

git submodule init
Submodule ‘awesomelibrary’ (https://bitbucket.org/jaredw/awesomelibrary) registered for path ‘awesomelibrary’

git submodule update

Cloning into awesomelibrary…
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 8 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.
Submodule path ‘awesomelibrary’: checked out ‘eb962bffbf2fe0238bb22c4f3933a1287cfc9a09’

The Gotcha

This all sounds too good to be true! But there is one small catch. As stated previously, the source is only tangentially related to your project and comes with the downside that the source becomes a directory on your file system (just like any other). You will need to stay vigilant with your management of this submodule to ensure that developers do not just edit the source – thinking the submodule is part of their project. This especially becomes important because a git push will not push the submodules’ code. If you push your repo with references to commits that do not exist on the sub modules repository on your remote, you will run into an error. No one will be able to clone the repository. To avoid this, make sure that changes are pushed from the submodule before the parent project.

Supported

Git submodules are an awesome way to leverage the power of git to manage your dependencies on source at a particular snapshot in time, all while maintaining its own history and avoiding the pain of merging an in-house fork. Weigh the pros and cons of git submodules before using them, as they are an advanced feature.

If you do choose to go down this path, Atlassian has you covered. Bamboo 3.4 and Sourcetree 1.3 Beta ship with git submodules support, making the management and building of projects that leverage this feature much simpler.

Git submodules