Subdirectory Checkouts with git sparse-checkout

If there is one thing I miss about SVN having switched to git (and trust me, it’s the only thing), it is the ability to checkout only a sub-tree of a repository. As of version 1.7, you can check out just a sub-tree in git as well! Now not only does git support checking out sub-directories, it does it better than subversion!

New Repository

There is a bit of a catch-22 when doing a sub-tree checkout for a new repository. In order to only checkout a sub-tree, you’ll need to have the core.sparsecheckout option set to true. Of course, you need to have a git repository before you can enable sparse-checkout. So, rather than doing a git clone, you’ll need to start with git init.

  1. Create and initialize your new repository:

    mkdir  && cd 
    git init
    git remote add –f  
  2. Enable sparse-checkout:

    git config core.sparsecheckout true
  3. Configure sparse-checkout by listing your desired sub-trees in .git/info/sparse-checkout:

    echo some/dir/ >> .git/info/sparse-checkout
    echo another/sub/tree >> .git/info/sparse-checkout
  4. Checkout from the remote:

    git pull  

Existing Repository

If you already have a repository, simply enable and configure sparse-checkout as above and do git read-tree.

  1. Enable sparse-checkout:

    git config core.sparsecheckout true
  2. Configure sparse-checkout by listing your desired sub-trees in .git/info/sparse-checkout:

    echo some/dir/ >> .git/info/sparse-checkout
    echo another/sub/tree >> .git/info/sparse-checkout
  3. Update your working tree:

    git read-tree -mu HEAD

Modifying sparse-checkout sub-trees

If you later decide to change which directories you would like checked out, simply edit the sparse-checkout file and run git read-tree again as above.

Be sure to read the documentation on read-tree/sparse-checkout. The sparse-tree file accepts file patterns similar to .gitignore. It also accepts negations—enabling you to specify certain directories or files to not checkout.

Now there isn’t anything that svn does better than git!


8 thoughts on “Subdirectory Checkouts with git sparse-checkout”

  1. Thanks for the nice blog post. Even thought it’s a bit dated now, I think you should add “/” to the sparse-checkout directories to avoid checking out nested directories as well. Might be obvious to some but not all.

  2. I, like you Jason, found this to be the largest weakness in Git. I’m glad I found your post on how to do it, as it explains it clearly in concise terms.

    I have to disagree with your estimation that Git does this better than SVN. I can do this in 1 or 2 commands in SVN (depending on what I’m trying to accomplish). I also don’t have to mess around around with configuration or other files. That I have to do that in Git is a negative, and extra steps to remember and perform.

    Example — add a new vendor library to my SVN repo without checking out the existing huge vendor subtree:

    % svn co -N
    % cd vendor
    % git clone fluentpdo
    % svn add fluentpdo
    % svn commit -m “Add new vendor library.”

    1. Yes, I would agree that SVN wins *if* you want the entire sub-tree. However, with git’s configuration files, you can define the subtree to be sparse at various levels and check it out at once. With SVN, you would need to manually recreate the desired directory structure and do a 0-depth checkout for each directory. Obviously this depends on your situation. Sub-tree checkouts are common in SVN because they are so easy. But they are usually unnecessary in git because such a directory structure would normally be broken into smaller repos. However, in the rare situations where one needs crazy multi-level sparse checkouts, git’s implementation is much more powerful (but complicated).

  3. Thanks. I can use this.
    I’m curious how other people deal with project management documentation and their source code.

    Anyone care to share what they’re doing?

    I want a version history for my docs, but i don’t need it to branch. And when i’m switching branches in my source code, i don’t want a different version of my notes!

    I imagine some people are using submodules, but that seems too much. Having two separate github repositories for the one project just doesn’t seem right. And submodules seem to be a bit more complicated.

    Sparse checkout will help with what I’ve been doing. I have two separate working directories (local repositories) that update the same github (remote) repository. I keep my docs on a separate branch (DocMaster) and edit it from one working directory, and my source code does whatever it wants in the other directory.

    Its a nuisance to have all the source code in two different places, and there’s a chance i’ll edit in the wrong working directory. sparse checkout will mean i only have what i need in each directory. Now I just have to remember to check both sets of changes in.

    What are other people doing for project management docs?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: