Checkout only selected Paths from Git Repository

In some organizations, its a common practice to put everything related to one project in one single git repository. Over the time, as the project goes on, more and more files keep getting added and it may reach a large size over the time. In such a case, you would like to check only a particular path, so that you can reduce the checkout time. It also make sense to checkout only selected paths, when you are running a continuous integration build, so that you can reduce overall build time. Even though git is very fast, but small improvements can really add up to be significant.

Fortunately, git provides this functionality using the concept of sparse checkout. In this blog post, we’ll learn how to do the same.

First, we need to clone the git repository in reference without checking out master branch:

git clone -n {path to git repo}

Above command will just clone the directory without checking out HEAD.

Once the repository is cloned, we first need to tell git that it should allow to checkout selected paths. This can be done by running below command:

git config core.sparsecheckout true

At this point, if you move to the selected directory, it would look empty as we have not checked out any files.

After this, we need to store the paths in a file named as .git/info/sparse-checkout:

echo some/sub-folder/you/want >> .git/info/sparse-checkout

Do note that file must be an ANSI formatted file with UNIX style line endings for git to parse it correctly. If you are using Windows, then use PowerShell and use Out-File cmdlet with encoding set as ASCII. For example, in below case, we are interested in checking out only samples/features/* and samples/demos/* paths:

setting sparsecheckout true and specifying files to be checked out

Now, we just need to checkout the branch in which we are interested. For this, we can run below command:

git checkout {branch-name}

Generally, this would be master branch. So the command would generally be git checkout master. If you would see the contents of directory, you would see only those files which are checked out. In our case, this would be:

viewing directory structure after git checkout

Let’s say, after some time, we realize that we need to checkout more files in a different path later. For this, we can add the more paths to the same file. For example:

add a new checkout path to sparse checkout configuration

And then checkout current working branch again:

running git checkout again to checkout files

You should now be able to see the files from new path:

viewing directory structure after adding new path-2

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s