Understanding Git Rebase and when to use it

This blog post is part of in-depth blog series on the Git and Visual Studio. You can find the previous blog post here. In this blog post, we’ll understand what is git rebase, how it is different from git merge and when to use the rebase command. The git rebase command is one of those commands which can work magic for managing the future development of a product by simplifying git history but it can be disastrous if not used carefully. Essentially, git merge and git rebase do the same thing, i.e., bring the contents of two branches together. However both of these commands execute this change, in entirely different ways. 

Concept of Git Rebase

Problem Statement

Consider that our product team just finished the production release from master branch. Now they have started working on a entirely new feature in one of the dedicate branch named as dev-feature01 branch. However, some one found a bug in the production release. Therefore, one of the developers created a quickfix branch to fix the bug and merged his/her code into master branch. Now we need to bring together the master branch and dev-feature01 branch.

Below is graphical summary of what commit tree looks like:

problem-statement-git-branches-with-different-history
Problem Statement – Git branches with different commit history

Solution 01: Git Merge

The merge is the most widely used method. To incorporate changes from master branch to dev-feature01 branch, first we need to checkout the dev-feature01 branch and then merge the master branch (See previous post in this series for more details):

git checkout dev-feature01
git merge master

Or, we can condense this to a one-liner:

git checkout master dev-feature01

Once, its successful, the result commit tree would look something like below:

The resultant commit tree after git merge
The resultant commit tree after git merge

Merging is nice because it’s a non-destructive operation. The existing branches are not changed in any way. However, it adds an extra commit called merge commit. This is added every time we need to incorporate changes from other branches into this branch.

Solution 02: Git Rebase

As an alternative to merging, we can rebase the dev-feature01 branch onto master branch using the following commands:

git checkout dev-feature01
git rebase master

This moves the entire dev-feature01 branch to begin on the tip of the master branch, effectively incorporating all of the new commits in master. But, instead of using a merge commit, rebasing re-writes the project history by creating brand new commits for each commit in the original branch.

So our commit tree would look something like below:

the resultant commit tree after rebasing.png
The resultant commit tree after rebasing

So instead of commits E and F, we have the commits E’ and F’. Note that the commits E and F are still there in git history, but they are just not accessible any more.

Benefits of Git Rebase

The major benefit of rebasing is that we get a much cleaner project history. First, it eliminates the unnecessary merge commits required by git merge. Second, as we can see in the above diagram, rebasing also results in a perfectly linear project history—we can follow the tip of feature all the way to the beginning of the project without any forks. This makes it easier to navigate the git history of the project.

The Golden Rule of Git Rebase

Since git rebase command essentially re-writes git history, it should never be used on a branch which is shared with another developer (Unless both developers are kind of git experts). Or as its also said, never use the rebasing for public branches.

The rebase moves all of the commits in master onto the tip of dev-feature01 branch. The problem is that this only happened in your version of the repository. All of the other developers are still working with the original master branch which they got initially. Since rebasing results in brand new commits, Git will think that your master branch’s history has diverged from everybody else’s. The only way to synchronize the two master branches is to merge them back together, resulting in an extra merge commit and two sets of commits that contain the same changes (the original ones, and the ones from your rebased branch).

To understand more, let’s consider below state of commit tree on same product on developer A, Central Server (GiHub, Bitbucket, VSTS etc.)  and developer B:

git commit history comparison - 01
git commit history comparison – 01

We can see that all 3 have same versions.

Now, consider that developer A breaks the golden rule by rebasing master branch with dev-feature01 branch. In this case, the resultant tree would look like:

git commit history comparison - 02
git commit history comparison – 02

As you can see, the commit tree of developer A is now different from others. Meanwhile, developer B has gone ahead and made another commit E to his/her version of repository. None of them has sync’d their code to the central server.

Let’s consider that developer A tries to sync his master branch to the central server. He’ll get denied because the his master branch is now different from the master branch in the central server.  One solution for him/her is to push his branch forcefully to the central server using git push –force. It will essentially overwrites remote branch with his/her version.

Now, the developer B has to resync his master branch before she can sync his/her code. So git will first do a pull and then a merge in order to resolve code conflicts. This is what it will look like once its complete:

git commit history comparison - 03.png
git commit history comparison – 03

Developer B is now finally able to push code back into central server and then developer A has to resync his code to get the  latest version. In the end, the commit trees would look like:

git commit history comparison - 04
git commit history comparison – 04

As you can see, it has become hard to understand the flow of commits from one branch to another branch.

When to use Git Rebase

Above illustration makes it look like that git rebase is essentially not good to work on public branches. However, git rebase can also be done on the same branch. By periodically performing an interactive rebase on local branch, you can make sure each commit mentioned in git history is focused and meaningful. This lets you write your code without worrying about breaking it up into isolated commits—you can fix it up after the fact. This process is also sometimes known as the local cleanup.

In the next couple of posts, we’ll see how to use git rebase at command line and in the Visual Studio.

 

 

One thought on “Understanding Git Rebase and when to use it

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s