Rebasing is Editing Commits

git

Mon Jun 30 17:19:00 -0700 2008

Rebase is one of Git’s most alluring and yet most difficult-to-comprehend features. Rebasing is editing commits. When you rebase, you’re rewriting history.

This is possible with Git because of the separation between a commit and a push. A commit is a changeset with some attached metadata like the commit message. A push publishes your commits to a remote repository. In Subversion, these steps are inseparable, both part of the commit.

Because of this separation, you can rewrite your commits before you push them. Just as you can edit your sourcefiles as many times as you want before you commit the results, so to can you edit your commits as many times as you want before you push the results.

Rebasing comes in two main forms. One is the interactive rebase. git rebase -i HEAD~5 pops you into an editor where you can change the order of the commits, delete entire commits, or squash commits together. You can also edit a commit, which will take you out of the editor and let you work on the commit in your working tree, and then commit with git commit --amend.

The other type of rebase is rebasing your local commits on top of some changes you’re pulling from a remote source, or against a local branch. (Hence the term “rebase”: you’re creating a new base for your patches.) The action takes some new commits from a branch and slips them in underneath yours, at the point the two branches diverge. Commits prior to the divergence point are unaffected.

I use this kind of rebasing instead of git pull. You’ll notice that pull almost always creates a merge commit, which is one of these things:

commit c4110e1fb1aa50c4f876716bde07f6a982a1f31c
Merge: 296a0db... cb6050b...
Author: Joe <joe@example.com>
Date:   Tue Jun 24 14:46:41 2008 -0700

    Merge branch 'master' of example.com:repo.git

You might wonder why a merge commit is needed. Subversion doesn’t have that, after all. But that’s because the merge commit in svn is always implicit. Did you ever find yourself working on an active project, and then when you went to commit, you needed to svn up and got a whole boatload of changes, including some conflicts? Sure you did. And then you had to sit there and perform the merge on your source, finally making a big commit which included both your changes and the merge.

Git encourages discrete changesets, so it makes sense to break apart regular changes (new feature, bugfix, etc) from merges. But on an active project with lots of contributors, there’s always merging going on. So you end up with lots of ugly merge commits cluttering up your logs.

Rebasing lets us have and eat our cake. Now you can make atomic commits as you’re working, regardless of whether you are ready to share those commits with your team. But when you want to pull down the latest work from your team and merge it with your work, you can instead use rebase to reapply your patches on top of theirs. If you’re not working on the same areas of the code, then this takes almost no work. Just type git fetch && git rebase origin master and you’re done. (Notice that the output of git log then shows your recent changes at the top, regardless of the timestamps.)

Occasionally there are merge conflicts during the rebase, and this will drop you out into a shell with some rather intimidating messages. Don’t worry though, all you need to do is take a look at the conflicted files, choose the part that you want (just like resolving a regular merge conflict), and then git add them. When you’re done, run git rebase --continue. It does this merge step separately for each commit, so if there are a lot of commits in the difference between you and the remote source, this could get time-consuming. In that case you may want to git rebase --abort and then run a regular git merge or git pull. Next time, resolve to rebase more frequently, to make the merge job less of a headache.

Since rebasing is editing of commits, it doesn’t make much sense to rebase things that have already been pushed. You can do it, but as soon as you go to merge with another repo that had the unedited commit history, you’ll bump into weirdness (and probably invalidate your whole reason for rebasing, which was to clean up the history). So as a general rule, I recommend never rebasing things that have already been pushed.

If you push something and then realize five seconds later that you shouldn’t have, it is possible to rebase your local branch and then git push --force, which will obliterate the remote repo’s history. This won’t help if someone else has already pulled the commits, since the next time they push the commits will come back, so only use it when you’re certain that no one else has pulled.