Git and clean history - tidying up the wrong thing

Published 2025-03-10

tag(s): #overblown-minor-annoyances #programming #yell-at-cloud

I just made a "merge and squash" commit, as per the wishes of the team that owns that repository, and that reminded how much I dislike the idea that your Git history is supposed to look "clean" and linear.
Why? I find the reasons justifying this banal, usually rooted in aesthetic benefits. And while squashing is not the same as rebasing, they both come from the same place of "keeping a tidy history". Why? History is messy, it's meant to be.

I know I am in the minority here, but I also know I am not the only one. And I don't have that much to say in this topic that hasn't been said before, so I will quote other people.

Rebase Considered Harmful

Arguably the most famous post on this, and while I don't agree with every single point in Rebase Considered Harmful (some things are overblown minor annoyances even by my standards!) there are a few great points in there:

One of the oft-cited advantages of rebasing in Git is that it lets you collapse multiple check-ins down to a single check-in to make the development history "clean." The intent is that development appear as though every feature were created in a single step: no multi-step evolution, no back-tracking, no false starts, no mistakes. This ignores actual developer psychology: ideas rarely spring forth from fingers to files in faultless finished form. A wish for collapsed, finalized check-ins is a wish for a counterfactual situation.

The common counterargument is that collapsed check-ins represent a better world, the ideal we're striving for. What that argument overlooks is that we must throw away valuable information to get there.

And that's what I consider bad about squashing and rebasing (again, yes they are very different operations). Every little change, false start, misdirection present in your "messy" commit history, informs the final design of the feature.
Often, the remnants of those false starts are a source of bugs. Or help explain and give context to, for example, why certain code is duplicated but slightly different in two locations. Or why a certain function has an unused parameter, or some unreachable code.
This is valuable information, that is discarded when it is all collapsed in a single commit, or a series of (allegedly) cleaner commits after a rebase.

Why you should stop using Git rebase

A bit less inflammatory, maybe because it isn't pushing an agenda, is Why you should stop using Git rebase. It is the article I tend to share when this topic comes up.

[...] Git merge. It's a simple, one-step process, where all conflicts are resolved in a single commit. The resulting merge commit clearly marks the integration point between our branches, and our history depicts what actually happened, and when it happened.
[...]
The importance of keeping your history true should not be underestimated. By rebasing, you are lying to yourself and to your team. You pretend that the commits were written today, when they were in fact written yesterday, based on another commit. You've taken the commits out of their original context, disguising what actually happened. Can you be sure that the code builds? Can you be sure that the commit messages still make sense? You may believe that you are cleaning up and clarifying your history, but the result may very well be the opposite.

Most comments in the highlights argue that proper testing and continuous integration can help you avoid a lot of the situations described. Well, even if you have perfect tests and CI, there will still be bugs present, that is just the nature of building software. So sooner or later you will need to look at the project's history.
And even for unit tests, seeing the tests history can also help you identify when and why a test stopped making sense :)

Contradiction

What I find weird about the interest in keeping a "clean and tidy" history, is that it is counter to the value of openness.
Git and related tools enabled us to do development in the open, publicly. And then people are going to great lengths to hide "the ugly parts" of that development work, as if it was something to be ashamed of.
I need you to know my initial design was off, that I missed a corner case and how I modified the code to make that work. I need you to know that I misspelled a variable, so when 6 months later you catch the same mistake in another file, the history lets you know what's the correct spelling and that I missed that one instance. And so on.

Share your thoughts (via email)

Back to homepage