dvcs

Our analysis of git hatred begins with a discussion of gitology, which I now define:

gitology (n. from git, a comptemptible fellow or DVCS): The pernicious study of the inner workings of git and how to apply them.

Thus gitology refers to the theoretical framework and mental model one must absorb as a prerequisite to becoming fully competent with git’s commonly-used features. Here a distinction must be made, because although there are many gitological concepts that only apply to git, many others are part of general DVCS theory. Gitology, however, sometimes interprets general DVCS concepts in its own way.

The following gitological concepts are not particular to git:

repositories (repos)
commits or changesets (csets)
directed acyclic graph (DAG)
branches
pushing and pulling changes
whole-repo tracking, not individual files
rebasing csets
pushing and pulling csets

The following are purely gitological and add unnecessary complexity, in addition to eventually being unavoidable:

Exposing the index/staging area
Exposing other implementation details: blobs, trees, commits, refs
refs and refspecs
Branches are refs
Detached HEADs (a.k.a “not on a branch”)
Distinguishing remote and local tracking branches
Choosing which branch to pull onto
Bare repos
Hard, soft, mixed resets
Porcelain vs plumbing

I will say more of each in turn further down.

Unavoidable gitology

If you are a user of git, I invite you to try to describe the purely gitological concepts above. If you have used git more than casually, I am sure you probably know what most of them are. If you don’t know most of them, it is likely because you haven’t used git for very long.

It is clear that gitology is unavoidable. A user of git must quickly become a student of gitology for git will make no attempt to hide its ugly guts to said user. Consider for example one of the first tasks that a user of git will encounter, making a commit. This is how git’s manpage describes this operation (and manpage it is, in the classic style of nerd-only Unix documentation):

Stores the current contents of the index in a new commit along with a log message from the user describing the changes.

Immediately the first gitological concept pops up, the index. Now, the index is not unique to git, nor even to a DVCS. Practically every VCS must implement an index in one way or another However, it is an implementation detail, which a VCS might decide to not implement for whatever reason. The great gitological stroke of genius was to gleefully expose and moreover force the user to manually handle the index. Any other common VCS only optionally makes the user handle it. Moreover, this is touted as a frequently-loved feature of git.

Let me pause here for a moment to describe why this is characteristic of git’s pathological design choices.

The index is an implementation detail. Git refers to implementation details as “plumbing” and the user interface as “porcelain”, in order to make it clear that git’s designers think of git with the same reverence I think of the instrument that handles the organic waste that my body produces. Git’s makers (I hesitate to suggest that they actually consciously designed this, so I won’t call them “designers” again) refer to the index as “porcelain”, whereas it should be “plumbing”, as evidenced by how it’s handled by hg, darcs, bzr, and yes, even crufty ol’ svn handles the index automatically.
They used manpages to document them. The manpage is a Unix developer format that tends to lead to terse or obscure documentation. It doesn’t have to do this (e.g. BSD tends to have great manpages), but as the name indicates, it used to be just a single page, really, just a cheat sheet with mnemonics to remind people what they already should have known about a program. The UNIX Haters’ Handbook explores this documentation problem in more detail. In this case, if you didn’t already know what an index was, the manpage is going to make it difficult for you to figure it out. For programs that are intended to be used by nerdy Unix developers only this wouldn’t be a problem; however git is primarily a tool for collaboration, very widespread collaboration, and expecting all collaborators to be nerdy Unix developers is unrealistic and hinders the very collaboration it’s supposed to encourage. They could be Windows users, they could be less technical contributors like translators or graphic artists (yes, it makes sense to put graphics under source control, even a DVCS, with some care). Manpages are inadequate for these people.
Exposing the index is frequently touted. Git encourages micro-management, and git’s users end up loving this micro-management (and blog about it, and write books, and have conferences, and so on ad nauseam). This is characteristic of the perversion that git promotes, focussing on details instead of getting work done. Of course it’s hard work to understand gitology, so of course people feel accomplished once they complete this work, but it’s work that shouldn’t exist. It’s not that it’s wrong to expose an implementation detail if it allows more advanced use. It’s wrong to have no way to hide this detail completely. There are workarounds, like passing options to commit, or using a front-end to git, but remember that I specifically do not hate front-ends to git (poor things are just doing the best they can to fix a horrible UI). It should be the other way around, though. It should be abstracted away from the user, and should the user want more advanced use, there should be a developer API upon which we can build more advanced tools.

Why gitology?

So why does git do the things that it does? Why expose all the plumbing? The situation is basically the following:

While the makers of git are so happy about all the flexibility that git offers due to exposing its plumbing, its users that are not also of the same persuasion are aghast that it requires learning about the path that human refuse follows through git.

The ability of a tool to be so flexible as to allow myself to get shot in the foot never did much for me other than leave me with injured feet.

A brief excursion into gitology

The gitological concepts I enumerated above are all deserving of a more thorough study in forthcoming blog posts. For the moment, I will briefly describe the remaining ones and why it’s not necessary in order to have a working DVCS.

Blobs, trees, commits, refs

Git’s simple storage model is simple enough to be exposed to the user, so why not make everyone learn it? The more internal details we can expose, the better. It is a rite of passage for any serious student of gitology to read Git For Computer Scientists or equivalent.

refs and refspecs

Refs are part of git’s storage model. They’re basically pointers, and just like carelessly manipulating C pointers results in segfaults, carelessly manipulating git’s refs results in lost data. Most of the time it’s not a problem, so naturally they should also be exposed to the user. Refspecs are a general class of ways to specify a ref, and some of their syntax is based on what a certain hacker learned to type when he was inspecting the results when a certain kernel took a core dump on him.

Branches are refs

A branch should be a simple concept independent of any implementation detail: a line of development in the repo’s DAG. For the most part, this is what they are in git, except that they identify branches with refs, so amongst other complications this leads to

Detached HEADs

When you check out an earlier commit in git, you end up with the cryptic “not in a branch” message… then where the hell are you? Isn’t the DAG made up exclusively of branches? No, a branch is a ref, so if you’re not at a commit (recursively) pointed to by a ref, you’re not anywhere and you might as well not exist and git will eventually garbage collect you.

Distinguishing remote and local tracking branches

A DVCS really shouldn’t care where branches are, and for symmetry (because symmetry is beautiful and simple) a branch shouldn’t change its nature if it’s here or there, or at least it shouldn’t appear this way to the user. Git, of course, disagrees and makes you remember the distinction between your local copy of the remote branch and the remote branch. It’s not such a big deal in the end, except when you run into

Choosing which branch to pull onto

In git, you can’t just say, “get over here whatever differences are over there” (i.e. “pull”, a common DVCS operation), because you have to remember where to pull it to. “Here” isn’t always enough to git, because the sad developmentally stunted self-described stupid content tracker can’t know which “here” you mean. You have to specify a branch. Because your branch isn’t the same as the branch over there, so it can’t always automatically go on whatever branch here corresponds to the branch there. This is all symptomatic of the asymmetry of remote and local tracking branches, as are

Bare repos

Because in git you can’t treat remote and local symmetrically (i.e. push and pull are not symmetric operations), you also can’t push (or shouldn’t, rather) push to any repos willy-nilly. So for example, you can’t make local clones in git and push and pull amongst them, because git can’t figure out what to do with the working directory of each. The gitological solution to not knowing what to do with the current working directory is to remove it altogether in what is known as a bare repo. There are workarounds, of course (create a zillion branches, which is what git users love doing), but git’s design choices really hinder the naturality of cloning. There is also deeper design problem lurking here that seeps through the stupidity of the content tracking.

Hard, soft, mixed resets

This one is admittedly more esoteric. There are several ways to do a “reset”, and really they all mean something completely different, but I chose this gitological example to point out another boneheaded aspect of git’s, shall we say, evolution. Gitology teaches that it’s best when a single command does too much and does completely different things depending on which options you choose but actually sort of the same thing if you realise that the underlying implementation is the same. It’s like having to “move” files if you want to “rename” them because that’s how filesystems are implemented, oh, wait, I start to see where these gitologists are getting their ideas from…

Coming up…

So this is just a taste of my git hate, because the fans were clamouring for more. I have so much more to hate about git, but I already feel a little useless having devoted this much breath to git, so I will probably take a while to write the next installment. It will come, though. It doesn’t look like git is getting any less hateful any time soon…

When in the course of human events it becomes necessary for a software user to disparage a thoroughly hostile DVCS, there is no recourse but to blog about it. Thus, software diplomacy has failed, and we must face the fact that I irredeemably hate git. To prove this, let these facts be submitted to a candid world.

git has destroyed our data.
git has destroyed our data remotely.
git has forced us to learn gitology.
git has coerced us to memorise --many --unrelated --options. For suposedly the same command.
git has made us use a thoroughly unintuitive user interface with badly named commands.
git has produced horrible documentation.
git has written and suggested we read Git for Computer Scientists. Several times.
git has created the largest unavoidable worldwide gang of users, and we must use git if we wish to collaborate with them.
git has obscured safer, easier, and just as powerful alternatives.

We, therefore, appealing to the wellbeing of all software users, do in the name and by the authority of all distributed version control system decency, solemnly publish and declare that git really fucking sucks, that the previous obscene intensifier was fully necessary, and that we should do all that is within our power to reduce and minimise the proliferation of git usage.

Why I must write this

Before I proceed with my study of git hate, I must first present an apologia for why I am writing this.

You may tell me, “fine, you hate git, don’t use it. It’s just your personal preference.” Normally, for any other software I use out of personal preference, I would agree. I am a student of the Church of Emacs, but this does not mean I think everyone should use Emacs. As long as you produce text, I don’t care how you do it. You can use your text editor of choice, I can use mine, and we will both be happy doing so.

Unfortunately, the same does not hold for git. Although git can be used in isolation without ever collaborating with anyone else (after all, it is a distributed version control system, so this makes it easy to use without other people or a remote server), this is not its primary use case. If everyone around me is using git, then I am too coerced to learn git if I want to collaborate with their software development. You may argue that I can use another DVCS because everything can convert back and forth to git, but interoperability can only go so far. Eventually someone uses a feature of git that another DVCS doesn’t implement, and I will have to use git anyway.

Moreover, it is clear that git is creating a community of people who are faffing around learning gitology and feeling good about themselves for understanding abstruse concepts which are completely orthogonal to actually getting work done. This is evidenced by all the blog posts written by people being frustrated with git. As the renowned 20th century mathematician G. H. Hardy once wrote, “it is a melancholy experience for a professional mathematician to find himself writing about mathematics. The function of a mathematician, is to do something, to prove new theorems, to add to mathematics, and not to talk about what he or other mathematicians have done.” In a similar vein, it’s a melancholy experience that we spend so much time blogging about how to use git, reading blogs about how to use git, and joke about using git, instead of getting work done. Without git.

Because everyone else is wasting their time praising and discussing git, the aptly-named stupid content tracker, I must now waste my own time to point out how blockheaded it is that we spend so much time learning the tool that should be tangential to our work and ultimately unrelated to our actual work.

What will not be hated on

I want to make it perfectly clear that I will direct my hatred at git and at git only. In particular, none of the following will receive any direct dosage of hatred:

github
magit
git Tower
gitg
TortoiseGit
Your Favourite non-commandline Interface to Git
Whatever other tool for tracking content that happens to use git as a database or whatever

I want to make this clear, because these tools above are the reason many people claim to love git. But just because good tools have been built on a horrible core doesn’t mean the core is good. As an example I will frequently come back to, C++ is a horrible language, but many good tools and frameworks have been built on top of it.

To all the hardcore git lovers out there, git is neither necessary nor sufficient for building all of the tools you probably love that have been built on top of git. The facilities git provides to build these tools on top of it could have been provided by any other DVCS. We can and should do better than git without sacrificing any of the things you love about git, other than whatever tool has been built on top of it.

Furthermore, I also want to make this very clear: the ultimate enemy is centralised version control, not git. CVCSes are what’s really slowing our collaboration down. git is only the major enemy within the DVCS camp, but if you’re forced to choose git over any CVCS, and git is really the only choice for a DVCS for you (this is rarely true, but I’m speaking hypothetically here), then, yes, choose git. It is by far the lesser of two evils in this hypothetical Catch-22.

What will be offered instead

I will frequently cite Mercurial, abbreviated hg, as an alternative to git. I contend that Mercurial is a safer, saner, better designed, and no less powerful alternative to git. While it may be true that Bitbucket, the major non-free provider of Mercurial hosting often compared to Github, may not fulfill the requirements you have come to expect from Github, this is not due to any limitation of hg itself. As I said, I want to only argue that git as a building block is rotten, but this is not directly related to how people have managed to polish this particular turd.

I do not, however, want to be taken as simply a an admirer of hg who is used to an hg workflow and thus hates git simply because git is not hg. I will make a thorough effort to present sound arguments for the myriad reasons why git sucks that will stand on their own without comparing them to hg. When I am finished expounding my revulsion for git, I will offer alternatives that hg or perhaps another DVCS offers that demonstrate that git’s ugliness is not necessary.

Necessary. This is also a word I will come back to often. A central theme of my attack is that is that all the complications that git has created are not necessary. In order to demonstrate this, I need to offer examples of where git’s equivalent functionality exists without the parts of git that are not necessary. This is why I will frequently cite hg.

So, enough with the meta. Let’s get on to the hating itself.

The Scribbler of the Rueful Countenance

On gitology