Giving distributed SCM systems a try

Posted: February 28th, 2006 | Filed under: Uncategorized | No Comments »

Open source SCM (version control) systems are somewhat like buses: you wait hours for a bus to come along, and then 3 arrive all at once. Well with SCM systems, for years everyone has been using (and complaining about) CVS, and then in the space of little over a year, 4 or more really good contenders spring into existance. Well actually it was a little more complicated that this – probably 2-3 years ago Subversion and GNU Arch started to gain a significant userbase. Subversion has frequently been criticised (rightly or wrongly, I’m not going to debate that now) for not being forward thinking / abitious enough – in terms of architecture its basically no more advanced that CVS with atomic changelists / commits – although the underlying technologies are definitely saner. GNU Arch took a big step forward by switching to a decentralized architecture, but its command syntax is incredibly obtuse & the implementation has some questionable techical decisions – such as ignoring POSIX/UNIX standard APIs from (G)LibC and writing them from scratch.

Getting back to my main point, however, the combination of the great Linux Kernel BitKeeper debacle, and dissatisfaction with the tradeoffs between Subversion & GNU Arch, appears to have accelerated the development of open source SCM to an even faster rate. We now have a choice of GIT, Monotone, Mercurial, Bazaar-NG, CodeVille, and Darcs. One thing these SCM systems all have in common is a distributed architecture, secondly learning from the problems of GNU Arch, they also strive to be as easy to learn & use as CVS

When starting work on the OLPC project we needed to pick an SCM system & we were determined that it would not be CVS; Subversion was also discounted due to its inability to work offline, so we required a distributed SCM system. I’d like to say we did a thorough analysis of the distributed SCM systems listed above, but we didn’t – David & both just came to agreement that Mercurial looked like a good tool and so it came to be. It turned out to be a very good choice – all day-to-day operations are incredibly fast; you can trivially work offline (as with any distributed SCM); the core commands were pretty ease to learn – pretty much regular CVS commands, with addition of “clone”, “push” & “pull” for synchronizing with remote repositories; it has minimal pre-requisite dependancies; trivial publishing of readonly anonymous repositories; did I mention it is fast yet ? After a month and a half of using it, I’ve only found one aspect that irritates me – when you pull down changes from a remote repository & you have local changes not yet pushed, it temporarily creates two HEADs, which you then have to merge, even if the changelists don’t conflict. Still this merge is a trivial process, so nothing to be worried about.

I’ve been so impressed with using it for OLPC, that I’ve decided to switch all my personal projects over to Mercurial for future development. For this process I chose to use Tailor which is a sub-project of Darcs providing a general purpose, bi-directional change history converter for a large number of SCM systems. After a little playing around to optimize the conversion, I now have 20 projects up and running, with 5 years of history intact. Very impressive.