Michael DeHaan yesterday posted an example using gource to visualize Cobbler development history. Development on Cobbler started in April 2006, making it a similar vintage to libvirt whose development started in November 2005. So I thought it would be interesting to produce a visualization of libvirt development as a comparison.
Head over to the YouTube page for this video if it doesn’t show the option watch in highdef in this embedded viewer. HD makes it much easier to make out the names
Until July last year, libvirt was using CVS for source control. Among a great many disadvantages of CVS is that it does not track author attribution at all, so the first 3 & 1/2 years show an inaccurately small contributor base. Watching the video it is clear when the switch to GIT happened as the number of authors explodes. Even with the inaccuracies from the CVS history, it is clear from the video just how much development of libvirt has been expanding over the past 4 years, particularly with the expansion to cover VirtualBox and VMWare ESX server as hypervisor targets. This video was generated on Fedora 12 using
gource isn’t the only source code visualization application around. Last year a project called code swarm came along too. It has a rather different & simpler physics model to gource, not showing the directory structure explicitly. As a comparison I produced a visualization of libvirt using code_swarm too:
Head over to the YouTube page for this video if it doesn’t show the option to watch in highdef in this embedded viewer. HD makes it much easier to make out the names
In this video the libvirt files are coloured into four groups, source code, test cases, documentation and i18n data (ie translated .po files). Each coloured dot represents a file, and each developer exerts a gravitional pull on files they have modified. For the years in which libvirt used CVS there were just a handful of developers who committed changes.This results in a visualization where developers have largely overlapping spheres of influence on files. In the last 6 months with GIT, changes have correct author attribution, so the visualization spreads out more accurately reflecting who is changing what. In the end, I think I rather prefer gource’s results because it has a less abstract view of the source tree and better illustrates the rate of change over time.
Finally, can anyone recommend a reliable online video hosting service that’s using HTML5 + Ogg Theora yet ? I can easily encode these videos in Ogg Theora, but don’t want to host the 200 MB files on my own webserver since it doesn’t have the bandwidth to cope.
In writing the Capa photo capture application, one of the things I wanted to support was some form of plugin engine to allow 3rd parties to easily extend its functionality. The core application code itself is designed to have a formal separation of backend and frontend logic. The backend is focused on providing the core object model & operation, typically wrapping external libraries like HAL, libgphoto, lcms in GObject classes, with no use of GTK allowed here. The primary frontend builds on this backend, to produce a GTK based user interface. It is also intended to build another frontend that provides a GIMP plugin.
Adding support for introspection
That’s the sales pitch, how about the reality ? The Capa code is based on GObject and was thus ready & willing to be introspected. The first step in adding introspection support is to add some m4 magic to the configure.ac to look for the introspection tools & library. This is simple boilerplate code that will be identical for every application using GObject + autoconf
AS_HELP_STRING([--enable-introspection], [enable GObject introspection]),
if test "x$enable_introspection" != "xno" ; then
[gobject-introspection-1.0 >= $GOBJECT_INTROSPECTION_REQUIRED],
if test "x$enable_introspection" = "xcheck"; then
AC_MSG_ERROR([gobject-introspection is not available])
if test "x$enable_introspection" = "xyes" ; then
AC_DEFINE([WITH_GOBJECT_INTROSPECTION], , [enable GObject introspection support])
AC_SUBST([G_IR_SCANNER], [$($PKG_CONFIG --variable=g_ir_scanner gobject-introspection-1.0)])
AC_SUBST([G_IR_COMPILER], [$($PKG_CONFIG --variable=g_ir_compiler gobject-introspection-1.0)])
AM_CONDITIONAL([WITH_GOBJECT_INTROSPECTION], [test "x$enable_introspection" = "xyes"])
The next step is to add Makefile.am rules to extract the introspection data. This is a two step process, the first step runs g-ir-scanner across all the source code and the actual compiled binary / library to generate a .gir file. This is an XML representation of the introspection data. The second step runs g-ir-compiler to turn the XML data into a machine usable binary format so it can be efficiently accessed. When running g-ir-scanner on a binary, as opposed to a library, it is necessary for that binary to support an extra command line flag called --introspect-dump. I add this code the main.c source file to support that
Back to the Makefile.am rules. g-ir-scanner has quite a few arguments you need to set. The --include args provide the names of introspection metadata files for any libraries depended on. The -I args provide the CPP include paths to the application’s header files. The --pkg args provide the names of any pkg-config files that code builds against. There are a few others too which I won’t cover – they’re all in the man page. The upshot is that the Makefile.am gained rules
After making those changes & rebuilding, it is wise to check the .gir file, since the g-ir-scanner doesn't always get everything correct. It may be necessary to provide annotations in the source files to help it out. For example, it got object ownership wrong on some getters, requiring annotations n the return values such as
* capa_app_get_plugin_manager: Retrieve the plugin manager
* Returns: (transfer none): the plugin manager
The final step was add rules to the RPM specfile, which are fairly self-explanatory
if test "x$enable_introspection" = "xno" ; then
PKG_CHECK_MODULES(GJS, gjs-1.0 >= $GJS_REQUIRED)
PKG_CHECK_MODULES(GJS_GI, gjs-gi-1.0 >= $GJS_REQUIRED)
And decided to (temporarily) abuse that until a better way could be found. I have an object instance of the CapaApp class which I wanted to pass into the activate method. The first step was to set this in the global namespace of the script being evaluated. Gjs comes with an API for converting a GObject instance into a JSObject instance which the runtime needs. Thus I wrote a simple helper
There was one little surprise in this though. The gjs_object_from_g_object method will only succeed if the current Gjs context has the introspection data for that object loaded. So it was necessary to import my application's introspection data by eval'ing const Capa = imports.gi.Capa. That done, it was now possible to pass variables into the plugin. The complete revised plugin loading code looks like
This code is slightly simplified, omitting error handling, for purposes of this blog post, but the real thing is not much harder. Looking at the code again, there is really very little (if anything) about the code which is specific to my application. It would be quite easy to pull out the code which finds & loads plugins into a library (eg "libgplugin"). This would make it possible for any existing GTK applications to be retrofitted with support plugins simply by generating introspection data for their internal APIs, and then instantiating a "PluginManager" object instance.
Since I started developing the Capa photo capture application, I’ve been following development of gphoto much more closely. Unfortunately gphoto is using subversion for source control. There are many things wrong with subversion in comparison to modern SCM systems like Mercurial or GIT. In this particular case though, the main problem is speed, or lack thereof. gphoto uses sourceforge as its hosting service and sf.net subversion servers are slower than you can possibly imagine. As an example, run ‘svn log’ to browse changes and you’ll be waiting 30 seconds for it to even start to give you an answer. Then run ‘svn diff’ to look at the contents of a change and you’ll be waiting another 30 seconds or more. Totally unacceptable. Once you’ve used a distributed SCM system like Mercurial or GIT, you cease to have tolerance for any operations which take longer than a 2-3 seconds.
Fortunately, GIT has the ability to checkout directly from SVN repository. The gphoto SVN repository actually contains many separate sub-projects in it and I didn’t want to import them all to my local GIT repository. This meant I couldn’t make use of the branch / tag tracking support directly and had todo things the long way. The good news is that the long way has already been blogged about and it isn’t hard.
There were two projects I was interested in getting, libgphoto (the main library) & gphoto (the command line frontend) and I wanted each to end up in their own GIT repository. For both, I wanted the trunk and 2.4.x branch. Starting with gphoto, since it has much less history, the first step was to clone the trunk
And the local ‘master’ branch is connected to the ‘git-svn’ remote.
$ git branch -a
Anytime further changes are made in the SVN repository, those can be pulled down to the local GIT repository using git svn fetch git-svn. At this point it is possible to add in the branches. Simply edit the .git/config file and add another ‘svn-remote’ entry, this time pointing at the branch path.
This leaves a local branch ‘v2.4’ and a remote branch ‘git-svn-2.4’
$ git branch -a
That takes care of the gphoto2 frontend command line app codebase. It is then a simply matter to repeat the same thing substituting libgphoto2 into the SVN paths to checkout the library codebase. Though this takes a little longer because it has much much more history. This little upfront pain to clone the SVN repo to GIT will be paid back many hundreds of times over thanks to the speed that GIT brings to SCM operation.
The moral of the story is simple: Don’t ever choose subversion. If you have the choice, use GIT. If you don’t have the choice, then mirror SVN to GIT anyway.
Edit: One thing I forgot to mention is that after setting up all branches, run a git gc on the repo. This will dramatically reduce the disk usage & speed up GIT operations further
$ du -h -c -s .
$ git gc
Counting objects: 3695, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (3663/3663), done.
Writing objects: 100% (3695/3695), done.
Total 3695 (delta 3081), reused 0 (delta 0)
$ du -h -c -s .