Visualizing libvirt development history using gource and code swarm

Posted: January 16th, 2010 | Filed under: libvirt, Virt Tools | Tags: | 3 Comments »

Michael DeHaan yesterday posted an example using gource to visualize Cobbler development history. Development on Cobbler started in April 2006, making it a similar vintage to libvirt whose development started in November 2005. So I thought it would be interesting to produce a visualization of libvirt development as a comparison.

Head over to the YouTube page for this video if it doesn’t show the option watch in highdef in this embedded viewer. HD makes it much easier to make out the names

Until July last year, libvirt was using CVS for source control. Among a great many disadvantages of CVS is that it does not track author attribution at all, so the first 3 & 1/2 years show an inaccurately small contributor base. Watching the video it is clear when the switch to GIT happened as the number of authors explodes. Even with the inaccuracies from the CVS history, it is clear from the video just how much development of libvirt has been expanding over the past 4 years, particularly with the expansion to cover VirtualBox and VMWare ESX server as hypervisor targets. This video was generated on Fedora 12 using

 # gource -s 0.07 --auto-skip-seconds 0.1 \
          --file-idle-time 500 --disable-progress \
          --output-framerate 25 --highlight-all-users \
          -1280x720 --stop-at-end --output-ppm-stream - \
 | ffmpeg -y -b 15000K -r 17 -f image2pipe -vcodec ppm \
          -i - -vcodec mpeg4 libvirt-2010-01-15-alt.mp4

gource isn’t the only source code visualization application around. Last year a project called code swarm came along too. It has a rather different & simpler physics model to gource, not showing the directory structure explicitly. As a comparison I produced a visualization of libvirt using code_swarm too:

Head over to the YouTube page for this video if it doesn’t show the option to watch in highdef in this embedded viewer. HD makes it much easier to make out the names

In this video the libvirt files are coloured into four groups, source code, test cases, documentation and i18n data (ie translated .po files). Each coloured dot represents a file, and each developer exerts a gravitional pull on files they have modified. For the years in which libvirt used CVS there were just a handful of developers who committed changes.This results in a visualization where developers have largely overlapping spheres of influence on files. In the last 6 months with GIT, changes have correct author attribution, so the visualization spreads out more accurately reflecting who is changing what. In the end, I think I rather prefer gource’s results because it has a less abstract view of the source tree and better illustrates the rate of change over time.

Finally, can anyone recommend a reliable online video hosting service that’s using HTML5 + Ogg Theora yet ? I can easily encode these videos in Ogg Theora, but don’t want to host the 200 MB files on my own webserver since it doesn’t have the bandwidth to cope.

Using GObject Introspection + Gjs to provide a JavaScript plugin engine

Posted: January 10th, 2010 | Filed under: Entangle | 8 Comments »

In writing the Capa photo capture application, one of the things I wanted to support was some form of plugin engine to allow 3rd parties to easily extend its functionality. The core application code itself is designed to have a formal separation of backend and frontend logic. The backend is focused on providing the core object model & operation, typically wrapping external libraries like HAL, libgphoto, lcms in GObject classes, with no use of GTK allowed here. The primary frontend builds on this backend, to produce a GTK based user interface. It is also intended to build another frontend that provides a GIMP plugin.

Back to the question of plugins for the main frontend. If the goal is to allow people to easily write extensions, a plugin engine based on writing C code is not really very desirable. Firefox uses JavaScript for its plugin engine and this has been hugely successful in lowering the bar for contributors. Wouldn’t it be nice if any GTK application could provide a JavaScript plugin engine ? Yes, indeed and thanks to the recent development of GObject introspection this is incredibly easy.

GObject introspection provides a means to query the GObject type system and discover all classes, interfaces, methods, properties, signals, all data types associated with their parameters and any calling conventions. This is an incredibly powerful capability with far reaching implications, the most important being that you will never again have to write a language binding for any GObject based library. There is enough metadata available in the GObject introspection system to provide language bindings in a 100% automated fashion. Notice I said “provide”, rather than “generate” because if targetting a dynamic language (Perl, Python JavaScript) it won’t even be necessary to auto-generate code ahead of time – everything can and will happen at runtime based on the introspection data. Say goodbye to hand written language bindings. Say goodbye to Swig. Say goodbye to any other home grown code generators.

Adding support for introspection

That’s the sales pitch, how about the reality ? The Capa code is based on GObject and was thus ready & willing to be introspected. The first step in adding introspection support is to add some m4 magic to the configure.ac to look for the introspection tools & library. This is simple boilerplate code that will be identical for every application using GObject + autoconf

GOBJECT_INTROSPECTION_REQUIRED=0.6.2
AC_SUBST(GOBJECT_INTROSPECTION_REQUIRED)

AC_ARG_ENABLE([introspection],
        AS_HELP_STRING([--enable-introspection], [enable GObject introspection]),
        [], [enable_introspection=check])

if test "x$enable_introspection" != "xno" ; then
        PKG_CHECK_MODULES([GOBJECT_INTROSPECTION],
                          [gobject-introspection-1.0 >= $GOBJECT_INTROSPECTION_REQUIRED],
                          [enable_introspection=yes],
                          [
                             if test "x$enable_introspection" = "xcheck"; then
                               enable_introspection=no
                             else
                               AC_MSG_ERROR([gobject-introspection is not available])
                             fi
                          ])
        if test "x$enable_introspection" = "xyes" ; then
          AC_DEFINE([WITH_GOBJECT_INTROSPECTION], [1], [enable GObject introspection support])
          AC_SUBST(GOBJECT_INTROSPECTION_CFLAGS)
          AC_SUBST(GOBJECT_INTROSPECTION_LIBS)
          AC_SUBST([G_IR_SCANNER], [$($PKG_CONFIG --variable=g_ir_scanner gobject-introspection-1.0)])
          AC_SUBST([G_IR_COMPILER], [$($PKG_CONFIG --variable=g_ir_compiler gobject-introspection-1.0)])
        fi
fi
AM_CONDITIONAL([WITH_GOBJECT_INTROSPECTION], [test "x$enable_introspection" = "xyes"])

The next step is to add Makefile.am rules to extract the introspection data. This is a two step process, the first step runs g-ir-scanner across all the source code and the actual compiled binary / library to generate a .gir file. This is an XML representation of the introspection data. The second step runs g-ir-compiler to turn the XML data into a machine usable binary format so it can be efficiently accessed. When running g-ir-scanner on a binary, as opposed to a library, it is necessary for that binary to support an extra command line flag called --introspect-dump. I add this code the main.c source file to support that

#if WITH_GOBJECT_INTROSPECTION
    static gchar *introspect = NULL;
#endif

    static const GOptionEntry entries[] = {
        ...snip other options...
#if WITH_GOBJECT_INTROSPECTION
        { "introspect-dump", 'i', 0, G_OPTION_ARG_STRING, &introspect;, "Dump introspection data", NULL },
#endif
        { NULL, 0, 0, 0, NULL, NULL, NULL },
    };

    ...parse command line args...

#if WITH_GOBJECT_INTROSPECTION
    if (introspect) {
        g_irepository_dump(introspect, NULL);
        return 0;
    }
#endif

Back to the Makefile.am rules. g-ir-scanner has quite a few arguments you need to set. The --include args provide the names of introspection metadata files for any libraries depended on. The -I args provide the CPP include paths to the application’s header files. The --pkg args provide the names of any pkg-config files that code builds against. There are a few others too which I won’t cover – they’re all in the man page. The upshot is that the Makefile.am gained rules

if WITH_GOBJECT_INTROSPECTION
Capa-0.1.gir: capa $(G_IR_SCANNER) Makefile.am
        $(G_IR_SCANNER) -v \
                --namespace Capa \
                --nsversion 0.1 \
                --include GObject-2.0 \
                --include Gtk-2.0 \
                --include GPhoto-2.0 \
                --program=$(builddir)/capa \
                --add-include-path=$(srcdir) \
                --add-include-path=$(builddir) \
                --output $@ \
                -I$(srcdir)/backend \
                -I$(srcdir)/frontend \
                --verbose \
                --pkg=glib-2.0 \
                --pkg=gthread-2.0 \
                --pkg=gdk-pixbuf-2.0 \
                --pkg=gobject-2.0 \
                --pkg=gtk+-2.0 \
                --pkg=libgphoto2 \
                --pkg=libglade-2.0 \
                --pkg=hal \
                --pkg=dbus-glib-1 \
                $(libcapa_backend_la_SOURCES:%=$(srcdir)/%) \
                $(libcapa_frontend_la_SOURCES:%=$(srcdir)/%) \
                $(capa_SOURCES:%=$(srcdir)/%)

girdir = $(datadir)/gir-1.0
gir_DATA = Capa-0.1.gir

typelibsdir = $(libdir)/girepository-1.0
typelibs_DATA = Capa-0.1.typelib

%.typelib: %.gir
        g-ir-compiler \
                --includedir=$(srcdir) \
                --includedir=$(builddir) \
                -o $@ $<

CLEANFILES += Capa-0.1.gir $(typelibs_DATA)

endif # WITH_GOBJECT_INTROSPECTION

After making those changes & rebuilding, it is wise to check the .gir file, since the g-ir-scanner doesn't always get everything correct. It may be necessary to provide annotations in the source files to help it out. For example, it got object ownership wrong on some getters, requiring annotations n the return values such as

/**
 * capa_app_get_plugin_manager: Retrieve the plugin manager
 *
 * Returns: (transfer none): the plugin manager
 */

The final step was add rules to the RPM specfile, which are fairly self-explanatory

%define with_introspection 0

%if 0%{?fedora} >= 12
%define with_introspection 1
%endif
%if 0%{?rhel} >= 6
%define with_introspection 1
%endif

%if %{with_introspection}
BuildRequires: gobject-introspection-devel
BuildRequires: gir-repository-devel
%endif


%prep
....
%if %{with_introspection}
%define introspection_arg --enable-introspection
%else
%define introspection_arg --disable-introspection
%endif

%configure %{introspection_arg}

%files
....
%if %{with_introspection}
%{_datadir}/gir-1.0/Capa-0.1.gir
%{_libdir}/girepository-1.0/Capa-0.1.typelib
%endif

That is all. The entire API is now accessible from Perl, JavaScript, Python without ever having written a line of code for those languages. It is also possible to generate a .jar file to make it accessible from Java.

Adding support for a JavaScript plugin engine

Since the API is now accessible from JavaScript, adding a JavaScript plugin engine ought to be easy at this point. There are in fact 2 competing JavaScript engines supporting GObject introspection, Gjs and Seed. Seed looks more advanced, documented & polished, but Gjs was what's in Fedora currently, so I used that. Again the first step was checking for it in configure.ac

AC_ARG_WITH([javascript],
      AS_HELP_STRING([--with-javascript],[enable JavaScript plugins]),
      [], [with_javascript=check])

if test "x$with_javascript" != "xno" ; then
  if test "x$enable_introspection" = "xno" ; then
    if test "x$with_javascript" = "xyes"; then
      AC_MSG_ERROR([gobject-introspection is requird for javascript plugins])
    fi
  fi

  PKG_CHECK_MODULES(GJS, gjs-1.0 >= $GJS_REQUIRED)
  AC_SUBST(GJS_CFLAGS)
  AC_SUBST(GJS_LIBS)

  PKG_CHECK_MODULES(GJS_GI, gjs-gi-1.0 >= $GJS_REQUIRED)
  AC_SUBST(GJS_GI_CFLAGS)
  AC_SUBST(GJS_GI_LIBS)

  with_javascript=yes
  AC_DEFINE([WITH_JAVASCRIPT], [1], [enable JavaScript plugins])
fi
AM_CONDITIONAL([WITH_JAVASCRIPT], [test "x$with_javascript" = "xyes"])

I won't go into any details on the way Capa scans for plugins (it uses $HOME/.local/share/capa/plugins//main.js), merely illustrate how to execute a plugin once it has been located. The important object in the Gjs API is GjsContext, providing the execution context for the javascript code. It is possible to have multiple contexts, so each plugin is independent and potentially able to be sandboxed. The JavaScript file to be invoked is main.js in the plugin's base directory. The first step is to setup the context's search path to point to the plugin base directory:

void runplugin(const gchar *plugindir) {
    const gchar *searchpath[2];
    GjsContext *context;

    searchpath[0] = plugindir;
    searchpath[1] = NULL;

    context = gjs_context_new_with_search_path((gchar **)searchpath);

The context is now ready to execute some javascript code. The Capa plugin system expects the main.js file to contain a method called activate. To start the plugin, we can thus simply evaluate const Main = imports.main; Main.activate();

   const gchar *script = "const Main = imports.main; Main.activate();";

   gjs_context_eval(context,
                     script,
                     -1,
                     "main.js",
                     &status;,
                     NULL);

   if (status !=0) {
     fprintf(stderr, "Loading plugin failed\n");
   }

Presto, you now have a javascript plugin running, having written no JavaScript at any point in the process. There is one slight issue in this though - how does the plugin get access to the application instance ? One way would be to provide a static method in your API to get hold of the application's main object, but I really wanted to pass the object into the plugin's activate method. This is where I hit Gjs's limitations - there appears to be no official API to set any global variable except for ARGV. After much poking around in the Gjs code though I discovered an exported method, which wasn't in the header files

JSContext* gjs_context_get_context(GjsContext *js_context);

And decided to (temporarily) abuse that until a better way could be found. I have an object instance of the CapaApp class which I wanted to pass into the activate method. The first step was to set this in the global namespace of the script being evaluated. Gjs comes with an API for converting a GObject instance into a JSObject instance which the runtime needs. Thus I wrote a simple helper

static void set_global(GjsContext *context,
                       const char *name,
                       GObject *value)
{
    JSContext *jscontext;
    JSObject *jsglobal;
    JSObject *jsvalue;

    jscontext = gjs_context_get_context(context);
    jsglobal = JS_GetGlobalObject(jscontext);
    JS_EnterLocalRootScope(jscontext);
    jsvalue = gjs_object_from_g_object(jscontext, value);
    JS_DefineProperty(jscontext, jsglobal,
                      name, OBJECT_TO_JSVAL(jsvalue),
                      NULL, NULL,
                      JSPROP_READONLY | JSPROP_PERMANENT);
    JS_LeaveLocalRootScope(jscontext);
}

There was one little surprise in this though. The gjs_object_from_g_object method will only succeed if the current Gjs context has the introspection data for that object loaded. So it was necessary to import my application's introspection data by eval'ing const Capa = imports.gi.Capa. That done, it was now possible to pass variables into the plugin. The complete revised plugin loading code looks like

void runplugin(CapaApp *application, const gchar *plugindir) {
    const gchar *script = "const Main = imports.main; Main.activate(app);";
    const gchar *searchpath[2];
    GjsContext *context;

    searchpath[0] = plugindir;
    searchpath[1] = NULL;

    context = gjs_context_new_with_search_path((gchar **)searchpath);

    gjs_context_eval(context,
                     "const Capa = imports.gi.Capa",
                     -1,
                     "dummy.js",
                     &status;,
                     NULL);

    set_global(context, plugin, "app", application);

    gjs_context_eval(context,
                     script,
                     -1,
                     "main.js",
                     &status;,
                     NULL);

    if (status !=0) {
      fprintf(stderr, "Loading plugin failed\n");
    }

This code is slightly simplified, omitting error handling, for purposes of this blog post, but the real thing is not much harder. Looking at the code again, there is really very little (if anything) about the code which is specific to my application. It would be quite easy to pull out the code which finds & loads plugins into a library (eg "libgplugin"). This would make it possible for any existing GTK applications to be retrofitted with support plugins simply by generating introspection data for their internal APIs, and then instantiating a "PluginManager" object instance.

In summary, GObject Introspection is an incredibly compelling addition to GLib. With a mere handful of additions to configure.ac and Makefile.am, it completely solves "language bindings" problem for you. I'd go as far as to say that this is a single most compelling reason to write any new C libraries using GLib/GObject. Furthermore if there are existing C libraries not using GObject, then provide a GObject wrapper for them as a top priority. Don't ever write or auto-generate a language binding again. Writing GTK applications either entirely in JavaScript, or in a mix of C + JavaScript plugins is also a really nice development, avoiding the issue of "clashing runtime environments" seen when using Python + GTK. The Gjs/Seed/GObject developers deserve warm praise for these great enhancements.

Following gphoto SVN development with GIT

Posted: January 10th, 2010 | Filed under: Entangle | No Comments »

Since I started developing the Capa photo capture application, I’ve been following development of gphoto much more closely. Unfortunately gphoto is using subversion for source control. There are many things wrong with subversion in comparison to modern SCM systems like Mercurial or GIT. In this particular case though, the main problem is speed, or lack thereof. gphoto uses sourceforge as its hosting service and sf.net subversion servers are slower than you can possibly imagine. As an example, run ‘svn log’ to browse changes and you’ll be waiting 30 seconds for it to even start to give you an answer. Then run ‘svn diff’ to look at the contents of a change and you’ll be waiting another 30 seconds or more. Totally unacceptable. Once you’ve used a distributed SCM system like Mercurial or GIT, you cease to have tolerance for any operations which take longer than a 2-3 seconds.

Fortunately, GIT has the ability to checkout directly from SVN repository. The gphoto SVN repository actually contains many separate sub-projects in it and I didn’t want to import them all to my local GIT repository. This meant I couldn’t make use of the branch / tag tracking support directly and had todo things the long way. The good news is that the long way has already been blogged about and it isn’t hard.

There were two projects I was interested in getting, libgphoto (the main library) & gphoto (the command line frontend) and I wanted each to end up in their own GIT repository. For both, I wanted the trunk and 2.4.x branch. Starting with gphoto, since it has much less history, the first step was to clone the trunk

# git svn clone https://gphoto.svn.sourceforge.net/svnroot/gphoto/trunk/gphoto2 gphoto2

This takes a fairly long time because it pulls down every single SVN changeset in the repository. Once that’s complete though, the .git/config contains

[svn-remote "svn"]
        url = https://gphoto.svn.sourceforge.net/svnroot/gphoto/trunk/gphoto2
        fetch = :refs/remotes/git-svn

And the local ‘master’ branch is connected to the ‘git-svn’ remote.

$ git branch -a
* master
  remotes/git-svn

Anytime further changes are made in the SVN repository, those can be pulled down to the local GIT repository using git svn fetch git-svn. At this point it is possible to add in the branches. Simply edit the .git/config file and add another ‘svn-remote’ entry, this time pointing at the branch path.

[svn-remote "svn24"]
        url = https://gphoto.svn.sourceforge.net/svnroot/gphoto/branches/libgphoto2-2_4/gphoto2
        fetch = :refs/remotes/git-svn-2.4

And then pull down all the changes for that branch, and create a local branch for this

# git svn fetch svn24
# git checkout -b v2.4 git-svn-2.4

This leaves a local branch ‘v2.4’ and a remote branch ‘git-svn-2.4’

$ git branch -a
  master
* v2.4
  remotes/git-svn
  remotes/git-svn-2.4

That takes care of the gphoto2 frontend command line app codebase. It is then a simply matter to repeat the same thing substituting libgphoto2 into the SVN paths to checkout the library codebase. Though this takes a little longer because it has much much more history. This little upfront pain to clone the SVN repo to GIT will be paid back many hundreds of times over thanks to the speed that GIT brings to SCM operation.

The moral of the story is simple: Don’t ever choose subversion. If you have the choice, use GIT. If you don’t have the choice, then mirror SVN to GIT anyway.

Edit: One thing I forgot to mention is that after setting up all branches, run a git gc on the repo. This will dramatically reduce the disk usage & speed up GIT operations further

$ du -h -c -s .
45M .
45M total
$ git gc
Counting objects: 3695, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (3663/3663), done.
Writing objects: 100% (3695/3695), done.
Total 3695 (delta 3081), reused 0 (delta 0)
$ du -h -c -s .
5.0M .
5.0M total

Going from 45 MB to 5 MB is quite impressive !