Watching the libvirt RPC protocol using SystemTAP

Posted: November 30th, 2011 | Author: | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | 1 Comment »

A couple of releases back I completely re-structured all the RPC handling code inside libvirt to make sure it could be properly shared between the client and server, as well as decoupling the RPC handling code from the implementation of the RPC functions. As part of this work I introduced a fairly comprehensive set of DTrace static probe points into the libvirt RPC code. While one could write a WireShark plugin that is able to decode the libvirt RPC protocol (oh look Michal already has written one), that would not be able to examine encrypted libvirt connections – which is pretty much all of them. By using static probes in the libvirt RPC code we can see the RPC messages being sent and received before/after encryption has been applied.

The observant will notice that I said I inserted DTrace static probes, while this blog subject line says SystemTAP. Well the SystemTAP developers had the good sense to make their userspace probing infrastructure support the DTrace static probe marker syntax. So inserting DTrace static probes into userspace code, trivially enables support for both DTrace and SystemTAP. I previously added DTrace probe support to QEMU/KVM and was very happy when Bryan Cantrill told me (at the recent KVM forum) that the DTrace probe support I added to KVM only needed minor build system tweaks to work on Solaris, despite my only ever having tested with Linux + SystemTAP.

Along with adding the DTrace markers to the libvirt RPC code, I also created two SystemTAP tapset files to make it simpler to use the probes from SystemTAP scripts. The first, /usr/share/systemtap/tapset/libvirt_probe.stp, contains the actual probe points, grouped by functional area, while the second, /usr/share/systemtap/tapset/libvirt_functions.stp, contains a bunch of helper functions for converting enum values into human friendly strings. The idea is that instead of seeing “Procedure 53”, a sysadmin would much rather see “Procedure domain_dump_core”. I won’t go into detail about what is in those two files here, instead I’ll just illustrate their use

Tracing the RPC client

Lets says we first want to see what messages the client is sending and receiving. There are two interesting probes here, “libvirt.rpc.server_client_msg_rx” and “libvirt.rpc.server_client_msg_tx_queue“. The former is triggered when a complete RPC message has been read off the wire, while the latter is triggered when an RPC message is queued for transmission. Ideally we would also have another probe triggered when an RPC message has been completely transmitted – that’s a future todo item. Simple usage of these two probes would be

# cat > demo.stp <<EOF
probe libvirt.rpc.client_msg_rx {
  printf("client=%p len=%d program=%d version=%d procedure=%d type=%d status=%d serial=%d\n",
         client, len, prog, vers, proc, type, status, serial);
}
probe libvirt.rpc.client_msg_tx_queue {
  printf("client=%p len=%d program=%s version=%d procedure=%s type=%s status=%d serial=%d\n",
         client, len, prog, vers, proc, type, status, serial);
}
EOF
# stap demo.stp
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=66 type=0 status=0 serial=0
client=0x7f827c3b1010 len=36 program=536903814 version=1 procedure=66 type=1 status=0 serial=0
client=0x7f827c3b1010 len=40 program=536903814 version=1 procedure=1 type=0 status=0 serial=1
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=1 type=1 status=0 serial=1
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=110 type=0 status=0 serial=2
client=0x7f827c3b1010 len=48 program=536903814 version=1 procedure=110 type=1 status=0 serial=2
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=2 type=0 status=0 serial=3
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=2 type=1 status=0 serial=3

The example shows the results of running “virsh domname vm1”. There are 4 RPC calls made here, 66 (authenticate), 1 (open), 110 (get uri), 2 (close).

Tracing the client with friendly output

Unless you have memorized libvirt RPC enums, this isn’t a very friendly way to trace the code. This is where the aforementioned libvirt_functions.stp tapset comes into play.

# cat > demo.stp <<EOF
probe libvirt.rpc.client_msg_rx {
  printf("R client=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
probe libvirt.rpc.client_msg_tx_queue {
  printf("T client=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
EOF
# stap demo.stp
T client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=auth_list type=call status=ok serial=0
R client=0x7f3e3dec0010 len=36 program=remote version=1 procedure=auth_list type=reply status=ok serial=0
T client=0x7f3e3dec0010 len=40 program=remote version=1 procedure=open type=call status=ok serial=1
R client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=open type=reply status=ok serial=1
T client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=get_uri type=call status=ok serial=2
R client=0x7f3e3dec0010 len=48 program=remote version=1 procedure=get_uri type=reply status=ok serial=2
T client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=close type=call status=ok serial=3
R client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=close type=reply status=ok serial=3

Much more friendly !

Tracing the server at the same time

It might desirable to see when the server itself receives the message, independently of when the client transmitted it. There are an identical set of probes available in the server, just replace ‘client’ with ‘server_client’ in the above examples. Thus the demo script can trivially be extended to show server messages at the same time:

# cat >> demo.stp << EOF
probe libvirt.rpc.server_client_msg_rx {
  printf("R server=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
probe libvirt.rpc.server_client_msg_tx_queue {
  printf("T server=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
# stap demo.stp
T client=0x7ff3c4855010 len=28 program=remote version=1 procedure=auth_list type=call status=ok serial=0
R server=0x17a2070 len=28 program=remote version=1 procedure=auth_list type=call status=ok serial=0
T server=0x17a2070 len=36 program=remote version=1 procedure=auth_list type=reply status=ok serial=0
R client=0x7ff3c4855010 len=36 program=remote version=1 procedure=auth_list type=reply status=ok serial=0
T client=0x7ff3c4855010 len=40 program=remote version=1 procedure=open type=call status=ok serial=1
R server=0x17a2070 len=40 program=remote version=1 procedure=open type=call status=ok serial=1
T server=0x17a2070 len=28 program=remote version=1 procedure=open type=reply status=ok serial=1
R client=0x7ff3c4855010 len=28 program=remote version=1 procedure=open type=reply status=ok serial=1
T client=0x7ff3c4855010 len=28 program=remote version=1 procedure=get_uri type=call status=ok serial=2
R server=0x17a2070 len=28 program=remote version=1 procedure=get_uri type=call status=ok serial=2
T server=0x17a2070 len=48 program=remote version=1 procedure=get_uri type=reply status=ok serial=2
R client=0x7ff3c4855010 len=48 program=remote version=1 procedure=get_uri type=reply status=ok serial=2
T client=0x7ff3c4855010 len=28 program=remote version=1 procedure=close type=call status=ok serial=3
R server=0x17a2070 len=28 program=remote version=1 procedure=close type=call status=ok serial=3
T server=0x17a2070 len=28 program=remote version=1 procedure=close type=reply status=ok serial=3
R client=0x7ff3c4855010 len=28 program=remote version=1 procedure=close type=reply status=ok serial=3

If the server is running on a different host than the client, just copy the demo.stp script to the other host and run a second copy there.

Further extensions

There are many further improvements that can be made to this script

  • Display a timestamp on each message
  • Associate each server side message with an individual socket
  • Display payload length
  • Display a message when the script is actually ready to run

To simplify life, we are maintaining a nice feature demonstration of the RPC SystemTAP probes in the libvirt GIT repository in the
examples/systemtap/rpc-monitor.stp file.

Here is what it can print out

  0.000 begin
  2.632 C + 0x7f1ea57dc010   local=127.0.0.1;0 remote=127.0.0.1;0
  2.632 C > 0x7f1ea57dc010   msg=remote.1.auth_list(call, ok, 0) len=28
  2.632 + S 0x1c1f710        local=127.0.0.1;0 remote=127.0.0.1;0
  2.632 > S 0x1c1f710        msg=remote.1.auth_list(call, ok, 0) len=28
  2.633 < S 0x1c1f710        msg=remote.1.auth_list(reply, ok, 0) len=36
  2.633 C < 0x7f1ea57dc010   msg=remote.1.auth_list(reply, ok, 0) len=36   2.633 C > 0x7f1ea57dc010   msg=remote.1.open(call, ok, 1) len=40
  2.633 > S 0x1c1f710        msg=remote.1.open(call, ok, 1) len=40
  2.639 < S 0x1c1f710        msg=remote.1.open(reply, ok, 1) len=28
  2.639 C < 0x7f1ea57dc010   msg=remote.1.open(reply, ok, 1) len=28   2.639 C > 0x7f1ea57dc010   msg=remote.1.get_uri(call, ok, 2) len=28
  2.639 > S 0x1c1f710        msg=remote.1.get_uri(call, ok, 2) len=28
  2.639 < S 0x1c1f710        msg=remote.1.get_uri(reply, ok, 2) len=48
  2.640 C < 0x7f1ea57dc010   msg=remote.1.get_uri(reply, ok, 2) len=48   2.640 C > 0x7f1ea57dc010   msg=remote.1.domain_lookup_by_id(call, ok, 3) len=32
  2.640 > S 0x1c1f710        msg=remote.1.domain_lookup_by_id(call, ok, 3) len=32
  2.640 < S 0x1c1f710        msg=remote.1.domain_lookup_by_id(reply, error, 3) len=180
  2.641 C < 0x7f1ea57dc010   msg=remote.1.domain_lookup_by_id(reply, error, 3) len=180   2.641 C > 0x7f1ea57dc010   msg=remote.1.close(call, ok, 4) len=28
  2.641 > S 0x1c1f710        msg=remote.1.close(call, ok, 4) len=28
  2.641 < S 0x1c1f710        msg=remote.1.close(reply, ok, 4) len=28
  2.641 C < 0x7f1ea57dc010   msg=remote.1.close(reply, ok, 4) len=28
  2.641 C - 0x7f1ea57dc010   local= remote=
  2.641 - S 0x1c1f710        local=127.0.0.1;0 remote=127.0.0.1;0

Tracing other areas of libvirt code

The RPC code is not the only place with SystemTAP/DTrace probe markers in libvirt. We have also instrumented our main event loop and provide an examples/systemtap/events.stp demo that prints out info like this

  0.000 begin
  2.359 18185 + handle 1 4 1
  2.360 18185 + handle 2 6 1
  2.360 18185 * handle 2 0
  2.360 14370 > handle 3 1
  2.360 14370 + handle 33 16 1
  2.361 14370 ~ 7 -1
  2.361 14370 > handle 33 1
  2.361 14370 * handle 33 1
  2.361 14370 * handle 33 1
  2.361 14370 * handle 33 3
  2.361 14370 ~ 7 -1
  2.361 14370 > handle 1 1
  2.361 14370 ~ 7 -1
  2.361 14370 > handle 33 2
  2.361 14370 * handle 33 1
  2.361 14370 ~ 7 -1
  2.361 18185 * handle 2 1
  2.362 18185 * handle 2 0

And finally we have instrumented our code which talks to the QEMU monitor, again providing a demo examples/systemtap/qemu-monitor.stp which prints out info like this

  0.000 begin
  3.848 ! 0x7f2dc00017b0 {"timestamp": {"seconds": 1319466931, "microseconds": 187755}, "event": "SHUTDOWN"}
  5.773 > 0x7f2dc0007960 {"execute":"qmp_capabilities","id":"libvirt-1"}
  5.774 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-1"}   5.774 > 0x7f2dc0007960 {"execute":"query-commands","id":"libvirt-2"}
  5.777 < 0x7f2dc0007960 {"return": [{"name": "quit"}, {"name": ....snip....   5.777 > 0x7f2dc0007960 {"execute":"query-chardev","id":"libvirt-3"}
  5.778 < 0x7f2dc0007960 {"return": [{"filename": ....snip....   5.779 > 0x7f2dc0007960 {"execute":"query-cpus","id":"libvirt-4"}
  5.780 < 0x7f2dc0007960 {"return": [{"current": true, "CPU": 0, "pc": 1048560, "halted": false, "thread_id": 13299}], "id": "libvirt-4"}   5.780 > 0x7f2dc0007960 {"execute":"set_password","arguments":{"protocol":"vnc","password":"123456","connected":"keep"},"id":"libvirt-5"}
  5.782 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-5"}   5.782 > 0x7f2dc0007960 {"execute":"expire_password","arguments":{"protocol":"vnc","time":"never"},"id":"libvirt-6"}
  5.783 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-6"}   5.783 > 0x7f2dc0007960 {"execute":"balloon","arguments":{"value":224395264},"id":"libvirt-7"}
  5.785 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-7"}   5.785 > 0x7f2dc0007960 {"execute":"cont","id":"libvirt-8"}
  5.789 ! 0x7f2dc0007960 {"timestamp": {"seconds": 1319466933, "microseconds": 129980}, "event": "RESUME"}
  5.789 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-8"}
  7.537 ! 0x7f2dc0007960 {"timestamp": {"seconds": 1319466934, "microseconds": 881214}, "event": "SHUTDOWN"}

Conclusion

The introduction of static probes into the libvirt code has been enormously helpful in understanding the operation of libvirt. While we have comprehensive debug logging present in libvirt is it hard to tailor the output to show the precise data desired. Traditional debuggers like GDB are not very practical when trying to understand the live operation of a heavily multi-threaded system crossing multiple processes, and while strace is useful in some scenarios it is too low level to be useful in most scenarios. SystemTAP userspace probing provides the kind of debugging experience / tool that really suits understanding the complex interactions in a system like libvirt. It is no co-incidence that the first set of probes we have written have focused on the libvirt event loop, RPC code and QEMU monitor – three of the areas in libvirt which are both very critical operationally, and exceptionally hard to debug with traditional approaches. We will certainly be expanding our use of static probe markers in systemtap in the future. My real immediate wishlist is for systemtap to get better at providing userspace stack traces, since it fails to provide a useful trace far too often, as compared to GDB.

Update: Mark Wielaard showed me what I had todo to get nice stack traces from SystemTAP. Apparently it is not getting enough memory space to deal with stack traces with its default settings. Telling it to use a little more memory makes it work nicely:

# cat > demo.stp <<EOF
probe libvirt.rpc.client_msg_rx {
  printf("client=%p len=%d program=%d version=%d procedure=%d type=%d status=%d serial=%d\n",
         client, len, prog, vers, proc, type, status, serial);
  print_ustack(ubacktrace())
}
# stap -DTASK_FINDER_VMA_ENTRY_ITEMS=7680 demo.stp
client=0x7f775cf62010 len=36 program=536903814 version=1 procedure=66 type=1 status=0 serial=0
 0x3c57f0b3dd : virNetClientIOHandleInput+0x87d/0x890 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0b9a0 : virNetClientIOEventLoop+0x5b0/0x630 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0cb23 : virNetClientSend+0x2b3/0x590 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0d47c : virNetClientProgramCall+0x26c/0x8a0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef091e : callWithFD+0xce/0x120 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef099c : call+0x2c/0x40 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57efee80 : doRemoteOpen+0x890/0x20f0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0341b : remoteOpen+0x9b/0x290 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec2133 : do_open+0x1f3/0x1100 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec4616 : virConnectOpenAuth+0x76/0xb0 [/usr/lib64/libvirt.so.0.9.7]
 0x40ceb1 [/usr/bin/virsh+0xceb1/0x40000]
client=0x7f775cf62010 len=28 program=536903814 version=1 procedure=1 type=1 status=0 serial=1
 0x3c57f0b3dd : virNetClientIOHandleInput+0x87d/0x890 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0b9a0 : virNetClientIOEventLoop+0x5b0/0x630 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0cb23 : virNetClientSend+0x2b3/0x590 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0d47c : virNetClientProgramCall+0x26c/0x8a0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef091e : callWithFD+0xce/0x120 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef099c : call+0x2c/0x40 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57eff57a : doRemoteOpen+0xf8a/0x20f0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0341b : remoteOpen+0x9b/0x290 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec2133 : do_open+0x1f3/0x1100 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec4616 : virConnectOpenAuth+0x76/0xb0 [/usr/lib64/libvirt.so.0.9.7]
 0x40ceb1 [/usr/bin/virsh+0xceb1/0x40000]
....

This makes me very happy :-)

Announce: Release of Entangle v0.3.0 – An app for tethered camera control & capture

Posted: November 29th, 2011 | Author: | Filed under: Entangle, Fedora, Photography | 2 Comments »

Over a year has gone by since I released Entangle 0.2.0, so I am very pleased to be able to announce that I have now just released Entangle 0.3.0, available from the usual download place which also contains the concise list of changes.

There have in fact been a great many changes in this release, but many of them will not be immediately obvious from looking at the updated screenshots.

First of all, has been a big effort to port to the latest best practice desktop application libraries

  • Port Gtk2 to GTK3. To enable use of many of the new desktop features, in particular, libpeas, Entangle now targets Gtk3 instead of Gtk2
  • Port LibGlade to GtkBuilder. With the use of Gtk3 there is no longer any point in using the external Glade library for UI building. Instead Entangle now uses the GtkBuilder infrastructure that is part of regular Gtk
  • Port GConf to GSettings. With the use of Gtk3, a newer Glib2 is required, which in turn brings in the GSettings APis. With these present, there is no longer any point in using the external GConf library
  • Port Unique/StartupNotifications to GApplication. Again, since a newer Glib2 is required, it is possible to take advantage of the GApplication APIs, to avoid using the external Unique/StartupNotifcation libraries.

Next up was a major internal rewrite of the way the UI handles camera operations. In previous releases, most operations were handed off to a camera scheduler thread. The design of this was overly complicated and not friendly to future extension. Having recently gained experience with the way asynchronous operations are done by the GIO library, I decided that this would be an effective approach for Entangle. So all the internal thread scheduling code was ripped out, and the GIO style asynchronous APIs were added in its place. Doing this work was a major blocking item in why it has taken so long to release 0.3.0.  Now it is all complete, I am very pleased with the way it has turned out. The code for dealing with the camera is so much simpler & more flexible at the same time.

With that out of the way, there are the general user visible feature improvements

  • Config refresh. For Nikon cameras, Entangle automatically updates the UI whenever any camera configuration setting changes
  • Continuous monitoring. Instead of having to explicitly start/stop monitoring, Entangle now monitors the camera for new images at all times, and auto-downloads them as they are captured
  • Continuous preview mode. Previously preview would be stopped after an image was captured. Now it is possible to capture many images in sequence, while remaining in preview mode
  • Folder preserved. Previously when connecting to a camera, the session folder would be reset to a default location based on camera model name. Now Entangle simply always remembers the user’s last folder
  • Avoid delete after download. The default behaviour is to delete files from the camera after downloading. This can now be disabled, to allow images to remain on the memory card.
  • Config UI improvements. The UI for displaying camera settings has been improved & simplified. The “Other PTP properties” panel has been removed to improve UI performance. The ‘Camera Status’ panel now just uses labels, instead of readonly text fields for a more compact display.
  • Image metadata summary. When moving the mouse over the Entangle window, a summary of the image metadata (extracted with GExiv2) will be displayed, showing the aperture, focal length, shutter speed, ISO and resolution.

With such large changes in the basic infrastructure, there are bound to be new wierd bugs introduced, but overall this release should be a good foundation for ongoing incremental development of Entangle.

GNOME-3 desktop virtualization support from GNOME Boxes (and the future for virt-manager)

Posted: November 22nd, 2011 | Author: | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | 4 Comments »

For many years now, the virt-manager application has been the primary open source tool for managing virtual machines under libvirt for Fedora/Linux hosts, attempting to satisfy both server and desktop users alike, with the result that often neither userbase were really too happy. The decision to use libvirt as the foundation of OpenStack, OpenNebula and various other cloud projects has been a great validation of libvirt’s capabilities. More recently, the open sourcing of the RHEV-M product to create the oVirt community project, has seen another step forward for open source data center virtualization management based upon libvirt. Finally, with today’s very first release of GNOME Boxes, the same step forward is also happening for Linux desktop virtualization. No longer will desktop virtualization (or remote desktop access) feel like an afterthought, but rather it will be a seamless part of the GNOME-3 desktop experience. This is coming to a Fedora release near you soon….the target is Fedora 17.

What does this mean for virt-manager you might wonder ? Well first of all let me reassure people that virt-manager isn’t going away anytime in the forseeable future. There will always be people who prefer straightforward, directly controllable applications which do not try to impose clever policies on their usage. virt-manager, virsh, virt-install, etc all fill this gap and we don’t want to take that control away from people. With the growth in usage of OpenStack for cloud, oVirt for data center management, and GNOME Boxes for desktop virtualization, I think it is clear though, that virt-manager will have a diminished role / userbase in the future. I don’t consider this to be bad thing, on the contrary, it shows just how strong & diverse the open source virtualization community has become. Where once there was only virt-manager, today we have a wide choice of applications providing highly effective virtualization solutions targeted towards the needs of their respective userbases.

Introducing the libvirt-glib, a mapping of the libvirt API and XML to GLib/GObject

Posted: November 22nd, 2011 | Author: | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | No Comments »

The historical philosophy of libvirt is for all our core libraries to be written in C and then create bindings to other programming languages or mappings to alternative object models. Thus far we have bindings to Python, Perl, Ruby, OCaml, Php, C#, Java and mappings to the QMF (Matahari), CIM and SNMP object models. The virt-install and virt-manager applications use the python binding to libvirt, but even very early in development of virt-manager it was clear that the libvirt python API is not a natural fit for an application using GTK, since it does not integrate with GObject and in particular GObject signals. Thus virt-manager wraps the libvirt python objects to create real GObjects it then works with. This has been quite successful, but because all the virt-manager code is in python other applications have not been able to take advantage of the higher level libvirt API virt-manager has evolved. In addition the virt-install code (which is called internally by virt-manager) contains a set of Python objects which represent the various libvirt XML schemas as plain old objects with properties and setters/getters. If you’ve developed applications against libvirt, you’ll likely appreciate just how useful such an API would be. Again though, because the API is in Python and (technically) internal to the virt-install codebase, it is not accessible to many other applications

There was clearly space for an independent library mapping the libvirt API and XML schemas to GObject, which could then be used by any application. The task of creating a libvirt GObject library API is large enough, without considering the task of also ensuring it is accessible from all the non-C programming languages. Fortunately, with the release of GNOME-3,  GObject introspection has now matured to the point where it can really be used in anger for real application development. The upshot is that it is now feasible to attempt development of a proper libvirt GObject API.

The libvirt-glib package is the result, and it actually contains three related libraries

  • libvirt-gib – non-object based glue code between GLib and libvirt. In particular this has APIs to convert libvirt virErrorPtr instances into GError instances, and provides an implementation of the libvirt event loop contract, using the GLib GMain APIs.
  • libvirt-gconfig – object based APIs which map libvirt XML documents/schemas into GObject classes. This library explicitly has no direct link to the libvirt API, solely concerning itself with XML management. This is to allow use of libvirt-gconfig from applications which are using one of the object mappings like QMF/CIM/SNMP, instead of the direct libvirt API. This where the current virt-install XML handling objects will be replicated
  • libvirt-gobject – object based APIs which map libvirt types and APIs into GObject classes. This library depends on libirt-glib and libvirt-gconfig, and is where the current virt-manager object mapping APIs will be replicated. This library is also adopting the GIO paradigm for allowing asynchronous API invocation & completion, for long running applications. This eliminates much of the need for applications to explicitly use threads (thread usage is hidden behind the async API impl).

From day 1, all the APIs are being developed with GObject introspection in mind., so all methods are fully annotated, and we are generating the glue layer for Vala bindings as standard in order to support the GNOME Boxes application. It is still very early days for development and very little of the libvirt API has been mapped into GObject thus far and work is only just starting on the XML object mappings. The overall target, however, is to develop the library to the state where it can support the aforementioned GNOME Boxes application in Fedora 17, as well as an application sandbox framework I am developing for Fedora 17 (more on that in a later blog post).

For more information

Live yum upgrading from Fedora 15 to Fedora 16

Posted: November 11th, 2011 | Author: | Filed under: Fedora | Tags: , , , , , | 5 Comments »

With Fedora 16 out, I have to perform upgrades on my Fedora machines, two laptops, one mac mini and five servers. Officially the only supported way to upgrade Fedora is to use the regular Anaconda installer, or use pre-upgrade. In the years I’ve been using Fedora (since Fedora Core 5), I have never used Anaconda for upgrades because I rarely have direct access to the machine to boot CDs/DVDs, and whenever I have tried pre-upgrade it always fails. This time it failed on one server because /boot was on software-RAID, and failed on the other two servers because /boot was too small. So instead I’ve always ended up doing live upgrades using yum directly. The first few times I did this, there were some surprises, but in the end I’ve settled into a recipe that has a high success rate.

I’m NOT encouraging people follow my approach, but if you have ended up in a situation where a live yum upgrade is your only option, these notes might help you avoid some pitfalls.

  • Tip 1: Only attempt to upgrade 1 release at a time, don’t try to skip over releases. This reduces the chances of problems with yum calculating a suitable upgrade path
  • Tip 2: Avoid having any 3rd party repositories configured with yum, unless you know they already support both the current and target distro release. In practice this means you want to avoid anything except official Fedora, RPM Fusion and Livna repositories.

The actual upgrade process I follow is this:

  • Step 1: Ensure the current install is fully updated.
    # yum -y update
  • Step 2: Remove any orphaned packages, ie locally installed packages which are not present in any of your active YUM repositories. Orphaned packages are the most common cause for unresolvable RPM dependencies during upgrade. If you choose to skip this step be aware that you will almost certainly need to revisit it, if yum fails to resolve an upgrade path.
    # package-cleanup --orphans
    # rpm -e ...for each orphan you want to remove...
  • Step 3: Purge all YUM cached data about current repos. This just frees up space by removing cached data files that won’t be needed anymore
    # yum clean all
  • Step 4: Install the fedora-release, fedora-release-rawhide and fedora-release-notes RPMs for the new release.
    # wget http://mirror.bytemark.co.uk/fedora/linux/releases/16/Everything/x86_64/os/Packages/fedora-release-16-1.noarch.rpm
    # wget http://mirror.bytemark.co.uk/fedora/linux/releases/16/Everything/x86_64/os/Packages/fedora-release-notes-16.1.0-1.fc16.noarch.rpm
    # wget http://mirror.bytemark.co.uk/fedora/linux/releases/16/Everything/x86_64/os/Packages/fedora-release-rawhide-16-1.noarch.rpm
    # rpm -Uvh fedora-release*16*.rpm
  • Step 5: Perform the actual upgrade.
     # yum update

Sometimes even after removing all orphan packages and 3rd party packages, the last step will still fail with unresolvable dependencies. This might be the case if an RPM was purged from latest Fedora and thus not having an upgrade path. When this happens, it is usually sufficient to just remove the obsolete package and retry the upgrade.

One significant change between Fedora 15 and 16 that is important for the upgrade process, is the switch from Grub1 to Grub2. While the live YUM upgrade will result in grub2 getting installed, it does not update your bootsector. So there are two post-upgrade tasks that must be done before reboot.

  • Generate the master grub2 config file
    # grub2-mkconfig > /etc/grub2.cfg
  • Install grub2 into the boot sector
    # grub2-install /dev/sda

    As mentioned earlier, on one of the machines I had software RAID configured, so I wanted grub2 installed on both disks

    # grub2-install /dev/sda
    # grub2-install /dev/sdb

After rebooting into the new kernel, I once again run ‘package-cleanup –orphans’ to ensure find any obsolete packages which can be killed off.

That’s all there was to it. Where I have followed these instructions I have a 100% success rate in live yum upgrades. The only upgrade which failed was the one where I forgot to generate the grub2 config file before rebooting :-)