Watching the libvirt RPC protocol using SystemTAP

Posted: November 30th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | 1 Comment »

A couple of releases back I completely re-structured all the RPC handling code inside libvirt to make sure it could be properly shared between the client and server, as well as decoupling the RPC handling code from the implementation of the RPC functions. As part of this work I introduced a fairly comprehensive set of DTrace static probe points into the libvirt RPC code. While one could write a WireShark plugin that is able to decode the libvirt RPC protocol (oh look Michal already has written one), that would not be able to examine encrypted libvirt connections – which is pretty much all of them. By using static probes in the libvirt RPC code we can see the RPC messages being sent and received before/after encryption has been applied.

The observant will notice that I said I inserted DTrace static probes, while this blog subject line says SystemTAP. Well the SystemTAP developers had the good sense to make their userspace probing infrastructure support the DTrace static probe marker syntax. So inserting DTrace static probes into userspace code, trivially enables support for both DTrace and SystemTAP. I previously added DTrace probe support to QEMU/KVM and was very happy when Bryan Cantrill told me (at the recent KVM forum) that the DTrace probe support I added to KVM only needed minor build system tweaks to work on Solaris, despite my only ever having tested with Linux + SystemTAP.

Along with adding the DTrace markers to the libvirt RPC code, I also created two SystemTAP tapset files to make it simpler to use the probes from SystemTAP scripts. The first, /usr/share/systemtap/tapset/libvirt_probe.stp, contains the actual probe points, grouped by functional area, while the second, /usr/share/systemtap/tapset/libvirt_functions.stp, contains a bunch of helper functions for converting enum values into human friendly strings. The idea is that instead of seeing “Procedure 53”, a sysadmin would much rather see “Procedure domain_dump_core”. I won’t go into detail about what is in those two files here, instead I’ll just illustrate their use

Tracing the RPC client

Lets says we first want to see what messages the client is sending and receiving. There are two interesting probes here, “libvirt.rpc.server_client_msg_rx” and “libvirt.rpc.server_client_msg_tx_queue“. The former is triggered when a complete RPC message has been read off the wire, while the latter is triggered when an RPC message is queued for transmission. Ideally we would also have another probe triggered when an RPC message has been completely transmitted – that’s a future todo item. Simple usage of these two probes would be

# cat > demo.stp <<EOF
probe libvirt.rpc.client_msg_rx {
  printf("client=%p len=%d program=%d version=%d procedure=%d type=%d status=%d serial=%d\n",
         client, len, prog, vers, proc, type, status, serial);
}
probe libvirt.rpc.client_msg_tx_queue {
  printf("client=%p len=%d program=%s version=%d procedure=%s type=%s status=%d serial=%d\n",
         client, len, prog, vers, proc, type, status, serial);
}
EOF
# stap demo.stp
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=66 type=0 status=0 serial=0
client=0x7f827c3b1010 len=36 program=536903814 version=1 procedure=66 type=1 status=0 serial=0
client=0x7f827c3b1010 len=40 program=536903814 version=1 procedure=1 type=0 status=0 serial=1
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=1 type=1 status=0 serial=1
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=110 type=0 status=0 serial=2
client=0x7f827c3b1010 len=48 program=536903814 version=1 procedure=110 type=1 status=0 serial=2
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=2 type=0 status=0 serial=3
client=0x7f827c3b1010 len=28 program=536903814 version=1 procedure=2 type=1 status=0 serial=3

The example shows the results of running “virsh domname vm1”. There are 4 RPC calls made here, 66 (authenticate), 1 (open), 110 (get uri), 2 (close).

Tracing the client with friendly output

Unless you have memorized libvirt RPC enums, this isn’t a very friendly way to trace the code. This is where the aforementioned libvirt_functions.stp tapset comes into play.

# cat > demo.stp <<EOF
probe libvirt.rpc.client_msg_rx {
  printf("R client=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
probe libvirt.rpc.client_msg_tx_queue {
  printf("T client=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
EOF
# stap demo.stp
T client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=auth_list type=call status=ok serial=0
R client=0x7f3e3dec0010 len=36 program=remote version=1 procedure=auth_list type=reply status=ok serial=0
T client=0x7f3e3dec0010 len=40 program=remote version=1 procedure=open type=call status=ok serial=1
R client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=open type=reply status=ok serial=1
T client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=get_uri type=call status=ok serial=2
R client=0x7f3e3dec0010 len=48 program=remote version=1 procedure=get_uri type=reply status=ok serial=2
T client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=close type=call status=ok serial=3
R client=0x7f3e3dec0010 len=28 program=remote version=1 procedure=close type=reply status=ok serial=3

Much more friendly !

Tracing the server at the same time

It might desirable to see when the server itself receives the message, independently of when the client transmitted it. There are an identical set of probes available in the server, just replace ‘client’ with ‘server_client’ in the above examples. Thus the demo script can trivially be extended to show server messages at the same time:

# cat >> demo.stp << EOF
probe libvirt.rpc.server_client_msg_rx {
  printf("R server=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
probe libvirt.rpc.server_client_msg_tx_queue {
  printf("T server=%p len=%d program=%s version=%d procedure=%s type=%s status=%s serial=%d\n",
         client, len,
         libvirt_rpc_program_name(prog, 0),
         vers,
         libvirt_rpc_procedure_name(prog, vers, proc, 0),
         libvirt_rpc_type_name(type, 0),
         libvirt_rpc_status_name(status, 0),
         serial);
}
# stap demo.stp
T client=0x7ff3c4855010 len=28 program=remote version=1 procedure=auth_list type=call status=ok serial=0
R server=0x17a2070 len=28 program=remote version=1 procedure=auth_list type=call status=ok serial=0
T server=0x17a2070 len=36 program=remote version=1 procedure=auth_list type=reply status=ok serial=0
R client=0x7ff3c4855010 len=36 program=remote version=1 procedure=auth_list type=reply status=ok serial=0
T client=0x7ff3c4855010 len=40 program=remote version=1 procedure=open type=call status=ok serial=1
R server=0x17a2070 len=40 program=remote version=1 procedure=open type=call status=ok serial=1
T server=0x17a2070 len=28 program=remote version=1 procedure=open type=reply status=ok serial=1
R client=0x7ff3c4855010 len=28 program=remote version=1 procedure=open type=reply status=ok serial=1
T client=0x7ff3c4855010 len=28 program=remote version=1 procedure=get_uri type=call status=ok serial=2
R server=0x17a2070 len=28 program=remote version=1 procedure=get_uri type=call status=ok serial=2
T server=0x17a2070 len=48 program=remote version=1 procedure=get_uri type=reply status=ok serial=2
R client=0x7ff3c4855010 len=48 program=remote version=1 procedure=get_uri type=reply status=ok serial=2
T client=0x7ff3c4855010 len=28 program=remote version=1 procedure=close type=call status=ok serial=3
R server=0x17a2070 len=28 program=remote version=1 procedure=close type=call status=ok serial=3
T server=0x17a2070 len=28 program=remote version=1 procedure=close type=reply status=ok serial=3
R client=0x7ff3c4855010 len=28 program=remote version=1 procedure=close type=reply status=ok serial=3

If the server is running on a different host than the client, just copy the demo.stp script to the other host and run a second copy there.

Further extensions

There are many further improvements that can be made to this script

  • Display a timestamp on each message
  • Associate each server side message with an individual socket
  • Display payload length
  • Display a message when the script is actually ready to run

To simplify life, we are maintaining a nice feature demonstration of the RPC SystemTAP probes in the libvirt GIT repository in the
examples/systemtap/rpc-monitor.stp file.

Here is what it can print out

  0.000 begin
  2.632 C + 0x7f1ea57dc010   local=127.0.0.1;0 remote=127.0.0.1;0
  2.632 C > 0x7f1ea57dc010   msg=remote.1.auth_list(call, ok, 0) len=28
  2.632 + S 0x1c1f710        local=127.0.0.1;0 remote=127.0.0.1;0
  2.632 > S 0x1c1f710        msg=remote.1.auth_list(call, ok, 0) len=28
  2.633 < S 0x1c1f710        msg=remote.1.auth_list(reply, ok, 0) len=36
  2.633 C < 0x7f1ea57dc010   msg=remote.1.auth_list(reply, ok, 0) len=36   2.633 C > 0x7f1ea57dc010   msg=remote.1.open(call, ok, 1) len=40
  2.633 > S 0x1c1f710        msg=remote.1.open(call, ok, 1) len=40
  2.639 < S 0x1c1f710        msg=remote.1.open(reply, ok, 1) len=28
  2.639 C < 0x7f1ea57dc010   msg=remote.1.open(reply, ok, 1) len=28   2.639 C > 0x7f1ea57dc010   msg=remote.1.get_uri(call, ok, 2) len=28
  2.639 > S 0x1c1f710        msg=remote.1.get_uri(call, ok, 2) len=28
  2.639 < S 0x1c1f710        msg=remote.1.get_uri(reply, ok, 2) len=48
  2.640 C < 0x7f1ea57dc010   msg=remote.1.get_uri(reply, ok, 2) len=48   2.640 C > 0x7f1ea57dc010   msg=remote.1.domain_lookup_by_id(call, ok, 3) len=32
  2.640 > S 0x1c1f710        msg=remote.1.domain_lookup_by_id(call, ok, 3) len=32
  2.640 < S 0x1c1f710        msg=remote.1.domain_lookup_by_id(reply, error, 3) len=180
  2.641 C < 0x7f1ea57dc010   msg=remote.1.domain_lookup_by_id(reply, error, 3) len=180   2.641 C > 0x7f1ea57dc010   msg=remote.1.close(call, ok, 4) len=28
  2.641 > S 0x1c1f710        msg=remote.1.close(call, ok, 4) len=28
  2.641 < S 0x1c1f710        msg=remote.1.close(reply, ok, 4) len=28
  2.641 C < 0x7f1ea57dc010   msg=remote.1.close(reply, ok, 4) len=28
  2.641 C - 0x7f1ea57dc010   local= remote=
  2.641 - S 0x1c1f710        local=127.0.0.1;0 remote=127.0.0.1;0

Tracing other areas of libvirt code

The RPC code is not the only place with SystemTAP/DTrace probe markers in libvirt. We have also instrumented our main event loop and provide an examples/systemtap/events.stp demo that prints out info like this

  0.000 begin
  2.359 18185 + handle 1 4 1
  2.360 18185 + handle 2 6 1
  2.360 18185 * handle 2 0
  2.360 14370 > handle 3 1
  2.360 14370 + handle 33 16 1
  2.361 14370 ~ 7 -1
  2.361 14370 > handle 33 1
  2.361 14370 * handle 33 1
  2.361 14370 * handle 33 1
  2.361 14370 * handle 33 3
  2.361 14370 ~ 7 -1
  2.361 14370 > handle 1 1
  2.361 14370 ~ 7 -1
  2.361 14370 > handle 33 2
  2.361 14370 * handle 33 1
  2.361 14370 ~ 7 -1
  2.361 18185 * handle 2 1
  2.362 18185 * handle 2 0

And finally we have instrumented our code which talks to the QEMU monitor, again providing a demo examples/systemtap/qemu-monitor.stp which prints out info like this

  0.000 begin
  3.848 ! 0x7f2dc00017b0 {"timestamp": {"seconds": 1319466931, "microseconds": 187755}, "event": "SHUTDOWN"}
  5.773 > 0x7f2dc0007960 {"execute":"qmp_capabilities","id":"libvirt-1"}
  5.774 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-1"}   5.774 > 0x7f2dc0007960 {"execute":"query-commands","id":"libvirt-2"}
  5.777 < 0x7f2dc0007960 {"return": [{"name": "quit"}, {"name": ....snip....   5.777 > 0x7f2dc0007960 {"execute":"query-chardev","id":"libvirt-3"}
  5.778 < 0x7f2dc0007960 {"return": [{"filename": ....snip....   5.779 > 0x7f2dc0007960 {"execute":"query-cpus","id":"libvirt-4"}
  5.780 < 0x7f2dc0007960 {"return": [{"current": true, "CPU": 0, "pc": 1048560, "halted": false, "thread_id": 13299}], "id": "libvirt-4"}   5.780 > 0x7f2dc0007960 {"execute":"set_password","arguments":{"protocol":"vnc","password":"123456","connected":"keep"},"id":"libvirt-5"}
  5.782 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-5"}   5.782 > 0x7f2dc0007960 {"execute":"expire_password","arguments":{"protocol":"vnc","time":"never"},"id":"libvirt-6"}
  5.783 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-6"}   5.783 > 0x7f2dc0007960 {"execute":"balloon","arguments":{"value":224395264},"id":"libvirt-7"}
  5.785 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-7"}   5.785 > 0x7f2dc0007960 {"execute":"cont","id":"libvirt-8"}
  5.789 ! 0x7f2dc0007960 {"timestamp": {"seconds": 1319466933, "microseconds": 129980}, "event": "RESUME"}
  5.789 < 0x7f2dc0007960 {"return": {}, "id": "libvirt-8"}
  7.537 ! 0x7f2dc0007960 {"timestamp": {"seconds": 1319466934, "microseconds": 881214}, "event": "SHUTDOWN"}

Conclusion

The introduction of static probes into the libvirt code has been enormously helpful in understanding the operation of libvirt. While we have comprehensive debug logging present in libvirt is it hard to tailor the output to show the precise data desired. Traditional debuggers like GDB are not very practical when trying to understand the live operation of a heavily multi-threaded system crossing multiple processes, and while strace is useful in some scenarios it is too low level to be useful in most scenarios. SystemTAP userspace probing provides the kind of debugging experience / tool that really suits understanding the complex interactions in a system like libvirt. It is no co-incidence that the first set of probes we have written have focused on the libvirt event loop, RPC code and QEMU monitor – three of the areas in libvirt which are both very critical operationally, and exceptionally hard to debug with traditional approaches. We will certainly be expanding our use of static probe markers in systemtap in the future. My real immediate wishlist is for systemtap to get better at providing userspace stack traces, since it fails to provide a useful trace far too often, as compared to GDB.

Update: Mark Wielaard showed me what I had todo to get nice stack traces from SystemTAP. Apparently it is not getting enough memory space to deal with stack traces with its default settings. Telling it to use a little more memory makes it work nicely:

# cat > demo.stp <<EOF
probe libvirt.rpc.client_msg_rx {
  printf("client=%p len=%d program=%d version=%d procedure=%d type=%d status=%d serial=%d\n",
         client, len, prog, vers, proc, type, status, serial);
  print_ustack(ubacktrace())
}
# stap -DTASK_FINDER_VMA_ENTRY_ITEMS=7680 demo.stp
client=0x7f775cf62010 len=36 program=536903814 version=1 procedure=66 type=1 status=0 serial=0
 0x3c57f0b3dd : virNetClientIOHandleInput+0x87d/0x890 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0b9a0 : virNetClientIOEventLoop+0x5b0/0x630 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0cb23 : virNetClientSend+0x2b3/0x590 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0d47c : virNetClientProgramCall+0x26c/0x8a0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef091e : callWithFD+0xce/0x120 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef099c : call+0x2c/0x40 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57efee80 : doRemoteOpen+0x890/0x20f0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0341b : remoteOpen+0x9b/0x290 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec2133 : do_open+0x1f3/0x1100 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec4616 : virConnectOpenAuth+0x76/0xb0 [/usr/lib64/libvirt.so.0.9.7]
 0x40ceb1 [/usr/bin/virsh+0xceb1/0x40000]
client=0x7f775cf62010 len=28 program=536903814 version=1 procedure=1 type=1 status=0 serial=1
 0x3c57f0b3dd : virNetClientIOHandleInput+0x87d/0x890 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0b9a0 : virNetClientIOEventLoop+0x5b0/0x630 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0cb23 : virNetClientSend+0x2b3/0x590 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0d47c : virNetClientProgramCall+0x26c/0x8a0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef091e : callWithFD+0xce/0x120 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ef099c : call+0x2c/0x40 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57eff57a : doRemoteOpen+0xf8a/0x20f0 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57f0341b : remoteOpen+0x9b/0x290 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec2133 : do_open+0x1f3/0x1100 [/usr/lib64/libvirt.so.0.9.7]
 0x3c57ec4616 : virConnectOpenAuth+0x76/0xb0 [/usr/lib64/libvirt.so.0.9.7]
 0x40ceb1 [/usr/bin/virsh+0xceb1/0x40000]
....

This makes me very happy :-)

GNOME-3 desktop virtualization support from GNOME Boxes (and the future for virt-manager)

Posted: November 22nd, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | 4 Comments »

For many years now, the virt-manager application has been the primary open source tool for managing virtual machines under libvirt for Fedora/Linux hosts, attempting to satisfy both server and desktop users alike, with the result that often neither userbase were really too happy. The decision to use libvirt as the foundation of OpenStack, OpenNebula and various other cloud projects has been a great validation of libvirt’s capabilities. More recently, the open sourcing of the RHEV-M product to create the oVirt community project, has seen another step forward for open source data center virtualization management based upon libvirt. Finally, with today’s very first release of GNOME Boxes, the same step forward is also happening for Linux desktop virtualization. No longer will desktop virtualization (or remote desktop access) feel like an afterthought, but rather it will be a seamless part of the GNOME-3 desktop experience. This is coming to a Fedora release near you soon….the target is Fedora 17.

What does this mean for virt-manager you might wonder ? Well first of all let me reassure people that virt-manager isn’t going away anytime in the forseeable future. There will always be people who prefer straightforward, directly controllable applications which do not try to impose clever policies on their usage. virt-manager, virsh, virt-install, etc all fill this gap and we don’t want to take that control away from people. With the growth in usage of OpenStack for cloud, oVirt for data center management, and GNOME Boxes for desktop virtualization, I think it is clear though, that virt-manager will have a diminished role / userbase in the future. I don’t consider this to be bad thing, on the contrary, it shows just how strong & diverse the open source virtualization community has become. Where once there was only virt-manager, today we have a wide choice of applications providing highly effective virtualization solutions targeted towards the needs of their respective userbases.

Introducing the libvirt-glib, a mapping of the libvirt API and XML to GLib/GObject

Posted: November 22nd, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | No Comments »

The historical philosophy of libvirt is for all our core libraries to be written in C and then create bindings to other programming languages or mappings to alternative object models. Thus far we have bindings to Python, Perl, Ruby, OCaml, Php, C#, Java and mappings to the QMF (Matahari), CIM and SNMP object models. The virt-install and virt-manager applications use the python binding to libvirt, but even very early in development of virt-manager it was clear that the libvirt python API is not a natural fit for an application using GTK, since it does not integrate with GObject and in particular GObject signals. Thus virt-manager wraps the libvirt python objects to create real GObjects it then works with. This has been quite successful, but because all the virt-manager code is in python other applications have not been able to take advantage of the higher level libvirt API virt-manager has evolved. In addition the virt-install code (which is called internally by virt-manager) contains a set of Python objects which represent the various libvirt XML schemas as plain old objects with properties and setters/getters. If you’ve developed applications against libvirt, you’ll likely appreciate just how useful such an API would be. Again though, because the API is in Python and (technically) internal to the virt-install codebase, it is not accessible to many other applications

There was clearly space for an independent library mapping the libvirt API and XML schemas to GObject, which could then be used by any application. The task of creating a libvirt GObject library API is large enough, without considering the task of also ensuring it is accessible from all the non-C programming languages. Fortunately, with the release of GNOME-3,  GObject introspection has now matured to the point where it can really be used in anger for real application development. The upshot is that it is now feasible to attempt development of a proper libvirt GObject API.

The libvirt-glib package is the result, and it actually contains three related libraries

  • libvirt-gib – non-object based glue code between GLib and libvirt. In particular this has APIs to convert libvirt virErrorPtr instances into GError instances, and provides an implementation of the libvirt event loop contract, using the GLib GMain APIs.
  • libvirt-gconfig – object based APIs which map libvirt XML documents/schemas into GObject classes. This library explicitly has no direct link to the libvirt API, solely concerning itself with XML management. This is to allow use of libvirt-gconfig from applications which are using one of the object mappings like QMF/CIM/SNMP, instead of the direct libvirt API. This where the current virt-install XML handling objects will be replicated
  • libvirt-gobject – object based APIs which map libvirt types and APIs into GObject classes. This library depends on libirt-glib and libvirt-gconfig, and is where the current virt-manager object mapping APIs will be replicated. This library is also adopting the GIO paradigm for allowing asynchronous API invocation & completion, for long running applications. This eliminates much of the need for applications to explicitly use threads (thread usage is hidden behind the async API impl).

From day 1, all the APIs are being developed with GObject introspection in mind., so all methods are fully annotated, and we are generating the glue layer for Vala bindings as standard in order to support the GNOME Boxes application. It is still very early days for development and very little of the libvirt API has been mapped into GObject thus far and work is only just starting on the XML object mappings. The overall target, however, is to develop the library to the state where it can support the aforementioned GNOME Boxes application in Fedora 17, as well as an application sandbox framework I am developing for Fedora 17 (more on that in a later blog post).

For more information

Using URI aliases with libvirt 0.9.7 (forthcoming release)

Posted: October 28th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , | 2 Comments »

In the next week or two, we will be releasing libvirt 0.9.7, which comes with a new URI alias feature, which will be particularly useful for users of tools like virsh and virt-install.

In the same way that SSH allows you to setup hostname aliases in $HOME/.ssh/config, libvirt will now allow you to setup URI aliases in $HOME/.libvirt/libvirt.conf (if you are running unprivileged) or /etc/libvirt/libvirt.conf (if you are running as root). NB do not confuse this file with libvirtd.conf which is a server side libvirtd daemon config file.

The new libvirt.conf configuration file will start with a single ‘uri_aliases’ parameter, which takes a list of entries. Each entry is an alias map in the format “ALIAS=URI”. To avoid potential overlap with otherwise valid URIs, the ALIAS part is restricted to the characters a-z, A-Z, 0-9, -, _. The config file format is fairly self-explanatory if you look at an example:

$ cat $HOME/.libvirt/libvirt.conf
uri_aliases = [
  "hail=qemu+ssh://root@hail.cloud.example.com/system",
  "sleet=qemu+tls://sleet.cloud.example.com/system",
]

Once this config has been created, I can then replace any commands like

$ virsh -c qemu+ssh://root@hail.cloud.example.com/system list
$ virt-install -c qemu+tls://sleet.cloud.example.com/system ...some args...

with much simpler commands like

$ virsh -c hail list
$ virt-install -c sleet ...some arg...

Application that use libvirt do not have to do anything special to support URI aliases, as they are automatically handled by the virConnectOpen family of APIs. If, however, an application wishes to prevent the use of URI aliases for some reason, it can do so by passing VIR_CONNECT_NO_ALIASES constant as a flag to the virConnectOpenAuth API.

For further documentation on libvirt URIs consult the URI reference and the remote driver reference pages on the libvirt website.

Debugging early startup of KVM with GDB, when launched by libvirtd

Posted: October 12th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , | 1 Comment »

Earlier today I was asked how one would go about debugging early startup of KVM under GDB, when launched by libvirtd. It was not possible to simply attach to KVM after it had been launched by libvirtd, since that was too late. In addition running the same KVM command outside libvirt did not exhibit the problem that was being investigated.

Fortunately, with a little cleverness, it is actually possible to debug a KVM guest launched by libvirtd, right from the start. The key is to combine a couple of breakpoints with use of follow-fork-mode. When libvirtd starts up a KVM guest, it runs QEMU a couple of times in order to detect which command line arguments are supported. This means the follow-fork-mode setting cannot be changed too early, otherwise GDB will end up following the wrong process.

I happen to know that there is only one place in the libvirt code which calls virCommandSetPreExecHook, and that is immediately before launching the real QEMU process. A nice thing about GDB is that when following forked/exec’d children, it will apply any existing breakpoints in the child, even if it is a new binary. So a break point set on ‘main’, while still in libvirtd will happily catch ‘main’ in the QEMU process. The only remaining problem is that if QEMU does not setup and activate the monitor quickly enough, libvirtd will try to kill it off again. Fortunately GDB lets you ignore SIGTERM, and even SIGKILL :-)

The start of the trick is this:

# pgrep libvirtd
12345
# gdb
(gdb) attach 12345
(gdb) break virCommandSetPreExecHook
(gdb) cont

Now in a separate shell

# virsh start $GUESTNAME

Back in the GDB shell the breakpoint should have triggered, allowing the trick to be finished:

(gdb) break main
(gdb) handle SIGKILL nopass noprint nostop
Signal        Stop	Print	Pass to program	Description
SIGKILL       No	No	No		Killed
(gdb) handle SIGTERM nopass noprint nostop
Signal        Stop	Print	Pass to program	Description
SIGTERM       No	No	No		Terminated
(gdb) set follow-fork-mode child
(gdb) cont
process 3020 is executing new program: /usr/bin/qemu-kvm
[Thread debugging using libthread_db enabled]
[Switching to Thread 0x7f2a4064c700 (LWP 3020)]
Breakpoint 2, main (argc=38, argv=0x7fff71f85af8, envp=0x7fff71f85c30)
    at /usr/src/debug/qemu-kvm-0.14.0/vl.c:1968
1968	{
(gdb) 

Bingo, you can now debug QEMU startup at your leisure