Improving QEMU security part 1: crypto code consolidation

Posted: March 31st, 2016 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: cryotography, qemu, tls | No Comments »

This blog is part 1 of a series I am writing about work I’ve completed over the past few releases to improve QEMU security related features.

Many years ago I wrote patches for QEMU to enable use of TLS with the VNC server via the VeNCrypt protocol extension. In those patches I modified the VNC server code to directly call out to gnutls in various places to perform the TLS handshake, validate certificates and encrypt/decrypt data. Fast-forward 8 years and I’m once again looking at QEMU with a view to adding TLS encryption support to many other QEMU network services, in particular character device backends, migration and NBD. The TLS certificate handling code is complex enough that I really didn’t fancy repeating it in multiple different areas of the QEMU codebase, so I started thinking about extracting the TLS code from the VNC server for purpose of easier reuse. Aside from VNC with TLS, QEMU uses cryptographic routines in a number of other areas, AES for qcow2 native encryption (which is horribly broke btw), single DES (yes, really single DES) in the VNC server for the awful VNC password authentication, SHA256 hashing in the quorum block driver and SHA1 hashing in the VNC websockets handshake, and AES in many of its CPU emulation backends for the various architecture specific AES acceleration instructions. QEMU actually has its own built-in impl of AES and DES that is uses, rather than calling out to a 3rd party crypto library, since the emulated CPU instructions need to run distinct internal steps of the AES algorithm, not merely consume the final output.

Looking to the future, as well as the expanded use of TLS, it was clear that use of cryptography will only ever increase in QEMU. For example, support of a LUKS encryption driver in the block layer will need access to countless encryption ciphers and hashes. It would be possible to get access to ciphers and hashes via the gnutls APIs, but sadly it doesn’t expose all the possible algorithms supported by the underlying libraries it uses. For added fun gnutls can be using either libgcrypt or nettle depending on what version of gnutls you have. So if QEMU wanted to get access to algorithms not exposed by gnutls, it would ideally have to support use of two different libraries. It was clear that QEMU would benefit from a consolidated internal API for dealing with anything related to encryption, to isolate the main bulk of the code from needing to directly deal with whatever 3rd party crypto libraries QEMU linked to. Thus I created a new top level directory in the QEMU codebase crypto/ and associated headers include/crypto/ which will contain all the code for interfacing with gnutls, libgcrypt, nettle, and whatever other cryptographic libraries we might need in the future. First of all the existing AES and DES implementations were moved into this directory. Then I created APIs for dealing with hash and cipher algorithms.

The cipher APIs are written to preferentially use either nettle or libcrypt depending on which one gnutls linked to, though this can be overridden via arguments to configure to force a particular choice. For those who really want to build without these 3rd party libraries the APIs can be built to use the internal AES or DES impls as a falback. A short example of encrypting data using AES-128 and CBC mode would look like this

  QCryptoCipher *cipher;
  uint8_t key = ....;
  size_t keylen = 16;
  uint8_t iv = ....;
 
  if (!qcrypto_cipher_supports(QCRYPTO_CIPHER_ALG_AES_128)) {
     error_report(errp, "Feature <blah> requires AES cipher support");
     return -1;
  }
 
  cipher = qcrypto_cipher_new(QCRYPTO_CIPHER_ALG_AES_128,
                              QCRYPTO_CIPHER_MODE_CBC,
                              key, keylen,
                              errp);
  if (!cipher) {
     return -1;
  }
 
  if (qcrypto_cipher_set_iv(cipher, iv, keylen, errp) < 0) {
     return -1;
  }
 
  if (qcrypto_cipher_encrypt(cipher, rawdata, encdata, datalen, errp) < 0) {
     return -1;
  }
 
  qcrypto_cipher_free(cipher);

The hash algorithms still use the gnutls APIs, though that will change in the 2.7 series to directly use libgcrypt or nettle. The hash APIs are slightly simpler since QEMU doesn’t (currently at least) need the ability to incrementally hash data, so the currently APIs just supporting one-shot hashing of buffers.

  char *digest = NULL;
 
  if (!qcrypto_hash_supports(QCRYPTO_HASH_ALG_SHA256)) {
     error_report(errp, "Feature <blah> requires sha256 hash support");
     return -1;
  }
 
  if (qcrypto_hash_digest(QCRYPTO_HASH_ALG_SHA256,
                          buf, len, &digest
                          errp) < 0) {
     return -1;
  }

The qcrypto_hash_digest() method outputs as printable hex characters. There is also qcrypto_hash_bytes() which returns the raw bytes, or qcrypto_hash_base64() which base64 encodes the result. As well as passing a single buffer, it is possible to provide a list of buffers in an ‘struct iovec’

The calls to qcrypto_cipher_supports() and qcrypto_hash_supports() are entirely optional – errors will be raised by other methods if needed, but they offer the opportunity to emit friendly error messages in the code. For example the VNC server can explicitly say which feature it can’t support due to missing DES support. Just converting the existing code in QEMU code to use these new cipher/hash APIs already had significant benefit, because it allowed for many #ifdef CONFIG_GNUTLS statements to be removed from across the codebase, particularly the VNC server. The other benefit is that the internal AES and DES implementations are no longer used by any QEMU code, except for the CPU instruction emulation, which is not even used if running with KVM. So modern KVM accelerated guests will be using well supported, audited & certified cipher & hash implementations which is often important to enterprise distribution vendors. This first stage of consolidation was completed and merged for the QEMU 2.4 release series but it has been invisible to users, mostly just benefiting the QEMU & distro maintainers.

In this blog series:

Part 1: crypto code consolidation
Part 2: generic TLS support
Part 3: securely passing in credentials
Part 4: generic I/O channel framework to simplify TLS
Part 5: TLS support for NBD server & client
Part 6: TLS support for character devices
Part 7: TLS support for migration

Announce: libvirt-sandbox “Dashti Margo” 0.6.0 release – an application sandbox toolkit

Posted: July 1st, 2015 | Filed under: Fedora, libvirt, Security, Virt Tools | Tags: application, containers, docker, kvm, lxc, sandbox | 2 Comments »

I pleased to announce the a new public release of libvirt-sandbox, version 0.6.0, is now available from:

http://sandbox.libvirt.org/download/

The packages are GPG signed with

  Key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF (4096R)

The libvirt-sandbox package provides an API layer on top of libvirt-gobject which facilitates the cration of application sandboxes using virtualization technology. An application sandbox is a virtual machine or container that runs a single application binary, directly from the host OS filesystem. In other words there is no separate guest operating system install to build or manage.

At this point in time libvirt-sandbox can create sandboxes using either LXC or KVM, and should in theory be extendable to any libvirt driver.

This release contains a mixture of new features and bugfixes.

The first major feature is the ability to provide block devices to sandboxes. Most of the time sandboxes only want/need filesystems, but there are some use cases where block devices are useful. For example, some applications (like databases) can directly use raw block devices for storage. Another one is where a tool actually wishes to be able to format filesystems and have this done inside the container. The complexity with exposing block devices is giving the sandbox tools a predictable path for accessing the device which does not change across hypervisors. To solve this, instead of allowing users of virt-sandbox to specify a block device name, they provide an opaque tag name. The block device is then made available at a path /dev/disk/by-tag/TAGNAME, which symlinks back to whatever hypervisor specific disk name was used.

The second major feature is the ability to provide a custom root filesystem for the sandbox. The original intent of the sandbox tool was that it provide an easy way to confine and execute applications that are installed on the host filesystem, so by default the host / filesystem is mapped to the sandbox / filesystem read-only. There are some use cases, however, where the user may wish to have a completely different root filesystem. For example, they may wish to execute applications from some separate disk image. So virt-sandbox now allows the user to map in a different root filesystem for the sandbox.

Both of these features were developed as part of a Google Summer of Code 2015 project which is aiming to enhance libvirt sandbox so that it is capable of executing images distributed by the Docker container image repository service. The motivation for this goes back to the original reason for creating the libvirt-sandbox project in the first place, which was to provide a hypervisor agnostic framework for sandboxing applications, as a higher level above the libvirt API. Once this is work is complete it’ll be possible to launch Docker images via libvirt QEMU, KVM or LXC, with no need for the Docker toolchain itself.

The detailed list of changes in this release is:

API/ABI in-compatible change, soname increased
Prevent use of virt-sandbox-service as non-root upfront
Fix misc memory leaks
Block SIGHUP from the dhclient binary to prevent accidental death if the controlling terminal is closed & reopened
Add support for re-creating libvirt XML from sandbox config to facilitate upgrades
Switch to standard gobject introspection autoconf macros
Add ability to set filters on network interfaces
Search /usr/lib instead of /lib for systemd unit files, as the former is the canonical location even when / and /usr are merged
Only set SELinux labels on hosts that support SELinux
Explicitly link to selinux, instead of relying on indirect linkage
Update compiler warning flags
Fix misc docs comments
Don’t assume use of SELinux in virt-sandbox-service
Fix path checks for SUSE in virt-sandbox-service
Add support for AppArmour profiles
Mount /var after other FS to ensure host image is available
Ensure state/config dirs can be accessed when QEMU is running non-root for qemu:///system
Fix mounting of host images in QEMU sandboxes
Mount images as ext4 instead of ext3
Allow use of non-raw disk images as filesystem mounts
Check if required static libs are available at configure time to prevent silent fallback to shared linking
Require libvirt-glib >= 0.2.1
Add support for loading lzma and gzip compressed kmods
Check for support libvirt URIs when starting guests to ensure clear error message upfront
Add LIBVIRT_SANDBOX_INIT_DEBUG env variable to allow debugging of kernel boot messages and sandbox init process setup
Add support for exposing block devices to sandboxes with a predictable name under /dev/disk/by-tag/TAGNAME
Use devtmpfs instead of tmpfs for auto-populating /dev in QEMU sandboxes
Allow setup of sandbox with custom root filesystem instead of inheriting from host’s root.
Allow execution of apps from non-matched ld-linux.so / libc.so, eg executing F19 binaries on F22 host
Use passthrough mode for all QEMU filesystems

QEMU QCow2 built-in encryption: just say no. Deprecated now, to be deleted soon

Posted: March 17th, 2015 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Security | Tags: deprecated, luks, qcow2 | 6 Comments »

A little over 5 years ago now, I wrote about a how libvirt introduced support for QCow2 built-in encryption. The use cases for built-in qcow2 encryption were compelling back then, and remain so today. In particular while LUKS is fine if your disk backend is already a kernel visible block device, it is not a generically usable alternative for QEMU since it requires privileged operation to set it up, would require yet another I/O layer via a loopback or qemu-nbd device, and finally is entirely Linux specific. The direction QEMU has taken over the past few years has in fact been to take the kernel out of the equation for more & more functionality. For example, QEMU can now natively connect to RBD, Gluster, iSCSI and NFS servers with no kernel assistance – the client code is implemented entirely within QEMU block driver layer, which precludes the use of LUKS there.

At the time I wrote that blog post, no one had seriously looked at the QCow2 encryption design to see if it was in any way sane from a security POV. At least if they had, AFAIK, they didn’t make their analysis public. Over time though, various QEMU maintainers did eventually look at the QCow2 encryption code and their conclusions were not positive. The consensus opinion amongst QEMU maintainers today is that QCow2 encryption is terminally flawed in a number of serious ways, including but not limited to:

The AES-CBC cipher is used with predictable initialization vectors based on the sector number. This makes it vulnerable to chosen plaintext attacks which can reveal the existence of encrypted data.
The user passphrase is directly used as the encryption key.
- A poorly chosen or short passphrase will compromise the security of the encryption.
- In the event of the passphrase being compromised there is no way to change the passphrase to protect data in any qcow images.
- It is difficult to make the data permanently inaccessible upon file deletion – at best you can try to destroy data with shred, though even this is ineffective with many modern storage technologies.

By comparison the LUKS encryption format does not suffer from any of these problems. With LUKS the initialization vectors typically use ESSIV to ensure unpredictability; the passphrase is only indirectly used to unlock the master encryption key material, so can be changed at will; the passphrase is put through a PBKDF2 function to mitigate the effects of short sequences of letters; the size of the master key material is artificially inflated with an anti-forensic algorithm to increase the difficulty of recovering the key from deleted volumes.

The QCow2 encryption scheme is a prime example of why merely using a well known standard algorithm (AES) is not sufficient to guarantee a secure implementation. In January 2014, I submitted an update for the QEMU docs to explicitly warn users about the security limitations of QCow2 encryption, which made it into the 1.5.0 release of QEMU. This week Markus has gone one step further and explicitly deprecated use of QCow2 encryption for the forthcoming 2.3.0 release of QEMU. Any attempt to use an encrypted QCow2 file with the QEMU system emulator will now result in a warning being printed to stderr, which in turn ends up in the libvirt logfile for that guest. As well as the security issues, Markus’ other motivation for deprecating this is that the way it is integrated into QEMU block driver layer causes a number of technical & usability problems. So even if we want encrypted block devices in QEMU, the internals for encryption need a complete rewrite from scratch.

In the 2.4.0 release, the intention is to go one step further and actually delete support for QCow2 encryption from the QEMU system emulator entirely, as well as all the infrastructure for block device encryption. We will keep support for decrypting images in the qemu-img program only, to provide a way for users to get their previously encrypted data out into a supported format.

In the immediate future, the recommendation is that users who need encryption for virtual disks should use LUKS on the host, despite the limitations that I noted earlier. At some point in the next 6 months my intention is to start working on a QEMU block driver implementation of the LUKS format, which will enable QEMU to add encryption to any of its virtual disk backends, not merely QCow2. This will require designing new infrastructure for handling decryption keys inside QEMU too, to replace the unsatisfactory approach used today. By using the LUKS format directly though, QEMU will benefit from the security knowledge of those who designed and analysed this format over many years to identify its strengths & weaknesses. It will also provide good interoperability. eg an encrypted qcow2-luks file will be able to be converted to/from a block device for access by the kernel’s LUKS driver with no need to re-encrypt the data, which is very desirable as it lets users decide whether to use in-QEMU or in-kernel block device backends at the flick of a switch.

So just to sum up. Do not ever use QCow2 built-in encryption as it exists today. It is unfixably broken by design. It is deprecated in QEMU 2.3.0 and is likely to be deleted in QEMU 2.4.0.

Nova and its use of Olso Incubator Guru Meditation Reports

Posted: February 19th, 2015 | Filed under: Coding Tips, Fedora, OpenStack, Security, Virt Tools | Tags: error, guru, meditation, nova, oslo, reports | 6 Comments »

This blogs describes a error reporting / troubleshooting feature added to Nova a while back which people are probably not generally aware of.

One of the early things I worked on in the Nova libvirt driver was integration of support for the libvirt event notification system. This allowed Nova to get notified when instances are shutdown by the guest OS administrator, instead of having to wait for the background power state sync task to run. Supporting libvirt events was a theoretically simple concept, but in practice there was a nasty surprise. The key issue was that we needed to have a native thread running the libvirt event loop, while the rest of Nova uses green threads. The code kept getting bizarre deadlocks, which were eventually traced to use of the python logging APIs in the native thread. Since eventlet had monkeypatched the thread mutex primitives, the logging APIs couldn’t be safely called from the native thread as they’d try to obtain a green mutex from a native thread context.

Eventlet has a concept of a backdoor port, which lets you connect to the python process using telnet and get an interactive python prompt. After turning this on, I got a stack trace of all green and native threads and eventually figured out the problem, which was great. Sadly the eventlet backdoor is not something anyone sane would ever enable out of the box on production systems – you don’t really want to allow remote command execution to anyone who can connect to a TCP port :-) Another debugging option is to use Python’s native debugger, but this is again something you have to enable ahead of time and won’t be something admins enable out of the box on production systems. It is possible to connect to a running python process with GDB and get a crude stack trace, but that’s not great either as it requires python debuginfo libraries installed. It would be possible to build an administrative debugging API for Nova using the REST API, but that only works if the REST API & message bus are working – not something that’s going to be much use when Nova itself has deadlocked the entire python interpretor

After this debugging experience I decided to propose something that I’ve had on previous complex systems, a facility that allows an admin to trigger a live error report. Crucially this facility must not require any kind of deployment setup tasks and thus be guaranteed available at all times, especially on production systems where debugging options are limited by the need to maintain service to users. I called the proposal the “Guru Meditation Report” in reference to old Amiga crash reports. I did a quick proof of concept to test the idea, but Solly Ross turned the idea into a complete feature for OpenStack, adding it to the Oslo Incubator in the openstack.common.reports namespace and integrating with Nova. This shipped with the Icehouse release of Nova.

Service integration & usage

Integration into projects is quite straightforward, the openstack-common.conf file needs to list the relevant modules to import from oslo-incubator

$ grep report openstack-common.conf
module=report
module=report.generators
module=report.models
module=report.views
module=report.views.json
module=report.views.text
module=report.views.xml

then each service process needs to initialize the error report system. This just requires a single import line and method call from the main method

$ cat nova/cmd/nova-compute
...snip...
from nova.openstack.common.report import guru_meditation_report as gmr
...snip...
def main():
    gmr.TextGuruMeditation.setup_autorun(version)

    ...run eventlet service or whatever...

The setup_autorun method installs a signal handler connected to SIGUSR1 which will dump an error report to stderr when triggered.

So from Icehouse onwards, if any Nova process is mis-behaving you can simply run something like

$ kill -USR1 `pgrep nova-compute`

to get a detailed error report of the process state sent to stderr. On RHEL-7 / Fedora systems this data should end up going into the systemd journal for that service. On other systems you may be lucky enough for the init service to have redirected stderr to a log file, or unlucky enough to have it sent to /dev/null. Did I mention the systemd journal is a really exactly feature when troubleshooting service problems :-)

Error report information

In the oslo-incubator code there are 5 standard sections defined for the error report

Config – dump of all configuration settings loaded by oslo.config – useful because the config settings loaded in memory don’t necessarily match what is currently stored in /etc/nova/nova.conf on disk – eg admin may have modified the config and forgotten to reload the services.
Package – information about the software package, as passed into the setup_autorun method previously. This lets you know the openstack release version number, the vendor who packaged it and any vendor specific version info such as the RPM release number. This is again key, because what’s installed on the host currently may not match the version that’s actually running. You can’t always trust the admins to give you correct info when reporting bugs, so having software report itself is more reliable.
Process – information about the running process including the process ID, parent process ID, user ID, group ID and scheduler state.
Green Threads – stack trace of every eventlet green thread currently in existence
Native Threads – stack trace of every native thread currently in existence

The report framework is modular, so it is possible to register new generator functions which add further data to the error report. This is useful if there is application specific data that is useful to include, that would not otherwise be suitable for inclusion in oslo-incubator directly. The data model is separated from the output formatting code, so it is possible to output the report in a number of different data formats. The reports which get sent to stderr are using a crude plain text format, but it is possible to have reports generated in XML, JSON, or another completely custom format.

Future improvements

Triggering from a UNIX signal and printing to stderr, is a very simple and reliable approach that we can guarantee will almost always work no matter what operational state the OpenStack deployment as a whole is in. It should not be considered the only possible approach though. I can see that it may be desirable to also wire this up to RPC messaging bus, so a cloud admin can remotely generate an error report for a service and get the response back over the message bus in an XML or JSON format. This wouldn’t replace the SIGUSR1 based stderr dumps, but rather augment them, as we want to retain the ability to trigger reports even if rabbitmq bus connection is dead/broken for some reason.

AFAIK, this error report system is only wired up into the Nova project at this time. It is desirable to bring this feature over to projects like Neutron, Cinder, Glance, Keystone too, so it can be a considered an openstack wide standard system for admins to collect data for troubleshooting. As explained above, this is no more difficult that adding the modules to openstack-common.conf and then adding a single method call to the service startup main method. Those projects might like to register extra error report sections to provide further data, but that’s by no means required for initial integration.

Having error reports triggered on demand by the admin is nice, but I think there is also value in having error reports triggered automatically in response to unexpected error conditions. For example if a RPC request to boot a new VM instance fails, it could be desirable to save a detailed error report, rather than just having an exception hit the logs with no context around it. In such a scenario you would extend the error report generator so that the report included the exception information & stack trace, and also include the headers and/or payload of the RPC request that failed. The error report would probably be written to a file instead of stderr, using JSON or XML. Tools could then be written to analyse error reports and identify commonly recurring problems.

Even with the current level of features for the error report system, it has proved its worth in facilitating the debugging of a number of problems in Nova, where things like the eventlet backdoor or python debugger were impractical to use. I look forward to its continued development and broader usage across openstack.

Example report

What follows below is an example error report from the nova-compute service running in one of my development hosts. Notice that oslo.conf configuration parameters that were declared with the ‘secret’ flag have their value masked. This is primarily aiming to prevent passwords making their way into the error report, since the expectation is the users may attach these reports to public bugs.

========================================================================
====                        Guru Meditation                         ====
========================================================================
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


========================================================================
====                            Package                             ====
========================================================================
product = OpenStack Nova
vendor = OpenStack Foundation
version = 2015.1
========================================================================
====                            Threads                             ====
========================================================================
------                  Thread #140157298652928                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157307045632                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157734876928                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158288500480                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158416287488                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158424680192                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/home/berrange/src/cloud/nova/nova/virt/libvirt/host.py:113 in _native_thread
    `libvirt.virEventRunDefaultImpl()`

/usr/lib64/python2.7/site-packages/libvirt.py:340 in virEventRunDefaultImpl
    `ret = libvirtmod.virEventRunDefaultImpl()`

------                  Thread #140157684520704                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158296893184                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158305285888                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157709698816                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158322071296                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157726484224                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158330464000                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157332223744                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157701306112                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157692913408                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157315438336                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158537955136                   ------

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:346 in run
    `self.wait(sleep_time)`

/usr/lib/python2.7/site-packages/eventlet/hubs/poll.py:85 in wait
    `presult = self.do_poll(seconds)`

/usr/lib/python2.7/site-packages/eventlet/hubs/epolls.py:62 in do_poll
    `return self.poll.poll(seconds)`

------                  Thread #140158313678592                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157718091520                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140157323831040                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

------                  Thread #140158338856704                   ------

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/eventlet/tpool.py:72 in tworker
    `msg = _reqq.get()`

/usr/lib64/python2.7/Queue.py:168 in get
    `self.not_empty.wait()`

/usr/lib64/python2.7/threading.py:339 in wait
    `waiter.acquire()`

========================================================================
====                         Green Threads                          ====
========================================================================
------                        Green Thread                        ------

/usr/bin/nova-compute:9 in <module>
    `load_entry_point('nova==2015.1.dev352', 'console_scripts', 'nova-compute')()`

/home/berrange/src/cloud/nova/nova/cmd/compute.py:74 in main
    `service.wait()`

/home/berrange/src/cloud/nova/nova/service.py:444 in wait
    `_launcher.wait()`

/home/berrange/src/cloud/nova/nova/openstack/common/service.py:187 in wait
    `status, signo = self._wait_for_exit_or_signal(ready_callback)`

/home/berrange/src/cloud/nova/nova/openstack/common/service.py:170 in _wait_for_exit_or_signal
    `super(ServiceLauncher, self).wait()`

/home/berrange/src/cloud/nova/nova/openstack/common/service.py:133 in wait
    `self.services.wait()`

/home/berrange/src/cloud/nova/nova/openstack/common/service.py:473 in wait
    `self.tg.wait()`

/home/berrange/src/cloud/nova/nova/openstack/common/threadgroup.py:145 in wait
    `x.wait()`

/home/berrange/src/cloud/nova/nova/openstack/common/threadgroup.py:47 in wait
    `return self.thread.wait()`

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:175 in wait
    `return self._exit_event.wait()`

/usr/lib/python2.7/site-packages/eventlet/event.py:121 in wait
    `return hubs.get_hub().switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

No Traceback!

------                        Green Thread                        ------

/usr/lib/python2.7/site-packages/eventlet/green/thread.py:40 in __thread_body
    `func(*args, **kwargs)`

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/usr/lib64/python2.7/threading.py:764 in run
    `self.__target(*self.__args, **self.__kwargs)`

/usr/lib/python2.7/site-packages/qpid/selector.py:126 in run
    `rd, wr, ex = select(self.reading, self.writing, (), timeout)`

/usr/lib/python2.7/site-packages/eventlet/green/select.py:83 in select
    `return hub.switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:214 in main
    `result = function(*args, **kwargs)`

/home/berrange/src/cloud/nova/nova/openstack/common/service.py:492 in run_service
    `done.wait()`

/usr/lib/python2.7/site-packages/eventlet/event.py:121 in wait
    `return hubs.get_hub().switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:214 in main
    `result = function(*args, **kwargs)`

/home/berrange/src/cloud/nova/nova/virt/libvirt/host.py:124 in _dispatch_thread
    `self._dispatch_events()`

/home/berrange/src/cloud/nova/nova/virt/libvirt/host.py:228 in _dispatch_events
    `_c = self._event_notify_recv.read(1)`

/usr/lib64/python2.7/socket.py:380 in read
    `data = self._sock.recv(left)`

/usr/lib/python2.7/site-packages/eventlet/greenio.py:464 in recv
    `self._trampoline(self, read=True)`

/usr/lib/python2.7/site-packages/eventlet/greenio.py:439 in _trampoline
    `mark_as_closed=self._mark_as_closed)`

/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py:162 in trampoline
    `return hub.switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

/usr/lib/python2.7/site-packages/eventlet/tpool.py:55 in tpool_trampoline
    `_c = _rsock.recv(1)`

/usr/lib/python2.7/site-packages/eventlet/greenio.py:325 in recv
    `timeout_exc=socket.timeout("timed out"))`

/usr/lib/python2.7/site-packages/eventlet/greenio.py:200 in _trampoline
    `mark_as_closed=self._mark_as_closed)`

/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py:162 in trampoline
    `return hub.switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:214 in main
    `result = function(*args, **kwargs)`

/usr/lib/python2.7/site-packages/oslo_utils/excutils.py:92 in inner_func
    `return infunc(*args, **kwargs)`

/usr/lib/python2.7/site-packages/oslo_messaging/_executors/impl_eventlet.py:93 in _executor_thread
    `incoming = self.listener.poll()`

/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:121 in poll
    `self.conn.consume(limit=1, timeout=timeout)`

/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_qpid.py:755 in consume
    `six.next(it)`

/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_qpid.py:685 in iterconsume
    `yield self.ensure(_error_callback, _consume)`

/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_qpid.py:602 in ensure
    `return method()`

/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_qpid.py:669 in _consume
    `timeout=poll_timeout)`

<string>:6 in next_receiver
    (source not found)

/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py:674 in next_receiver
    `if self._ecwait(lambda: self.incoming, timeout):`

/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py:50 in _ecwait
    `result = self._ewait(lambda: self.closed or predicate(), timeout)`

/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py:580 in _ewait
    `result = self.connection._ewait(lambda: self.error or predicate(), timeout)`

/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py:218 in _ewait
    `result = self._wait(lambda: self.error or predicate(), timeout)`

/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py:197 in _wait
    `return self._waiter.wait(predicate, timeout=timeout)`

/usr/lib/python2.7/site-packages/qpid/concurrency.py:59 in wait
    `self.condition.wait(timeout - passed)`

/usr/lib/python2.7/site-packages/qpid/concurrency.py:96 in wait
    `sw.wait(timeout)`

/usr/lib/python2.7/site-packages/qpid/compat.py:53 in wait
    `ready, _, _ = select([self], [], [], timeout)`

/usr/lib/python2.7/site-packages/eventlet/green/select.py:83 in select
    `return hub.switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

/home/berrange/src/cloud/nova/nova/openstack/common/loopingcall.py:90 in _inner
    `greenthread.sleep(-delay if delay < 0 else 0)`

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:34 in sleep
    `hub.switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

------                        Green Thread                        ------

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:214 in main
    `result = function(*args, **kwargs)`

/home/berrange/src/cloud/nova/nova/openstack/common/loopingcall.py:133 in _inner
    `greenthread.sleep(idle)`

/usr/lib/python2.7/site-packages/eventlet/greenthread.py:34 in sleep
    `hub.switch()`

/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

========================================================================
====                           Processes                            ====
========================================================================
Process 27760 (under 27711) [ run by: berrange (1000), state: running ]

========================================================================
====                         Configuration                          ====
========================================================================

cells: 
  bandwidth_update_interval = 600
  call_timeout = 60
  capabilities = 
    hypervisor=xenserver;kvm
    os=linux;windows
  cell_type = compute
  enable = False
  manager = nova.cells.manager.CellsManager
  mute_child_interval = 300
  name = nova
  reserve_percent = 10.0
  topic = cells

cinder: 
  cafile = None
  catalog_info = volumev2:cinderv2:publicURL
  certfile = None
  cross_az_attach = True
  endpoint_template = None
  http_retries = 3
  insecure = False
  keyfile = None
  os_region_name = None
  timeout = None

conductor: 
  manager = nova.conductor.manager.ConductorManager
  topic = conductor
  use_local = False
  workers = None

database: 
  backend = sqlalchemy
  connection = ***
  connection_debug = 0
  connection_trace = False
  db_inc_retry_interval = True
  db_max_retries = 20
  db_max_retry_interval = 10
  db_retry_interval = 1
  idle_timeout = 3600
  max_overflow = None
  max_pool_size = None
  max_retries = 10
  min_pool_size = 1
  mysql_sql_mode = TRADITIONAL
  pool_timeout = None
  retry_interval = 10
  slave_connection = ***
  sqlite_db = nova.sqlite
  sqlite_synchronous = True
  use_db_reconnect = False
  use_tpool = False

default: 
  allow_migrate_to_same_host = True
  allow_resize_to_same_host = True
  allow_same_net_traffic = True
  amqp_auto_delete = False
  amqp_durable_queues = False
  api_paste_config = /etc/nova/api-paste.ini
  api_rate_limit = False
  auth_strategy = keystone
  auto_assign_floating_ip = False
  backdoor_port = None
  bandwidth_poll_interval = 600
  bindir = /usr/bin
  block_device_allocate_retries = 60
  block_device_allocate_retries_interval = 3
  boot_script_template = /home/berrange/src/cloud/nova/nova/cloudpipe/bootscript.template
  ca_file = cacert.pem
  ca_path = /home/berrange/src/cloud/data/nova/CA
  cert_manager = nova.cert.manager.CertManager
  client_socket_timeout = 900
  cnt_vpn_clients = 0
  compute_available_monitors = 
    nova.compute.monitors.all_monitors
  compute_driver = libvirt.LibvirtDriver
  compute_manager = nova.compute.manager.ComputeManager
  compute_monitors = 
  compute_resources = 
    vcpu
  compute_stats_class = nova.compute.stats.Stats
  compute_topic = compute
  config-dir = None
  config-file = 
    /etc/nova/nova.conf
  config_drive_format = iso9660
  config_drive_skip_versions = 1.0 2007-01-19 2007-03-01 2007-08-29 2007-10-10 2007-12-15 2008-02-01 2008-09-01
  console_host = mustard.gsslab.fab.redhat.com
  console_manager = nova.console.manager.ConsoleProxyManager
  console_topic = console
  consoleauth_manager = nova.consoleauth.manager.ConsoleAuthManager
  consoleauth_topic = consoleauth
  control_exchange = nova
  create_unique_mac_address_attempts = 5
  crl_file = crl.pem
  db_driver = nova.db
  debug = True
  default_access_ip_network_name = None
  default_availability_zone = nova
  default_ephemeral_format = ext4
  default_flavor = m1.small
  default_floating_pool = public
  default_log_levels = 
    amqp=WARN
    amqplib=WARN
    boto=WARN
    glanceclient=WARN
    iso8601=WARN
    keystonemiddleware=WARN
    oslo.messaging=INFO
    qpid=WARN
    requests.packages.urllib3.connectionpool=WARN
    routes.middleware=WARN
    sqlalchemy=WARN
    stevedore=WARN
    suds=INFO
    urllib3.connectionpool=WARN
    websocket=WARN
  default_notification_level = INFO
  default_publisher_id = None
  default_schedule_zone = None
  defer_iptables_apply = False
  dhcp_domain = novalocal
  dhcp_lease_time = 86400
  dhcpbridge = /usr/bin/nova-dhcpbridge
  dhcpbridge_flagfile = 
    /etc/nova/nova.conf
  dmz_cidr = 
  dmz_mask = 255.255.255.0
  dmz_net = 10.0.0.0
  dns_server = 
  dns_update_periodic_interval = -1
  dnsmasq_config_file = 
  ebtables_exec_attempts = 3
  ebtables_retry_interval = 1.0
  ec2_listen = 0.0.0.0
  ec2_listen_port = 8773
  ec2_private_dns_show_ip = False
  ec2_strict_validation = True
  ec2_timestamp_expiry = 300
  ec2_workers = 2
  enable_new_services = True
  enabled_apis = 
    ec2
    metadata
    osapi_compute
  enabled_ssl_apis = 
  fake_call = False
  fake_network = False
  fatal_deprecations = False
  fatal_exception_format_errors = False
  firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver
  fixed_ip_disassociate_timeout = 600
  fixed_range_v6 = fd00::/48
  flat_injected = False
  flat_interface = p1p1
  flat_network_bridge = br100
  flat_network_dns = 8.8.4.4
  floating_ip_dns_manager = nova.network.noop_dns_driver.NoopDNSDriver
  force_config_drive = always
  force_dhcp_release = True
  force_raw_images = True
  force_snat_range = 
  forward_bridge_interface = 
    all
  gateway = None
  gateway_v6 = None
  heal_instance_info_cache_interval = 60
  host = mustard.gsslab.fab.redhat.com
  image_cache_manager_interval = 2400
  image_cache_subdirectory_name = _base
  injected_network_template = /home/berrange/src/cloud/nova/nova/virt/interfaces.template
  instance_build_timeout = 0
  instance_delete_interval = 300
  instance_dns_domain = 
  instance_dns_manager = nova.network.noop_dns_driver.NoopDNSDriver
  instance_format = [instance: %(uuid)s] 
  instance_name_template = instance-%08x
  instance_usage_audit = False
  instance_usage_audit_period = month
  instance_uuid_format = [instance: %(uuid)s] 
  instances_path = /home/berrange/src/cloud/data/nova/instances
  internal_service_availability_zone = internal
  iptables_bottom_regex = 
  iptables_drop_action = DROP
  iptables_top_regex = 
  ipv6_backend = rfc2462
  key_file = private/cakey.pem
  keys_path = /home/berrange/src/cloud/data/nova/keys
  keystone_ec2_insecure = False
  keystone_ec2_url = http://10.33.8.112:5000/v2.0/ec2tokens
  l3_lib = nova.network.l3.LinuxNetL3
  linuxnet_interface_driver = nova.network.linux_net.LinuxBridgeInterfaceDriver
  linuxnet_ovs_integration_bridge = br-int
  live_migration_retry_count = 30
  lockout_attempts = 5
  lockout_minutes = 15
  lockout_window = 15
  log-config-append = None
  log-date-format = %Y-%m-%d %H:%M:%S
  log-dir = None
  log-file = None
  log-format = None
  logging_context_format_string = %(asctime)s.%(msecs)03d %(color)s%(levelname)s %(name)s [%(request_id)s %(user_name)s %(project_name)s%(color)s] %(instance)s%(color)s%(message)s
  logging_debug_format_suffix = from (pid=%(process)d) %(funcName)s %(pathname)s:%(lineno)d
  logging_default_format_string = %(asctime)s.%(msecs)03d %(color)s%(levelname)s %(name)s [-%(color)s] %(instance)s%(color)s%(message)s
  logging_exception_prefix = %(color)s%(asctime)s.%(msecs)03d TRACE %(name)s %(instance)s
  max_age = 0
  max_concurrent_builds = 10
  max_header_line = 16384
  max_local_block_devices = 3
  maximum_instance_delete_attempts = 5
  memcached_servers = None
  metadata_host = 10.33.8.112
  metadata_listen = 0.0.0.0
  metadata_listen_port = 8775
  metadata_manager = nova.api.manager.MetadataManager
  metadata_port = 8775
  metadata_workers = 2
  migrate_max_retries = -1
  mkisofs_cmd = genisoimage
  monkey_patch = False
  monkey_patch_modules = 
    nova.api.ec2.cloud:nova.notifications.notify_decorator
    nova.compute.api:nova.notifications.notify_decorator
  multi_host = True
  multi_instance_display_name_template = %(name)s-%(count)d
  my_block_storage_ip = 10.33.8.112
  my_ip = 10.33.8.112
  network_allocate_retries = 0
  network_api_class = nova.network.api.API
  network_device_mtu = None
  network_driver = nova.network.linux_net
  network_manager = nova.network.manager.FlatDHCPManager
  network_size = 256
  network_topic = network
  networks_path = /home/berrange/src/cloud/data/nova/networks
  non_inheritable_image_properties = 
    bittorrent
    cache_in_nova
  notification_driver = 
  notification_topics = 
    notifications
  notify_api_faults = False
  notify_on_state_change = None
  novncproxy_base_url = http://10.33.8.112:6080/vnc_auto.html
  null_kernel = nokernel
  num_networks = 1
  osapi_compute_listen = 0.0.0.0
  osapi_compute_listen_port = 8774
  osapi_compute_workers = 2
  ovs_vsctl_timeout = 120
  password_length = 12
  pci_alias = 
  pci_passthrough_whitelist = 
  periodic_enable = True
  periodic_fuzzy_delay = 60
  policy_default_rule = default
  policy_dirs = 
    policy.d
  policy_file = policy.json
  preallocate_images = none
  project_cert_subject = /C=US/ST=California/O=OpenStack/OU=NovaDev/CN=project-ca-%.16s-%s
  public_interface = br100
  publish_errors = False
  pybasedir = /home/berrange/src/cloud/nova
  qpid_heartbeat = 60
  qpid_hostname = 10.33.8.112
  qpid_hosts = 
    10.33.8.112:5672
  qpid_password = ***
  qpid_port = 5672
  qpid_protocol = tcp
  qpid_receiver_capacity = 1
  qpid_sasl_mechanisms = 
  qpid_tcp_nodelay = True
  qpid_topology_version = 1
  qpid_username = 
  quota_cores = 20
  quota_driver = nova.quota.DbQuotaDriver
  quota_fixed_ips = -1
  quota_floating_ips = 10
  quota_injected_file_content_bytes = 10240
  quota_injected_file_path_length = 255
  quota_injected_files = 5
  quota_instances = 10
  quota_key_pairs = 100
  quota_metadata_items = 128
  quota_ram = 51200
  quota_security_group_rules = 20
  quota_security_groups = 10
  quota_server_group_members = 10
  quota_server_groups = 10
  reboot_timeout = 0
  reclaim_instance_interval = 0
  remove_unused_base_images = True
  remove_unused_original_minimum_age_seconds = 86400
  report_interval = 10
  rescue_timeout = 0
  reservation_expire = 86400
  reserved_host_disk_mb = 0
  reserved_host_memory_mb = 512
  resize_confirm_window = 0
  resize_fs_using_block_device = False
  resume_guests_state_on_host_boot = False
  rootwrap_config = /etc/nova/rootwrap.conf
  routing_source_ip = 10.33.8.112
  rpc_backend = qpid
  rpc_conn_pool_size = 30
  rpc_response_timeout = 60
  rpc_thread_pool_size = 64
  run_external_periodic_tasks = True
  running_deleted_instance_action = reap
  running_deleted_instance_poll_interval = 1800
  running_deleted_instance_timeout = 0
  scheduler_available_filters = 
    nova.scheduler.filters.all_filters
  scheduler_default_filters = 
    AvailabilityZoneFilter
    ComputeCapabilitiesFilter
    ComputeFilter
    ImagePropertiesFilter
    RamFilter
    RetryFilter
    ServerGroupAffinityFilter
    ServerGroupAntiAffinityFilter
  scheduler_manager = nova.scheduler.manager.SchedulerManager
  scheduler_max_attempts = 3
  scheduler_topic = scheduler
  scheduler_weight_classes = 
    nova.scheduler.weights.all_weighers
  security_group_api = nova
  send_arp_for_ha = True
  send_arp_for_ha_count = 3
  service_down_time = 60
  servicegroup_driver = db
  share_dhcp_address = False
  shelved_offload_time = 0
  shelved_poll_interval = 3600
  shutdown_timeout = 60
  snapshot_name_template = snapshot-%s
  ssl_ca_file = None
  ssl_cert_file = None
  ssl_key_file = None
  state_path = /home/berrange/src/cloud/data/nova
  sync_power_state_interval = 600
  syslog-log-facility = LOG_USER
  tcp_keepidle = 600
  teardown_unused_network_gateway = False
  tempdir = None
  transport_url = None
  until_refresh = 0
  update_dns_entries = False
  use-syslog = False
  use-syslog-rfc-format = False
  use_cow_images = True
  use_forwarded_for = False
  use_ipv6 = False
  use_network_dns_servers = False
  use_project_ca = False
  use_single_default_gateway = False
  use_stderr = True
  user_cert_subject = /C=US/ST=California/O=OpenStack/OU=NovaDev/CN=%.16s-%.16s-%s
  vcpu_pin_set = None
  vendordata_driver = nova.api.metadata.vendordata_json.JsonFileVendorData
  verbose = True
  vif_plugging_is_fatal = True
  vif_plugging_timeout = 300
  virt_mkfs = 
  vlan_interface = 
  vlan_start = 100
  vnc_enabled = True
  vnc_keymap = en-us
  vncserver_listen = 127.0.0.1
  vncserver_proxyclient_address = 127.0.0.1
  volume_api_class = nova.volume.cinder.API
  volume_usage_poll_interval = 0
  vpn_flavor = m1.tiny
  vpn_image_id = 0
  vpn_ip = 10.33.8.112
  vpn_key_suffix = -vpn
  vpn_start = 1000
  wsgi_default_pool_size = 1000
  wsgi_keep_alive = True
  wsgi_log_format = %(client_ip)s "%(request_line)s" status: %(status_code)s len: %(body_length)s time: %(wall_seconds).7f
  xvpvncproxy_base_url = http://10.33.8.112:6081/console

ephemeral_storage_encryption: 
  cipher = aes-xts-plain64
  enabled = False
  key_size = 512

glance: 
  allowed_direct_url_schemes = 
  api_insecure = False
  api_servers = 
    http://10.33.8.112:9292
  host = 10.33.8.112
  num_retries = 0
  port = 9292
  protocol = http

guestfs: 
  debug = False

keymgr: 
  api_class = nova.keymgr.conf_key_mgr.ConfKeyManager

libvirt: 
  block_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER, VIR_MIGRATE_LIVE, VIR_MIGRATE_TUNNELLED, VIR_MIGRATE_NON_SHARED_INC
  checksum_base_images = False
  checksum_interval_seconds = 3600
  connection_uri = 
  cpu_mode = none
  cpu_model = None
  disk_cachemodes = 
  disk_prefix = None
  gid_maps = 
  glusterfs_mount_point_base = /home/berrange/src/cloud/data/nova/mnt
  hw_disk_discard = None
  hw_machine_type = None
  image_info_filename_pattern = /home/berrange/src/cloud/data/nova/instances/_base/%(image)s.info
  images_rbd_ceph_conf = 
  images_rbd_pool = rbd
  images_type = default
  images_volume_group = None
  inject_key = False
  inject_partition = -2
  inject_password = False
  iscsi_iface = None
  iscsi_use_multipath = False
  iser_use_multipath = False
  live_migration_bandwidth = 0
  live_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER, VIR_MIGRATE_LIVE, VIR_MIGRATE_TUNNELLED
  live_migration_uri = qemu+ssh://berrange@%s/system
  mem_stats_period_seconds = 10
  nfs_mount_options = None
  nfs_mount_point_base = /home/berrange/src/cloud/data/nova/mnt
  num_aoe_discover_tries = 3
  num_iscsi_scan_tries = 5
  num_iser_scan_tries = 5
  qemu_allowed_storage_drivers = 
  quobyte_client_cfg = None
  quobyte_mount_point_base = /home/berrange/src/cloud/data/nova/mnt
  rbd_secret_uuid = None
  rbd_user = None
  remove_unused_kernels = False
  remove_unused_resized_minimum_age_seconds = 3600
  rescue_image_id = None
  rescue_kernel_id = None
  rescue_ramdisk_id = None
  rng_dev_path = None
  scality_sofs_config = None
  scality_sofs_mount_point = /home/berrange/src/cloud/data/nova/scality
  smbfs_mount_options = 
  smbfs_mount_point_base = /home/berrange/src/cloud/data/nova/mnt
  snapshot_compression = False
  snapshot_image_format = None
  snapshots_directory = /home/berrange/src/cloud/data/nova/instances/snapshots
  sparse_logical_volumes = False
  sysinfo_serial = auto
  uid_maps = 
  use_usb_tablet = False
  use_virtio_for_bridges = True
  virt_type = kvm
  volume_clear = zero
  volume_clear_size = 0
  volume_drivers = 
    aoe=nova.virt.libvirt.volume.LibvirtAOEVolumeDriver
    fake=nova.virt.libvirt.volume.LibvirtFakeVolumeDriver
    fibre_channel=nova.virt.libvirt.volume.LibvirtFibreChannelVolumeDriver
    glusterfs=nova.virt.libvirt.volume.LibvirtGlusterfsVolumeDriver
    gpfs=nova.virt.libvirt.volume.LibvirtGPFSVolumeDriver
    iscsi=nova.virt.libvirt.volume.LibvirtISCSIVolumeDriver
    iser=nova.virt.libvirt.volume.LibvirtISERVolumeDriver
    local=nova.virt.libvirt.volume.LibvirtVolumeDriver
    nfs=nova.virt.libvirt.volume.LibvirtNFSVolumeDriver
    quobyte=nova.virt.libvirt.volume.LibvirtQuobyteVolumeDriver
    rbd=nova.virt.libvirt.volume.LibvirtNetVolumeDriver
    scality=nova.virt.libvirt.volume.LibvirtScalityVolumeDriver
    sheepdog=nova.virt.libvirt.volume.LibvirtNetVolumeDriver
    smbfs=nova.virt.libvirt.volume.LibvirtSMBFSVolumeDriver
  wait_soft_reboot_seconds = 120
  xen_hvmloader_path = /usr/lib/xen/boot/hvmloader

osapi_v3: 
  enabled = True
  extensions_blacklist = 
  extensions_whitelist = 

oslo_concurrency: 
  disable_process_locking = False
  lock_path = /home/berrange/src/cloud/data/nova

rdp: 
  enabled = False
  html5_proxy_base_url = http://127.0.0.1:6083/

remote_debug: 
  host = None
  port = None

serial_console: 
  base_url = ws://127.0.0.1:6083/
  enabled = False
  listen = 127.0.0.1
  port_range = 10000:20000
  proxyclient_address = 127.0.0.1

spice: 
  agent_enabled = True
  enabled = False
  html5proxy_base_url = http://10.33.8.112:6082/spice_auto.html
  keymap = en-us
  server_listen = 127.0.0.1
  server_proxyclient_address = 127.0.0.1

ssl: 
  ca_file = None
  cert_file = None
  key_file = None

upgrade_levels: 
  baseapi = None
  cells = None
  compute = None
  conductor = None
  console = None
  consoleauth = None
  network = None
  scheduler = None

workarounds: 
  disable_libvirt_livesnapshot = True
  disable_rootwrap = False

Usage of the libvirt virCommand APIs for process spawning

Posted: October 2nd, 2014 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: bash, commands, exec, fork, popen, shellshock, spawn, system, virCommand | 4 Comments »

The previous blog post looked at the history of libvirt APIs for spawning processes, up to the current day where there is a single virCommand object + APIs for spawning processes in a very flexible manner. This blog post will now look at the key features of this API and how it is used in practice.

Example usage

Before going into the dry details, lets consider a couple of real world examples where libvirt uses these APIs.

“system” replacement

As a first example, the “virNodeSuspendSetNodeWakeup” method is a place where libvirt might have traditionally used the ‘system’ call.

The goal is to suspend the host, setting a pre-defined wakeup alarm, for which libvirt needs to run the ‘rtcwake’ command. The wakeup time is provided to the API in terms of a unsigned long long which needs to be converted to a string.

If attempting this with the ‘system’ call the code might look like this:

static int virNodeSuspendSetNodeWakeup(unsigned long long alarmTime)
{
    char *setAlarmCmd;
    int ret = -1;

    if (asprintf(&setAlarmCmd, "rtcwake -m no -s %lld", alarmTime) < 0)
        goto cleanup;

    if (system(setAlarmCmd) < 0)
        goto cleanup;

    ret = 0;

 cleanup:
    free(setAlarmCmd);
    return ret;
}

Now consider what this would look like when using the virCommand APIs:

static int virNodeSuspendSetNodeWakeup(unsigned long long alarmTime)
{
    virCommandPtr setAlarmCmd;
    int ret = -1;

    setAlarmCmd = virCommandNewArgList("rtcwake", "-m", "no", "-s", NULL);
    virCommandAddArgFormat(setAlarmCmd, "%lld", alarmTime);

    if (virCommandRun(setAlarmCmd, NULL) < 0)
        goto cleanup;

    ret = 0;

 cleanup:
    virCommandFree(setAlarmCmd);
    return ret;
}

The difference in code complexity is negligible, but the difference in the quality of the implementation is significant.

“popen” replacement

As a second example, the “virStorageBackendIQNFound” method is a place where libvirt might have traditionally used the ‘popen’ call.

The goal this time is to run the iscsiadm command with a number of arguments and parse its stdout to look for a particular iSCSI target.

First consider what this might look like when using ‘popen’

static int virStorageBackendIQNFound(const char *initiatoriqn,
                                          char **ifacename)
{
    int ret = -1;
    FILE *fp = NULL;
    int fd = -1;
    char line[4096]

    if ((fp = popen("iscsiadm --mode iface")) == NULL)
        goto cleanup;

    while (fgets(line, 4096, fp) != NULL) {
       ...analyse line for a match... 
    }

    ret = 0;

cleanup:
    pclose(fp);
    close(fd);
    virCommandFree(cmd);

    return ret;
}

Now consider the re-write to use virCommand APIs

static int virStorageBackendIQNFound(const char *initiatoriqn,
                                          char **ifacename)
{
    int ret = -1;
    FILE *fp = NULL;
    int fd = -1;
    char line[4096]
    virCommandPtr cmd = virCommandNewArgList("iscsiadm", "--mode", "iface", NULL);

    virCommandSetOutputFD(cmd, &fd);
    if (virCommandRunAsync(cmd, NULL) < 0)
        goto cleanup;

    if ((fp = fdopen(fd, "r")) == NULL)
        goto cleanup;

    while (fgets(line, 4096, fp) != NULL) {
       ...analyse line for a match... 
    }
 
    if (virCommandWait(cmd, NULL) < 0)
        goto cleanup;
 
    ret = 0;

cleanup:
    fclose(fp);
    close(fd);
    virCommandFree(cmd);

    return ret;
}

There is a little more work todo for virCommand in terms of initial setup in this example. Technically the call to ‘virCommandWait’ could have been omitted here, since we don’t care about the exit status, but it is good practice to included it. If we had extra dynamic arguments to be provided to ‘iscsiadm’ that needed string formatting the two examples would have been nearer parity in terms of complexity. Even with the slightly longer code for virCommand, the result is a clear win from the quality POV avoiding the many flaws of popen’s implementation.

Detailed API examination

Now that the two examples have given a taste of what the virCommand APIs can do to replace popen/system, lets consider the full set of features exposed. After this it should be clear that the flexibility of the virCommand means there is never any need to delve into fork+exec anymore, let alone popen/system.

Constructing the command arguments

Probably the first task when spawning a command is actually construct the array of arguments and environment variables. In simple cases this can be done immediately when allocating the new virCommad object instance, for example using var-args

virCommandPtr cmd = virCommandNewArgList("touch", "/tmp/foo", NULL);

there is no need to check the return value of virCommandNew* for NULL. The later virCommandRun() API will look for a NULL pointer and report the OOM error at that point instead. The same is true for all error reporting in these APIs – virCommandRun is generally the only place where an error check is needed. This simplification in error handling is a major contributor to making this APIs hard to mis-use in calling code and thus minimizing errors.

Sometimes the list of arguments is not so simple that it can be initialized in one go via var-args. To deal with this it is possible to add arguments to an existing constructed command in a variety of ways

virCommandAddArgFormat(cmd, "--size=%d", 1025);
virCommandAddArgPair(cmd, "--user", "fred");
virCommandAddArgList(cmd, "some", "extra", "args", NULL);

This is handy because for complex command lines (eg those used with QEMU) it allows construction of the virCommand to be split up across multiple functions, each adding their own piece of the command line.

Setting up the environment

By default a process spawned will inherit the full environment of the parent process (almost always libvirtd in libvirt code). With things like QEMU though libvirt wants to be in complete control of the environment it runs under, so it will filter the environment to a subset of names. There are a couple of env variables that are always desired to pass down LD_PRELOAD, LD_LIBRARY_PATH, PATH, USER, HOME, LOGNAME & TMPDIR. If this set is desired there is a convenient method to request passthrough of this set

virCommandAddEnvPassCommand(cmd);

Additional environment variables can be set for passthrough from libvirtd. When passing through environment variables libvirt requires an explicit decision on whether the env variable is safe to pass when running setuid. If an env variable is considered unsafe for a setuid application, there is the option of passing a default value to substitute. The “PATH” variable is unsafe to pass when setuid, and should be set to a known safe value when running setuid:

virCommandAddEnvPassBlockSUID(cmd, "PATH", "/bin:/usr/bin");

The “LD_LIBRARY_PATH” variable is also unsafe when running setuid and should simply be dropped from the environment entirely:

virCommandAddEnvPassBlockSUID(cmd, "LD_LIBRARY_PATH", NULL);

Finally the “LOGNAME” is fine to allow even when setuid so can be left unchanged

virCommandAddEnvPassAllowSUID(cmd, "LOGNAME");

It is not always sufficient to just passthrough existing environment variables, so there are of course APIs to set them directly

virCommandAddEnvPair(cmd, "LOGNAME", "fred");
virCommandAddEnvFormat(cmd, "LOGNAME=%s", fred);
virCommandAddEnvString(cmd, "LOGNAME=fred");

Setting security attributes

Under UNIX a program will inherit process limits, umask and working directory from the parent. It is thus desirable to be able to override this, for example:

virCommandSetMaxFiles(cmd, 65536);

Along a similar vein the child’s umask can also be set

virCommandSetUmask(cmd, 0007);

If the current process’ working directory is unknown, it is a good idea to force an explicit working directory:

virCommandSetWorkingDirectory(cmd, "/");

It may sometimes be desirable to control what capabilities bits a child process has, to override the default behaviour. In such cases the command would to be initialized with the empty set and then the desired bits whitelisted.

virCommandClearCaps(cmd);
virCommandAllowCap(cmd, CAP_NET_RAW);

Finally when interacting with mandatory access control systems like SELinux or AppArmour it is possible to configure an explicit label for the child

virCommandSetSELinuxLabel(cmd, "system_u:system_r:svirt_t:s0:c135,c275");
virCommandSetAppArmorProfile(cmd, "2cb0e828-e6f6-40d1-b0f5-c50cdf34f5c9");

Interacting with stdio

A common requirement when spawning processes is to be able to interact with the child’s stdio in some manner. By default with the virCommand APIs, a process will get its stdin, stdout & stderr connected to /dev/null. For stdin there is a choice of feeding it a fixed length string, or connecting it up to the read end of an existing file descriptor, typically a pipe

virCommandSetInputBuffer(cmd, "Feed me brains");
virCommandSetInputFD(cmd, pipefd);

There are a similar pair of choices for receiving the stdout/stderr from the child. It is possible to supply a pointer to a ‘char *’ which will be filled with the child’s output upon exit. Alternatively a pointer to an ‘int’ can be provided, which can either specify an existing file descriptor, or if ‘-1’ a new anonymous pipe will be created.

char *child_out, *child_err;
virCommandSetOutputBuffer(cmd, &child_out);
virCommandSetErrorBuffer(cmd, &child_err);

int child_out -1, child_err = -1;
virCommandSetOutputFD(cmd, &child_out);
virCommandSetOutputFD(cmd, &child_err);

Passing file descriptors

Aside from any associated with stdio, all file descriptors will be closed when the child process is launched. This is generally a good thing since it prevents any leakage of file descriptors to the child. Such leakage can be a security flaw, and unless using glibc extensions to POSIX, it is not possible to avoid the race condition with setting O_CLOEXEC, so an explicit mass close is a very desirable approach. There can be times when it is is necessary to pass other arbitrary file descriptors to a child process. When passing a file descriptor it may also be desirable to close it in the parent process.

virCommandPassFD(cmd, 8, 0);
virCommandPassFD(cmd, 8, VIR_COMMAND_PASS_FD_CLOSE_PARENT);

Pre-exec callback hooks

Despite its wide range of features, there are times when the virCommand API is not sufficient for the job. In these cases there is the ability to request that a callback function be invoked immediately before exec’ing the child binary. This allows the caller to do arbitrary extra work, though of course bearing in mind POSIX’s rules about which functions are safe to use between fork+exec in a threaded application.

int my_hook(void *data)
{
    if (sometask(data) < 0)
        return -1;
    return 0;
}

virCommandSetPreExecHook(cmd, my_hook, "thefilename");

Executing the command

Everything until this point has been about setting up the command args and the constraints under which it will execute. None of the APIs shown so far require any kind of error checking of return values. Only now that it is time to execute the command will errors be reported and checked by the caller. The simplest way to execute is to use ‘virCommandRun’ which will block until the command finishes running and report the exit status

int status;
if (virCommandRun(cmd, &status) < 0) {
    virCommandFree(cmd);
    return -1;
}

virCommandFree(cmd);

It is possible to leave the ‘status’ arg as NULL in which case the API will turn any non-zero exit status into a fatal error. If the intention is to interact with the command via one or more file descriptors connected to stdio, then a slightly more flexible ‘virCommandRunAsync’ API is required. This call will only block until the process is actually running. The parent can then interact with it and when ready call ‘virCommandWait’.

int status;
pid_t child;
if (virCommandRunAsync(cmd, &child) < 0) {
    virCommandFree(cmd);
    return -1;
}

...interact with child via stdio or other means...
if (virCommandWait(cmd, &status) < 0) {
    virCommandFree(cmd);
    return -1;
}

virCommandFree(cmd);

If something goes wrong during interaction, it is possible to terminate the process with prejudice by using ‘virCommandAbort’ instead of ‘virCommandWait’.

Synchronizing with the child

Normally, once a child has fork’d off, the child & parent will continue execution in parallel, with the parent having no idea at which point the final exec() will have been performed. There can be cases where the parent needs to do some work in between the fork and exec. A pre-exec hook can often be used for this, but the work needs to take place in the parent process another solution is required. To deal with this it is possible to request a handshake take place with the child process. Before the child exec’s its binary, it will notify the parent process that is wants to handshake and wait for a reply. The parent process meanwhile will wait to be notify, then do its work and finally reply to the child again

virCommandRequireHandshake(cmd);

if (virCommandRunAsync(cmd) < 0)
    return -1;

virCommandHandshakeWait(cmd);
...do setup work...
virCommandHandshakeNotify(cmd);

Simplified command execution

The examples above are very powerful, but in the simplest use cases it is possible to combine the virCommandNew + virCommandRun + virCommandFree calls into a single API call

const char *args[] = { "/bin/program", "arg1", "arg2", NULL };
if (virRun(args, NULL) < 0)
  return -1;

This is pretty much equivalent to ‘system’ in terms of complexity, but much safer as it avoids the shell and many other problems mentioned previously.

Integration with unit tests

One of the particularly interesting features of the virCommand APIs is the ability to do unit testing of code that otherwise spawns external commands. The test suite can define a callback that will be invoked any time an attempt is made to run a command. This callback can analyse stdin string, fill in stdout/stderr strings and set the exit status. This is sufficient to avoid the need to run the real command in the context of unit tests in most cases.

static void testCommandCallback(const char *const*args,
                                const char *const*env,
                                const char *input,
                                char **output,
                                char **error,
                                int *status,
                                void *opaque)
{
    ....fake the exit status and fill in **output or **error...
}

virCommandSetDryRun(NULL, testCommandCallback, NULL);

The End

That completes our whirlwind tour of libvirts APIs for spawning child processes. It should be clear that a lot of thought & effort has gone into designing a set of APIs that maximize safety without compromising on ease of use. There can really be no excuse for using either the popen or system calls for spawning programs & thus leaving yourself vulnerable to flaws like shellshock. The libvirt code described in this post is all available under the terms of the LGPLv2+ should anyone wish to pull out & adapt the virCommand APIs for their own programs. I look forward to the day when it is possible to use a Linux system with no reliance on shell by any program. Shell should be exclusively for use by interactive login sessions and administrator local scripting work, not a part of applications where it only ever leads to misery & insecurity.

Daniel P. Berrangé

Writing about open source software, virtualization & more