Easier QEMU live tracing using systemtap

Posted: February 18th, 2019 | Filed under: Coding Tips, Fedora, Virt Tools | Tags: , , , | No Comments »

QEMU is able to leverage a number of live tracing systems, the choice configurable at build time between

  • log – printf formatted string for each event sent into QEMU’s logging system which writes to stderr
  • syslog – printf formatted string for each event sent via syslog
  • simple – binary data stream for each event written to a file or fifo pipe
  • ftrace – printf formatted string for each event sent to kernel ftrace facility
  • dtrace – user space probe markers dynamically enabled via dtrace or systemtap
  • ust – user space probe markers dynamically enabled via LTT-ng

Upstream QEMU enables the “log” trace backend by default since it is cross-platform portable and very simple to use by adding “-d trace:PATTERN” on the QEMU command line. For example to enable logging of all trace events in the QEMU I/O subsystem (aka “qio“) we can

$ qemu -d trace:qio* ...some args...
23266@1547735759.137292:qio_channel_socket_new Socket new ioc=0x563a8a39d400
23266@1547735759.137305:qio_task_new Task new task=0x563a891d0570 source=0x563a8a39d400 func=0x563a86f1e6c0 opaque=0x563a89078000
23266@1547735759.137326:qio_task_thread_start Task thread start task=0x563a891d0570 worker=0x563a86f1ce50 opaque=0x563a891d9d90
23273@1547735759.137491:qio_task_thread_run Task thread run task=0x563a891d0570
23273@1547735759.137503:qio_channel_socket_connect_sync Socket connect sync ioc=0x563a8a39d400 addr=0x563a891d9d90
23273@1547735759.138108:qio_channel_socket_connect_fail Socket connect fail ioc=0x563a8a39d400

This is very simple and surprisingly effective much of the time, but it is not without its downsides

  • Inactive probes have non-negligible performance impact on hot codepaths
  • It is targetted to human consumption, so it is not easy to reliably process with machines
  • It requires adding arguments to QEMU’s command line so is not easy to enable in many cases
  • It is custom to QEMU so does not facilitate getting correlated traces across the whole system

For these reasons, some downstreams chose not to use the default “log” backend. Both Fedora and RHEL have instead enabled the “dtrace” backend which can be leveraged via systemtap on Linux. This provides a very powerful tracing system, but the cost is that the previous simple task of printing a formatted string when a probe point fires has become MUCH more complicated. For example to get equivalent output to that seen with QEMU’s log backend would require

# cat > trace.stp <<EOF
probe qemu.system.x86_64.qio_task_new {
    printf("%d@%d qio_task_new Task new task=%p source=%p func=%p opaque=%p\n", 
           pid(), gettimeofday_ns(), task, source, func, opaque)
}
EOF
# stap trace.stp
22806@1547735341399862570 qio_task_new Task new task=0x56135cd66eb0 source=0x56135d1d7c00 func=0x56135af746c0 opaque=0x56135bf06400

Repeat that code snippet for every qio* probe point you want to watch, figuring out the set of args it has available to print.This quickly becomes tedious for what should be a simple logging job, especially if you need to reference null terminated strings from userspace.

After cursing this difficulty one time too many, it occurred to me that QEMU could easily do more to make life easier for systemtap users. The QEMU build system is already auto-generating all the trace backend specific code from a generic description of probes in the QEMU source tree. It has a format string which is used in the syslog, log and ftrace backends, but this is ignored for the dtrace backend. It did not take much to change the code generator so that it can use this format string to generate a convenient systemtap tapset representing the above manually written probe:

probe qemu.system.x86_64.log.qio_task_new = qemu.system.x86_64.qio_task_new ?
{
    printf("%d@%d qio_task_new Task new task=%p source=%p func=%p opaque=%p\n",
           pid(), gettimeofday_ns(), task, source, func, opaque)
}

This can be trivially executed with minimal knowledge of systemtap tapset language required

# stap -e "qemu.system.x86_64.log.qio_task_new{}"
22806@1547735341399862570 qio_task_new Task new task=0x56135cd66eb0 source=0x56135d1d7c00 func=0x56135af746c0 opaque=0x56135bf06400

Even better, we have now gained the ability to use wildcards too

# stap -e "qemu.system.x86_64.log.qio*{}"
23266@1547735759.137292:qio_channel_socket_new Socket new ioc=0x563a8a39d400
23266@1547735759.137305:qio_task_new Task new task=0x563a891d0570 source=0x563a8a39d400 func=0x563a86f1e6c0 opaque=0x563a89078000
23266@1547735759.137326:qio_task_thread_start Task thread start task=0x563a891d0570 worker=0x563a86f1ce50 opaque=0x563a891d9d90
23273@1547735759.137491:qio_task_thread_run Task thread run task=0x563a891d0570
23273@1547735759.137503:qio_channel_socket_connect_sync Socket connect sync ioc=0x563a8a39d400 addr=0x563a891d9d90
23273@1547735759.138108:qio_channel_socket_connect_fail Socket connect fail ioc=0x563a8a39d400

Users still, however, need to be aware of the naming convention for QEMU’s systemtap tapsets and how it maps to the particular QEMU binary that is used, and don’t forget the trailing “{}”. Thus I decided to go one step further and ship a small helper tool to make it even easier to use

$ qemu-trace-stap run qemu-system-x86_64 'qio*'
22806@1547735341399856820 qio_channel_socket_new Socket new ioc=0x56135d1d7c00
22806@1547735341399862570 qio_task_new Task new task=0x56135cd66eb0 source=0x56135d1d7c00 func=0x56135af746c0 opaque=0x56135bf06400
22806@1547735341399865943 qio_task_thread_start Task thread start task=0x56135cd66eb0 worker=0x56135af72e50 opaque=0x56135c071d70
22806@1547735341399976816 qio_task_thread_run Task thread run task=0x56135cd66eb0

The second argument to this tool is the QEMU binary filename to be traced, which can be relative (to search $PATH) or absolute. What is clever is that it will set the SYSTEMTAP_TAPSET env variable to point to the right location to find the corresponding tapset definition. This is very useful when you have multiple copies of QEMU on the system and need to make sure systemtap traces the right one.

The ‘qemu-trace-stap‘ script takes a verbose arg so you can understand what it is running behind the scenes:

$ qemu-trace-stap run /home/berrange/usr/qemu-git/bin/qemu-system-x86_64 'qio*'
Using tapset dir '/home/berrange/usr/qemu-git/share/systemtap/tapset' for binary '/home/berrange/usr/qemu-git/bin/qemu-system-x86_64'
Compiling script 'probe qemu.system.x86_64.log.qio* {}'
Running script, <Ctrl>-c to quit
...trace output...

It can enable multiple probes at once

$ qemu-trace-stap run qemu-system-x86_64 'qio*' 'qcrypto*' 'buffer*'

By default it monitors all existing running processes and all future launched proceses. This can be restricted to a specific PID using the –pid arg

$ qemu-trace-stap run --pid 2532 qemu-system-x86_64 'qio*'

Finally if you can’t remember what probes are valid it can tell you

$ qemu-trace-stap list qemu-system-x86_64
ahci_check_irq
ahci_cmd_done
ahci_dma_prepare_buf
ahci_dma_prepare_buf_fail
ahci_dma_rw_buf
ahci_irq_lower
...snip...

This new functionality merged into QEMU upstream a short while ago and will be included in the QEMU 4.0 release coming at the end of April.

CPU model configuration for QEMU/KVM on x86 hosts

Posted: June 29th, 2018 | Filed under: Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: , , , , , , , | 15 Comments »

With the various CPU hardware vulnerabilities reported this year, guest CPU configuration is now a security critical task. This blog post contains content I’ve written that is on its way to become part of the QEMU documentation.

QEMU / KVM virtualization supports two ways to configure CPU models

Host passthrough
This passes the host CPU model features, model, stepping, exactly to the guest. Note that KVM may filter out some host CPU model features if they cannot be supported with virtualization. Live migration is unsafe when this mode is used as libvirt / QEMU cannot guarantee a stable CPU is exposed to the guest across hosts. This is the recommended CPU to use, provided live migration is not required.
Named model
QEMU comes with a number of predefined named CPU models, that typically refer to specific generations of hardware released by Intel and AMD. These allow the guest VMs to have a degree of isolation from the host CPU, allowing greater flexibility in live migrating between hosts with differing hardware.

In both cases, it is possible to optionally add or remove individual CPU features, to alter what is presented to the guest by default.

Libvirt supports a third way to configure CPU models known as “Host model”. This uses the QEMU “Named model” feature, automatically picking a CPU model that is similar the host CPU, and then adding extra features to approximate the host model as closely as possible. This does not guarantee the CPU family, stepping, etc will precisely match the host CPU, as they would with “Host passthrough”, but gives much of the benefit of passthrough, while making live migration safe.

Recommendations for KVM CPU model configuration on x86 hosts

The information that follows provides recommendations for configuring CPU models on x86 hosts. The goals are to maximise performance, while protecting guest OS against various CPU hardware flaws, and optionally enabling live migration between hosts with hetergeneous CPU models.

Preferred CPU models for Intel x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

Skylake-Server
Skylake-Server-IBRS
Intel Xeon Processor (Skylake, 2016)
Skylake-Client
Skylake-Client-IBRS
Intel Core Processor (Skylake, 2015)
Broadwell
Broadwell-IBRS
Broadwell-noTSX
Broadwell-noTSX-IBRS
Intel Core Processor (Broadwell, 2014)
Haswell
Haswell-IBRS
Haswell-noTSX
Haswell-noTSX-IBRS
Intel Core Processor (Haswell, 2013)
IvyBridge
IvyBridge-IBRS
Intel Xeon E3-12xx v2 (Ivy Bridge, 2012)
SandyBridge
SandyBridge-IBRS
Intel Xeon E312xx (Sandy Bridge, 2011)
Westmere
Westmere-IBRS
Westmere E56xx/L56xx/X56xx (Nehalem-C, 2010)
Nehalem
Nehalem-IBRS
Intel Core i7 9xx (Nehalem Class Core i7, 2008)
Penryn
Intel Core 2 Duo P9xxx (Penryn Class Core 2, 2007)
Conroe
Intel Celeron_4x0 (Conroe/Merom Class Core 2, 2006)

Important CPU features for Intel x86 hosts

The following are important CPU features that should be used on Intel x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

pcid
Recommended to mitigate the cost of the Meltdown (CVE-2017-5754) fix. Included by default in Haswell, Broadwell & Skylake Intel CPU models. Should be explicitly turned on for Westmere, SandyBridge, and IvyBridge Intel CPU models. Note that some desktop/mobile Westmere CPUs cannot support this feature.
spec-ctrl
Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in Intel CPU models with -IBRS suffix. Must be explicitly turned on for Intel CPU models without -IBRS suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any Intel CPU model. Must be explicitly turned on for all Intel CPU models. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
pdpe1gb
Recommended to allow guest OS to use 1GB size pages.Not included by default in any Intel CPU model. Should be explicitly turned on for all Intel CPU models. Note that not all CPU hardware will support this feature.

Preferred CPU models for AMD x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

EPYC
EPYC-IBPB
AMD EPYC Processor (2017)
Opteron_G5
AMD Opteron 63xx class CPU (2012)
Opteron_G4
AMD Opteron 62xx class CPU (2011)
Opteron_G3
AMD Opteron 23xx (Gen 3 Class Opteron, 2009)
Opteron_G2
AMD Opteron 22xx (Gen 2 Class Opteron, 2006)
Opteron_G1
AMD Opteron 240 (Gen 1 Class Opteron, 2004)

Important CPU features for AMD x86 hosts

The following are important CPU features that should be used on AMD x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

ibpb
Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in AMD CPU models with -IBPB suffix. Must be explicitly turned on for AMD CPU models without -IBPB suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
virt-ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This should be provided to guests, even if amd-ssbd is also provided, for maximum guest compatibility. Note for some QEMU / libvirt versions, this must be force enabled when when using “Host model”, because this is a virtual feature that doesn’t exist in the physical host CPUs.
amd-ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This provides higher performance than virt-ssbd so should be exposed to guests whenever available in the host. virt-ssbd should none the less also be exposed for maximum guest compatability as some kernels only know about virt-ssbd.
amd-no-ssb
Recommended to indicate the host is not vulnerable CVE-2018-3639. Not included by default in any AMD CPU model. Future hardware genarations of CPU will not be vulnerable to CVE-2018-3639, and thus the guest should be told not to enable its mitigations, by exposing amd-no-ssb. This is mutually exclusive with virt-ssbd and amd-ssbd.
pdpe1gb
Recommended to allow guest OS to use 1GB size pages. Not included by default in any AMD CPU model. Should be explicitly turned on for all AMD CPU models. Note that not all CPU hardware will support this feature.

Default x86 CPU models

The default QEMU CPU models are designed such that they can run on all hosts. If an application does not wish to do perform any host compatibility checks before launching guests, the default is guaranteed to work.

The default CPU models will, however, leave the guest OS vulnerable to various CPU hardware flaws, so their use is strongly discouraged. Applications should follow the earlier guidance to setup a better CPU configuration, with host passthrough recommended if live migration is not needed.

qemu32
qemu64
QEMU Virtual CPU version 2.5+ (32 & 64 bit variants). qemu64 is used for x86_64 guests and qemu32 is used for i686 guests, when no -cpu argument is given to QEMU, or no <cpu> is provided in libvirt XML.

Other non-recommended x86 CPUs

The following CPUs models are compatible with most AMD and Intel x86 hosts, but their usage is discouraged, as they expose a very limited featureset, which prevents guests having optimal performance.

kvm32
kvm64
Common KVM processor (32 & 64 bit variants). Legacy models just for historical compatibility with ancient QEMU versions.
486
athlon
phenom
coreduo
core2duo
n270
pentium
pentium2
pentium3
Various very old x86 CPU models, mostly predating the introduction of hardware assisted virtualization, that should thus not be required for running virtual machines.

Syntax for configuring CPU models

The example below illustrate the approach to configuring the various CPU models / features in QEMU and libvirt

QEMU command line

Host passthrough
   $ qemu-system-x86_64 -cpu host

With feature customization:

   $ qemu-system-x86_64 -cpu host,-vmx,...
Named CPU models
   $ qemu-system-x86_64 -cpu Westmere

With feature customization:

   $ qemu-system-x86_64 -cpu Westmere,+pcid,...

Libvirt guest XML

Host passthrough
   <cpu mode='host-passthrough'/>

With feature customization:

   <cpu mode='host-passthrough'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>
Host model
   <cpu mode='host-model'/>

With feature customization:

   <cpu mode='host-model'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>
Named model
   <cpu mode='custom'>
       <model>Westmere</model>
   </cpu>

With feature customization:

   <cpu mode='custom'>
       <model>Westmere</model>
       <feature name="pcid" policy="require"/>
       ...
   </cpu>

 

The Fedora virtualization software archive (aka virt-ark)

Posted: February 9th, 2018 | Filed under: Coding Tips, Fedora, libvirt, Virt Tools | Tags: , , , | 1 Comment »

With libvirt releasing 11 times a year and QEMU releasing three times a year, there is a quite large set of historical releases available by now. Both projects have a need to maintain compatibility across releases in varying areas. For QEMU the most important thing is that versioned machine types present the same guest ABI across releases. ie a ‘pc-2.0.0’ machine on QEMU 2.0.0, should be identical to a ‘pc-2.0.0’ machine on QEMU 2.5.0. If this rule is violated, the ability to live migrate and save/restore is doomed. For libvirt the most important thing is that a given guest configuration should be usable across many QEMU versions, even if the command line arguments required to achieve the configuration in QEMU have changed. This is key to libvirt’s promise that upgrading either libvirt or QEMU will not break either previously running guests, or future operations by the management tool. Finally management applications using libvirt may promise that they’ll operate with any version of libvirt or QEMU from a given starting version onwards. This is key to ensuring a management application can be used on a wide range of distros each with different libvirt/QEMU versions. To achieve this the application must be confident it hasn’t unexpectedly made use of a feature that doesn’t exist in a previous version of libvirt/QEMU that is intended to be supported.

The key to all this is of course automated testing. Libvirt keeps a record of capabilities associated with each QEMU version in its GIT repo along with various sample pairs of XML files and QEMU arguments. This is good for unit testing, but there’s some stuff that’s only really practical to validate well by running functional tests against each QEMU release. For live migration compatibility, it is possible to produce reports specifying the guest ABI for each machine type, on each QEMU version and compare them for differences. There are a huge number of combinations of command line args that affect ABI though, so it is useful to actually have the real binaries available for testing, even if only to dynamically generate the reports.

The COPR repository

With the background motivation out of the way, lets get to the point of this blog post. A while ago I created a Fedora copr repository that contained many libvirt builds. These were created in a bit of a hacky way making it hard to keep it up to date as new releases of libvirt come out, or as new Fedora repos need to be targeted. So in the past week, I’ve done a bit of work to put this on a more sustainable footing and also integrate QEMU builds.

As a result, there is a now a copr repo called ‘virt-ark‘ that currently targets Fedora 26 and 27, containing every QEMU version since 1.4.0 and every libvirt version since 1.2.0. That is 46 versions of libvirt dating back to Dec 2013, and 36 versions of QEMU dating back to Feb 2013. For QEMU I included all bugfix releases, which is why there are so many when there’s only 3 major releases a year compared to libvirt’s 11 major versions a year.

# rpm -qa | grep -E '(libvirt|qemu)-ark' | sort
libvirt-ark-1_2_0-1.2.0-1.x86_64
libvirt-ark-1_2_10-1.2.10-2.fc27.x86_64
libvirt-ark-1_2_11-1.2.11-2.fc27.x86_64
...snip...
libvirt-ark-3_8_0-3.8.0-2.fc27.x86_64
libvirt-ark-3_9_0-3.9.0-2.fc27.x86_64
libvirt-ark-4_0_0-4.0.0-2.fc27.x86_64
qemu-ark-1_4_0-1.4.0-3.fc27.x86_64
qemu-ark-1_4_1-1.4.1-3.fc27.x86_64
qemu-ark-1_4_2-1.4.2-3.fc27.x86_64
...snip....
qemu-ark-2_8_1-2.8.1-3.fc27.x86_64
qemu-ark-2_9_0-2.9.0-2.fc27.x86_64
qemu-ark-2_9_1-2.9.1-3.fc27.x86_64

Notice how the package name includes the version string. Each package version installs into /opt/$APP/$VERSION, eg /opt/libvirt/1.2.0 or /opt/qemu/2.4.0, so you can have them all installed at once and happily co-exist.

Using the custom versions

To launch a particular version of libvirtd

$ sudo /opt/libvirt/1.2.20/sbin/libvirtd

The libvirt builds store all their configuration in /opt/libvirt/$VERSION/etc/libvirt, and creates UNIX sockets in /opt/libvirt/$VERSION/var/run so will (mostly) not conflict with the main Fedora installed libvirt. As a result though, you need to use the corresponding virsh binary to connect to it

$ /opt/libvirt/1.2.20/bin/virsh

To test building or running another app against this version of libvirt set some environment variables

export PKG_CONFIG_PATH=/opt/libvirt/1.2.20/lib/pkgconfig
export LD_LIBRARY_PATH=/opt/libvirt/1.2.20/lib

For libvirtd to pick up a custom QEMU version, it must appear in $PATH before the QEMU from /usr, when libvirtd is started eg

$ su -
# export PATH=/opt/qemu/2.0.0/bin:$PATH
# /opt/libvirt/1.2.20/sbin/libvirtd

Alternatively just pass in the custom QEMU binary path in the guest XML (if the management app being tested supports that).

The build mechanics

When managing so many different versions of a software package you don’t want to be doing lots of custom work to create each one. Thus I have tried to keep everything as simple as possible. There is a Pagure hosted GIT repo containing the source for the builds. There are libvirt-ark.spec.in and qemu-ark.spec.in RPM specfile templates which are used for every version. No attempt is made to optimize the dependencies for each version, instead BuildRequires will just be the union of dependencies required across all versions. To keep build times down, for QEMU only the x86_64 architecture system emulator is enabled. In future I might enable the system emulators for other architectures that are commonly used (ppc, arm, s390), but won’t be enabling all the other ones QEMU has. The only trouble comes when newer Fedora releases include a change which breaks the build. This has happened a few times for both libvirt and QEMU. The ‘patches/‘ subdirectory thus contains a handful of patches acquired from upstream GIT repos to fix the builds. Essentially though I can run

$  make copr APP=libvirt ARCHIVE_FMT=xz DOWNLOAD_URL=https://libvirt.org/sources/ VERSION=1.3.0

Or

$  make copr APP=qemu ARCHIVE_FMT=xz DOWNLOAD_URL=https://download.qemu.org/ VERSION=2.6.0

And it will download the pristine upstream source, write a spec file including any patches found locally, create a src.rpm and upload this to the copr build service. I’ll probably automate this a little more in future to avoid having to pass so many args to make, by keeping a CSV file with all metadata for each version.

 

Improving QEMU security part 7: TLS support for migration

Posted: August 16th, 2016 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: , , | 1 Comment »

This blog is part 7 of a series I am writing about work I’ve completed over the past few releases to improve QEMU security related features.

The live migration feature in QEMU allows a running VM to be moved from one host to another with no noticeable interruption in service and minimal performance impact. The live migration data stream will contain a serialized copy of state of all emulated devices, along with all the guest RAM. In some versions of QEMU it is also used to transfer disk image content, but in modern QEMU use of the NBD protocol is preferred for this purpose. The guest RAM in particular can contain sensitive data that needs to be protected against any would be attackers on the network between source and target hosts. There are a number of ways to provide such security using external tools/services including VPNs, IPsec, SSH/stunnel tunnelling. The libvirtd daemon often already has a secure connection between the source and destination hosts for its own purposes, so many years back support was added to libvirt to automatically tunnel the live migration data stream over libvirt’s own secure connection. This solved both the encryption and authentication problems at once, but there are some downsides to this approach. Tunnelling the connection means extra data copies for the live migration traffic and when we look at guests with RAM many GB in size, the number of data copies will start to matter. The libvirt tunnel only supports a tunnelling of a single data connection and in future QEMU may well wish to use multiple TCP connections for the migration data stream to improve performance of post-copy. The use of NBD for storage migration is not supported with tunnelling via libvirt, since it would require extra connections too. IOW while tunnelling over libvirt was a useful short term hack to provide security, it has outlived its practicality.

It is clear that QEMU needs to support TLS encryption natively on its live migration connections. The QEMU migration code has historically had its own distinct I/O layer called QEMUFile which mixes up tracking of migration state with the connection establishment and I/O transfer support. As mentioned in previous blog post, QEMU now has a general purpose I/O channel framework, so the bulk of the work involved converting the migration code over to use the QIOChannel classes and APIs, which greatly reduced the amount of code in the QEMU migration/ sub-folder as well as simplifying it somewhat. The TLS support involves the addition of two new parameters to the migration code. First the “tls-creds” parameter provides the ID of a previously created TLS credential object, thus enabling use of TLS on the migration channel. This must be set on both the source and target QEMU’s involved in the migration.

On the target host, QEMU would be launched with a set of TLS credentials for a server endpoint:

$ qemu-system-x86_64 -monitor stdio -incoming defer \
    -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=server,id=tls0 \
    ...other args...

To enable incoming TLS migration 2 monitor commands are then used

(qemu) migrate_set_str_parameter tls-creds tls0
(qemu) migrate_incoming tcp:myhostname:9000

On the source host, QEMU is launched in a similar manner but using client endpoint credentials

$ qemu-system-x86_64 -monitor stdio \
    -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=client,id=tls0 \
    ...other args...

To enable outgoing TLS migration 2 monitor commands are then used

(qemu) migrate_set_str_parameter tls-creds tls0
(qemu) migrate tcp:otherhostname:9000

The migration code supports a number of different protocols besides just “tcp:“. In particular it allows an “fd:” protocol to tell QEMU to use a passed-in file descriptor, and an “exec:” protocol to tell QEMU to launch an external command to tunnel the connection. It is desirable to be able to use TLS with these protocols too, but when using TLS the client QEMU needs to know the hostname of the target QEMU in order to correctly validate the x509 certificate it receives. Thus, a second “tls-hostname” parameter was added to allow QEMU to be informed of the hostname to use for x509 certificate validation when using a non-tcp migration protocol. This can be set on the source QEMU prior to starting the migration using the “migrate_set_str_parameter” monitor command

(qemu) migrate_set_str_parameter tls-hostname myhost.mydomain

This feature has been under development for a while and finally merged into QEMU GIT early in the 2.7.0 development cycle, so will be available for use when 2.7.0 is released in a few weeks. With the arrival of the 2.7.0 release there will finally be TLS support across all QEMU host services where TCP connections are commonly used, namely VNC, SPICE, NBD, migration and character devices.

In this blog series:

Improving QEMU security part 6: TLS support for character devices

Posted: August 16th, 2016 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: , , , | 2 Comments »

This blog is part 6 of a series I am writing about work I’ve completed over the past few releases to improve QEMU security related features.

A number of QEMU device models and objects use a character devices for providing connectivity with the outside world, including the QEMU monitor, serial ports, parallel ports, virtio serial channels, RNG EGD object, CCID smartcard passthrough, IPMI device, USB device redirection and vhost-user. While some of these will only ever need a character device configured with local connectivity, some will certainly need to make use of TCP connections to remote hosts. Historically these connections have always been entirely in clear text, which is unacceptable in the modern hostile network environment where even internal networks cannot be trusted. Clearly the QEMU character device code requires the ability to use TLS for encrypting sensitive data and providing some level of authentication on connections.

The QEMU character device code was mostly using GLib’s  GIOChannel framework for doing I/O but this has a number of unsatisfactory limitations. It can not do vectored I/O, is not easily extensible and does not concern itself at all with initial connection establishment. These are all reasons why the QIOChannel framework was added to QEMU. So the first step in supporting TLS on character devices was to convert the code over to use QIOChannel instead of GIOChannel. With that done, adding in support for TLS was quite straightforward, merely requiring addition of a new configuration property (“tls-creds“) to set the desired TLS credentials.

For example to run a QEMU VM with a serial port listening on IP 10.0.01, port 9000, acting as a TLS server:

$ qemu-system-x86_64 \
      -object tls-creds-x509,id=tls0,endpoint=server,dir=/home/berrange/qemutls \
      -chardev socket,id=s0,host=10.0.0.1,port=9000,tls-creds=tls0,server \
      -device isa-serial,chardev=s0
      ...other QEMU options...

It is possible test connectivity to this TLS server using the gnutls-cli tool

$ gnutls-cli --priority=NORMAL -p 9000 \
--x509cafile=/home/berrange/security/qemutls/ca-cert.pem \
127.0.0.1

In the above example, QEMU was running as a TCP server, and acting as the TLS server endpoint, but this matching is not required. It is valid to configure it to run as a TLS client if desired, though this would be somewhat uncommon.

Of course you can connect 2 QEMU VMs together, both using TLS. Assuming the above QEMU is still running, we can launch a second QEMU connecting to it with

$ qemu-system-x86_64 \
      -object tls-creds-x509,id=tls0,endpoint=client,dir=/home/berrange/qemutls \
      -chardev socket,id=s0,host=10.0.0.1,port=9000,tls-creds=tls0 \
      -device isa-serial,chardev=s0
      ...other QEMU options...

Notice, we’ve changed the “endpoint” and removed the “server” option, so this second QEMU runs as a TCP client and acts as the TLS client endpoint.

This feature is available since the QEMU 2.6.0 release a few months ago.

In this blog series: