CPU model configuration for QEMU/KVM on x86 hosts

Posted: June 29th, 2018 | Filed under: Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: , , , , , , , | 15 Comments »

With the various CPU hardware vulnerabilities reported this year, guest CPU configuration is now a security critical task. This blog post contains content I’ve written that is on its way to become part of the QEMU documentation.

QEMU / KVM virtualization supports two ways to configure CPU models

Host passthrough
This passes the host CPU model features, model, stepping, exactly to the guest. Note that KVM may filter out some host CPU model features if they cannot be supported with virtualization. Live migration is unsafe when this mode is used as libvirt / QEMU cannot guarantee a stable CPU is exposed to the guest across hosts. This is the recommended CPU to use, provided live migration is not required.
Named model
QEMU comes with a number of predefined named CPU models, that typically refer to specific generations of hardware released by Intel and AMD. These allow the guest VMs to have a degree of isolation from the host CPU, allowing greater flexibility in live migrating between hosts with differing hardware.

In both cases, it is possible to optionally add or remove individual CPU features, to alter what is presented to the guest by default.

Libvirt supports a third way to configure CPU models known as “Host model”. This uses the QEMU “Named model” feature, automatically picking a CPU model that is similar the host CPU, and then adding extra features to approximate the host model as closely as possible. This does not guarantee the CPU family, stepping, etc will precisely match the host CPU, as they would with “Host passthrough”, but gives much of the benefit of passthrough, while making live migration safe.

Recommendations for KVM CPU model configuration on x86 hosts

The information that follows provides recommendations for configuring CPU models on x86 hosts. The goals are to maximise performance, while protecting guest OS against various CPU hardware flaws, and optionally enabling live migration between hosts with hetergeneous CPU models.

Preferred CPU models for Intel x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

Skylake-Server
Skylake-Server-IBRS
Intel Xeon Processor (Skylake, 2016)
Skylake-Client
Skylake-Client-IBRS
Intel Core Processor (Skylake, 2015)
Broadwell
Broadwell-IBRS
Broadwell-noTSX
Broadwell-noTSX-IBRS
Intel Core Processor (Broadwell, 2014)
Haswell
Haswell-IBRS
Haswell-noTSX
Haswell-noTSX-IBRS
Intel Core Processor (Haswell, 2013)
IvyBridge
IvyBridge-IBRS
Intel Xeon E3-12xx v2 (Ivy Bridge, 2012)
SandyBridge
SandyBridge-IBRS
Intel Xeon E312xx (Sandy Bridge, 2011)
Westmere
Westmere-IBRS
Westmere E56xx/L56xx/X56xx (Nehalem-C, 2010)
Nehalem
Nehalem-IBRS
Intel Core i7 9xx (Nehalem Class Core i7, 2008)
Penryn
Intel Core 2 Duo P9xxx (Penryn Class Core 2, 2007)
Conroe
Intel Celeron_4x0 (Conroe/Merom Class Core 2, 2006)

Important CPU features for Intel x86 hosts

The following are important CPU features that should be used on Intel x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

pcid
Recommended to mitigate the cost of the Meltdown (CVE-2017-5754) fix. Included by default in Haswell, Broadwell & Skylake Intel CPU models. Should be explicitly turned on for Westmere, SandyBridge, and IvyBridge Intel CPU models. Note that some desktop/mobile Westmere CPUs cannot support this feature.
spec-ctrl
Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in Intel CPU models with -IBRS suffix. Must be explicitly turned on for Intel CPU models without -IBRS suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any Intel CPU model. Must be explicitly turned on for all Intel CPU models. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
pdpe1gb
Recommended to allow guest OS to use 1GB size pages.Not included by default in any Intel CPU model. Should be explicitly turned on for all Intel CPU models. Note that not all CPU hardware will support this feature.

Preferred CPU models for AMD x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

EPYC
EPYC-IBPB
AMD EPYC Processor (2017)
Opteron_G5
AMD Opteron 63xx class CPU (2012)
Opteron_G4
AMD Opteron 62xx class CPU (2011)
Opteron_G3
AMD Opteron 23xx (Gen 3 Class Opteron, 2009)
Opteron_G2
AMD Opteron 22xx (Gen 2 Class Opteron, 2006)
Opteron_G1
AMD Opteron 240 (Gen 1 Class Opteron, 2004)

Important CPU features for AMD x86 hosts

The following are important CPU features that should be used on AMD x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

ibpb
Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in AMD CPU models with -IBPB suffix. Must be explicitly turned on for AMD CPU models without -IBPB suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
virt-ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This should be provided to guests, even if amd-ssbd is also provided, for maximum guest compatibility. Note for some QEMU / libvirt versions, this must be force enabled when when using “Host model”, because this is a virtual feature that doesn’t exist in the physical host CPUs.
amd-ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This provides higher performance than virt-ssbd so should be exposed to guests whenever available in the host. virt-ssbd should none the less also be exposed for maximum guest compatability as some kernels only know about virt-ssbd.
amd-no-ssb
Recommended to indicate the host is not vulnerable CVE-2018-3639. Not included by default in any AMD CPU model. Future hardware genarations of CPU will not be vulnerable to CVE-2018-3639, and thus the guest should be told not to enable its mitigations, by exposing amd-no-ssb. This is mutually exclusive with virt-ssbd and amd-ssbd.
pdpe1gb
Recommended to allow guest OS to use 1GB size pages. Not included by default in any AMD CPU model. Should be explicitly turned on for all AMD CPU models. Note that not all CPU hardware will support this feature.

Default x86 CPU models

The default QEMU CPU models are designed such that they can run on all hosts. If an application does not wish to do perform any host compatibility checks before launching guests, the default is guaranteed to work.

The default CPU models will, however, leave the guest OS vulnerable to various CPU hardware flaws, so their use is strongly discouraged. Applications should follow the earlier guidance to setup a better CPU configuration, with host passthrough recommended if live migration is not needed.

qemu32
qemu64
QEMU Virtual CPU version 2.5+ (32 & 64 bit variants). qemu64 is used for x86_64 guests and qemu32 is used for i686 guests, when no -cpu argument is given to QEMU, or no <cpu> is provided in libvirt XML.

Other non-recommended x86 CPUs

The following CPUs models are compatible with most AMD and Intel x86 hosts, but their usage is discouraged, as they expose a very limited featureset, which prevents guests having optimal performance.

kvm32
kvm64
Common KVM processor (32 & 64 bit variants). Legacy models just for historical compatibility with ancient QEMU versions.
486
athlon
phenom
coreduo
core2duo
n270
pentium
pentium2
pentium3
Various very old x86 CPU models, mostly predating the introduction of hardware assisted virtualization, that should thus not be required for running virtual machines.

Syntax for configuring CPU models

The example below illustrate the approach to configuring the various CPU models / features in QEMU and libvirt

QEMU command line

Host passthrough
   $ qemu-system-x86_64 -cpu host

With feature customization:

   $ qemu-system-x86_64 -cpu host,-vmx,...
Named CPU models
   $ qemu-system-x86_64 -cpu Westmere

With feature customization:

   $ qemu-system-x86_64 -cpu Westmere,+pcid,...

Libvirt guest XML

Host passthrough
   <cpu mode='host-passthrough'/>

With feature customization:

   <cpu mode='host-passthrough'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>
Host model
   <cpu mode='host-model'/>

With feature customization:

   <cpu mode='host-model'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>
Named model
   <cpu mode='custom'>
       <model>Westmere</model>
   </cpu>

With feature customization:

   <cpu mode='custom'>
       <model>Westmere</model>
       <feature name="pcid" policy="require"/>
       ...
   </cpu>

 

ANNOUNCE: gtk-vnc 0.7.2 release

Posted: March 23rd, 2018 | Filed under: Fedora, Gtk-Vnc, libvirt, Virt Tools | No Comments »

I’m pleased to announce a new release of GTK-VNC, version 0.7.2. The release focus is on bug fixing, and addresses an important regression in TLS handling from the previous release.

  • Deprecated the manual python2 binding in favour of GObject introspection. It will be deleted in the next release.
  • Emit led state notification on connect
  • Fix incorrect copyright notices
  • Simplify shifted-tab key handling
  • Don’t short circuit TLS credential request
  • Improve check for keymap under XWayland
  • Update doap description of project
  • Modernize RPM specfile

Thanks to all those who reported bugs and provides patches that went into this new release.

The Fedora virtualization software archive (aka virt-ark)

Posted: February 9th, 2018 | Filed under: Coding Tips, Fedora, libvirt, Virt Tools | Tags: , , , | 1 Comment »

With libvirt releasing 11 times a year and QEMU releasing three times a year, there is a quite large set of historical releases available by now. Both projects have a need to maintain compatibility across releases in varying areas. For QEMU the most important thing is that versioned machine types present the same guest ABI across releases. ie a ‘pc-2.0.0’ machine on QEMU 2.0.0, should be identical to a ‘pc-2.0.0’ machine on QEMU 2.5.0. If this rule is violated, the ability to live migrate and save/restore is doomed. For libvirt the most important thing is that a given guest configuration should be usable across many QEMU versions, even if the command line arguments required to achieve the configuration in QEMU have changed. This is key to libvirt’s promise that upgrading either libvirt or QEMU will not break either previously running guests, or future operations by the management tool. Finally management applications using libvirt may promise that they’ll operate with any version of libvirt or QEMU from a given starting version onwards. This is key to ensuring a management application can be used on a wide range of distros each with different libvirt/QEMU versions. To achieve this the application must be confident it hasn’t unexpectedly made use of a feature that doesn’t exist in a previous version of libvirt/QEMU that is intended to be supported.

The key to all this is of course automated testing. Libvirt keeps a record of capabilities associated with each QEMU version in its GIT repo along with various sample pairs of XML files and QEMU arguments. This is good for unit testing, but there’s some stuff that’s only really practical to validate well by running functional tests against each QEMU release. For live migration compatibility, it is possible to produce reports specifying the guest ABI for each machine type, on each QEMU version and compare them for differences. There are a huge number of combinations of command line args that affect ABI though, so it is useful to actually have the real binaries available for testing, even if only to dynamically generate the reports.

The COPR repository

With the background motivation out of the way, lets get to the point of this blog post. A while ago I created a Fedora copr repository that contained many libvirt builds. These were created in a bit of a hacky way making it hard to keep it up to date as new releases of libvirt come out, or as new Fedora repos need to be targeted. So in the past week, I’ve done a bit of work to put this on a more sustainable footing and also integrate QEMU builds.

As a result, there is a now a copr repo called ‘virt-ark‘ that currently targets Fedora 26 and 27, containing every QEMU version since 1.4.0 and every libvirt version since 1.2.0. That is 46 versions of libvirt dating back to Dec 2013, and 36 versions of QEMU dating back to Feb 2013. For QEMU I included all bugfix releases, which is why there are so many when there’s only 3 major releases a year compared to libvirt’s 11 major versions a year.

# rpm -qa | grep -E '(libvirt|qemu)-ark' | sort
libvirt-ark-1_2_0-1.2.0-1.x86_64
libvirt-ark-1_2_10-1.2.10-2.fc27.x86_64
libvirt-ark-1_2_11-1.2.11-2.fc27.x86_64
...snip...
libvirt-ark-3_8_0-3.8.0-2.fc27.x86_64
libvirt-ark-3_9_0-3.9.0-2.fc27.x86_64
libvirt-ark-4_0_0-4.0.0-2.fc27.x86_64
qemu-ark-1_4_0-1.4.0-3.fc27.x86_64
qemu-ark-1_4_1-1.4.1-3.fc27.x86_64
qemu-ark-1_4_2-1.4.2-3.fc27.x86_64
...snip....
qemu-ark-2_8_1-2.8.1-3.fc27.x86_64
qemu-ark-2_9_0-2.9.0-2.fc27.x86_64
qemu-ark-2_9_1-2.9.1-3.fc27.x86_64

Notice how the package name includes the version string. Each package version installs into /opt/$APP/$VERSION, eg /opt/libvirt/1.2.0 or /opt/qemu/2.4.0, so you can have them all installed at once and happily co-exist.

Using the custom versions

To launch a particular version of libvirtd

$ sudo /opt/libvirt/1.2.20/sbin/libvirtd

The libvirt builds store all their configuration in /opt/libvirt/$VERSION/etc/libvirt, and creates UNIX sockets in /opt/libvirt/$VERSION/var/run so will (mostly) not conflict with the main Fedora installed libvirt. As a result though, you need to use the corresponding virsh binary to connect to it

$ /opt/libvirt/1.2.20/bin/virsh

To test building or running another app against this version of libvirt set some environment variables

export PKG_CONFIG_PATH=/opt/libvirt/1.2.20/lib/pkgconfig
export LD_LIBRARY_PATH=/opt/libvirt/1.2.20/lib

For libvirtd to pick up a custom QEMU version, it must appear in $PATH before the QEMU from /usr, when libvirtd is started eg

$ su -
# export PATH=/opt/qemu/2.0.0/bin:$PATH
# /opt/libvirt/1.2.20/sbin/libvirtd

Alternatively just pass in the custom QEMU binary path in the guest XML (if the management app being tested supports that).

The build mechanics

When managing so many different versions of a software package you don’t want to be doing lots of custom work to create each one. Thus I have tried to keep everything as simple as possible. There is a Pagure hosted GIT repo containing the source for the builds. There are libvirt-ark.spec.in and qemu-ark.spec.in RPM specfile templates which are used for every version. No attempt is made to optimize the dependencies for each version, instead BuildRequires will just be the union of dependencies required across all versions. To keep build times down, for QEMU only the x86_64 architecture system emulator is enabled. In future I might enable the system emulators for other architectures that are commonly used (ppc, arm, s390), but won’t be enabling all the other ones QEMU has. The only trouble comes when newer Fedora releases include a change which breaks the build. This has happened a few times for both libvirt and QEMU. The ‘patches/‘ subdirectory thus contains a handful of patches acquired from upstream GIT repos to fix the builds. Essentially though I can run

$  make copr APP=libvirt ARCHIVE_FMT=xz DOWNLOAD_URL=https://libvirt.org/sources/ VERSION=1.3.0

Or

$  make copr APP=qemu ARCHIVE_FMT=xz DOWNLOAD_URL=https://download.qemu.org/ VERSION=2.6.0

And it will download the pristine upstream source, write a spec file including any patches found locally, create a src.rpm and upload this to the copr build service. I’ll probably automate this a little more in future to avoid having to pass so many args to make, by keeping a CSV file with all metadata for each version.

 

Full coverage of libvirt XML schemas achieved in libvirt-go-xml

Posted: December 7th, 2017 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Virt Tools | Tags: , | No Comments »

In recent times I have been aggressively working to expand the coverage of libvirt XML schemas in the libvirt-go-xml project. Today this work has finally come to a conclusion, when I achieved what I believe to be effectively 100% coverage of all of the libvirt XML schemas. More on this later, but first some background on Go and XML….

For those who aren’t familiar with Go, the core library’s encoding/xml module provides a very easy way to consume and produce XML documents in Go code. You simply define a set of struct types and annotate their fields to indicate what elements & attributes each should map to. For example, given the Go structs:

type Person struct {
    XMLName xml.Name `xml:"person"`
    Name string `xml:"name,attr"`
    Age string `xml:"age,attr"` 
    Home *Address `xml:"home"`
    Office *Address `xml:"office"`
} 
type Address struct { 
    Street string `xml:"street"`
    City string `xml:"city"` 
}

You can parse/format XML documents looking like

<person name="Joe Blogs" age="24">
  <home>
    <street>Some where</street><city>London</city>
  </home>
  <office>
    <street>Some where else</street><city>London</city>
  </office>  
</person>

Other programming languages I’ve used required a great deal more work when dealing with XML. For parsing, there’s typically a choice between an XML stream based parser where you have to react to tokens as they’re parsed and stuff them into structs, or a DOM object hierarchy from which you then have to pull data out into your structs. For outputting XML, apps either build up a DOM object hierarchy again, or dynamically format the XML document incrementally. Whichever approach is taken, it generally involves writing alot of tedious & error prone boilerplate code. In most cases, the Go encoding/xml module eliminates all the boilerplate code, only requiring the data type defintions. This really makes dealing with XML a much more enjoyable experience, because you effectively don’t deal with XML at all! There are some exceptions to this though, as the simple annotations can’t capture every nuance of many XML documents. For example, integer values are always parsed & formatted in base 10, so extra work is needed for base 16. There’s also no concept of unions in Go, or the XML annotations. In these edge cases custom marshaling / unmarshalling methods need to be written. BTW, this approach to XML is also taken for other serialization formats including JSON and YAML too, with one struct field able to have many annotations so it can be serialized to a range of formats.

Back to the point of the blog post, when I first started writing Go code using libvirt it was immediately obvious that everyone using libvirt from Go would end up re-inventing the wheel for XML handling. Thus about 1 year ago, I created the libvirt-go-xml project whose goal is to define a set of structs that can handle documents in every libvirt public XML schema. Initially the level of coverage was fairly light, and over the past year 18 different contributors have sent patches to expand the XML coverage in areas that their respective applications touched. It was clear, however, that taking an incremental approach would mean that libvirt-go-xml is forever trailing what libvirt itself supports. It needed an aggressive push to achieve 100% coverage of the XML schemas, or as near as practically identifiable.

Alongside each set of structs we had also been writing unit tests with a set of structs populated with data, and a corresponding expected XML document. The idea for writing the tests was that the author would copy a snippet of XML from a known good source, and then populate the structs that would generate this XML. In retrospect this was not a scalable approach, because there is an enourmous range of XML documents that libvirt supports. A further complexity is that Go doesn’t generate XML documents in the exact same manner. For example, it never generates self-closing tags, instead always outputting a full opening & closing pair. This is semantically equivalent, but makes a plain string comparison of two XML documents impractical in the general case.

Considering the need to expand the XML coverage, and provide a more scalable testing approach, I decided to change approach. The libvirt.git tests/ directory currently contains 2739 XML documents that are used to validate libvirt’s own native XML parsing & formatting code. There is no better data set to use for validating the libvirt-go-xml coverage than this. Thus I decided to apply a round-trip testing methodology. The libvirt-go-xml code would be used to parse the sample XML document from libvirt.git, and then immediately serialize them back into a new XML document. Both the original and new XML documents would then be parsed generically to form a DOM hierarchy which can be compared for equivalence. Any place where documents differ would cause the test to fail and print details of where the problem is. For example:

$ go test -tags xmlroundtrip
--- FAIL: TestRoundTrip (1.01s)
	xml_test.go:384: testdata/libvirt/tests/vircaps2xmldata/vircaps-aarch64-basic.xml: \
            /capabilities[0]/host[0]/topology[0]/cells[0]/cell[0]/pages[0]: \
            element in expected XML missing in actual XML

This shows the filename that failed to correctly roundtrip, and the position within the XML tree that didn’t match. Here the NUMA cell topology has a ‘<pages>‘  element expected but not present in the newly generated XML. Now it was simply a matter of running the roundtrip test over & over & over & over & over & over & over……….& over & over & over, adding structs / fields for each omission that the test identified.

After doing this for some time, libvirt-go-xml now has 586 structs defined containing 1816 fields, and has certified 100% coverage of all libvirt public XML schemas. Of course when I say 100% coverage, this is probably a lie, as I’m blindly assuming that the libvirt.git test suite has 100% coverage of all its own XML schemas. This is certainly a goal, but I’m confident there are cases where libvirt itself is missing test coverage. So if any omissions are identified in libvirt-go-xml, these are likely omissions in libvirt’s own testing.

On top of this, the XML roundtrip test is set to run in the libvirt jenkins and travis CI systems, so as libvirt extends its XML schemas, we’ll get build failures in libvirt-go-xml and thus know to add support there to keep up.

In expanding the coverage of XML schemas, a number of non-trivial changes were made to existing structs  defined by libvirt-go-xml. These were mostly in places where we have to handle a union concept defined by libvirt. Typically with libvirt an element will have a “type” attribute, whose value then determines what child elements are permitted. Previously we had been defining a single struct, whose fields represented all possible children across all the permitted type values. This did not scale well and gave the developer no clue what content is valid for each type value. In the new approach, for each distinct type attribute value, we now define a distinct Go struct to hold the contents. This will cause API breakage for apps already using libvirt-go-xml, but on balance it is worth it get a better structure over the long term. There were also cases where a child XML element previously represented a single value and this was mapped to a scalar struct field. Libvirt then added one or more attributes on this element, meaning the scalar struct field had to turn into a struct field that points to another struct. These kind of changes are unavoidable in any nice manner, so while we endeavour not to gratuitously change currently structs, if the libvirt XML schema gains new content, it might trigger further changes in the libvirt-go-xml structs that are not 100% backwards compatible.

Since we are now tracking libvirt.git XML schemas, going forward we’ll probably add tags in the libvirt-go-xml repo that correspond to each libvirt release. So for app developers we’ll encourage use of Go vendoring to pull in a precise version of libvirt-go-xml instead of blindly tracking master all the time.

Full colour emojis in virtual machine names in Fedora 27

Posted: December 1st, 2017 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , , | 1 Comment »

Quite by chance today I discovered that Fedora 27 can display full colour glyphs for unicode characters that correspond to emojis, when the terminal displaying my mutt mail reader displayed someone’s name with a full colour glyph showing stars:

Mutt in GNOME terminal rendering color emojis in sender name

Chatting with David Gilbert on IRC I learnt that this is a new feature in Fedora 27 GNOME, thanks to recent work in the GTK/Pango stack. David then pointed out this works in libvirt, so I thought I would illustrate it.

Virtual machine name with full colour emojis rendered

No special hacks were required to do this, I simply entered the emojis as the virtual machine name when creating it from virt-manager’s wizard

Virtual machine name with full colour emojis rendered

As mentioned previously, GNOME terminal displays colour emojis, so these virtual machine names appear nicely when using virsh and other command line tools

Virtual machine name rendered with full colour emojis in terminal commands

The more observant readers will notice that the command line args have a bug as the snowman in the machine name is incorrectly rendered in the process listing. The actual data in /proc/$PID/cmdline is correct, so something about the “ps” command appears to be mangling it prior to output. It isn’t simply a font problem because other comamnds besides “ps” render properly, and if you grep the “ps” output for the snowman emoji no results are displayed.