CPU model configuration for QEMU/KVM on x86 hosts

Posted: June 29th, 2018 | Filed under: Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: cpu, kvm, libvirt, meltdown, qemu, spectre, ssbd, virtual machine | 15 Comments »

With the various CPU hardware vulnerabilities reported this year, guest CPU configuration is now a security critical task. This blog post contains content I’ve written that is on its way to become part of the QEMU documentation.

QEMU / KVM virtualization supports two ways to configure CPU models

Host passthrough: This passes the host CPU model features, model, stepping, exactly to the guest. Note that KVM may filter out some host CPU model features if they cannot be supported with virtualization. Live migration is unsafe when this mode is used as libvirt / QEMU cannot guarantee a stable CPU is exposed to the guest across hosts. This is the recommended CPU to use, provided live migration is not required.
Named model: QEMU comes with a number of predefined named CPU models, that typically refer to specific generations of hardware released by Intel and AMD. These allow the guest VMs to have a degree of isolation from the host CPU, allowing greater flexibility in live migrating between hosts with differing hardware.

In both cases, it is possible to optionally add or remove individual CPU features, to alter what is presented to the guest by default.

Libvirt supports a third way to configure CPU models known as “Host model”. This uses the QEMU “Named model” feature, automatically picking a CPU model that is similar the host CPU, and then adding extra features to approximate the host model as closely as possible. This does not guarantee the CPU family, stepping, etc will precisely match the host CPU, as they would with “Host passthrough”, but gives much of the benefit of passthrough, while making live migration safe.

Recommendations for KVM CPU model configuration on x86 hosts

The information that follows provides recommendations for configuring CPU models on x86 hosts. The goals are to maximise performance, while protecting guest OS against various CPU hardware flaws, and optionally enabling live migration between hosts with hetergeneous CPU models.

Preferred CPU models for Intel x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

Skylake-Server
Skylake-Server-IBRS: Intel Xeon Processor (Skylake, 2016)
Skylake-Client
Skylake-Client-IBRS: Intel Core Processor (Skylake, 2015)
Broadwell
Broadwell-IBRS
Broadwell-noTSX
Broadwell-noTSX-IBRS: Intel Core Processor (Broadwell, 2014)
Haswell
Haswell-IBRS
Haswell-noTSX
Haswell-noTSX-IBRS: Intel Core Processor (Haswell, 2013)
IvyBridge
IvyBridge-IBRS: Intel Xeon E3-12xx v2 (Ivy Bridge, 2012)
SandyBridge
SandyBridge-IBRS: Intel Xeon E312xx (Sandy Bridge, 2011)
Westmere
Westmere-IBRS: Westmere E56xx/L56xx/X56xx (Nehalem-C, 2010)
Nehalem
Nehalem-IBRS: Intel Core i7 9xx (Nehalem Class Core i7, 2008)
Penryn: Intel Core 2 Duo P9xxx (Penryn Class Core 2, 2007)
Conroe: Intel Celeron_4x0 (Conroe/Merom Class Core 2, 2006)

Important CPU features for Intel x86 hosts

The following are important CPU features that should be used on Intel x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

pcid: Recommended to mitigate the cost of the Meltdown (CVE-2017-5754) fix. Included by default in Haswell, Broadwell & Skylake Intel CPU models. Should be explicitly turned on for Westmere, SandyBridge, and IvyBridge Intel CPU models. Note that some desktop/mobile Westmere CPUs cannot support this feature.
spec-ctrl: Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in Intel CPU models with -IBRS suffix. Must be explicitly turned on for Intel CPU models without -IBRS suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
ssbd: Required to enable the CVE-2018-3639 fix. Not included by default in any Intel CPU model. Must be explicitly turned on for all Intel CPU models. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
pdpe1gb: Recommended to allow guest OS to use 1GB size pages.Not included by default in any Intel CPU model. Should be explicitly turned on for all Intel CPU models. Note that not all CPU hardware will support this feature.

Preferred CPU models for AMD x86 hosts

EPYC
EPYC-IBPB: AMD EPYC Processor (2017)
Opteron_G5: AMD Opteron 63xx class CPU (2012)
Opteron_G4: AMD Opteron 62xx class CPU (2011)
Opteron_G3: AMD Opteron 23xx (Gen 3 Class Opteron, 2009)
Opteron_G2: AMD Opteron 22xx (Gen 2 Class Opteron, 2006)
Opteron_G1: AMD Opteron 240 (Gen 1 Class Opteron, 2004)

Important CPU features for AMD x86 hosts

The following are important CPU features that should be used on AMD x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

ibpb: Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in AMD CPU models with -IBPB suffix. Must be explicitly turned on for AMD CPU models without -IBPB suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
virt-ssbd: Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This should be provided to guests, even if amd-ssbd is also provided, for maximum guest compatibility. Note for some QEMU / libvirt versions, this must be force enabled when when using “Host model”, because this is a virtual feature that doesn’t exist in the physical host CPUs.
amd-ssbd: Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This provides higher performance than virt-ssbd so should be exposed to guests whenever available in the host. virt-ssbd should none the less also be exposed for maximum guest compatability as some kernels only know about virt-ssbd.
amd-no-ssb: Recommended to indicate the host is not vulnerable CVE-2018-3639. Not included by default in any AMD CPU model. Future hardware genarations of CPU will not be vulnerable to CVE-2018-3639, and thus the guest should be told not to enable its mitigations, by exposing amd-no-ssb. This is mutually exclusive with virt-ssbd and amd-ssbd.
pdpe1gb: Recommended to allow guest OS to use 1GB size pages. Not included by default in any AMD CPU model. Should be explicitly turned on for all AMD CPU models. Note that not all CPU hardware will support this feature.

Default x86 CPU models

The default QEMU CPU models are designed such that they can run on all hosts. If an application does not wish to do perform any host compatibility checks before launching guests, the default is guaranteed to work.

The default CPU models will, however, leave the guest OS vulnerable to various CPU hardware flaws, so their use is strongly discouraged. Applications should follow the earlier guidance to setup a better CPU configuration, with host passthrough recommended if live migration is not needed.

qemu32
qemu64: QEMU Virtual CPU version 2.5+ (32 & 64 bit variants). qemu64 is used for x86_64 guests and qemu32 is used for i686 guests, when no -cpu argument is given to QEMU, or no <cpu> is provided in libvirt XML.

Other non-recommended x86 CPUs

The following CPUs models are compatible with most AMD and Intel x86 hosts, but their usage is discouraged, as they expose a very limited featureset, which prevents guests having optimal performance.

kvm32
kvm64: Common KVM processor (32 & 64 bit variants). Legacy models just for historical compatibility with ancient QEMU versions.
486
athlon
phenom
coreduo
core2duo
n270
pentium
pentium2
pentium3: Various very old x86 CPU models, mostly predating the introduction of hardware assisted virtualization, that should thus not be required for running virtual machines.

Syntax for configuring CPU models

The example below illustrate the approach to configuring the various CPU models / features in QEMU and libvirt

QEMU command line

Host passthrough

   $ qemu-system-x86_64 -cpu host

With feature customization:

   $ qemu-system-x86_64 -cpu host,-vmx,...

Named CPU models

   $ qemu-system-x86_64 -cpu Westmere

With feature customization:

   $ qemu-system-x86_64 -cpu Westmere,+pcid,...

Libvirt guest XML

Host passthrough

   <cpu mode='host-passthrough'/>

With feature customization:

   <cpu mode='host-passthrough'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>

Host model

   <cpu mode='host-model'/>

With feature customization:

   <cpu mode='host-model'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>

Named model

   <cpu mode='custom'>
       <model>Westmere</model>
   </cpu>

With feature customization:

   <cpu mode='custom'>
       <model>Westmere</model>
       <feature name="pcid" policy="require"/>
       ...
   </cpu>

Announce: libvirt-sandbox “Dashti Margo” 0.6.0 release – an application sandbox toolkit

Posted: July 1st, 2015 | Filed under: Fedora, libvirt, Security, Virt Tools | Tags: application, containers, docker, kvm, lxc, sandbox | 2 Comments »

I pleased to announce the a new public release of libvirt-sandbox, version 0.6.0, is now available from:

http://sandbox.libvirt.org/download/

The packages are GPG signed with

  Key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF (4096R)

The libvirt-sandbox package provides an API layer on top of libvirt-gobject which facilitates the cration of application sandboxes using virtualization technology. An application sandbox is a virtual machine or container that runs a single application binary, directly from the host OS filesystem. In other words there is no separate guest operating system install to build or manage.

At this point in time libvirt-sandbox can create sandboxes using either LXC or KVM, and should in theory be extendable to any libvirt driver.

This release contains a mixture of new features and bugfixes.

The first major feature is the ability to provide block devices to sandboxes. Most of the time sandboxes only want/need filesystems, but there are some use cases where block devices are useful. For example, some applications (like databases) can directly use raw block devices for storage. Another one is where a tool actually wishes to be able to format filesystems and have this done inside the container. The complexity with exposing block devices is giving the sandbox tools a predictable path for accessing the device which does not change across hypervisors. To solve this, instead of allowing users of virt-sandbox to specify a block device name, they provide an opaque tag name. The block device is then made available at a path /dev/disk/by-tag/TAGNAME, which symlinks back to whatever hypervisor specific disk name was used.

The second major feature is the ability to provide a custom root filesystem for the sandbox. The original intent of the sandbox tool was that it provide an easy way to confine and execute applications that are installed on the host filesystem, so by default the host / filesystem is mapped to the sandbox / filesystem read-only. There are some use cases, however, where the user may wish to have a completely different root filesystem. For example, they may wish to execute applications from some separate disk image. So virt-sandbox now allows the user to map in a different root filesystem for the sandbox.

Both of these features were developed as part of a Google Summer of Code 2015 project which is aiming to enhance libvirt sandbox so that it is capable of executing images distributed by the Docker container image repository service. The motivation for this goes back to the original reason for creating the libvirt-sandbox project in the first place, which was to provide a hypervisor agnostic framework for sandboxing applications, as a higher level above the libvirt API. Once this is work is complete it’ll be possible to launch Docker images via libvirt QEMU, KVM or LXC, with no need for the Docker toolchain itself.

The detailed list of changes in this release is:

API/ABI in-compatible change, soname increased
Prevent use of virt-sandbox-service as non-root upfront
Fix misc memory leaks
Block SIGHUP from the dhclient binary to prevent accidental death if the controlling terminal is closed & reopened
Add support for re-creating libvirt XML from sandbox config to facilitate upgrades
Switch to standard gobject introspection autoconf macros
Add ability to set filters on network interfaces
Search /usr/lib instead of /lib for systemd unit files, as the former is the canonical location even when / and /usr are merged
Only set SELinux labels on hosts that support SELinux
Explicitly link to selinux, instead of relying on indirect linkage
Update compiler warning flags
Fix misc docs comments
Don’t assume use of SELinux in virt-sandbox-service
Fix path checks for SUSE in virt-sandbox-service
Add support for AppArmour profiles
Mount /var after other FS to ensure host image is available
Ensure state/config dirs can be accessed when QEMU is running non-root for qemu:///system
Fix mounting of host images in QEMU sandboxes
Mount images as ext4 instead of ext3
Allow use of non-raw disk images as filesystem mounts
Check if required static libs are available at configure time to prevent silent fallback to shared linking
Require libvirt-glib >= 0.2.1
Add support for loading lzma and gzip compressed kmods
Check for support libvirt URIs when starting guests to ensure clear error message upfront
Add LIBVIRT_SANDBOX_INIT_DEBUG env variable to allow debugging of kernel boot messages and sandbox init process setup
Add support for exposing block devices to sandboxes with a predictable name under /dev/disk/by-tag/TAGNAME
Use devtmpfs instead of tmpfs for auto-populating /dev in QEMU sandboxes
Allow setup of sandbox with custom root filesystem instead of inheriting from host’s root.
Allow execution of apps from non-matched ld-linux.so / libc.so, eg executing F19 binaries on F22 host
Use passthrough mode for all QEMU filesystems

Announce: libvirt-sandbox “Cholistan” 0.5.1 release – an application sandbox toolkit

Posted: November 19th, 2013 | Filed under: Fedora, libvirt, Security, Virt Tools | Tags: application, containers, kvm, libvirt, lxc, sandbox | No Comments »

I pleased to announce the a new public release of libvirt-sandbox, version 0.5.1, is now available from:

http://sandbox.libvirt.org/download/

The packages are GPG signed with

  Key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF (4096R)

At this point in time libvirt-sandbox can create sandboxes using either LXC or KVM, and should in theory be extendable to any libvirt driver.

This release focused on exclusively on bugfixing

Changed in this release:

Fix path to systemd binary (prefers dir /lib/systemd not /bin)
Remove obsolete commands from virt-sandbox-service man page
Fix delete of running service container
Allow use of custom root dirs with ‘virt-sandbox –root DIR’
Fix ‘upgrade’ command for virt-sandbox-service generic services
Fix logrotate script to use virsh for listing sandboxed services
Add ‘inherit’ option for virt-sandbox ‘-s’ security context option, to auto-copy calling process’ context
Remove non-existant ‘-S’ option froom virt-sandbox-service man page
Fix line break formatting of man page
Mention LIBVIRT_DEFAULT_URI in virt-sandbox-service man page
Check some return values in libvirt-sandbox-init-qemu
Remove unused variables
Fix crash with partially specified mount option string
Add man page docs for ‘ram’ mount type
Avoid close of un-opened file descriptor
Fix leak of file handles in init helpers
Log a message if sandbox cleanup fails
Cope with domain being missing when deleting container
Improve stack trace diagnostics in virt-sandbox-service
Fix virt-sandbox-service content copying code when faced with non-regular files.
Improve error reporting if kernel does not exist
Allow kernel version/path/kmod to be set with virt-sandbox
Don’t overmount ‘/root’ in QEMU sandboxes by default
Fix nosuid / nodev mount options for tmpfs
Force 9p2000.u protocol version to avoid QEMU bugs
Fix cleanup when failing to start interactive sandbox
Create copy of kernel from /boot to allow relabelling
Bulk re-indent of code
Avoid crash when gateway is missing in network options
Fix symlink target created in multi-user.target.wants
Add ‘-p PATH’ option for virt-sandbox-service clone/delete to match ‘create’ command option.
Only allow ‘lxc:///’ URIs with virt-sandbox-service until further notice
Rollback state if cloning a service sandbox fails
Add more kernel modules instead of assuming they are all builtins
Don’t complain if some kmods are missing, as they may be builtins
Allow –mount to be repeated with virt-sandbox-service

Thanks to everyone who contributed to this release

Creating a “head outline” image for team photographs with Fedora and GIMP

Posted: November 26th, 2012 | Filed under: Fedora, libvirt, Photography, Virt Tools | Tags: barcelona, fedora, gimp, kvm, kvm forum, outline | 3 Comments »

Two weeks back, I was in Barcelona for LinuxCon Europe / KVM Forum 2012. While there Jeff Cody acquired a photo of many of the KVM community developers. Although already visible on Google+, along with tags to identify all the faces, I wanted to put up an outline view of the photo too, mostly so that I could then write this blog post describing how to create the head outline :-) The steps on this page were all performed using Fedora 17 and GIMP 2.8.2, but this should work with pretty much every version of GIMP out there since there’s nothing fancy going on.

The master photo

The master photo that we’ll be working with is

Step 1: Edge detect

It was thought that one of the edge detection algorithms available in GIMP would be a good basis for providing a head outline. After a little trial & error, I picked ‘Filters -> Edge-detect -> Edge..’, then chose the ‘Laplace’ algorithm.

This resulted in the following image

Step 2: Invert colours

The previous image shows the outlines quite effectively, but my desire is for a primarily white image, with black outlines. This is easily achieved using the menu option ‘Colours -> Invert’

Step 3: Desaturate

The edge detection algorithm leaves some colour artifacts in the images, which are trivially dealt with by desaturating the image using ‘Colours -> Desaturate…’ and any one of the desaturation algorithms GIMP offers.

Step 4: Boost contrast

The outline looks pretty good, but there is still a fair amount of fine detail “noise”. There are a few ways we might get rid of this – in particular some of GIMPs noise removal filters. I went for the easy option of simply boosting the overall image contrast, using ‘Colours -> Brightness/Contrast…’

For this image, setting the contrast to ’40’ worked well, vary according to the particular characteristics of the image

Step 5: Add numbers

The outline view is where we want to be, but the whole point of the exercise is to make it easy to put names to faces. Thus the final step is to simply number each head. GIMP’s text tool is the perfect way to do this, just click on each face in turn and type in a number.

No need to worry about perfect placement, since each piece of text becomes a new layer. Once done, the layer positions can be moved around to fit well.

And that’s the final image completed. In the page I created on the KVM website, a little javascript handled swapping between the original & outline views on mouse over, but that’s all there is to it. The hardest part of the whole exercise is actually remembering who everyone is :-P

KVM Forum: building application sandboxes on top of KVM or LXC using libvirt

Posted: November 8th, 2012 | Filed under: Fedora, libvirt, Virt Tools | Tags: kvm, libvirt, libvirt-sandbox, lxc, sandbox | No Comments »

This week I have spent my time at LinuxCon Europe and KVM Forum 2012. I gave a talk titled “Building application sandboxes on top of KVM or LXC using libvirt”. For those who enquired afterwards, the slides are now available.

Daniel P. Berrangé

Writing about open source software, virtualization & more

CPU model configuration for QEMU/KVM on x86 hosts

Recommendations for KVM CPU model configuration on x86 hosts

Preferred CPU models for Intel x86 hosts

Important CPU features for Intel x86 hosts

Preferred CPU models for AMD x86 hosts

Important CPU features for AMD x86 hosts

Default x86 CPU models

Other non-recommended x86 CPUs

Syntax for configuring CPU models

QEMU command line

Libvirt guest XML

Announce: libvirt-sandbox “Dashti Margo” 0.6.0 release – an application sandbox toolkit

Announce: libvirt-sandbox “Cholistan” 0.5.1 release – an application sandbox toolkit

Creating a “head outline” image for team photographs with Fedora and GIMP

The master photo

Step 1: Edge detect

Step 2: Invert colours

Step 3: Desaturate

Step 4: Boost contrast

Step 5: Add numbers

KVM Forum: building application sandboxes on top of KVM or LXC using libvirt

Pages

Sites

Categories

Browse Posts

Projects

Recent Posts