I’m pleased to announce a new release of GTK-VNC, vesion 0.7.0. The release focus is on bug fixing and includes fixes for two publically reported security bugs which allow a malicious server to exploit the client. Similar bugs were recently reported & fixed in other common VNC clients too.
- CVE-2017-5884 – fix bounds checking for RRE, hextile and copyrect encodings
- CVE-2017-5885 – fix color map index bounds checking
- Add API to allow smooth scaling to be disabled
- Workaround to help SPICE servers quickly drop VNC clients which mistakenly connect, by sending “RFB ” signature bytes early
- Don’t accept color map entries for true-color pixel formats
- Add missing vala .deps files for gvnc & gvncpulse
- Avoid crash if host/port is NULL
- Add precondition checks to some public APIs
- Fix link to home page in README file
- Fix misc memory leaks
- Clamp cursor hot-pixel to within cursor region
Thanks to all those who reported bugs and provides patches that went into this new release.
This post is to announce the existence of a new small project to provide a websockets proxy explicitly targeting virtual machines serial consoles and SPICE/VNC graphical displays.
Virtual machines will typically expose one or more consoles for the user to interact on, whether plain text consoles or graphical consoles via VNC/SPICE. With KVM, the network server for these consoles is run as part of the QEMU process on the compute node. It has become common practice to not directly expose these network services to the end user. Instead large scale deployments will typically run some kind of proxy service sitting between the end user and their VM’s console(s). Typically this proxy will tunnel the connection over the websockets protocol too. This has a number of advantages to users and admins alike. By using websockets, it is possible to use an HTTP parameter or cookie to identify which VM to connect to. This means that a single public network port can multiplex access to all the virtual machines, dynamically figuring out which to connect to. The use of websockets also makes it possible to write in-browser clients (noVNC, SPICE-HTML5) to display the console, avoiding the need to install local desktop apps.
There are already quite a few implementations of the websockets proxy idea out there used by virt/cloud management apps, so one might wonder why another one is needed. The existing implementations generally all work at the TCP layer, meaning they treat the content being carried as opaque data. This means that while they provide security between the end user & proxy server by use of HTTPS for the websockets connection, the internal communication between the proxy and QEMU process is still typically running in cleartext. For SPICE and serial consoles it is possible to add security between the proxy and QEMU by simply layering TLS over the entire protocol, so again the data transported can be considered opaque. This isn’t possible for VNC though. To get encryption between the proxy server and QEMU requires interpreting the VNC protocol to intercept the authentication scheme negotiation, turning on TLS support. IOW, the proxy server cannot treat the VNC data stream as opaque.
The libvirt-console-proxy project was started specifically to address this requirement for VNC security. The proxy will interpret the VNC protocol handshake, responding to the client user’s auth request (which is for no auth at the VNC layer, since the websockets layer already handled auth from the client’s POV). The proxy will start a new auth scheme handshake with QEMU and request the VeNCrypt scheme be used, which enables x509 certificate mutual validation and TLS encryption of the data stream. Once the security handshake is complete, the proxy will allow the rest of the VNC protocol to run as a black-box once more – it simply adds TLS encryption when talking to QEMU. This VNC MITM support is already implemented in the libvirt-console-proxy code and enabled by default.
As noted earlier, SPICE is easier to deal with, since it has separate TCP ports for plain vs TLS traffic. The proxy simply needs to connect to the TLS port instead of the TCP port, and does not need to speak the SPICE protocol. There is, however, a different complication with SPICE. A logical SPICE connection from a client to the server actually consists of many network connections as SPICE uses a separate socket for each type of data (keyboard events, framebuffer updates, audio, usb tunnelling, etc). In fact there can be as many as 10 network connections for a single SPICE connection. The websockets proxy will normally expect some kind of secret token to be provided by the client both as authentication, and to identify which SPICE server to connect to. It can be highly desirable for such tokens to be single-use only. Since each SPICE connection requires multiple TCP connections to be established, the proxy has no way of knowing when it can invalidate further use of the token. To deal with this it needs to insert itself into the SPICE protocol negotiation. This allows it to see the protocol message from the server enumerating which channels are available, and also see the unique session identifier. When the additional SPICE network connections arrive, the proxy can now validate that the session ID matches the previously identified session ID, and reject re-use of the websockets security token if they don’t match. It can also permanently invalidate the security token, once all expected secondary connections are established. At time of writing, this SPICE MITM support is not yet implemented in the libvirt-console-proxy code, but is is planned.
The libvirt-console-proxy project is designed to be reusable by any virtualization management project and so is split into two parts. The general purpose “
virtconsoleproxyd” daemon exposes the websockets server and actually runs the connections to the remote QEMU server(s). This daemon is designed to be secure by default, so it mandates configuration of x509 certificates for both the websockets server and the internal VNC, SPICE & serial console connections to QEMU. Running in cleartext requires explicit “
--listen-insecure” and “
--connect-insecure” flags to make it clear this is a stupid thing to be doing.
virtconsoleproxyd” still needs some way to identify which QEMU server(s) it should connect to for a given incoming websockets connection. To address this there is the concept of an external “token resolver” service. This is a daemon running somewhere (either co-located, or on a remote host), providing a trivial REST service. The proxy server does a GET to this external resolver service, passing along the token value. The resolver validates the token, determines what QEMU server it corresponds to and passes this info back to “
virtconsoleproxyd“. This way, each project virt management project can deploy the “
virtconsoleproxyd” service as is, and simply provide a custom “token resolver” service to integrate with whatever application specific logic is needed to identify QEMU.
There is also a reference implementation of the token resolver interface called “
virtconsoleresolverd“. This resolver has a connection to one or more
libvirtd daemons and monitors for running virtual machines on each one. When a virtual machine starts, it’ll query the libvirt XML for that machine and extract a specific piece of metadata from the XML under the namespace
http://libvirt.org/schemas/console-proxy/1.0. This metadata identifies a libvirt secret object that provides the token for connecting to that guest console. As should be expected, the “
virtconsoleresolverd” requires use of HTTPS by default, refusing to run cleartext unless the “
--listen-insecure” option is given.
For example on a guest which has a SPICE graphics console, a serial ports and two virtio-console devices, the metadata would look like
$ virsh metadata demo http://libvirt.org/schemas/console-proxy/1.0
<console token="999f5742-2fb5-491c-832b-282b3afdfe0c" type="spice" port="0" insecure="no"/>
<console token="6a92ef00-6f54-4c18-820d-2a2eaf9ac309" type="serial" port="0" insecure="no"/>
<console token="3d7bbde9-b9eb-4548-a414-d17fa1968aae" type="console" port="0" insecure="no"/>
<console token="393c6fdd-dbf7-4da9-9ea7-472d2f5ad34c" type="console" port="1" insecure="no"/>
Each of those tokens refers to a libvirt secret
$ virsh -c sys secret-dumpxml 999f5742-2fb5-491c-832b-282b3afdfe0c
<secret ephemeral='no' private='no'>
<description>Token for spice console proxy domain d54df46f-1ab5-4a22-8618-4560ef5fac2c</description>
And that secret has a value associated which is the actual security token to be provided when connecting to the websockets proxy
$ virsh -c sys secret-get-value 999f5742-2fb5-491c-832b-282b3afdfe0c | base64 -d
So in this example, to access the SPICE console, the remote client would provide the security token string “
750d78ed-8837-4e59-91ce-eef6800227fd“. It happens that I’ve used UUIDs as the actual security token, but they can actually be in any format desired – random UUIDs just happen to be reasonably good high entropy secrets.
Setting up this metadata is a little bit tedious, so there is also a help program that can be used to automatically create the secrets, set a random value and write the metadata into the guest XML
$ virtconsoleresolveradm enable demo
Enabled access to domain 'demo'
Or to stop it again
$ virtconsoleresolveradm disable demo
Disabled access to domain 'demo'
If you noticed my previous announcements of libvirt-go and libvirt-go-xml it won’t come as a surprise to learn that this new project is also written in Go. In fact the integration of libvirt support in this console proxy project is what triggered my work on those other two projects. Having previously worked on adding the same kind of VNC MITM protocol support to the OpenStack nova-novncproxy server, I find I’m liking Go more and more compared to Python.
Libvirt has long supported use of TLS for its remote API service, using the gnutls library as its backend. When negotiating a TLS session, there are a huge number of possible algorithms that could be used and the client & server need to decide on the best one, where “best” is commonly some notion of “most secure”. The preference for negotiation is expressed by simply having an list of possible algorithms, sorted best to worst, and the client & server choose the first matching entry in their respective lists. Historically libvirt has not expressed any interest in the handshake priority configuration, simply delegating the decision to the gnutls library on that basis that its developers knew better than libvirt developers which are best. In gnutls terminology, this means that libvirt has historically used the “
DEFAULT” priority string.
The past year or two has seen a seemingly never ending stream of CVEs related to TLS, some of them particular to specific algorithms. The only way some of these flaws can be addressed is by discontinuing use of the affected algorithm. The TLS library implementations have to be fairly conservative in dropping algorithms, because this has an effect on consumers of the library in question. There is also potentially a significant delay between a decision to discontinue support for an algorithm, and updated libraries being deployed to hosts. To address this Fedora 21 introduced the ability to define the algorithm priority strings in host configuration files, outside of the library code. This system administrators can edit a file
/etc/crypto-policies/config to change the algorithm priority for all apps using TLS on the host. After editting this file, the
update-crypto-policies command is run to generate the library specific configuration files. For example, it populates
/etc/crypto-policies/back-ends/gnutls.config In gnutls use of this file is enabled by specifying that an application wants to use the “
@SYSTEM” priority string.
This is a good step forward, as it takes the configuration out of source code and into data files, but it has limited flexibility because it applies to all apps on the host. There can be two apps on a host which have mutually incompatible views about what the best algorithm priority is. For example, a web browser will want to be fairly conservative in dropping algorithms to avoid breaking access to countless websites. An application like libvirtd though, where there is a well known set of servers and clients to connect in any site, can be fairly aggressive in only supporting the very best algorithms. What is desired is a way to override the algorithm priorty per application. Now of course this can easily be done via the application’s own configuration file, and so libvirt has added a new parameter “
The downside of using the application’s own configuration, is that the system administrator has to go hunting through many different files to update each application. It is much nicer to have a central location where the TLS priority settings for all applications can be controlled. What is desired is a way for libvirt to be built such that it can tell gnutls to first look for a libvirt specific priority string, and then fallback to the global priority string. To address this patches were written for GNUTLS to extend its priority string syntax. It is now possible to for libvirt to pass “
@LIBVIRT,SYSTEM” to gnutls as the priority. It will thus read
/etc/crypto-policies/back-ends/gnutls.config first looking for an entry matching “
LIBVIRT” and then looking for an entry matching “
SYSTEM“. To go along with the gnutls change, there is also an enhancement to the update-crypto-policies tool to allow application specific entries to be included when generating the
/etc/crypto-policies/back-ends/gnutls.config file. It is thus possible to configure the libvirt priority string by simply creating a file
/etc/crypto-policies/local.d/gnutls-libvirt.config containing the desired string and re-running
In summary, the libvirt default priority settings are now:
- RHEL-6/7 –
NORMAL – a string hard coded in gnutls at build time
- Fedora < 25 -
@SYSTEM – a priority level defined by sysadmin based on
- Fedora >= 25 –
@LIBVIRT,SYSTEM – a raw priority string defined in
/etc/crypto-policies/local.d/gnutls-libvirt.config, falling back to
/etc/crypto-policies/config if not present.
In all cases it is still possible to customize in
/etc/libvirt/libvirtd.conf via the
tls_priority setting, but it is is recommended to use the global system
/etc/crypto-policies facility where possible.
This blog is part 7 of a series I am writing about work I’ve completed over the past few releases to improve QEMU security related features.
The live migration feature in QEMU allows a running VM to be moved from one host to another with no noticeable interruption in service and minimal performance impact. The live migration data stream will contain a serialized copy of state of all emulated devices, along with all the guest RAM. In some versions of QEMU it is also used to transfer disk image content, but in modern QEMU use of the NBD protocol is preferred for this purpose. The guest RAM in particular can contain sensitive data that needs to be protected against any would be attackers on the network between source and target hosts. There are a number of ways to provide such security using external tools/services including VPNs, IPsec, SSH/stunnel tunnelling. The libvirtd daemon often already has a secure connection between the source and destination hosts for its own purposes, so many years back support was added to libvirt to automatically tunnel the live migration data stream over libvirt’s own secure connection. This solved both the encryption and authentication problems at once, but there are some downsides to this approach. Tunnelling the connection means extra data copies for the live migration traffic and when we look at guests with RAM many GB in size, the number of data copies will start to matter. The libvirt tunnel only supports a tunnelling of a single data connection and in future QEMU may well wish to use multiple TCP connections for the migration data stream to improve performance of post-copy. The use of NBD for storage migration is not supported with tunnelling via libvirt, since it would require extra connections too. IOW while tunnelling over libvirt was a useful short term hack to provide security, it has outlived its practicality.
It is clear that QEMU needs to support TLS encryption natively on its live migration connections. The QEMU migration code has historically had its own distinct I/O layer called QEMUFile which mixes up tracking of migration state with the connection establishment and I/O transfer support. As mentioned in previous blog post, QEMU now has a general purpose I/O channel framework, so the bulk of the work involved converting the migration code over to use the QIOChannel classes and APIs, which greatly reduced the amount of code in the QEMU
migration/ sub-folder as well as simplifying it somewhat. The TLS support involves the addition of two new parameters to the migration code. First the “
tls-creds” parameter provides the ID of a previously created TLS credential object, thus enabling use of TLS on the migration channel. This must be set on both the source and target QEMU’s involved in the migration.
On the target host, QEMU would be launched with a set of TLS credentials for a server endpoint:
$ qemu-system-x86_64 -monitor stdio -incoming defer \
-object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=server,id=tls0 \
To enable incoming TLS migration 2 monitor commands are then used
(qemu) migrate_set_str_parameter tls-creds tls0
(qemu) migrate_incoming tcp:myhostname:9000
On the source host, QEMU is launched in a similar manner but using client endpoint credentials
$ qemu-system-x86_64 -monitor stdio \
-object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=client,id=tls0 \
To enable outgoing TLS migration 2 monitor commands are then used
(qemu) migrate_set_str_parameter tls-creds tls0
(qemu) migrate tcp:otherhostname:9000
The migration code supports a number of different protocols besides just “
tcp:“. In particular it allows an “
fd:” protocol to tell QEMU to use a passed-in file descriptor, and an “
exec:” protocol to tell QEMU to launch an external command to tunnel the connection. It is desirable to be able to use TLS with these protocols too, but when using TLS the client QEMU needs to know the hostname of the target QEMU in order to correctly validate the x509 certificate it receives. Thus, a second “
tls-hostname” parameter was added to allow QEMU to be informed of the hostname to use for x509 certificate validation when using a non-tcp migration protocol. This can be set on the source QEMU prior to starting the migration using the “
migrate_set_str_parameter” monitor command
(qemu) migrate_set_str_parameter tls-hostname myhost.mydomain
This feature has been under development for a while and finally merged into QEMU GIT early in the 2.7.0 development cycle, so will be available for use when 2.7.0 is released in a few weeks. With the arrival of the 2.7.0 release there will finally be TLS support across all QEMU host services where TCP connections are commonly used, namely VNC, SPICE, NBD, migration and character devices.
In this blog series:
This blog is part 6 of a series I am writing about work I’ve completed over the past few releases to improve QEMU security related features.
A number of QEMU device models and objects use a character devices for providing connectivity with the outside world, including the QEMU monitor, serial ports, parallel ports, virtio serial channels, RNG EGD object, CCID smartcard passthrough, IPMI device, USB device redirection and vhost-user. While some of these will only ever need a character device configured with local connectivity, some will certainly need to make use of TCP connections to remote hosts. Historically these connections have always been entirely in clear text, which is unacceptable in the modern hostile network environment where even internal networks cannot be trusted. Clearly the QEMU character device code requires the ability to use TLS for encrypting sensitive data and providing some level of authentication on connections.
The QEMU character device code was mostly using GLib’s GIOChannel framework for doing I/O but this has a number of unsatisfactory limitations. It can not do vectored I/O, is not easily extensible and does not concern itself at all with initial connection establishment. These are all reasons why the QIOChannel framework was added to QEMU. So the first step in supporting TLS on character devices was to convert the code over to use QIOChannel instead of GIOChannel. With that done, adding in support for TLS was quite straightforward, merely requiring addition of a new configuration property (“
tls-creds“) to set the desired TLS credentials.
For example to run a QEMU VM with a serial port listening on IP 10.0.01, port 9000, acting as a TLS server:
$ qemu-system-x86_64 \
-object tls-creds-x509,id=tls0,endpoint=server,dir=/home/berrange/qemutls \
-chardev socket,id=s0,host=10.0.0.1,port=9000,tls-creds=tls0,server \
...other QEMU options...
It is possible test connectivity to this TLS server using the gnutls-cli tool
$ gnutls-cli --priority=NORMAL -p 9000 \
In the above example, QEMU was running as a TCP server, and acting as the TLS server endpoint, but this matching is not required. It is valid to configure it to run as a TLS client if desired, though this would be somewhat uncommon.
Of course you can connect 2 QEMU VMs together, both using TLS. Assuming the above QEMU is still running, we can launch a second QEMU connecting to it with
$ qemu-system-x86_64 \
-object tls-creds-x509,id=tls0,endpoint=client,dir=/home/berrange/qemutls \
-chardev socket,id=s0,host=10.0.0.1,port=9000,tls-creds=tls0 \
...other QEMU options...
Notice, we’ve changed the “endpoint” and removed the “server” option, so this second QEMU runs as a TCP client and acts as the TLS client endpoint.
This feature is available since the QEMU 2.6.0 release a few months ago.
In this blog series: