Improving QEMU security part 4: generic I/O channel framework to simplify TLS
This blog is part 4 of a series I am writing about work I’ve completed over the past few releases to improve QEMU security related features.
Part 2 of this series described the creation of a general purpose API for simplifying TLS session handling inside QEMU, particularly with a view to hiding the complexity of the handshake and x509 certificate validation. The VNC server was converted to use this API, which was a big benefit, but there was still a need to add extra code to support TLS in the I/O paths. Specifically, anywhere that the VNC server would read/write on the network socket, had to be made TLS aware so that it would use plain POSIX send/recv functions vs the TLS wrapped send/recv functions as appropriate. For the VNC server it is actually even more complex, because it also supports websockets, so each I/O point had to choose between plain, TLS, websockets and websockets plus TLS. As TLS support extends to other areas of QEMU this pattern would continue to complicate I/O paths in each backend.
Clearly there was a need for some form of I/O channel abstraction that would allow TLS to be enabled in each QEMU network backend without having to add conditional logic at every I/O send/recv call. Looking around at the QEMU subsystems that would ultimately need TLS support, showed a variety of approaches currently in use
- Character devices use combination of POSIX sockets APIs to establish connections and GIOChannel for performing I/O on them
- Migration has a QEMUFile abstraction which provides read/write facilities for a number of underlying transports, TCP sockets, UNIX sockets, STDIO, external command, in memory buffer and RDMA. The various QEMUFile impls all uses the plain POSIX sockets APIs and for TCP/UNIX sockets the sendmsg/recvmsg functions for I/O
- NBD client & server use plain POSIX sockets APIs and sendmsg/recvmsg for I/O
- VNC server uses plain POSIX sockets APIs and sendmsg/recvmsg for I/O
The GIOChannel APIs used by the character device backend theoretically provide an extensible framework for I/O and there is even a TLS implementation of the GIOChannel API. The two limitations of GIOChannel for QEMU though are that it does not support scatter / gather / vectored I/O APIs and that it does not support file descriptor passing over UNIX sockets. The latter is not a show stopper, since you can still access the socket handle directly to send/recv file descriptors. The lack of vectored I/O though would be a significant issue for migration and NBD servers where performance is very important. While we could potentially extend GIOChannel to add support for new callbacks to do vectored I/O, by the time you’ve done that most of the original GIOChannel code isn’t going to be used, limiting the benefit of starting from GIOChannel as a base. It is also clear that GIOChannel is really not something that is going to get any further development from the GLib maintainers, since their focus is on the new and much better GIO library. This supports file descriptor passing and TLS encryption, but again lacks support for vectored I/O. The bigger show stopper though is that to get access to the TLS support requires depending on a version on GLib that is much newer than what QEMU is willing to use. The existing QEMUFile APIs could form the basis of a general purpose I/O channel system if they were untangled & extracted from migration codebase. One limitation is that QEMUFile only concerns itself with I/O, not the initial channel establishment which is left to the migration core code to deal with, so did not actually provide very much of a foundation on which to build.
After looking through the various approaches in use in QEMU, and potentially available from GLib, it was decided that QEMU would be best served by creating a new general purpose I/O channel API. Thus a new QEMU subsystem was added in the io/ and include/io/ directories to provide a set of classes for I/O over a variety of different data channels. The core design aims were to use the QEMU object model (QOM) framework to provide a standard pattern for extending / subclassing, use the QEMU Error object for all error reporting, file descriptor passing, main loop watch integration and coroutine integration. Overall the new design took many elements of its design from GIOChannel and the GIO library, and blended them with QEMU’s own codebase design. The initial goal was to provide enough functionality to convert the VNC server as a proof of concept. To this end the following classes were created
- QIOChannel – the abstract base defining the overall interface for the I/O framework
- QIOChannelSocket – implementation targeting TCP, UDP and UNIX sockets
- QIOChannelTLS – layer that can provide a TLS session over any other channel
- QIOChannelWebsock – layer that can run the websockets protocol over any other channel
To avoid making this blog posting even larger, I won’t go into details of these (the code is available in QEMU git for anyone who’s really interesting), but instead illustrate it with a comparison of the VNC code before & after. First consider the original code in the VNC server for dealing with writing a buffer of data over a plain socket or websocket either with TLS enabled. The following functions existed in the VNC server code to handle all the combinations:
ssize_t vnc_tls_push(const char *buf, size_t len, void *opaque) { VncState *vs = opaque; ssize_t ret; retry: ret = send(vs->csock, buf, len, 0); if (ret < 0) { if (errno == EINTR) { goto retry; } return -1; } return ret; } ssize_t vnc_client_write_buf(VncState *vs, const uint8_t *data, size_t datalen) { ssize_t ret; int err = 0; if (vs->tls) { ret = qcrypto_tls_session_write(vs->tls, (const char *)data, datalen); if (ret < 0) { err = errno; } } else { ret = send(vs->csock, (const void *)data, datalen, 0); if (ret < 0) { err = socket_error(); } } return vnc_client_io_error(vs, ret, err); } long vnc_client_write_ws(VncState *vs) { long ret; vncws_encode_frame(&vs->ws_output, vs->output.buffer, vs->output.offset); buffer_reset(&vs->output); return vnc_client_write_buf(vs, vs->ws_output.buffer, vs->ws_output.offset); } static void vnc_client_write_locked(void *opaque) { VncState *vs = opaque; if (vs->encode_ws) { vnc_client_write_ws(vs); } else { vnc_client_write_plain(vs); } }
After conversion to use the new QIOChannel classes for sockets, websockets and TLS, all of the VNC server code above turned into
ssize_t vnc_client_write_buf(VncState *vs, const uint8_t *data, size_t datalen) { Error *err = NULL; ssize_t ret; ret = qio_channel_write(vs->ioc, (const char *)data, datalen, &err); return vnc_client_io_error(vs, ret, &err); }
It is clearly a major win for maintainability of the VNC server code to have all the TLS and websockets I/O support handled by the QIOChannel APIs. There is no impact to supporting TLS and websockets anywhere in the VNC server I/O paths now. The only place where there is new code is the point where the TLS or websockets session is initiated and this now only requires instantiation of a suitable QIOChannel subclass and registering a callback to be run when the session handshake completes (or fails).
tls = qio_channel_tls_new_server(vs->ioc, vs->vd->tlscreds, vs->vd->tlsaclname, &err); if (!tls) { vnc_client_error(vs); return 0; } object_unref(OBJECT(vs->ioc)); vs->ioc = QIO_CHANNEL(tls); qio_channel_tls_handshake(tls, vnc_tls_handshake_done, vs, NULL);
Notice that the code is simply replacing the current QIOChannel handle ‘vs->ioc’ with an instance of the QIOChannelTLS class. The vnc_tls_handshake_done method is invoked when the TLS handshake is complete or failed and lets the VNC server continue with the next part of its authentication protocol, or drop the client connection as appropriate. So adding TLS session support to the VNC server comes in at about 10 lines of code now.
In this blog series:
- Part 1: crypto code consolidation
- Part 2: generic TLS support
- Part 3: securely passing in credentials
- Part 4: generic I/O channel framework to simplify TLS
- Part 5: TLS support for NBD server & client
- Part 6: TLS support for character devices
- Part 7: TLS support for migration
Leave a Reply