More than you (or I) ever wanted to know about virtual keyboard handling

Posted: July 4th, 2010 | Filed under: Gtk-Vnc, Virt Tools | Tags: , , , , , , , | 6 Comments »

As a general rule, people using virtual machines have only one requirement when it comes to keyboard handling: any key they press should generate the same output in the host OS and the guest OS. Unfortunately, this is a surprisingly difficult requirement to satisfy. For a long time when using either Xen or KVM, to get workable keyboard handling it was necessary to configure the keymap in three places, 1. the VNC client OS, 2. the guest OS, 3. QEMU itself. The third item was a particular pain because it meant that a regular guest OS user would need administrative access to the host to change the keymap of their guest. Not good for delegation of control. In Fedora 11 we introduced a special VNC extension which allowed us to remove the need to configure keymaps in QEMU, so now it is merely necessary to configure the guest OS to match the client OS. One day when we get a general purpose guest OS agent, we might be able to automatically set the guest OS keymap to match client OS, whenever connecting via VNC, removing the last manual step. This post aims to give background on how keyboards work, what we done in VNC to improve the guest keyboard handling and what problems we still have.

Keyboard hardware scan codes

Between the time of pressing the physical key and text appearing on the screen, there are several steps in processing the input with data conversions along the way. The keyboard generates what are known as scan codes, a sequence of one or more bytes, which uniquely identifies the physical key and whether it is a make or break (press or release) event. Scan codes are invariant across all keyboards of the same type, regardless of what label is printed on the key. In the PC world, IBM defined the first scan code set with their IBM XT, followed later by AT scan codes and PS/2 scan codes. Other manufacturers adopted the IBM scan codes for their own products to ensure they worked out of the box. In the USB world, the HID specification defines the standard scan codes that manufacturers must use.

Operating system key codes

For operating systems wanted to support more than one different type of keyboard, scan codes are not a particularly good representation. They are also often unwieldly as a result of encoding both the key & its make/break state into the same byte(s). Thus operating systems typically define their own standard set of key codes, which is able to represent any possible keys on all known keyboards. They will also track the make/break state separately, now using press/release or up/down as terminology. Thus the first task of the keyboard driver is to convert from the hardware specific scan codes to the operating system specific key code. This is an easily reverseable, lossless mapping.

Display / toolkit key symbols & modifiers

Key codes still aren’t a concept that is particularly useful for (most) applications, which would rather known what user’s intended symbol was, rather than the physical key. Thus the display service (X11, Win32 GUI, etc) or application toolkit (GTK) define what are known as key symbols. To convert from key codes to key symbols, a key map is required for the language specific keyboard layout. The key map declares modifier keys (shift, alt, control, etc) and provides a list of key symbols that are associated with each key code. This mapping is only reverseable if you know the original key map. This is also a lossy mapping, because it is possible for several different key codes to map to the same key symbol.

Considering an end-to-end example, starting with the user pressing the key in row 4, column 2 which is labelled ‘A’ in a US layout, XT compatible keyboard. The operating system keyboard driver receives XT scan code 0x1e, which it converts to Linux key code 30 (KEY_A), Xorg server keyboard driver further converts to X11 key symbol 0x0061 (XK_a), GTK toolkit converts this to GDK key symbol 0x0061 (GDK_a), and finally the application displays the character ‘a’ in the text entry field on screen. There are actually a couple of conversions I’ve left out here, specifically how X11 gets the key codes from the OS and how X11 handles key codes internally, which I’ll come back to another time.

The problem with virtualization

For 99.99% of applications all these different steps / conversions are no problem at all, because they are only interested in text entry. Virtualization, as ever, introduces fun new problems where ever it goes. The problem occurs at the interface between the host virtual desktop client and the hardware emulation. The virtual desktop may be a local fat client using a toolkit like SDL, or it may be a remote network client using a protocol like VNC, RFB or SPICE. In both cases, a naive virtual desktop client will be getting key symbols with their key events. The hardware emulation layer will usually want to provide something like a virtualizated PS/2 or USB keyboard. This implies that there needs to be a conversion from key symbols back to hardware specific scan codes. Remember a couple of paragraphs ago where it was noted that the key code -> key symbol conversion is lossy. That is a now a big problem. In the case of a network client it is even worse, because the virtualization host does not even know what language specific keymap was used.

Faced with these obstacles, the initial approach QEMU took was to just add a command line parameter ‘-k $KEYMAP’. Without this parameter set, it will assume the virtuall desktop client is using a US layout, otherwise it will use the specified keymap. There is still the problem that many key codes can map to the same key symbol. It is impossible to get around this problem – QEMU just has to pick one of the many possible reverse mappings & use it. This means hat, even if the user configures matching keymaps on their client, QEMU and the guest OS, there may be certain keys that will never work in the guest OS. There is also a burden for QEMU to maintain a set of keymaps for every language layout. This set is inevitably incomplete.

The solution with virtualization

Fortunately, there is a get out of jail free card available. When passing key events to an application, all graphical windowing systems will provide both the key symbol and layout independent key code. Remember from many paragraphs earlier that scan code -> key code conversion is lossless and easily reverseable. For local fat clients, the only stumbling block is knowing what key code set is in use. Windows and OS-X both define a standard virtual key set, but sadly X11 does not :-( Instead the key codes are specific to the X11 keyboard driver that is in use, with a Linux based Xorg this is typically ‘kbd’ or ‘evdev’. QEMU has to use heuristics on X11 to decide which key codes it is receiving, but at least once this is identified, the conversion back to scan codes is trivial.

For the VNC network client though, there was one additional stumbling block. The RFB protocol encoding of key events only includes the key symbol, not the key code. So even though the VNC client has all the information the VNC server needs, there is no way to send it. Leading upto the development of Fedora 11, the upstream GTK-VNC and QEMU communities collaborated to define an official extension to the RFB protocol for an extended key event, that includes both key symbol and key code. Since every windowing system has its own set of key codes & the RFB needs to be platform independent, the protocol extension defined that the keycode set on the wire will be a special 32-bit encoding of the traditional XT scancodes. It is not a coincidence that QEMU already uses a 32-bit encoding of traditional XT scan codes internally :-) With this in place the RFB client, merely has to identify what operating system specific key codes it is receiving and then apply the suitable lossless mapping back to XT scancodes.

Considering an end-to-end example, starting with the user pressing the key in row 4, column 2 which is labelled ‘A’ in a US layout, XT compatible keyboard. The operating system keyboard driver receives XT scan code 0x1e, which it converts to Linux key code 30 (KEY_A), Xorg server keyboard driver further converts to X11 key symbol 0x0061 (XK_a) and an evdev key code, GTK toolkit converts the key symbol to GDK key symbol 0x0061 (GDK_a) but passes the evdev key code unchanged. The VNC client converts the evdev key code to the RFB key code and sends it to the QEMU VNC server along with the key symbol. The QEMU VNC server totally ignores the key symbol and does the no-op conversion from the RFB key code to the QEMU key code. The QEMU PS/2 emulation converts the QEMU keycode to either the XT scan code, AT scan code or PS/2 scan code, depending on how the guest OS has configured the keyboard controller. Finally the conversion mentioned much earlier takes place in the guest OS and the letter ‘a’ appears in a text field in the guest. The important bit to note is that although the key symbol was present, it was never used in the host OS or remote network client at any point. The only conversions performed were between scan codes and key codes all of which were lossless. The user is able to generate all the same key sequences in the guest OS as they can in the host OS. The user is happy their keyboard works as expected; the virtualization developer is happy at lack of bug reports about broken keyboard handling.

New GTK-VNC and work on OpenGL accelerated scaling

Posted: February 5th, 2008 | Filed under: Gtk-Vnc, Virt Tools | No Comments »

The new GTK-VNC 0.3.3 release was made available over the weekend. This was primarily a bug fix release dealing with yet more UltraVNC compatability issues, a few crash scenarios, improved key-state tracking to deal with annoying GTK behaviour where it fails to send you ‘release’ events during key-repeat, and ability to reset modifier state when the widget looses focus to avoid the server thinking Alt/Ctrl are stuck ‘on’.

We also have an EXPERIMENTAL web browser plugin for firefox. Before you go and install this, remember I’m saying this is EXPERIMENTAL. We’re not recommending anyone use this outside the lab yet because there needs to be a proper security audit of our protocol handling, and some more thought about how plugin security should work. Needless to say the build is disabled by default for now.

More interestingly though, is that the dev tree of GTK-VNC now has support for scaling the VNC display. We didn’t fancy writing pixmap scaling code, so for this we’re integrating with GtkGExt and OpenGL to get hardware accelerated scaling. The performance is pretty decent – Anthony tells me he’s been able to watch a DVD over VNC with scaling active. Sadly I’m stuck with the non-accelerated radeon driver in my laptop for now, so CPU usage is considerable. Still, its nice to be able to go “full screen” and have it use all screen real estate even when the remote desktop resolution doesn’t match the local desktop resolution.

New GTK-VNC release 0.3.2

Posted: December 31st, 2007 | Filed under: Gtk-Vnc, Virt Tools | No Comments »

Things have been progressing well with the GTK-VNC widget. The 0.3.0 release a few weeks back fixed a couple of co-routine race conditions, fixed portability to Solaris and added compatability for UltraVNC brokenness – it claims support for RFB version 3.4 which doesn’t technically exist. 0.3.1 was a brown paper bag release a day later, due to the ‘make dist’ process going wrong with 0.3.0; say no more. Today Anthony released version 0.3.2 which adds a GThread based co-routine implementation to provide portability to platforms lacking ucontext support (yes I’m looking at you Windows/cygwin). It also adds support for the RRE server encoding which is a zlib compressed format, although not commonly used its in the spec so its worth supporting.

For the next releases we’ll have support for the Tight encoding as used by TightVNC – this is a more advanced variant on RRE, in some cases using JPEG as its compression method which is interesting. We’re also in communication with the VMWare team to see if we can write code to support the RFB extensions, for which they recently got official extension numbers assigned. We decided to apply the ‘release early, release often’ mentality with earnest, and thus we’re aiming to have regular monthly releases for the forseeable future. Meanwhile John is continuing to develop Vinagre, a long overdue modern VNC client taking full advantage of the GNOME infrastructure & desktop integration points.

VNC sessions on a 3-d spinning cube

Posted: November 5th, 2007 | Filed under: Gtk-Vnc, Virt Tools | 1 Comment »

In his recent blog posting on redirected direct renderering, Kristian happened to mention Clutter, a toolkit which allows you to do 3d graphics without first needing to acquire a PhD in OpenGL. I made a mental note to give it a try sometime.

On saturday I spent a few hours doing the major upgrade from Fedora 6 to Fedora 8 on my laptop & desktop (skipping F7). I did it the crazy way, just changing my YUM config files & letting it upgrade everything. I can’t say it was a painless process, but I do finally have two working boxes on F8 now. I also took the opportunity to switch my IBM T60p laptop over to the Avivo driver, instead of the VESA driver & which worked without a hitch.

Back on topic. After the upgrade to F8, I poked at the repos and found that (nearly) all the Clutter related bits are packaged & available to Fedora users already. There is just an annoying buildrequires missing in the python bindings spec file, which meant the RPM is missing the GTK & GST APIs for clutter. A quick rebuild fixed that issue. You may remember I’ve been working on a GTK widget for displayed VNC sessions. One evil thought, led to another even more evil thought, and thus I ended up writing a VNC viewer program which displays multiple VNC sessions on a spinning cube. To make it even more evil, I decided to not restrict it to a cube, and instead generalized to an arbitrary number of sides.

The results are spectacular, though a static screenshot doesn’t really do it justice…
VNC sessions on a 3d spinning hexagon

Ctrl+Alt+PageUp/PageDown lets you rotate to the previous/next session respectively. The mouse & keyboard events are fully plumbed in so you can actually interact with each session. In this example I just started 6 VNC server instances running TWM at 500×500 pixels, but they could have been real GNOME desktops too. The only issue is that I’ve not yet figured out how todo correct depth sorting of each desktops. You don’t notice this problem in the screenshot, since I’ve just got them all rendering at 50% opacity ;-) It is impressively slow & impressively fast at the same time. Slow because I’m using Avivo which doesn’t have any real 3d rendering support yet; fast because I’m amazed it is actually working at all :-)

Now to hook this all up into the next version of virt-manager….. just kidding ;-P

Remotely setting up remote access to a GNOME session

Posted: September 5th, 2007 | Filed under: Gtk-Vnc, Virt Tools | 5 Comments »

I’ve got many boxes for testing purposes and while often I can run graphical apps over SSH, every so often I really do need to run the app within a full GNOME session. For example, the incredible new PolicyKit app in Fedora 8 enables desktop applications to authenticate to gain extra privileges. PolicyKit uses ConsoleKit for its session tracking & the ConsoleKit sessions are created by GDM when you initially login. Thus to test an application using PolicyKit you really do need to login via GDM and run a full GNOME session, not merely a X tunnel over SSH.

Now of course the critical times when I need to do this testing are when I’m not physically anywhere near the machine I need to test on. And invariably I’ve not left a login session active, nor even GNOME’s ‘remote desktop’ access enabled. Traditionally I’ve just created a suitable VNC server startup file containing

$ cat $HOME/.vnc/xstartup

[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
xsetroot -solid grey
vncconfig -iconic &

eval `dbus-launch --sh-syntax --exit-with-session`
exec  gnome-session

This gets me a full GNOME login session. Unfortunately there’s no ConsoleKit session associated with this & thus no possibility of using PolicyKit. GNOME itself though does come with VINO which can export your regular X session using the VNC protocol. If only I were logged into X on the machine’s console & running VINO. Argh.

After much poking around I finally figured out a solution. First off, SSH to the box in question as your regular desktop user. Now we can use gconftool-2 to enable VINO. We need to enable it, enable authentication, set a password, turn off incoming connection prompts and possibily set an explicit port (if you have something else on the regular port 5900 – eg a Xen guest).

# Disable local confirmation dialog for incoming connections
gconftool-2 --type bool --set /desktop/gnome/remote_access/prompt_enabled false

# Change VNC port to :9 instead of :0
gconftool-2 --type bool --set /desktop/gnome/remote_access/use_alternative_port true
gconftool-2 --type int --set /desktop/gnome/remote_access/alternative_port 5909

# Enable password auth
gconftool-2 --type list --list-type string --set /desktop/gnome/remote_access/authentication_methods '[vnc]'
PW=`echo 'mypassword' | base64`
gconftool-2 --type string --set /desktop/gnome/remote_access/vnc_password $PW

# Enable the VINO server
gconftool-2 --type bool --set /desktop/gnome/remote_access/enabled true

So that has the VINO server configured to run when I’m logged in, but as I mentioned already – I’m typically not logged in on the console when I need to be. For this challenge GDM comes to the rescue. It is possible change its config file to specify that a particular user will be automatically logged in the moment GDM starts. To do this edit /etc/gdm/custom.conf and add


A quick restart of GDM later, and I’m automatically logged into the remote box with a full GNOME session, including all the neccessary ConsoleKit magic. I can now connect with VNC and properly test virt-manager / PolicyKit integration. Yay.