Capa: a desktop application for photo capture via a digital camera

Posted: December 21st, 2009 | Filed under: Entangle | 6 Comments »

A couple of months ago I attended an LPMG event on the subject of off-camera flash. The talk was quite interactive with the presenters using a wide array of camera and flash equipment on stage to demonstrate the techniques they were covering. The cameras were connected to a laptop, in turn connected to a projector, allowing the audience to view the photos on the big screen as soon as they were captured.

I’m not entirely certain, but I believe the presenters were using Nikon Camera Control Pro on the laptop for control of the digital SLR. Watching it, I couldn’t help wondering if it was possible to do remote camera control & capture on a Linux laptop using only open source software. After a short investigation I discovered that, in addition to its image file download/management capabilities, gphoto allows for triggering the shutter and changing settings on digital cameras. Applications providing graphical frontends to gphoto though, only appeared to expose its image file download/management capabilities. Thus decided to write a graphical frontend myself. I’m calling it Capa

Before continuing with more wordy stuff, here is what everyone is really wanting, a screenshot

The goal of the Capa application is to provide a simple and efficient interface for tethered shooting and controlling camera settings. It will leave all file management tasks to other applications such as the gphoto plugins for GVFS or FUSE. The two main libraries on which the application is built are gphoto and GTK. The source code is being licensed under the terms of the GPLv3+

The code is at a very early stage of development with no formal releases yet, but it is at the point where it might be useful to people beyond myself, hence this blog posting. At this point it is capable of either triggering the camera shutter directly, or event monitoring where it detects photos shot on the camera. In both cases it will immediately download and display all photos. When a new image is detected it will be immediately downloaded & displayed. The images are not left on the memory card. The current session defines a directory on the host computer where all images are saved, defaulting to a directory under $HOME/Pictures (or wherever your XDG preferences point). The UI for changing tunable settings is rather crude. If you’re lucky it may work, but don’t count on it yet :-)

The interface is fully colour management aware. It is capable of automatically detecting the monitor’s current configured ICC profile in accordance with the X11 ICC profile specification. All images displayed by the application, whether full size or as thumbnails, will have the necessary colour profile transformation applied. GNOME Colour Manager is an excellent new app for configuring your display profiles in the necessary manner for them to work with Capa. Integration with HAL allows immediate automatic detection of any newly plugged in USB cameras and similar support for UDev is planned.

In the very near future it is intended that GObject introspection and GJS will be used to support a JavaScript plugin engine. The codebase has a strict separation between its object model and UI model specifically designed to facilitate plugins. This will allow end user customization & scripting of the UI to best suit their needs. For example, timer triggered shooting, motion detection and many other neat ideas could be provided via plugins.

For a little more information visit the Capa website.

Routed subnets without NAT for libvirt managed virtual machines in Fedora

Posted: December 13th, 2009 | Filed under: libvirt, Virt Tools | 6 Comments »

There are a huge number of ways of configuring networking for virtual machines when running libvirt. The two most common options, and our default recommendations for people, are either NAT (also known as “virtual networking”) or bridging (also known as “shared physical device”). The NAT option has the advantage that it can be made to work out of the box for pretty much all machines, even though with only wifi or dialup networking, but only allows outbound access from VMs, no incoming connections from outside the host. The bridging option has the advantage that machines on the wider LAN can access guests on the host, but it does not work with wifi.

This post is going to quickly describe a 3rd way of providing network connectivity to VMs, which we called the ‘routed’ option. In terms of its implementation in libvirt, it is actually just a variant on the NAT option, but without the NAT. For the purposes of this discussion I am going to describe my home network setup which is entirely wireless.

WLAN router: This is an LinkSys WRT54GL wireless router which of course runs Linux in the form of OpenWRT Kamikaze. This provides a DHCP service on the wireless LAN for the subnet 192.168.254.0/24
Mini server: This is a Mac Mini running Fedora 12, primarily acting as server for running SqueezeCenter. While it has an ethernet port, its location in the house means wifi access is the only option.
Random laptops
: This is mostly the IBM Thinkpad I do most of my day-to-day work on. Again it only ever connects over wifi

The requirement is to run a number of virtual machines on the mini server, and be able to have unrestricted access to them from the laptop. Since the mini server is wireless, bridging is out of the question. Similarly, since I need unrestricted access to the VMs, the NAT option is also not viable. Hence this post about setting up routed networking.

Configuring the virtualization host

The libvirt virtual networking service normally gives you a ‘virbr0’ configured todo NAT. The XML format, however, allows you to specify that any virtual network be setup without NAT. It is perfectly acceptable to have many virtual networks on the same host, some using NAT, some not. Thus leave the default network alone, and simply define a new one using a subnet of 192.168.200.0/24 on the mini server.

# cat > vms.xml <<EOF
<network>
  <name>vms</name>
  <forward mode='route'/>
  <bridge name='virbr1' />
  <ip address='192.168.200.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.200.2' end='192.168.200.254' />
    </dhcp>
  </ip>
</network>
EOF

# virsh net-define vms.xml
Network vms defined from vms.xml

With the configuration for the network defined, it can be started, and indeed set to startup upon system boot

# virsh net-start vms
Network vms started

# virsh net-autostart vms
Network vms marked as autostarted

If you look at iptables output you will set libvirt has defined a series of iptables rules in the FORWARD chain to allow traffic to be pass from the virtual network to the LAN & vica-verca. You must, however, ensure that the sysctl net.ipv4.ip_forward is enabled otherwise the kernel won’t even attempt forwarding!

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 2856  203K ACCEPT     udp  --  virbr1 *       0.0.0.0/0            0.0.0.0/0           udp dpt:53
    0     0 ACCEPT     tcp  --  virbr1 *       0.0.0.0/0            0.0.0.0/0           tcp dpt:53
    2   656 ACCEPT     udp  --  virbr1 *       0.0.0.0/0            0.0.0.0/0           udp dpt:67
    0     0 ACCEPT     tcp  --  virbr1 *       0.0.0.0/0            0.0.0.0/0           tcp dpt:67


Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 453K  672M ACCEPT     all  --  *      virbr1  0.0.0.0/0            192.168.200.0/24
 245K   13M ACCEPT     all  --  virbr1 *       192.168.200.0/24     0.0.0.0/0
    0     0 ACCEPT     all  --  virbr1 virbr1  0.0.0.0/0            0.0.0.0/0
    0     0 REJECT     all  --  *      virbr1  0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable
    0     0 REJECT     all  --  virbr1 *       0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable

The 4 rules in the INPUT chain allow DHCP/DNS requests to the dnsmasq instance running on virbr1. The first rule in the FORWARD chain allows traffic from the hosts’s WLAN to pass to the VM subnet only. The second rule allows traffic from VMs to the WLAN. The third rule allows traffic between VMs. The final two rules block everything else, mostly to protect against IP spoofing

That really is all that is needed on the virtualization host to setup a routed network. The next step takes place on the LAN router.

Configuring the LAN/WLAN router

The virtualization host is now all set to forward traffic from 192.168.200.0/24 to & from the WLAN 192.168.254.0/24. This on its own though is not sufficient, because no other hosts on the WLAN know where the 192.168.200.0/24 subnet is ! It is thus necessary to configure a static route on the LAN/WLAN router, in this case my OpenWRT box.

To be able to route to the new subnet, the virtualization host needs to have a static IP address, even if it is being configured via DHCP. I fixed my virt host to use the IP 192.168.254.223, so now enabling route to the subnet containing the VMs merely requires adding one static route. Using the ‘ip’ command this could be done with:

# ip route add 192.168.254.0/24 via 192.168.254.223

OpenWRT Kamikaze of course comes with a config file that lets you do that in a way that is persistent across reboots.

# cat >> /etc/config/networks <<EOF
config 'route' 'minivms'
        option 'interface' 'lan'
        option 'target' '192.168.200.0'
        option 'netmask' '255.255.255.0'
        option 'gateway' '192.168.254.223'
EOF

# /etc/init.d/network restart

Depending on the precise way the network interfaces on the router are configured, and the current iptables setup it might be necessary to add a rule to the FORWARD chain. In my case I had to allow the 192.168.200.0 subnet to be forwarded over the ‘br-lan’ device, by adding to /etc/firewall.user

# iptables -I FORWARD 1 -i br-lan -o br-lan --dest 192.168.254.0/24 -j ACCEPT

With that in place my laptop can now ping guests guests running on the mini server’s virtual network, and vica-verca. No NAT or bridging involved and all playing nicely with wifi.

Configuring the guests

With the router and virtual host both configured the stage is set to provision virtual machines. If using virt-manager, then in the last step of the “New VM wizard”, expand the ‘Advanced options’ panel and simply select the ‘vms’ network from the drop down list of choices. The guest will then be connected to the newly defined virtual network, instead of the default NAT based one.
If provisioning using virt-install on the command line, then use the argument ‘–network network:vms’ to tell it to use the new virtual network.

Other ways of providing routed networking

While the above setup only requires three configuration steps, (1. define network on the virt host, 2. add static route on WLAN router. 3. add iptables rules), the obvious pain point here is that you might not have the ability to add static routes on the WLAN router. If that is not the case, then you could provide the static routes on all the client machines on the WLAN (ie add it to the laptop itself). This is sub-optimal too for rather obvious scalability reasons.

What we really want is to be able to provide routed networking without having to define a new IP subnet. In other words we need to figure out how to make libvirt’s virtual networking capability support Proxy ARP either of individual IPs or by subnetting. Patches on a postcard please… :-)

Taking screenshots on colour managed displays with GIMP

Posted: December 12th, 2009 | Filed under: Fedora, Photography | 2 Comments »

Last week I mentioned how I had started running F12 with a colour managed desktop wherever possible with the application. Today I had need to capture some screenshots of a new (& as yet unannounced ) application I’m working on. There are two ways I normally capture screenshots. The first is to just press the ‘PrintScreen’ button, let GNOME save the PNG image and then crop it in GIMP or something like that. The second way is to use GIMP’s own screen capture function (File -> Create -> Screenshot), useful if you want an capture of a specific window instead of the whole desktop.

Today I acquired the screenshot using GIMP since I already had it open. And the colours in the screenshot looked like complete garbage. It shouldn’t be hard to understand what went wrong here. The default ICC profile for new images created in GIMP is the sRGB colourspace. In image windows, GIMP applies a transformation to the image colours, going from the sRGB profile to the monitor’s calibrated profile. Except that since this image was created from a screenshot of a colour managed display, the colours have already been transformed according to the monitor’s profile. GIMP is in essence applying a duplicate conversion. It is no wonder the result looks awful.

Having realized that a duplicate conversion was taking place, the solution is easy. Tell GIMP that the image is in the monitor’s colourspace, rather than the default sRGB. This is done using the menu Image -> Mode -> Assign Color Profile. With the ‘Assign colour profile’ operation, you are not changing the pixel values in the source image, merely telling GIMP how to interpret them. Since it now knows the image is already in the monitor’s colourspace, the transformation becomes a no-op, and the image displays in sensible colours again.

It is possible to leave it at that, save the image and do whatever you were going todo with it. This is sub-optimal if you intend to distribute the image to other people. The sRGB colourspace is intended as a generic colourspace which has reasonable display characteristics even on monitors which are not calibrated / colour managed. If uploading the web, most people viewing the image are not going to have colour managed displays. Thus, if you want the image to look reasonable for them, it is wise to now convert it to the sRGB colourspace. This is done using the menu Image -> Mode -> Convert to Color Profile. In contrast to the ‘Assign’ operation, the ‘Convert’ operation does the change the actual pixel values in the source image. Depending on the overlap between the monitor’s colourspace and the sRGB colourspace, and the rendering intent chosen, this will be a slightly lossy process. The image colours won’t display in quite same way as before, but it will display better on other people’s monitors.

In summary, if you are taking screenshots of a colour management aware application on a colour managed display, you need to first assign the monitor profile to the captured image, and then convert it to the sRGB profile. Oh and remember that, depending on the source of the data, this assign+convert step may also be required when pasting image data from the clipboard.

Colour management in firefox on Fedora 12

Posted: December 6th, 2009 | Filed under: Fedora, Photography | 1 Comment »

It has been a long time coming, but the Linux desktop is finally getting to the point where colour management is widely available in applications. At a low level ArgyllCMS is providing support for many colour calibration devices and lCMS provides a nice library for applying colour profile transformations to images. At a high level, the graphics/photos tools DigiKam, GIMP, UFRaw, InkScape, Phatch and XSane are all able to do colour management. Most are even following the X colour management spec to automatically obtain the current monitor profile. In the last few weeks Richard Hughes has filled in another missing piece, writing gnome-colour-manager to provide a UI for driving ArgyllCMS and setting up monitor profiles upon login.

It is great to be able to do photo/graphics work on a fully colour managed Linux desktop….and then you upload the photos to Flickr and they go back to looking awful. After a little googling though, it turns out all is not lost. Firefox does in fact contain some colour management support, hidden away in its truly awful about:config page. If you go to that page and filter on ‘gfx’, you’ll find a couple of settings with ‘color_management’ in their name

gfx.color_management.display_profile
gfx.color_management.mode
gfx.color_management.rendering_intent

The first, display_profile, takes the full path to an ICC profile for your monitor, while mode controls where colour management is applied. A value of ‘2’ will make firefox only apply profiles to images explicitly tagged with a profile. A value of ‘1’ will make firefox apply profiles to CSS and images, assuming an sRGB profile if the image does is tagged. rendering_intent takes values 0, 1, 2, 3 corresponding to ‘perceptual’, ‘relative colourimetric’, ‘saturation’ and ‘absolute colourimetric’ respectively. I configured my firefox for mode=1, set a profile and restarted. Browsing to Flickr to showed an immediate improvement, with my images actually appearing in the correct colours, matching those I see during editing in GIMP/UFRaw/etc. There’s a little more info about these settings at the mozilla developer notes on ICC.

While it is nice to have colour management in firefox, its implementation is rather sub-optimal since it requires the user to manually configure the display ICC profile path. Each display profile is only valid with the monitor against which it was created. So the moment I switch my laptop from its built-in LCD to an external LCD all the colours in firefox will go to hell. If firefox followed the X ICC profile spec it would be able to automatically apply the correct profiles for each monitor. Hopefully someone will be motivated to fix this soon, since the spec is rather easy to comply with only needing a quick look at a particular named property on the root window.

Using CGroups with libvirt and LXC/KVM guests in Fedora 12

Posted: December 3rd, 2009 | Filed under: libvirt, Virt Tools | 11 Comments »

In my previous post I discussed the new disk encryption capabilities available with libvirt in Fedora 12. That was not the only new feature we quietly introduced without telling anyone. The second was integration between libvirt and the kernels’ CGroups functionality. In fact we have had this around for a while, but prior to this point it has only been used with our LXC driver. It is now also available for use with the QEMU driver.

What are CGroups?

You might be wondering at this point what CGroups actually are ? At a high level, it is a generic mechanism the kernel provides for grouping of processes and applying controls to those groups. The grouping is done via a virtual filesystem called “cgroup”. Within this filesytem, each directory defines a new group. Thus groups can be arranged to form an arbitrarily nested hierarchy simply by creating new sub-directories.

Tunables within a cgroup are provided by what the kernel calls ‘controllers’, with each controller able to expose one or more tunable or control. When mounting the cgroups filesystem it is possible to indicate what controllers are to be activated. This makes it possible to mount the filesystem several times, with each mount point having a different set of (non-overlapping) controllers. Why might separate mount points be useful ? The key idea is that this allows the administrator to construct differing group hierarchies for different sets of controllers/tunables.

memory: Memory controller: Allows for setting limits on RAM and swap usage and querying cumulative usage of all processes in the group
cpuset: CPU set controller: Binding of processes within a group to a set of CPUs and controlling migration between CPus
cpuacct: CPU accounting controller: Information about CPU usage for a group of processes
cpu: CPU schedular controller: Controlling the priorization of processes in the group. Think of it as a more advanced nice level
devices: Devices controller: Access control lists on character and block devices
freezer: Freezer controller: Pause and resume execution of processes in the group. Think of it as SIGSTOP for the whole group
net_cls: Network class controller: Control network utilization by associating processes with a ‘tc’ network class

This isn’t the blog post to go into fine details about each of these controllers & their capabilities, the high level overview will do. Suffice to say that at this time, the libvirt LXC driver (container based virtualization) will use all of these controllers except for net_cls and cpuset, while the libvirt QEMU driver will only use the cpu and devices controllers.

Activating CGroups on a Fedora 12 system

CGroups are a system-wide resource and libvirt doesn’t presume that it can dictate how CGroup controllers are mounted, nor in what hierarchy they are arranged. It will leave mount point & directory setup entirely to the administrators’ discretion. Unfortunately though, is not just a matter of adding some mount points to /etc/fstab. It is neccessary to setup the directory hierarchy and decide how processes get placed within it. Fortunately the libcg project provides a init service and set of tools to assist in host configuration. On Fedora 12 this is packaged in the libcgroup RPM. If you don’t have that installed, install it now!

The primary configuration file for CGroups is /etc/cgconfig.conf. There are two interesting things to configure here. Step 1 is declaring what controllers are mounted where. Should you wish to keep things very simple it is possible to mount many controllers in one single location with a snippet that looks like

mount {
       cpu = /dev/cgroups;
       cpuacct = /dev/cgroups;
       memory = /dev/cgroups;
       devices = /dev/cgroups;
}

This will allow a hierarchy of process cgroups rooted at /dev/cgroups. If you get more advanced though, you might wish to have one hierarchy just for CPU schedular controls, and another for device ACLs, and a third for memory management. That could be accomplished using a configuration that looks like this

mount {
       cpu = /dev/cgroups/cpu;
       cpuacct = /dev/cgroups/cpu;
       memory = /dev/cgroups/memory;
       devices = /dev/cgroups/devices;
}

Going with this second example, save the /etc/cgconfig.conf file with these mount rules, and then activate the configuration by running

 # service cgconfig start

Looking in /dev/cgroups, there should be a number of top level directories.

# ls /dev/cgroups/
cpu  devices  memory

# ls /dev/cgroups/cpu
cpuacct.stat          cpu.rt_period_us   notify_on_release  tasks
cpuacct.usage         cpu.rt_runtime_us  release_agent
cpuacct.usage_percpu  cpu.shares         sysdefault

# ls /dev/cgroups/memory/
memory.failcnt                   memory.stat
memory.force_empty               memory.swappiness
memory.limit_in_bytes            memory.usage_in_bytes
memory.max_usage_in_bytes        memory.use_hierarchy
memory.memsw.failcnt             notify_on_release
memory.memsw.limit_in_bytes      release_agent
memory.memsw.max_usage_in_bytes  sysdefault
memory.memsw.usage_in_bytes      tasks

# ls /dev/cgroups/devices/
devices.allow  devices.list       release_agent  tasks
devices.deny   notify_on_release  sysdefault

Now that the basic cgroups controllers are mounted in the locations we want, there is a choice of how to proceed. If starting libvirtd at this point, it will end up in the sysdefault group seen here. This may be satisfactory for some people, in which case they can skip right ahead to the later notes on how KVM virtual machines use cgroups. Other people may want to move the libvirtd daemon (and thus everything it runs) into a separate cgroup first.

Placing the libvirtd daemon in a dedicated group

Lets say we wish to place an overall limit on the amount of memory that can be used by the libvirtd daemon and all guests that it launches. For this it will be neccessary to define a new group, and specify a limit using the memory controller. Back at the /etc/cgconfig.conf configuration file, this can be achieved using the ‘group’ statement:

group virt {
       memory {
               memory.limit_in_bytes = 7500M;
       }
}

This says that no matter what all the processes in this group do, their combined usage will never be allowed above 7.5 GB. Any usage above this limit will cause stuff to be pushed out to swap. Newer kernels can even let you control how much swap can be used, before the OOM killer comes out of hiding. This particular example is chosen to show how cgroups can be used to protect the virtualization host against accidental overcommit. eg On this server with 8 GB RAM, no matter how crazy & out of control the virtual machines get, I have reasonable reassurance that I’ll always be able to get to an SSH / console login prompt because we’ve left a guaranteed 500 MB for other cgroups (ie the rest of the system) to use.

Now that the desired custom group has been defined it is neccessary to actually tell the system that the libvirtd daemon (or any other interesting daemons) needs to be placed in this virt group. If the daemon in question has a well designed initscript, it will be using the shared functions from /etc/init.d/functions, in particular the ‘daemon’ function. libvirtd is of course well designed :-P Placing libvirtd into a cgroup requires adding one line to its configuration file /etc/sysconfig/libvirtd.

CGROUP_DAEMON="memory:/virt"

If we wanted to place it in several cgroups, those would be listed in that same parameter, using a space to separate each. At this point a (re-)start of the cgconfig and libvirtd services will complete the host setup. There is a magic /proc file which can show you at a glance what cgroup any process is living in

# service cgconfig restart
# service libvirtd restart
# PID=`pgrep libvirtd`
# cat /proc/$PID/cgroup
32:devices:/sysdefault
16:memory:/virt
12:cpuacct,cpu:/sysdefault

Our config didn’t say anything about the devices or cpuacct groups, even though we’d asked for them to be mounted. Thus libvirtd got placed in the sysdefault group for those controllers.

Controlling KVM guests

The libvirtd daemon has drivers for many virtualization technologies, and at time of writing, its LXC and QEMU drivers integrate with CGroups. For maximum flexibility of administration, libvirt will create its own mini hierarchy of groups in to which guests will be placed. This mini hierarchy will be rooted at whatever location the libvirtd daemon starts in.

$ROOT
 |
 +-  libvirt    (all virtual machines/containers run by libvirtd)
       |
       +- lxc   (all LXC containers run by libvirtd)
       |   |
       |   +-  guest1    (LXC container called 'guest1')
       |   +-  guest2    (LXC container called 'guest2')
       |   +-  guest3    (LXC container called 'guest3')
       |   +-  ...       (LXC container called ...)
       |
       +- qemu  (all QEMU/KVM containers run by libvirtd)
           |
           +-  guest1    (QENU machine called 'guest1')
           +-  guest2    (QEMU machine called 'guest2')
           +-  guest3    (QEMU machine called 'guest3')
           +-  ...       (QEMU machine called ...)

Remember that we actually configured 3 mount points, /dev/cgroups/cpu, /dev/cgroups/devices and /dev/cgroups/memory. libvirt will detect whatever you mounted, and replicate its mini hierarchy at the appropriate place. So in the above example $ROOT will expand to 3 locations

/dev/cgroups/cpu/sysdefault
/dev/cgroups/memory/virt
/dev/cgroups/devices/sysdefault

As an administrator you should not need to concern yourself with directly accessing the tunables within libvirt’s mini cgroups hiearchy. libvirt will add/remove the child groups for each guest when the guest is booted/destroyed respectively. The libvirt API, and/or virsh command line tool provide mechanisms to set tunables on guests. For the QEMU and LXC drivers in libvirt, the virsh schedinfo command provides access to CPU scheduler prioritization for a guest,

# virsh schedinfo demo
Scheduler      : posix
cpu_shares     : 1024

The “cpu_shares” value is a relative priorization. All guests start out with a cpu_shares of 1024. If you half its “cpu_shares” value, it will get 1/2 the CPU time as compared to other guests. This is applied to the guest as a whole, regardless of how many virtual CPUs it has. This last point is an important benefit over simple ‘nice’ levels which operate per-thread. With the latter it is very hard to set relative prioritization between guests unless they all have exactly the same number of virtual CPUs The cpu_shares tunable can be set with the same virsh command

# virsh schedinfo --set cpu_shares=500 demo
Scheduler      : posix
cpu_shares     : 500

For LXC containers, the configured guest memory limit is implemented via the ‘memory’ controller, and container CPU time accounting is done with the ‘cpuacct’ controller.

If the “devices” controller is mounted, libvirt will attempt to use that to control what a guest has access to. It will setup the ACLs so that all guests can access things like /dev/null, /dev/rand, etc, but deny all access to block devices except for those explicitly configured in its XML. This is done for both QEMU and LXC guests.

Future work

There is quite alot more that libvirt could do to take advantage of CGroups. We already have three feature requests that can be satisfied with the use of Cgroups

Enforcement of guest memory usage: QEMU guests are supplied with the VirtIO Balloon device. This allows the host OS to request that the guest OS release memory it was previously allocated back to the host. This allows administrators to dynamically adjust memory allocation of guests on the fly. The obvious downside is that it relies on co-operation of the guest OS. If the guest ignores the request to release memory there is nothing the host can do. Almost nothing. The cgroups memory controller allows a limit on both RAM and swap usage to be set. Since each guest is placed in its own cgroup, we can use this control to enforce the balloon request, by updating the cgroup memory controller limit whenever changing the balloon limit. If the guest ignores the balloon request, then the cgroups controller will force part of the guest’s RAM allocation out to swap. This gives us both a carrot and a stick.
Network I/O policy on guests: The cgroups net_cls controller allows a cgroup to be associated with a tc traffic control class (see the tc(8) manpage). One obvious example usage would involve setting a hard bandwidth cap, but there are plenty more use cases beyond that
Disk I/O policy on guests: A number of kernel developers have been working on a new disk I/O bandwidth control mechanism for a while now, targeting virtualization as a likely consumer. The problem is not quite as simple as it sounds though. While you might put caps on bandwidth of guests, this is ignoring the impact of seek time. If all the guest I/O patterns are sequential, high throughput it might be great, but a single guest doing excessive random access I/O causing too many seeks can easily destroy the throughput of all others. None the less, there appears to be a strong desire for disk I/O bandwidth controls. This is almost certainly going to end up as another cgroup controller that libvirt can then take advantage of.

There are probably more things can be done with cgroups, but this is plenty to keep us busy for a while

Daniel P. Berrangé

Writing about open source software, virtualization & more