Announce: libvirt-sandbox “Cholistan” 0.5.1 release – an application sandbox toolkit

Posted: November 19th, 2013 | Author: | Filed under: Fedora, libvirt, Security, Virt Tools | Tags: , , , , , | No Comments »

I pleased to announce the a new public release of libvirt-sandbox, version 0.5.1, is now available from:

The packages are GPG signed with

  Key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF (4096R)

The libvirt-sandbox package provides an API layer on top of libvirt-gobject which facilitates the cration of application sandboxes using virtualization technology. An application sandbox is a virtual machine or container that runs a single application binary, directly from the host OS filesystem. In other words there is no separate guest operating system install to build or manage.

At this point in time libvirt-sandbox can create sandboxes using either LXC or KVM, and should in theory be extendable to any libvirt driver.

This release focused on exclusively on bugfixing

Changed in this release:

  • Fix path to systemd binary (prefers dir /lib/systemd not /bin)
  • Remove obsolete commands from virt-sandbox-service man page
  • Fix delete of running service container
  • Allow use of custom root dirs with ‘virt-sandbox –root DIR’
  • Fix ‘upgrade’ command for virt-sandbox-service generic services
  • Fix logrotate script to use virsh for listing sandboxed services
  • Add ‘inherit’ option for virt-sandbox ‘-s’ security context option, to auto-copy calling process’ context
  • Remove non-existant ‘-S’ option froom virt-sandbox-service man page
  • Fix line break formatting of man page
  • Mention LIBVIRT_DEFAULT_URI in virt-sandbox-service man page
  • Check some return values in libvirt-sandbox-init-qemu
  • Remove unused variables
  • Fix crash with partially specified mount option string
  • Add man page docs for ‘ram’ mount type
  • Avoid close of un-opened file descriptor
  • Fix leak of file handles in init helpers
  • Log a message if sandbox cleanup fails
  • Cope with domain being missing when deleting container
  • Improve stack trace diagnostics in virt-sandbox-service
  • Fix virt-sandbox-service content copying code when faced with non-regular files.
  • Improve error reporting if kernel does not exist
  • Allow kernel version/path/kmod to be set with virt-sandbox
  • Don’t overmount ‘/root’ in QEMU sandboxes by default
  • Fix nosuid / nodev mount options for tmpfs
  • Force 9p2000.u protocol version to avoid QEMU bugs
  • Fix cleanup when failing to start interactive sandbox
  • Create copy of kernel from /boot to allow relabelling
  • Bulk re-indent of code
  • Avoid crash when gateway is missing in network options
  • Fix symlink target created in
  • Add ‘-p PATH’ option for virt-sandbox-service clone/delete to match ‘create’ command option.
  • Only allow ‘lxc:///’ URIs with virt-sandbox-service until further notice
  • Rollback state if cloning a service sandbox fails
  • Add more kernel modules instead of assuming they are all builtins
  • Don’t complain if some kmods are missing, as they may be builtins
  • Allow –mount to be repeated with virt-sandbox-service

Thanks to everyone who contributed to this release

A new (configurable) cgroups layout for libvirt with QEMU, KVM & LXC

Posted: May 13th, 2013 | Author: | Filed under: Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , , | 1 Comment »

Several years ago I wrote a bit about libvirt and cgroups in Fedora 12. Since that time, much has changed, and we’ve learnt alot about the use of cgroups, not all of it good.

Perhaps the biggest change has been the arrival of systemd, which has brought cgroups to the attention of a much wider audience. One of the biggest positive impacts of systemd on cgroups, has been a formalization of how to integrate with cgroups as an application developer. Libvirt of course follows these cgroups guidelines, has had input into their definition & continues to work with the systemd community to improve them.

One of the things we’ve learnt the hard way is that the kernel implementation of control groups is not without cost, and the way applications use cgroups can have a direct impact on the performance of the system. The kernel developers have done a great deal of work to improve the performance and scalability of cgroups but there will always be a cost to their usage which application developers need to be aware of. In broad terms, the performance impact is related to the number of cgroups directories created and particularly to their depth.

To cut a long story short, it became clear that the directory hierarchy layout libvirt used with cgroups was seriously sub-optimal, or even outright harmful. Thus in libvirt 1.0.5, we introduced some radical changes to the layout created.

Historically libvirt would create a cgroup directory for each virtual machine or container, at a path $LOCATION-OF-LIBVIRTD/libvirt/$DRIVER-NAME/$VMNAME. For example, if libvirtd was placed in /system/libvirtd.service, then a QEMU guest named “web1” would live at /system/libvirtd.service/libvirt/qemu/web1. That’s 5 levels deep already, which is not good.

As of libvirt 1.0.5, libvirt will create a cgroup directory for each virtual machine or container, at a path /machine/$VMNAME.libvirt-$DRIVER-NAME. First notice how this is now completely disassociated from the location of libvirtd itself. This allows the administrator greater flexibility in controlling resources for virtual machines independently of system services. Second notice that the directory hierarchy is only 2 levels deep by default, so a QEMU guest named “web” would live at /machine/web1.libvirt-qemu

The final important change is that the location of virtual machine / container can now be configured on a per-guest basis in the XML configuration, to override the default of /machine. So if the guest config says


then libvirt will create the guest cgroup directory /virtualmachines.partition/production.partition/web1.libvirt-qemu. Notice that there will always be a .partition suffix on these user defined directories. Only the default top level directories /machine, /system and /user will be without a suffix. The suffix ensures that user defined directories can never clash with anything the kernel will create. The systemd PaxControlGroups will be updated with this & a few escaping rules soon.

There is still more we intend todo with cgroups in libvirt, in particular adding APIs for creating & managing these partitions for grouping VMs, so you don’t need to go to a tool outside libvirt to create the directories.

One final thing, libvirt now has a bit of documentation about its cgroups usage which will serve as the base for future documentation in this area.

Installing a 4 node Fedora 18 OpenStack Folsom cluster with PackStack

Posted: March 1st, 2013 | Author: | Filed under: Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , | 9 Comments »

For a few months now Derek has been working on a tool called PackStack, which aims to facilitate & automate the deployment of OpenStack services. Most of the time I’ve used DevStack for deploying OpenStack, but this is not at all suitable for doing production quality deployments.  I’ve also done production deployments from scratch following the great Fedora instructions. The latter work but require the admin to do far too much tedious legwork and know too much about OpenStack in general. This is where PackStack comes in. It starts from the assumption that the admin knows more or less nothing about how the OpenStack tools work. All they need do is decide which services they wish to deploy on each machine. With that answered PackStack goes off and does the work to make it all happen. Under the hood PackStack does its work by connecting to each machine over SSH, and using Puppet to deploy/configure the services on each one. By leveraging puppet, PackStack itself is mostly isolated from the differences between various Linux distros. Thus although PackStack has been developed on RHEL and Fedora, it should be well placed for porting to other distros like Debian & I hope we’ll see that happen in the near future. It will be better for the OpenStack community to have a standard tool that is portable across all target distros, than the current situation where pretty much ever distributor of OpenStack has reinvented the wheel building their own private tooling for deployment. This is why PackStack is being developed as an upstream project, hosted on StackForge rather than as a Red Hat only private project.

Preparing the virtual machines

Anyway back to the point of this blog post. Having followed PackStack progress for a while I decided it was time to actually try it out for real. While I have a great many development machines, I don’t have enough free to turn into an OpenStack cluster, so straight away I decided to do my test deployment inside a set of Fedora 18 virtual machines, running on a Fedora 18 KVM host.

The current PackStack network support requires that you have 2 network interfaces. For an all-in-one box deployment you only actually need one physical NIC for the public interface – you can use ‘lo’ for the private interface on which VMs communicate with each other. I’m doing a multi-node deployment though, so my first step was to decide how to provide networking to my VMs. A standard libvirt install will provide a default NAT based network, using the virbr0 bridge device. This will serve just fine as the public interface over which we can communicate with the OpenStack services & their REST / Web APIs. For VM traffic, I decided to create a second libvirt network on the host machine.

# cat > openstackvms.xml <<EOF
  <bridge name='virbr1' stp='off' delay='0' />
# virsh net-define openstackvms.xml
Network openstackvms defined from openstackvms.xml

# virsh net-start openstackvms
Network openstackvms started


Next up, I installed a single Fedora 18 guest machine, giving it two network interfaces, the first attached to the ‘default’ libvirt network, and the second attached to the ‘openstackvms’ virtual network.

# virt-install  --name f18x86_64a --ram 1000 --file /var/lib/libvirt/images/f18x86_64a.img \
    --location \
    --noautoconsole --vnc --file-size 10 --os-variant fedora18 \
    --network network:default --network network:openstackvms

In the installer, I used the defaults for everything with two exceptions. I select the “Minimal install” instead of “GNOME Desktop”, and I reduced the size of the swap partition from 2 GB, to 200 MB – it the VM ever needed more than a few 100 MB of swap, then it is pretty much game over for responsiveness of that VM. A minimal install is very quick, taking only 5 minutes or so to completely install the RPMs – assuming good download speeds from the install mirror chosen. Now I need to turn that one VM into 4 VMs. For this I looked to the virt-clone tool. This is a fairly crude tool which merely does a copy of each disk image, and then updates the libvirt XML for the guest to given it a new UUID and MAC address. It doesn’t attempt to change anything inside the guest, but for a F18 minimal install this is not a significant issue.

# virt-clone  -o f18x86_64a -n f18x86_64b -f /var/lib/libvirt/images/f18x86_64b.img 
Allocating 'f18x86_64b.img'                                                                                    |  10 GB  00:01:20     

Clone 'f18x86_64b' created successfully.
# virt-clone  -o f18x86_64a -n f18x86_64c -f /var/lib/libvirt/images/f18x86_64c.img 
Allocating 'f18x86_64c.img'                                                                                    |  10 GB  00:01:07     

Clone 'f18x86_64c' created successfully.
# virt-clone  -o f18x86_64a -n f18x86_64d -f /var/lib/libvirt/images/f18x86_64d.img 
Allocating 'f18x86_64d.img'                                                                                    |  10 GB  00:00:59     

Clone 'f18x86_64d' created successfully.

I don’t fancy having to remember the IP address of each of the virtual machines I installed, so I decided to setup some fixed IP address mappings in the libvirt default network, and add aliases to /etc/hosts

# virsh net-destroy default
# virsh net-edit default
...changing the following...
  <ip address='' netmask=''>
      <range start='' end='' />
  </ip> this...

  <ip address='' netmask=''>
      <range start='' end='' />
      <host mac='52:54:00:fd:e7:03' name='f18x86_64a' ip='' />
      <host mac='52:54:00:c4:b7:f6' name='f18x86_64b' ip='' />
      <host mac='52:54:00:81:84:d6' name='f18x86_64c' ip='' />
      <host mac='52:54:00:6a:9b:1a' name='f18x86_64d' ip='' />
# cat >> /etc/hosts <<EOF f18x86_64a f18x86_64b f18x86_64c f18x86_64d
# virsh net-start default

Now we’re ready to actually start the virtual machines

# virsh start f18x86_64a
Domain f18x86_64a started

# virsh start f18x86_64b
Domain f18x86_64b started

# virsh start f18x86_64c
Domain f18x86_64c started

# virsh start f18x86_64d
Domain f18x86_64d started

# virsh list
 Id    Name                           State
 25    f18x86_64a                     running
 26    f18x86_64b                     running
 27    f18x86_64c                     running
 28    f18x86_64d                     running

Deploying OpenStack with PackStack

All of the above is really nothing todo with OpenStack or PackStack – it is just about me getting some virtual machines ready to act as the pretend “physical servers”. The interesting stuff starts now. PackStack doesn’t need to be installed in the machines that will receive the OpenStack install, but rather on any client machine which has SSH access to the target machines. In my case I decided to run packstack from the physical host running the VMs I just provisioned.

# yum -y install openstack-packstack

While PackStack is happy to prompt you with questions, it is far simpler to just use an answer file straight away. It lets you see upfront everything that is required and will make it easy for you repeat the exercise later.

$ packstack --gen-answer-file openstack.txt

The answer file tries to fill in sensible defaults, but there’s not much it can do for IP addresses. So it just fills in the IP address of the host on which it was generated. This is suitable if you’re doing an all-in-one install on the current machine, but not for doing a multi-node install. So the next step is to edit the answer file and customize at least the IP addresses. I have decided that f18x86_64a will be the Horizon frontend and host the user facing APIs from glance/keystone/nova/etc, f18x86_64b will provide QPid, MySQL, Nova schedular and f18x86_64c and f18x86_64d will be compute nodes and swift storage nodes (though I haven’t actually enabled swift in the config).

$ emacs openstack.txt
...make IP address changes...

So you can see what I changed, here is the unified diff

--- openstack.txt	2013-03-01 12:41:31.226476407 +0000
+++ openstack-custom.txt	2013-03-01 12:51:53.877073871 +0000
@@ -4,7 +4,7 @@
 # been installed on the remote servers the user will be prompted for a
 # password and this key will be installed so the password will not be
 # required again

 # Set to 'y' if you would like Packstack to install Glance
@@ -34,7 +34,7 @@

 # The IP address of the server on which to install MySQL

 # Username for the MySQL admin user
@@ -43,10 +43,10 @@

 # The IP address of the server on which to install the QPID service

 # The IP address of the server on which to install Keystone

 # The password to use for the Keystone to access DB
@@ -58,7 +58,7 @@

 # The IP address of the server on which to install Glance

 # The password to use for the Glance to access DB
@@ -83,25 +83,25 @@

 # The IP address of the server on which to install the Nova API
 # service

 # The IP address of the server on which to install the Nova Cert
 # service

 # The IP address of the server on which to install the Nova VNC proxy

 # A comma separated list of IP addresses on which to install the Nova
 # Compute services

 # Private interface for Flat DHCP on the Nova compute servers

 # The IP address of the server on which to install the Nova Network
 # service

 # The password to use for the Nova to access DB
@@ -116,14 +116,14 @@

 # IP Range for Flat DHCP

 # IP Range for Floating IP's

 # The IP address of the server on which to install the Nova Scheduler
 # service

 # The overcommitment ratio for virtual to physical CPUs. Set to 1.0
 # to disable CPU overcommitment
@@ -131,20 +131,20 @@

 # The overcommitment ratio for virtual to physical RAM. Set to 1.0 to
 # disable RAM overcommitment

 # The IP address of the server on which to install the OpenStack
 # client packages. An admin "rc" file will also be installed

 # The IP address of the server on which to install Horizon

 # To set up Horizon communication over https set this to "y"

 # The IP address on which to install the Swift proxy service

 # The password to use for the Swift to authenticate with Keystone
@@ -155,7 +155,7 @@
 # on as a swift storage device(packstack does not create the
 # filesystem, you must do this first), if /dev is omitted Packstack
 # will create a loopback device for a test setup

 # Number of swift storage zones, this number MUST be no bigger than
 # the number of storage devices configured
@@ -223,7 +223,7 @@

 # The IP address of the server on which to install the Nagios server

 # The password of the nagiosadmin user on the Nagios server

The current version of PackStack in Fedora mistakenly assumes that ‘net-tools’ is installed by default in Fedora. This used to be the case, but as of Fedora 18 it is not longer installed. Upstream PackStack git has switched from using ifconfig to ip, to avoid this. So for F18 we temporarily need to make sure the ‘net-tools’ RPM is installed in each host. In addition the SELinux policy has not been finished for all openstack components, so we need to set it to permissive mode.

$ ssh root@f18x86_64a setenforce 0
$ ssh root@f18x86_64b setenforce 0
$ ssh root@f18x86_64c setenforce 0
$ ssh root@f18x86_64d setenforce 0
$ ssh root@f18x86_64a yum -y install net-tools
$ ssh root@f18x86_64b yum -y install net-tools
$ ssh root@f18x86_64c yum -y install net-tools
$ ssh root@f18x86_64d yum -y install net-tools

Assuming that’s done, we can now just run packstack

# packstack --answer-file openstack-custom.txt
Welcome to Installer setup utility

Clean Up...                                              [ DONE ]
Setting up ssh keys...                                   [ DONE ]
Adding pre install manifest entries...                   [ DONE ]
Adding MySQL manifest entries...                         [ DONE ]
Adding QPID manifest entries...                          [ DONE ]
Adding Keystone manifest entries...                      [ DONE ]
Adding Glance Keystone manifest entries...               [ DONE ]
Adding Glance manifest entries...                        [ DONE ]
Adding Cinder Keystone manifest entries...               [ DONE ]
Checking if the Cinder server has a cinder-volumes vg... [ DONE ]
Adding Cinder manifest entries...                        [ DONE ]
Adding Nova API manifest entries...                      [ DONE ]
Adding Nova Keystone manifest entries...                 [ DONE ]
Adding Nova Cert manifest entries...                     [ DONE ]
Adding Nova Compute manifest entries...                  [ DONE ]
Adding Nova Network manifest entries...                  [ DONE ]
Adding Nova Scheduler manifest entries...                [ DONE ]
Adding Nova VNC Proxy manifest entries...                [ DONE ]
Adding Nova Common manifest entries...                   [ DONE ]
Adding OpenStack Client manifest entries...              [ DONE ]
Adding Horizon manifest entries...                       [ DONE ]
Preparing servers...                                     [ DONE ]
Adding post install manifest entries...                  [ DONE ]
Installing Dependencies...                               [ DONE ]
Copying Puppet modules and manifests...                  [ DONE ]
Applying Puppet manifests...
Applying :                                       [ DONE ] :                                       [ DONE ] :                                       [ DONE ] :                                       [ DONE ]
Applying :                                           [ DONE ] :                                            [ DONE ]
Applying :                                        [ DONE ] :                                          [ DONE ] :                                          [ DONE ]
Applying :                                        [ DONE ]
Applying :                                            [ DONE ] :                                            [ DONE ] :                                        [ DONE ] :                                         [ DONE ] :                                            [ DONE ] :                                            [ DONE ]
Applying :                                      [ DONE ] :                                      [ DONE ] :                                      [ DONE ] :                                      [ DONE ]
[ DONE ]

**** Installation completed successfully ******

(Please allow Installer a few moments to start up.....)

Additional information:
* Time synchronization installation was skipped. Please note that unsynchronized time on server instances might be problem for some OpenStack components.
* Did not create a cinder volume group, one already existed
* To use the command line tools you need to source the file /root/keystonerc_admin created on
* To use the console, browse to
* The installation log file is available at: /var/tmp/packstack/20130301-135443-qbNvvH/openstack-setup.log

That really is it – you didn’t need to touch any config files for OpenStack, QPid, MySQL or any other service involved. PackStack just worked its magic and there is now a 4 node OpenStack cluster up and running. One of the nice things about PackStack using Puppet for all its work, is that if something goes wrong 1/2 way through, you don’t need to throw it all away – just fix the issue and re-run packstack and it’ll do whatever work was left over from before.

The results

Lets see what’s running on each node. First the frontend user facing node

$ ssh root@f18x86_64a ps -ax
1 ?        Ss     0:03 /usr/lib/systemd/systemd --switched-root --system --deserialize 14
283 ?        Ss     0:00 /usr/lib/systemd/systemd-udevd
284 ?        Ss     0:07 /usr/lib/systemd/systemd-journald
348 ?        S      0:00 /usr/lib/systemd/systemd-udevd
391 ?        Ss     0:06 /usr/bin/python -Es /usr/sbin/firewalld --nofork
392 ?        S<sl   0:00 /sbin/auditd -n
394 ?        Ss     0:00 /usr/lib/systemd/systemd-logind
395 ?        Ssl    0:00 /sbin/rsyslogd -n -c 5
397 ?        Ssl    0:01 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
403 ?        Ss     0:00 login -- root
411 ?        Ss     0:00 /usr/sbin/crond -n
417 ?        S      0:00 /usr/lib/systemd/systemd-udevd
418 ?        Ssl    0:01 /usr/sbin/NetworkManager --no-daemon
452 ?        Ssl    0:00 /usr/lib/polkit-1/polkitd --no-debug
701 ?        S      0:00 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/ -lf /var/lib/dhclient/ -cf /var/run/nm-dhclient-eth0.conf eth0
769 ?        Ss     0:00 /usr/sbin/sshd -D
772 ?        Ss     0:00 sendmail: accepting connections
792 ?        Ss     0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
800 tty1     Ss+    0:00 -bash
8702 ?        Ss     0:00 /usr/bin/python /usr/bin/glance-registry --config-file /etc/glance/glance-registry.conf
8745 ?        S      0:00 /usr/bin/python /usr/bin/glance-registry --config-file /etc/glance/glance-registry.conf
8764 ?        Ss     0:00 /usr/bin/python /usr/bin/glance-api --config-file /etc/glance/glance-api.conf
10030 ?        Ss     0:01 /usr/bin/python /usr/bin/keystone-all --config-file /etc/keystone/keystone.conf
10201 ?        S      0:00 /usr/bin/python /usr/bin/glance-api --config-file /etc/glance/glance-api.conf
13096 ?        Ss     0:01 /usr/bin/python /usr/bin/nova-api --config-file /etc/nova/nova.conf --logfile /var/log/nova/api.log
13103 ?        S      0:00 /usr/bin/python /usr/bin/nova-api --config-file /etc/nova/nova.conf --logfile /var/log/nova/api.log
13111 ?        S      0:00 /usr/bin/python /usr/bin/nova-api --config-file /etc/nova/nova.conf --logfile /var/log/nova/api.log
13120 ?        S      0:00 /usr/bin/python /usr/bin/nova-api --config-file /etc/nova/nova.conf --logfile /var/log/nova/api.log
13484 ?        Ss     0:05 /usr/bin/python /usr/bin/nova-consoleauth --config-file /etc/nova/nova.conf --logfile /var/log/nova/consoleauth.log
20354 ?        Ss     0:00 python /usr/bin/nova-novncproxy --web /usr/share/novnc/
20429 ?        Ss     0:03 /usr/bin/python /usr/bin/nova-cert --config-file /etc/nova/nova.conf --logfile /var/log/nova/cert.log
21035 ?        Ssl    0:00 /usr/bin/memcached -u memcached -p 11211 -m 922 -c 8192 -l -U 11211 -t 1
21311 ?        Ss     0:00 /usr/sbin/httpd -DFOREGROUND
21312 ?        Sl     0:00 /usr/sbin/httpd -DFOREGROUND
21313 ?        S      0:00 /usr/sbin/httpd -DFOREGROUND
21314 ?        S      0:00 /usr/sbin/httpd -DFOREGROUND
21315 ?        S      0:00 /usr/sbin/httpd -DFOREGROUND
21316 ?        S      0:00 /usr/sbin/httpd -DFOREGROUND
21317 ?        S      0:00 /usr/sbin/httpd -DFOREGROUND
21632 ?        S      0:00 /usr/sbin/httpd -DFOREGROUND

Now the infrastructure node

$ ssh root@f18x86_64b ps -ax
1 ?        Ss     0:02 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
289 ?        Ss     0:00 /usr/lib/systemd/systemd-udevd
290 ?        Ss     0:05 /usr/lib/systemd/systemd-journald
367 ?        S      0:00 /usr/lib/systemd/systemd-udevd
368 ?        S      0:00 /usr/lib/systemd/systemd-udevd
408 ?        Ss     0:04 /usr/bin/python -Es /usr/sbin/firewalld --nofork
409 ?        S<sl   0:00 /sbin/auditd -n
411 ?        Ss     0:00 /usr/lib/systemd/systemd-logind
412 ?        Ssl    0:00 /sbin/rsyslogd -n -c 5
414 ?        Ssl    0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
419 tty1     Ss+    0:00 /sbin/agetty --noclear tty1 38400 linux
429 ?        Ss     0:00 /usr/sbin/crond -n
434 ?        Ssl    0:01 /usr/sbin/NetworkManager --no-daemon
484 ?        Ssl    0:00 /usr/lib/polkit-1/polkitd --no-debug
717 ?        S      0:00 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/ -lf /var/lib/dhclient/ -cf /var/run/nm-dhclient-eth0.conf eth0
766 ?        Ss     0:00 sendmail: accepting connections
792 ?        Ss     0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
805 ?        Ss     0:00 /usr/sbin/sshd -D
8531 ?        Ss     0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
8884 ?        Sl     0:15 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/ --socket=/var/lib/mysql/mysql.sock --port=3306
9778 ?        Ssl    0:01 /usr/sbin/qpidd --config /etc/qpidd.conf
10004 ?        S<     0:00 [loop2]
13381 ?        Ss     0:02 /usr/sbin/tgtd -f
14831 ?        Ss     0:00 /usr/bin/python /usr/bin/cinder-api --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/api.log
14907 ?        Ss     0:04 /usr/bin/python /usr/bin/cinder-scheduler --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/scheduler.log
14956 ?        Ss     0:02 /usr/bin/python /usr/bin/cinder-volume --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/volume.log
15516 ?        Ss     0:06 /usr/bin/python /usr/bin/nova-scheduler --config-file /etc/nova/nova.conf --logfile /var/log/nova/scheduler.log
15609 ?        Ss     0:08 /usr/bin/python /usr/bin/nova-network --config-file /etc/nova/nova.conf --logfile /var/log/nova/network.log

And finally one of the 2 compute nodes

$ ssh root@f18x86_64c ps -ax
  1 ?        Ss     0:02 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
315 ?        Ss     0:00 /usr/lib/systemd/systemd-udevd
317 ?        Ss     0:04 /usr/lib/systemd/systemd-journald
436 ?        Ss     0:05 /usr/bin/python -Es /usr/sbin/firewalld --nofork
437 ?        S<sl   0:00 /sbin/auditd -n
439 ?        Ss     0:00 /usr/lib/systemd/systemd-logind
440 ?        Ssl    0:00 /sbin/rsyslogd -n -c 5
442 ?        Ssl    0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
454 ?        Ss     0:00 /usr/sbin/crond -n
455 tty1     Ss+    0:00 /sbin/agetty --noclear tty1 38400 linux
465 ?        S      0:00 /usr/lib/systemd/systemd-udevd
466 ?        S      0:00 /usr/lib/systemd/systemd-udevd
470 ?        Ssl    0:01 /usr/sbin/NetworkManager --no-daemon
499 ?        Ssl    0:00 /usr/lib/polkit-1/polkitd --no-debug
753 ?        S      0:00 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/ -lf /var/lib/dhclient/ -cf /var/run/nm-dhclient-eth0.conf eth0
820 ?        Ss     0:00 sendmail: accepting connections
834 ?        Ss     0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
846 ?        Ss     0:00 /usr/sbin/sshd -D
9749 ?        Ssl    0:13 /usr/sbin/libvirtd
16060 ?        Sl     0:01 /usr/bin/python -Es /usr/sbin/tuned -d
16163 ?        Ssl    0:03 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova.conf --logfile /var/log/nova/compute.log

All-in-all PackStack exceeded my expectations for such a young tool – it did a great job with minimum of fuss and was nice & reliable at what it did too. The only problem I hit was forgetting to set SELinux permissive first, which was not its fault – this is a bug in Fedora policy we will be addressing – and it recovered from that just fine when I re-ran it after setting permissive mode.

What DevStack does to your host when setting up OpenStack on Fedora 17

Posted: November 20th, 2012 | Author: | Filed under: Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , , , , , | 2 Comments »

As I mentioned in my previous post, I’m not really a fan of giant shell scripts which ask for unrestricted sudo access without telling you what they’re going todo. Unfortunately DevStack is one such script :-( So I decided to investigate just what it does to a Fedora 17 host when it is run. The general idea I had was

  • Install a generic Fedora 17 guest
  • Create a QCow2 image using the installed image as its backing file
  • Reconfigure the guest to use the QCow2 image as its disk
  • Run DevStack in the guest
  • Compare the contents of the original installed image and the DevStack processed image

It sounded like libguestfs ought to be able to help out with the last step, and after a few words with Rich, I learnt about use of virt-ls for exactly this purpose. After trying this once, it quickly became apparent that just comparing the lists of files is quite difficult because DevStack installs a load of extra RPMs with many 1000’s of files. So to take this out of the equation, I grabbed the /var/log/yum.log file to get a list of all RPMs that DevStack had installed, and manually added them into the generic Fedora 17 guest base image. Now I could re-run DevStack again and do a file comparison which excluded all the stuff installed by RPM.

RPM packages installed (with YUM)

  • apr-1.4.6-1.fc17.x86_64
  • apr-util-1.4.1-2.fc17.x86_64
  • apr-util-ldap-1.4.1-2.fc17.x86_64
  • augeas-libs-0.10.0-3.fc17.x86_64
  • binutils-
  • boost-1.48.0-13.fc17.x86_64
  • boost-chrono-1.48.0-13.fc17.x86_64
  • boost-date-time-1.48.0-13.fc17.x86_64
  • boost-filesystem-1.48.0-13.fc17.x86_64
  • boost-graph-1.48.0-13.fc17.x86_64
  • boost-iostreams-1.48.0-13.fc17.x86_64
  • boost-locale-1.48.0-13.fc17.x86_64
  • boost-program-options-1.48.0-13.fc17.x86_64
  • boost-python-1.48.0-13.fc17.x86_64
  • boost-random-1.48.0-13.fc17.x86_64
  • boost-regex-1.48.0-13.fc17.x86_64
  • boost-serialization-1.48.0-13.fc17.x86_64
  • boost-signals-1.48.0-13.fc17.x86_64
  • boost-system-1.48.0-13.fc17.x86_64
  • boost-test-1.48.0-13.fc17.x86_64
  • boost-thread-1.48.0-13.fc17.x86_64
  • boost-timer-1.48.0-13.fc17.x86_64
  • boost-wave-1.48.0-13.fc17.x86_64
  • ceph-0.44-5.fc17.x86_64
  • check-0.9.8-5.fc17.x86_64
  • cloog-ppl-0.15.11-3.fc17.1.x86_64
  • cpp-4.7.2-2.fc17.x86_64
  • curl-7.24.0-5.fc17.x86_64
  • Django-1.4.2-1.fc17.noarch
  • django-registration-0.7-3.fc17.noarch
  • dmidecode-2.11-8.fc17.x86_64
  • dnsmasq-utils-2.63-1.fc17.x86_64
  • ebtables-2.0.10-5.fc17.x86_64
  • euca2ools-2.1.1-2.fc17.noarch
  • gawk-4.0.1-1.fc17.x86_64
  • gcc-4.7.2-2.fc17.x86_64
  • genisoimage-1.1.11-14.fc17.x86_64
  • git-
  • glusterfs-3.2.7-2.fc17.x86_64
  • glusterfs-fuse-3.2.7-2.fc17.x86_64
  • gnutls-utils-2.12.17-1.fc17.x86_64
  • gperftools-libs-2.0-5.fc17.x86_64
  • httpd-2.2.22-4.fc17.x86_64
  • httpd-tools-2.2.22-4.fc17.x86_64
  • iptables-1.4.14-2.fc17.x86_64
  • ipxe-roms-qemu-20120328-1.gitaac9718.fc17.noarch
  • iscsi-initiator-utils-
  • kernel-headers-3.6.6-1.fc17.x86_64
  • kpartx-0.4.9-26.fc17.x86_64
  • libaio-0.3.109-5.fc17.x86_64
  • libcurl-7.24.0-5.fc17.x86_64
  • libmpc-0.9-2.fc17.2.x86_64
  • libunwind-1.0.1-3.fc17.x86_64
  • libusal-1.1.11-14.fc17.x86_64
  • libvirt-
  • libvirt-client-
  • libvirt-daemon-
  • libvirt-daemon-config-network-
  • libvirt-daemon-config-nwfilter-
  • libvirt-python-
  • libwsman1-2.2.7-5.fc17.x86_64
  • lzop-1.03-4.fc17.x86_64
  • m2crypto-0.21.1-8.fc17.x86_64
  • mod_wsgi-3.3-2.fc17.x86_64
  • mx-3.2.3-1.fc17.x86_64
  • mysql-5.5.28-1.fc17.x86_64
  • mysql-libs-5.5.28-1.fc17.x86_64
  • MySQL-python-1.2.3-5.fc17.x86_64
  • mysql-server-5.5.28-1.fc17.x86_64
  • netcf-libs-0.2.2-1.fc17.x86_64
  • numad-0.5-4.20120522git.fc17.x86_64
  • numpy-1.6.2-1.fc17.x86_64
  • parted-3.0-10.fc17.x86_64
  • perl-AnyEvent-5.27-7.fc17.noarch
  • perl-AnyEvent-AIO-1.1-8.fc17.noarch
  • perl-AnyEvent-BDB-1.1-7.fc17.noarch
  • perl-Async-MergePoint-0.03-7.fc17.noarch
  • perl-BDB-1.88-5.fc17.x86_64
  • perl-common-sense-3.5-1.fc17.noarch
  • perl-Compress-Raw-Bzip2-2.052-1.fc17.x86_64
  • perl-Compress-Raw-Zlib-2.052-1.fc17.x86_64
  • perl-Config-General-2.50-6.fc17.noarch
  • perl-Coro-6.07-3.fc17.x86_64
  • perl-Curses-1.28-5.fc17.x86_64
  • perl-DBD-MySQL-4.020-2.fc17.x86_64
  • perl-DBI-1.617-1.fc17.x86_64
  • perl-Encode-Locale-1.02-5.fc17.noarch
  • perl-Error-0.17016-7.fc17.noarch
  • perl-EV-4.03-8.fc17.x86_64
  • perl-Event-1.20-1.fc17.x86_64
  • perl-Event-Lib-1.03-16.fc17.x86_64
  • perl-Git-
  • perl-Glib-1.241-2.fc17.x86_64
  • perl-Guard-1.022-1.fc17.x86_64
  • perl-Heap-0.80-10.fc17.noarch
  • perl-HTML-Parser-3.69-3.fc17.x86_64
  • perl-HTML-Tagset-3.20-10.fc17.noarch
  • perl-HTTP-Date-6.00-3.fc17.noarch
  • perl-HTTP-Message-6.03-1.fc17.noarch
  • perl-IO-AIO-4.15-1.fc17.x86_64
  • perl-IO-Async-0.29-7.fc17.noarch
  • perl-IO-Compress-2.052-1.fc17.noarch
  • perl-IO-Socket-SSL-1.66-1.fc17.noarch
  • perl-IO-Tty-1.10-5.fc17.x86_64
  • perl-LWP-MediaTypes-6.01-4.fc17.noarch
  • perl-Net-HTTP-6.02-2.fc17.noarch
  • perl-Net-LibIDN-0.12-8.fc17.x86_64
  • perl-Net-SSLeay-1.48-1.fc17.x86_64
  • perl-POE-1.350-2.fc17.noarch
  • perl-Socket6-0.23-8.fc17.x86_64
  • perl-Socket-GetAddrInfo-0.19-1.fc17.x86_64
  • perl-TermReadKey-2.30-14.fc17.x86_64
  • perl-TimeDate-1.20-6.fc17.noarch
  • perl-URI-1.60-1.fc17.noarch
  • ppl-0.11.2-8.fc17.x86_64
  • ppl-pwl-0.11.2-8.fc17.x86_64
  • pylint-0.25.1-1.fc17.noarch
  • python-amqplib-1.0.2-3.fc17.noarch
  • python-anyjson-0.3.1-3.fc17.noarch
  • python-babel-0.9.6-3.fc17.noarch
  • python-BeautifulSoup-3.2.1-3.fc17.noarch
  • python-boto-2.5.2-1.fc17.noarch
  • python-carrot-0.10.7-4.fc17.noarch
  • python-cheetah-2.4.4-2.fc17.x86_64
  • python-cherrypy-3.2.2-1.fc17.noarch
  • python-coverage-3.5.1-0.3.b1.fc17.x86_64
  • python-crypto-2.6-1.fc17.x86_64
  • python-dateutil-1.5-3.fc17.noarch
  • python-devel-2.7.3-7.2.fc17.x86_64
  • python-docutils-0.8.1-3.fc17.noarch
  • python-eventlet-0.9.17-1.fc17.noarch
  • python-feedparser-5.1.2-2.fc17.noarch
  • python-gflags-1.5.1-2.fc17.noarch
  • python-greenlet-0.3.1-11.fc17.x86_64
  • python-httplib2-0.7.4-6.fc17.noarch
  • python-iso8601-0.1.4-4.fc17.noarch
  • python-jinja2-2.6-2.fc17.noarch
  • python-kombu-1.1.3-2.fc17.noarch
  • python-lockfile-0.9.1-2.fc17.noarch
  • python-logilab-astng-0.23.1-1.fc17.noarch
  • python-logilab-common-0.57.1-2.fc17.noarch
  • python-lxml-2.3.5-1.fc17.x86_64
  • python-markdown-2.1.1-1.fc17.noarch
  • python-migrate-0.7.2-2.fc17.noarch
  • python-mox-0.5.3-4.fc17.noarch
  • python-netaddr-0.7.5-4.fc17.noarch
  • python-nose-1.1.2-2.fc17.noarch
  • python-paramiko-
  • python-paste-deploy-1.5.0-4.fc17.noarch
  • python-paste-script-1.7.5-4.fc17.noarch
  • python-pep8-1.0.1-1.fc17.noarch
  • python-pip-1.0.2-2.fc17.noarch
  • python-pygments-1.4-4.fc17.noarch
  • python-qpid-0.18-1.fc17.noarch
  • python-routes-1.12.3-3.fc17.noarch
  • python-setuptools-0.6.27-2.fc17.noarch
  • python-sphinx-1.1.3-1.fc17.noarch
  • python-sqlalchemy-0.7.9-1.fc17.x86_64
  • python-suds-0.4.1-2.fc17.noarch
  • python-tempita-0.5.1-1.fc17.noarch
  • python-unittest2-0.5.1-3.fc17.noarch
  • python-virtualenv-
  • python-webob-1.1.1-2.fc17.noarch
  • python-wsgiref-0.1.2-8.fc17.noarch
  • pyxattr-0.5.1-1.fc17.x86_64
  • PyYAML-3.10-3.fc17.x86_64
  • qemu-common-1.0.1-2.fc17.x86_64
  • qemu-img-1.0.1-2.fc17.x86_64
  • qemu-system-x86-1.0.1-2.fc17.x86_64
  • qpid-cpp-client-0.18-5.fc17.x86_64
  • qpid-cpp-server-0.18-5.fc17.x86_64
  • radvd-1.8.5-3.fc17.x86_64
  • screen-4.1.0-0.9.20120314git3c2946.fc17.x86_64
  • scsi-target-utils-1.0.24-6.fc17.x86_64
  • seabios-bin-1.7.1-1.fc17.noarch
  • sg3_utils-1.31-2.fc17.x86_64
  • sgabios-bin-0-0.20110622SVN.fc17.noarch
  • spice-server-0.10.1-5.fc17.x86_64
  • sqlite-3.7.11-3.fc17.x86_64
  • tcpdump-4.2.1-3.fc17.x86_64
  • vgabios-0.6c-4.fc17.noarch
  • wget-1.13.4-7.fc17.x86_64
  • xen-libs-4.1.3-5.fc17.x86_64
  • xen-licenses-4.1.3-5.fc17.x86_64

Python packages installed (with PIP)

These all ended up in /usr/lib/python2.7/site-packages :-(

  • WebOp
  • amqplib
  • boto (splattering over existing boto RPM package with older version)
  • cinderclient
  • cliff
  • cmd2
  • compressor
  • django_appconf
  • django_compressor
  • django_openstack_auth
  • glance
  • horizon
  • jsonschema
  • keyring
  • keystoneclient
  • kombu
  • lockfile
  • nova
  • openstack_auth
  • pam
  • passlib
  • prettytable
  • pyparsing
  • python-cinderclient
  • python-glanceclient
  • python-novaclient
  • python-openstackclient
  • python-quantumclient
  • python-swiftclient
  • pytz
  • quantumclient
  • suds
  • swiftclient
  • warlock
  • webob

Files changed

  • /etc/group (added $USER to ‘libvirtd’ group)
  • /etc/gshadow (as above)
  • /etc/httpd/conf/httpd.conf (changes Listen 80 to Listen
  • /usr/lib/python2.7/site-packages/boto (due to overwriting RPM provided boto)
  • /usr/bin/cq (as above)
  • /usr/bin/elbadmin (as above)
  • /usr/bin/list_instances (as above)
  • /usr/bin/lss3 (as above)
  • /usr/bin/route53 (as above)
  • /usr/bin/s3multiput (as above)
  • /usr/bin/s3put (as above)
  • /usr/bin/sdbadmin (as above)

Files created

  • /etc/cinder/*
  • /etc/glance/*
  • /etc/keystone/*
  • /etc/nova/*
  • /etc/httpd/conf.d/horizon.conf
  • /etc/polkit-1/localauthority/50-local.d/50-libvirt-reomte-access.pkla
  • /etc/sudoers.d/50_stack_sh
  • /etc/sudoers.d/cinder-rootwrap
  • /etc/sudoers.d/nova-rootwrap
  • $DEST/cinder/*
  • $DEST/data/*
  • $DEST/glance/*
  • $DEST/horizon/*
  • $DEST/keystone/*
  • $DEST/noVNC/*
  • $DEST/nova/*
  • $DEST/python-cinderclient/*
  • $DEST/python-glanceclient/*
  • $DEST/python-keystoneclient/*
  • $DEST/python-novaclient/*
  • $DEST/python-openstackclient/*
  • /usr/bin/cinder*
  • /usr/bin/glance*
  • /usr/bin/keystone*
  • /usr/bin/nova*
  • /usr/bin/openstack
  • /usr/bin/quantum
  • /usr/bin/swift
  • /var/cache/cinder/*
  • /var/cache/glance/*
  • /var/cache/keystone/*
  • /var/cache/nova/*
  • /var/lib/mysql/cinder/*
  • /var/lib/mysql/glance/*
  • /var/lib/mysql/keystone/*
  • /var/lib/mysql/mysql/*
  • /var/lib/mysql/nova/*
  • /var/lib/mysql/performance_schema/*

Thoughts on installation

As we can see from the details above, DevStack does a very significant amount of work as root using sudo. I had fully expected that it was installing RPMs as root, but I had not counted on it adding extra python modules into /usr/lib/python-2.7, nor the addition of files in /etc/, /var or /usr/bin. I had set the $DEST environment variable for DevStack naively assuming that it would cause it to install everything possible under that location. In fact the $DEST variable was only used to control where the GIT checkouts of each openstack component went, along with a few misc files in $DEST/files/

IMHO a development environment setup tool should do as little as humanely possible as root. From the above list of changes, the only things that I believe justify use of sudo privileges are:

  • Installation of RPMs from standard YUM repositories
  • Installation of /etc/sudoers.d/ config files
  • Installation of /etc/polkit file to grant access to libvirtd

Everything else is capable of being 100% isolated from the rest of the OS, under the $DEST directory location. Taking that into account my preferred development setup would be

 +- nova (GIT checkout)
 +- ...etc... (GIT checkout)
 +- vroot
     +- bin
     |   +- nova
     |   +- ...etc...
     +- etc
     |   +- nova
     |   +- ...etc...
     +- lib
     |   +- python2.7
     |       +- site-packages
     |           +- boto
     |           +- ...etc...
     +- var
         +- cache
         |   +- nova
         |   +- ...etc...
         +- lib
             +- mysql

This would imply running a private copy of qpid, mysql and httpd, ideally all inside the same screen session as the rest of the OpenStack services, using unprivileged ports. Even if we relied on the system instances of qpid, mysql, httpd and did a little bit more privileged config, 95% of the rest of the stuff DevStack does as root, could still be kept unprivileged. I am also aware that current OpenStack code may not be amenable to installation in locations outside of / by default, but the code is all there to be modified to cope with arbitrary install locations if desired/required.

My other wishlist item for DevStack would be for it to print output that is meaningful to the end user running it. Simply printing a verbose list of every single shell command executed is one of the most unkind things you can do to a user. I’d like to see

 # ./
 * Cloning GIT repositories
     - 1/10 nova
     - 2/10 cinder
     - 3/10 quantum
 * Installing RPMs using YUM
     - 1/30 python-pip
     - 2/30 libvirt
     - 3/30 libvirt-client
 * Installing Python packages to $DEST/vroot/lib/python-2.7/site-packages using PIP
     - 1/24 WebOb
     - 2/24 ampqlib
     - 3/24 boto
 * Creating database schemas
     - 1/10 nova
     - 2/10 cinder
     - 3/10 quantum

By all means still save the full list of every shell command and their output to a ‘devstack.log’ file for troubleshooting when things go wrong.

What was new in libvirt for the OpenStack Nova Folsom release

Posted: November 16th, 2012 | Author: | Filed under: Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , , , | 1 Comment »

The Folsom release of OpenStack has been out for a few weeks now, and I had intended to write this post much earlier, but other things (moving house, getting married & travelling to LinuxCon Europe / KVM Forum all in the space of a month) got in the way. There are highlighted release notes, but I wanted to give a little more detail on some of the changes I was involved with making to the libvirt driver and what motivated them.

XML configuration

First off was a change in the way Nova generates libvirt XML configurations. Previously the libvirt driver in Nova used the Cheetah templating system to generate its XML configurations. The problem with this is that there was alot of information that needed to be passed into the template as parameters, so Nova was inventing a adhoc configuration format for libvirt guest config internally which then was further translated into proper guest config in the template. The resulting code was hard to maintain and understand, because the logic for constructing the XML was effectively spread across both the template file and the libvirt driver code with no consistent structure. Thus the first big change that went into the libvirt driver during Folsom was to introduce a formal set of python configuration objects to represent the libvirt XML config. The libvirt driver code now directly populates these config objects with the required data, and then simply serializes the objects to XML. The use of Cheetah has been completely eliminated, and the code structure is clarified significantly as a result. There is a wiki page describing this in a little more detail.

CPU model configuration

The primary downside from the removal of the Cheetah templating, is that it is no longer possible for admins deploying Nova to make adhoc changes to the libvirt guest XML that is used. Personally I’d actually argue that this is a good thing, because the ability to make adhoc changes meant that there was less motivation for directly addressing the missing features in Nova, but I know plenty of people would disagree with this view :-) It was quickly apparent that the one change a great many people were making to the libvirt XML config was to specify a guest CPU model. If no explicit CPU model is requested in the guest config, KVM will start with a generic, lowest common denominator model that will typically work everywhere. As can be expected, this generic CPU model is not going to offer optimal performance for the guests. For example, if your host has shiny new CPUs with builtin AES encryption instructions, the guest is not going to be able to take advantage of them. Thus the second big change in the Nova libvirt driver was to introduce explicit support for configuration the CPU model. This involves two new Nova config parameters, libvirt_cpu_mode which chooses between “host-model”, “host-passthrough” and “custom”. If mode is set to “custom”, then the libvirt_cpu_model parameter is used to specify the name of the custom CPU that is required. Again there is a wiki page describing this in a little more details.

Once the ability to choose CPU models was merged, it was decided that the default behaviour should also be changed. Thus if Nova is configured to use KVM as its hypervisor, then it will use the “host-model” CPU mode by default. This causes the guest CPU model to be a (almost) exact copy of the host CPU model, offering maximum out of the box performance. There turned out to be one small wrinkle in this choice when using nested KVM though. Due to a combination of problems in libvirt and KVM, use of “host-model” fails for nested KVM. Thus anyone using nested KVM needs to set libvirt_cpu_model=”none” as a workaround for now. If you’re using KVM on bare metal everything should be fine, which is of course the normal scenario for production deployments.

Time keeping policy

Again on the performance theme, the libvirt Nova driver was updated to set time time keeping policies for KVM guests. Virtual machines on x86 have a number of timers available including the PIT, RTC, PM-Timer, HPET. Reliable timers are one of the hardest problems to solve in full machine virtualization platforms, and KVM is no exception. If all comes down to the question of what to do when the hypervisor cannot inject a timer interrupt at the correct time, because a different guest is running. There are a number of policies available, inject the missed tick as soon as possible, merged all missed ticks into 1 and deliver it as soon as possible, temporarily inject missed ticks at a higher rate than normal to “catch up”, or simply discard the missed tick entirely. It turns out that Windows 7 is particularly sensitive to timers and the default KVM policies for missing ticks were causing frequent crashes, while older Linux guests would often experience severe time drift. Research validated by the oVirt project team has previously identified an optimal set of policies that should keep the majority of guests happy. Thus the libvirt Nova driver was updated to set explicit policies for time keeping with the PIT and RTC timers when using KVM, which should make everything time related much more reliable.

Libvirt authentication

The libvirtd daemon can be configured with a number of different authentication schemes. Out of the box it will use PolicyKit to authenticate clients, and thus Nova packages on Fedora / RHEL / EPEL include a policykit configuration file which grants Nova the ability to connect to libvirt. Administrators may, however, decide to use a different configuration scheme, for example, SASL. If the scheme chosen required a username+password, there was no way for Nova’s libvirt driver to provide these authentication credentials. Fortunately the libvirt client has the ability to lookup credentials in a local file. Unfortunately the way Nova connected to libvirt prevented this from working. Thus the way the Nova libvirt driver used openAuth() was fixed to allow the default credential lookup logic to work. It is now possible to require authentication between Nova and libvirt thus:

# augtool -s set /files/etc/libvirt/libvirtd.conf/auth_unix_rw sasl
Saved 1 file(s)

# saslpasswd -a libvirt nova
Password: XYZ
Again (for verification): XYZ

# su – nova -s /bin/sh
$ mkdir -p $HOME/.config/libvirt
$ cat > $HOME/.config/libvirt/auth.conf <<EOF


Other changes

Obviously I was not the only person working on the libvirt driver in Folsom, many others contributed work too. Leander Beernaert provided an implementation of the ‘nova diagnostics’ command that works with the libvirt driver, showing the virtual machine cpu, memory, disk and network interface utilization statistics. Pádraig Brady improved the performance of migration, by sending the qcow2 image between hosts directly, instead of converting it to raw file, sending that, and then converting it back to qcow2. Instead of transferring 10 G of raw data, it can now send just the data actually used which may be as little as a few 100 MB. In his test case, this reduced the time to migrate from 7 minutes to 30 seconds, which I’m sure everyone will like to hear :-) Pádraig also optimized the file injection code so that it only mounts the guest image once to inject all data, instead of mounting it separately for each injected item. Boris Filippov contributed support for storing VM disk images in LVM volumes, instead of qcow2 files, while Ben Swartzlander contributed support for using NFS files as the backing for virtual block volumes. Vish updated the way libvirt generates XML configuration for disks, to include the “serial” property against each disk, based on the nova volume ID. This allows the guest OS admin to reliably identify the disk in the guest, using the /dev/disk/by-id/virtio-<volume id> paths, since the /dev/vdXXX device numbers are pretty randomly assigned by the kernel.

Not directly part of the libvirt driver, but Jim Fehlig enhanced the Nova VM schedular so that it can take account of the hypervisor, architecture and VM mode (paravirt vs HVM) when choosing what host to boot an image on. This makes it much more practical to run mixed environments of say, Xen and KVM, or Xen fullvirt vs Xen paravirt, or  Arm vs x86, etc. When uploading an image to glance, the admin can tag it with properties specifying the desired hypervisor/architecture/vm_mode. The compute drivers then report what combinations they can support, and the scheduler computes the intersection to figure out which hosts are valid candidates for running the image.