Walk through of running OpenStack on Fedora 17 using DevStack

Posted: November 19th, 2012 | Author: | Filed under: Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , , , | 2 Comments »

When first getting involved in the OpenStack project as a developer, most people will probably recommend use of DevStack. When I first started hacking, I skipped this because it wasn’t reliable on Fedora at that time, but these days it works just fine and there are even basic instructions for DevStack on Fedora. Last week I decided to finally give DevStack a go, since my hand-crafted dev environment was getting kind of nasty. The front page on the DevStack website says it is only supported on Fedora 16, but don’t let that put you off; aside from one bug which does not appear distro specific, it all seemed to work correctly. What follows is an overview of what I did / learnt

Setting up the virtual machine

I don’t like really like letting scripts like DevStack mess around with my primary development environment, particularly when there is little-to-no-documentation about what changes they will be making and they ask for unrestricted sudo (sigh) privileges ! Thus running DevStack inside a virtual machine was the obvious way to go. Yes, this means actual VMs run by Nova will be forced to use plain QEMU emulation (or nested KVM if you are brave), but for dev purposes this is fine, since the VMs don’t need todo anything except boot. My host is Fedora 17, and for simplicity I decided that my guest dev environment will also be Fedora 17. With that decided installing the guest was a simple matter of running virt-install on the host as root

# virt-install --name f17x86_64 --ram 2000 --file /var/lib/libvirt/images/f17x86_64.img --file-size 20 --accelerate --location http://mirror2.hs-esslingen.de/fedora/linux//releases/17/Fedora/x86_64/os/ --os-variant fedora17

I picked the defaults for all installer options, except for reducing the swap file size down to a more sensible 500 MB (rather than the 4 G it suggested). NB if copying this, you probably want to change the URL used to point to your own best mirror location.

Once installation completed, run through the firstboot wizard, creating yourself an unprivileged user account, then login as root. First add the user to the wheel group, to enable it to run sudo commands:

# gpasswd -a YOURUSERNAME wheel

The last step before getting onto DevStack is to install GIT

# yum -y install git

Setting up DevStack

The recommended way to use DevStack, is to simply check it out of GIT and run the latest code available. I like to keep all my source code checkouts in one place, so I’m using $HOME/src/openstack for this project

$ mkdir -p $HOME/src/openstack
$ cd $HOME/src/openstack
$ git clone git://github.com/openstack-dev/devstack.git

Arguably you can now just kick off the stack.sh script at this point, but there are some modifications that are a good idea to do. This involves creating a “localrc” file in the top level directory of the DevStack checkout

$ cd devstack
$ cat > localrc <<EOF
# Stop DevStack polluting /opt/stack
DESTDIR=$HOME/src/openstack

# Switch to use QPid instead of RabbitMQ 
disable_service rabbit
enable_service qpid

# Replace with your primary interface name
HOST_IP_IFACE=eth0
PUBLIC_INTERFACE=eth0
VLAN_INTERFACE=eth0
FLAT_INTERFACE=eth0

# Replace with whatever password you wish to use
MYSQL_PASSWORD=badpassword
SERVICE_TOKEN=badpassword
SERVICE_PASSWORD=badpassword
ADMIN_PASSWORD=badpassword

# Pre-populate glance with a minimal image and a Fedora 17 image
IMAGE_URLS="http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-uec.tar.gz,http://berrange.fedorapeople.org/images/2012-11-15/f17-x86_64-openstack-sda.qcow2"
EOF

With the localrc created, now just kick off the stack.sh script

$ ./stack.sh

At time of writing there is a bug in DevStack which will cause it to fail to complete correctly – it is checking for existence paths before it has created them. Fortunately, just running it for a second time is a simple workaround

$ ./unstack.sh
$ ./stack.sh

From a completely fresh Fedora 17 desktop install, stack.sh will take a while to complete, as it installs a large number of pre-requisite RPMs and downloads the appliance images. Once it has finished it should tell you what URL the Horizon web interface is running on. Point your browser to it and login as “admin” with the password provided in your localrc file earlier.
Because we told DevStack to use $HOME/src/openstack as the base directory, a small permissions tweak is needed to allow QEMU to access disk images that will be created during testing.

$ chmod o+rx $HOME

Note, that SELinux can be left ENFORCING, as it will just “do the right thing” with the VM disk image labelling.
UPDATE: if you want to use the Horizon web interface, then you do in fact need to set SELinux to permissive mode, since Apache won’t be allowed to access your GIT checkout where the Horizon files live.

$ sudo su -
# setenforce 0
# vi /etc/sysconfig/selinux
...change to permissive...

UPDATE:If you want to use Horizon, you must also manually install Node.js from a 3rd party repositoryh, because it is not yet included in Fedora package repositories:

# yum localinstall --nogpgcheck http://nodejs.tchol.org/repocfg/fedora/nodejs-stable-release.noarch.rpm
# yum -y install nodejs nodejs-compat-symlinks
# systemctl restart httpd.service

Testing DevStack

Before going any further, it is a good idea to make sure that things are operating somewhat normally. DevStack has created an file containing the environment variables required to communicate with OpenStack, so load that first

$ . openrc

Now check what images are available in glance. If you used the IMAGE_URLS example above, glance will have been pre-populated

$ glance image-list
+--------------------------------------+---------------------------------+-------------+------------------+-----------+--------+
| ID                                   | Name                            | Disk Format | Container Format | Size      | Status |
+--------------------------------------+---------------------------------+-------------+------------------+-----------+--------+
| 32b06aae-2dc7-40e9-b42b-551f08e0b3f9 | cirros-0.3.0-x86_64-uec-kernel  | aki         | aki              | 4731440   | active |
| 61942b99-f31c-4155-bd6c-d51971d141d3 | f17-x86_64-openstack-sda        | qcow2       | bare             | 251985920 | active |
| 9fea8b4c-164b-4f54-8e74-b53966e858a6 | cirros-0.3.0-x86_64-uec-ramdisk | ari         | ari              | 2254249   | active |
| ec3e9b72-0970-44f2-b442-58d0042448f7 | cirros-0.3.0-x86_64-uec         | ami         | ami              | 25165824  | active |
+--------------------------------------+---------------------------------+-------------+------------------+-----------+--------+

Incidentally the glance sort ordering is less than helpful here – it appears to be sorting based on the UUID strings rather than the image names :-(
Before booting a instance, Nova likes to be given an SSH public key, which it will inject into the guest filesystem to allow admin login

$ nova keypair-add --pub-key $HOME/.ssh/id_rsa.pub mykey

Finally an image can be booted

$ nova boot --key-name mykey --image f17-x86_64-openstack-sda --flavor m1.tiny f17demo1
+------------------------+--------------------------------------+
| Property               | Value                                |
+------------------------+--------------------------------------+
| OS-DCF:diskConfig      | MANUAL                               |
| OS-EXT-STS:power_state | 0                                    |
| OS-EXT-STS:task_state  | scheduling                           |
| OS-EXT-STS:vm_state    | building                             |
| accessIPv4             |                                      |
| accessIPv6             |                                      |
| adminPass              | NsddfbJtR6yy                         |
| config_drive           |                                      |
| created                | 2012-11-19T15:00:51Z                 |
| flavor                 | m1.tiny                              |
| hostId                 |                                      |
| id                     | 6ee509f9-b612-492b-b55b-a36146e6833e |
| image                  | f17-x86_64-openstack-sda             |
| key_name               | mykey                                |
| metadata               | {}                                   |
| name                   | f17demo1                             |
| progress               | 0                                    |
| security_groups        | [{u'name': u'default'}]              |
| status                 | BUILD                                |
| tenant_id              | dd3d27564c6043ef87a31404aeb01ac5     |
| updated                | 2012-11-19T15:00:55Z                 |
| user_id                | 72ae640f50434d07abe7bb6a8e3aba4e     |
+------------------------+--------------------------------------+

Since we’re running QEMU inside a KVM guest, booting the image will take a little while – several minutes or more. Just keep running the ‘nova list’ command to keep an eye on it, until it shows up as ACTIVE

$ nova list
+--------------------------------------+----------+--------+------------------+
| ID                                   | Name     | Status | Networks         |
+--------------------------------------+----------+--------+------------------+
| 6ee509f9-b612-492b-b55b-a36146e6833e | f17demo1 | ACTIVE | private=10.0.0.2 |
+--------------------------------------+----------+--------+------------------+

Just to prove that it really is working, login to the instance with SSH


$ ssh ec2-user@10.0.0.2
The authenticity of host '10.0.0.2 (10.0.0.2)' can't be established.
RSA key fingerprint is 9a:73:e5:1a:39:e2:f7:a5:10:a7:dd:bc:db:6e:87:f5.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.0.2' (RSA) to the list of known hosts.
[ec2-user@f17demo1 ~]$ sudo su -
[root@f17demo1 ~]# 

Working with DevStack

The DevStack setup runs all the python services under a screen session. To stop/start individual services, attach to the screen session with the ‘rejoin-stack.sh’ script. Each service is running under a separate screen “window”. Switch to the window containing the service to be restarted, and just Ctrl-C it and then use bash history to run the same command again.

$ ./rejoin-stack.sh

Sometimes the entire process set will need to be restarted. In this case, just kill the screen session entirely, which causes all the OpenStack services to go away. Then the same ‘rejoin-stack.sh’ script can be used to start them all again.
One annoyance is that unless you have the screen session open, the debug messages from Nova don’t appear to end up anywhere useful. I’ve taken to editing the file “stack-screen” to make each service log to a local file in its checkout. eg I changed

stuff "cd /home/berrange/src/openstack/nova && sg libvirtd /home/berrange/src/openstack/nova/bin/nova-compute"

to

stuff "cd /home/berrange/src/openstack/nova && sg libvirtd /home/berrange/src/openstack/nova/bin/nova-compute 2>&1 | tee nova-compute.log"

Troubleshooting libvirt with the KVM and LXC drivers

Posted: October 3rd, 2011 | Author: | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | 1 Comment »

In “fantasy island” the libvirt and KVM/LXC code is absolutely perfect and always does exactly what you want it todo. Back in the real world, however, there may be annoying bugs in libvirt, KVM/LXC, the kernel and countless other parts of the OS that conspire to cause you great pain and suffering. This blog post contains a very quick introduction to debugging/troubleshooting libvirt problems, particularly focusing on the KVM and LXC drivers.

libvirt logging capabilities

The libvirt code is full of logging statements which can be instrumental in understanding where a problem might lie.

Configuring libvirtd logging

Current releases of libvirt will log problems occurring in libvirtd at level WARNING/ERROR to a dedicated log file /var/log/libvirt/libvirtd.log, while older releases would be send them to syslog, typically ending up in /var/log/messages. The libvirtd configuration file has two parameters that can be used to increase the amount of logging information printed.

log_filters="...filter string..."
log_outputs="...destination config..."

The logging documentation describes these in some detail. If you just want to quickly get started though, it suffices to understand that filter strings are simply doing substring matches against libvirt source filenames. So to enable all debug information from ‘src/util/event.c’ (the libvirt event loop) you would set

log_filters="1:event"
log_outputs="1:file:/var/log/libvirt/libvirtd.log"

If you wanted to enable logging for everything in ‘src/util’, except for ‘src/util/event.c’ you would set

log_filters="3:event 1:util"
log_outputs="1:file:/var/log/libvirt/libvirtd.log"

Configuring libvirt client logging

On the client side of libvirt there is no configuration file to put log settings in, so instead, there are a couple of environment variables. These take exactly the same type of strings as the libvirtd configuration file

LIBVIRT_LOG_FILTERS="...filter string..."
LIBVIRT_LOG_OUTPUTS="...destination config..."
export LIBVIRT_LOG_FILTERS LIBVIRT_LOG_OUTPUTS

One thing to be aware of is that with the KVM and LXC drivers in libvirt, very little code is ever run on the libvirt client. The only interesting pieces are the RPC code, event loop and main API entrypoints. To enable debugging of the RPC code you might use

LIBVIRT_LOG_FILTERS="1:rpc" LIBVIRT_LOG_OUTPUTS="1:stderr" virsh list

Useful log filter settings for KVM and LXC

The following are some useful values for logging wrt the KVM and LXC drivers

All libvirt public APIs invoked
1:libvirt
All external commands run by libvirt
1:command
Cgroups management
1:cgroup
All QEMU driver code
1:qemu
QEMU text monitor commands
1:qemu_monitor_text
QEMU JSON/QMP monitor commands
1:qemu_monitor_json
All LXC driver code
1:lxc
All lock management code
1:locking
All security manager code
1:security

QEMU driver logfiles

Every QEMU process run by libvirt has a dedicated log file /var/log/libvirt/qemu/$VMNAME.log which captures any data that QEMU writes to stderr/stdout. It also contains timestamps written by libvirtd whenever the QEMU process is started, and exits. Finally, prior to starting a guest, libvirt will write out the full set of environment variables and command line arguments it intends to launch QEMU with.

If you are running libvirtd with elevated log settings, there is also the possibility that some of the logging output will end up in the per-VM logfile, instead of the location set by the log_outputs configuration parameter. This is because a little bit of libvirt code will run in the child process between the time it is forked and QEMU is exec()d.

LXC driver logfiles

Every LXC process run by libvirt has a dedicated log file /var/log/libvirt/qemu/$VMNAME.log which captures any data that QEMU writes to stderr/stdout. As with QEMU it will also contain the command line args libvirt uses, though these are much less interesting in the LXC case. The LXC logfile is mostly useful for debugging the initial container bootstrap process.

Troubleshooting SELinux / sVirt

On a RHEL or Fedora host, the out of the box configuration will run all guests under confined SELinux contexts. One common problem that may affect developers running libvirtd straight from the source tree is that libvirtd itself will run under the wrong context, which in turn prevents guests from running correctly. This can be addressed in two ways, first by manually labelling the libvirtd binary after each rebuild

chcon system_u:object_r:virtd_exec_t:s0 $SRCTREE/daemon/.libs/lt-libvirtd

Or by specifying a label when executing libvirtd

runcon system_u:object_r:virtd_exec_t:s0 $SRCTREE/daemon/libvirtd

Another problem might be with libvirt not correctly labelling some device needed by the QEMU process. The best way to see what’s going on here, is to enable libvirtd logging with a filter of “1:security_selinux”, which will print out a message for every single file path that libvirtd labels. Then look at the log to see that everything expected is present:

14:36:57.223: 14351: debug : SELinuxGenSecurityLabel:284 : model=selinux label=system_u:system_r:svirt_t:s0:c669,c903 imagelabel=system_u:object_r:svirt_image_t:s0:c669,c903 baselabel=(null)
14:36:57.350: 14351: info : SELinuxSetFilecon:402 : Setting SELinux context on '/var/lib/libvirt/images/f16x86_64.img' to 'system_u:object_r:svirt_image_t:s0:c669,c903'
14:36:57.350: 14351: info : SELinuxSetFilecon:402 : Setting SELinux context on '/home/berrange/boot.iso' to 'system_u:object_r:virt_content_t:s0'
14:36:57.551: 14351: debug : SELinuxSetSecurityDaemonSocketLabel:1129 : Setting VM f16x86_64 socket context unconfined_u:unconfined_r:unconfined_t:s0:c669,c903

If a guest is failing to start, then there are two ways to double check if it really is SELinux related. SELinux can be put into permissive mode on the virtualization host

setenforce 0

Or the sVirt driver can be disabled in libvirt entirely

# vi /etc/libvirt/qemu.conf
...set 'security_driver="none" ...
# service libvirtd restart

Troubleshooting cgroups

When libvirt runs guests on modern Linux systems, cgroups will be used to control aspects of the guests’ execution. If any cgroups are mounted on the host when libvirtd starts up, it will create a basic hierarchy

$MOUNT_POINT
 |
 +- libvirt
     |
     +- qemu
     +- lxc

When starting a KVM or LXC guest, further directories will be created, one per guest, so that after a while the tree will look like

$MOUNT_POINT
 |
 +- libvirt
     |
     +- qemu
     |    |
     |    +- VMNAME1
     |    +- VMNAME1
     |    +- VMNAME1
     |    +- ...
     |    ...
     +- lxc
          |
          +- VMNAME1
          +- VMNAME1
          +- VMNAME1
          +- ...

Assuming the host administrator has not changed the policy in the top level cgroups, there should be no functional change to operation of the guests with this default setup. There are possible exceptions though if you are trying something unusal. For example, the ‘devices’ cgroups controller will be used to setup a whitelist of block / character devices that QEMU is allowed to access. So if you have modified QEMU to access to funky new device, libvirt will likely block this via the cgroups device ACL. Due to various kernel bugs, some of the cgroups controllers have also had a detrimental performance impact on both QEMU guest and the host OS as a whole.

libvirt will never try to mount any cgroups itself, so the quickest way to stop libvirt using cgroups is to stop the host OS from mounting them. This is not always desirable though, so there is a configuration parameter in /etc/libvirt/qemu.conf which can be used to restrict what cgroups libvirt will use.

Running from the GIT source tree

Sometimes when troubleshooting a particularly hard problem it might be desirable to build libvirt from the latest GIT source and run that. When doing this is a good idea not to overwrite your distro provided installation with a GIT build, but instead run libvirt directly from the source tree. The first thing to be careful of is that the custom build uses the right installation prefix (ie /etc, /usr, /var and not /usr/local). To simplify this libvirt provides an ‘autogen.sh’ script to run all the right libtool commands and set the correct prefixes. So to build libvirt from GIT, in a way that is compatible with a typical distro build use:

./autogen.sh --system --enable-compile-warnings=error
make

Hint: use make -j 4 (or larger) to significantly speed up the build on multi-core systems

To run libvirtd from the source tree, as root, stop the existing daemon and invoke the libtool wrapper script

# service libvirtd stop
# ./daemon/libvirtd

Or to run with SELinux contexts

# service libvirtd stop
# runcon system_u:system_r:virtd_t:s0-s0:c0.c1023 ./daemon/libvirtd

virsh can easily be run from the source tree in the same way

# ./tools/virsh ....normal args...

Running python programs against a non-installed libvirt gets a little harder, but that can be overcome too

$ export PYTHONPATH=$SOURCETREE/python:$SOURCETREE/python/.libs
$ export LD_LIBRARY_PATH=$SOURCETREE/src/.libs
$ python virt-manager --no-fork

When running the LXC driver, it is necessary to make a change to the guest XML to point it to a different emulator. Running ‘virsh edit $GUEST’ change

/usr/libexec/libvirt_lxc

to

$SOURCETREE/src/libvirt_lxc

(expand $SOURCETREE to be the actual path of the GIT checkout – libvirt won’t interpret env vars in the XML)

Two small improvements to sVirt guest configuration flexibility with KVM+libvirt

Posted: September 29th, 2011 | Author: | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , | No Comments »

sVirt has been available in the libvirt KVM driver for a few years now, both for SELinux and more recently for AppArmour. When using it with SELinux there has been a choice of two different configurations

Dynamic configuration
libvirt takes the default base label (“system_u:system_r:svirt_t:s0”), generates a unique MCS label for the guest (“c123,c465”) and combines them to form the complete security label for the virtual machine process. libvirt takes the same MCS label and combines it with the default image base label (“system_u:system_r:svirt_image_t:s0”) to form the image label. libvirt will then automatically apply the image label to all host OS files that the VM is required to access. These can be disk images, disk devices, PCI devices (we label the corresponding sysfs files), USB devices (we label the /dev/bus/usb files), kernel/initrd files, and a few more things. When the VM shuts down again, we reverse the labelling. This mode was originally intended for general usage where the management application is not aware of the existence of sVirt.
Static configuration
The guest XML provides the full security label, including the MCS part. libvirt simply assigns this security label to the virtual machine process without trying to alter/interpret it any further. libvirt does not change the labels of any files on disk. The administrator/application using libvirt, is expected to have done all the resource file labelling ahead of time. This mode was originally intended for locked down MLS environments, where even libvirtd itself is not trusted to perform relabelling

These two configurations have worked well enough for the two uses cases they were designed to satisfy. As sVirt has become an accepted part of the libvirt/KVM ecosystem, application developers have started wanting todo more advances things which are currently harder than they should be. In particular some applications want to have full control over the security label generation (eg to ensure cluster-wide unique labels, instead of per-host uniqueness), but still want libvirt to take care of resource relabelling. This is sort of a hybrid between our static & dynamic configuration. Other applications would like to be able to choose a different base label (“system_u:system_r:svirt_custom_t:s0”) but still have libvirt assign the MCS suffix and perform relabelling. This is another variant on dynamic labelling. To satisfy these use cases we have extended the syntax for sVirt labelling in recent libvirt. The “seclabel” element gained a ‘relabel’ attribute to control whether resource relabelling is attempted. A new “baselabel” element was introduced to override the default base security label in dynamic mode. So there are now 4 possible styles of configuration:

  • Dynamic configuration (the default out of the box usage)

    <seclabel type='dynamic' model='selinux' relabel='yes'>
      <label>system_u:system_r:svirt_t:s0:c192,c392</label>                  (output only element)
      <imagelabel>system_u:object_r:svirt_image_t:s0:c192,c392</imagelabel>  (output only element)
    </seclabel>
  • Dynamic configuration, with base label

    <seclabel type='dynamic' model='selinux' relabel='yes'>
      <baselabel>system_u:system_r:svirt_custom_t:s0</baselabel>
      <label>system_u:system_r:svirt_custom_t:s0:c192,c392</label>           (output only element)
      <imagelabel>system_u:object_r:svirt_image_t:s0:c192,c392</imagelabel>  (output only element)
    </seclabel>
  • Static configuration, no resource labelling (primarily for MLS/strictly controlled environments)

    <seclabel type='static' model='selinux' relabel='no'>
      <label>system_u:system_r:svirt_custom_t:s0:c192,c392</label>
    </seclabel>
  • Static configuration, with dynamic resource labelling

    <seclabel type='static' model='selinux' relabel='yes'>
      <label>system_u:system_r:svirt_custom_t:s0:c192,c392</label>
    </seclabel>