Running a full Fedora OS inside a libvirt LXC guest

Posted: August 12th, 2013 | Filed under: libvirt, Virt Tools | Tags: , , , | 2 Comments »

Historically, running a Linux OS inside an LXC guest, has required a execution of a set of hacky scripts which do a bunch of customizations to the default OS install to make it work in the constrained container environment. One of the many benefits to Fedora, of the switch over to systemd, has been that a default Fedora install has much more sensible behaviour when run inside containers. For example, systemd will skip running udev inside a container since containers do not get given permission to mknod – the /dev is pre-populated with the whitelist of devices the container is allowed to use. As such running Fedora inside a container is really not much more complicated than invoking yum to install desired packages into a chroot, then invoking virt-install to configure the LXC guest.

As a proof of concept, on Fedora 19 I only needed to do the following to setup a Fedora 19 environment suitable for execution inside LXC

 # yum -y --releasever=19 --nogpg --installroot=/var/lib/libvirt/filesystems/mycontainer \
          --disablerepo='*' --enablerepo=fedora install \
          systemd passwd yum fedora-release vim-minimal openssh-server procps-ng
 # echo "pts/0" >> /var/lib/libvirt/filesystems/mycontainer/etc/securetty
 # chroot /var/lib/libvirt/filesystems/mycontainer /bin/passwd root

It would be desirable to avoid the manual editing of /etc/securetty. LXC guests get their default virtual consoles backed by a /dev/pts/0 device, which isn’t listed in the securetty file by default. Perhaps it is a simple as just adding that device node unconditionally. Just have to think about whether there’s a reason to not do that which would impact bare metal. With the virtual root environment ready, now virt-install can be used to configure the container with libvirt

# virt-install --connect lxc:/// --name mycontainer --ram 800 \
              --filesystem /var/lib/libvirt/filesystems/mycontainer,/

virt-install will create the XML config libvirt wants, and boot the guest, opening a connection to the primary text console. This should display boot up messages from the instance of systemd running as the container’s init process, and present a normal text login prompt.

If attempting this with systemd-nspawn command, login would fail because the PAM modules audit code will reject all login attempts. This is really unhelpful behaviour by PAM modules which can’t be disabled by any config, except for booting the entire host with audit=0 which is not very desirable. Fortunately, however, virt-install will configure a separate network namespace for the container by default, which will prevent the PAM module from talking to the kernel audit service entirely, giving it a ECONNREFUSED error. By a stoke of good luck, the PAM modules treat ECONNREFUSED as being equivalent to booting with audit=0, so everything “just works”. This is nice case of two bugs cancelling out to leave no bug :-)

While the above commands are fairly straightforward, it is a goal of ours to simplify live even further, into a single command. We would like to provide a command that looks something like this:

# virt-bootstrap --connect lxc:/// --name mycontainer --ram 800 \
                 --root /var/lib/libvirt/filesystems/mycontainer \
                 --osid fedora19

The idea is that the ‘–osid’ value will be looked up in the libosinfo database. This will have details of the software repository for that OS, and whether it uses yum/apt/ebuild/somethingelse. virt-bootstrap will then invoke the appropriate packaging tool to populate the root filesystem, and then boot the container in one single step.

One final point is that LXC in Fedora still can’t really be considered to be secure without the use of SELinux. The commands I describe above don’t do anything to enable SELinux protection of the container at this time. This is obviously something that ought to be fixed. Separate from this, upstream libvirt now has support for the kernel user namespace feature. This enables plain old the DAC framework to provide a secure container environment. Unfortunately this kernel feature is still not available in Fedora kernel builds. It is blocked on upstream completion of patches for XFS. Fortunately this work seems to be moving forward again, so if we’re lucky it might just be possible to enable user namespaces in Fedora 20, finally making LXC reasonably secure by default even without SELinux.