Fine grained access control in libvirt using polkit

Posted: August 12th, 2013 | Author: | Filed under: Fedora, libvirt, OpenStack, Security, Virt Tools | Tags: , , , | 3 Comments »

Historically access control to libvirt has been very coarse, with only three privilege levels “anonymous” (only authentication APIs are allowed), “read-only” (only querying information is allowed) and “read-write” (anything is allowed). Over the past few months I have been working on infrastructure inside libvirt to support fine grained access control policies. The initial code drop arrived in libvirt 1.1.0, and the wiring up of authorization checks in drivers was essentially completed in libvirt 1.1.1 (with the exception of a handful of APIs in the legacy Xen driver code). We did not wish to tie libvirt to any single access control system, so the framework inside libvirt is modular, to allow for multiple plugins to be developed. The only plugin provided at this time makes use of polkit for its access control checks. There was a second proof of concept plugin that used SELinux to provide MAC, but there are a number of design issues still to be resolved with that, so it is not merged at this time.

The basic framework integration

The libvirt library exposes a number of objects (virConnectPtr, virDomainPtr, virNetworkPtr, virNWFilterPtr, virNodeDevicePtr, virIntefacePtr, virSecretPtr, virStoragePoolPtr, virStorageVolPtr), with a wide variety of operations defined in the public API. Right away it was clear that we did not wish to describe access controls based on the names of the APIs themselves. For each object there are a great many APIs which all imply the same level of privilege, so it made sense to collapse those APIs onto single permission bits. At the same time, some individual APIs could have multiple levels of privilege depending on the flags set in parameters, so would expand to multiple permission bits. Thus the first task to was come up with a list of permission bits which were able to cover all APIs. This was encoded in the internal viraccessperm.h header file. With the permissions defined, the next big task was to define a mapping between permissions and APIs. This mapping was encoded as magic comments in the RPC protocol definition file. This in turn allows the code for performing access control checks to be automatically generated, thus minimizing scope for coding errors, such as forgetting to perform checks in a method, or performing the wrong checks.

The final coding step was for the automatically generated ACL check methods to be inserted into each of the libvirt driver APIs. Most of the ACL checks validate the input parameters to ensure the caller is authorized to operate on the object in question. In a number of methods, the ACL checks are used to restrict / filter the data returned. For example, if asking for a list of domains, the returned list must be filtered to only those the client is authorized to see. While the code for checking permissions was auto-generated, it is not practical to automatically insert the checks into each libvirt driver. It was, however, possible to write scripts to perform static analysis on the code to validate that each driver had the full set of access control checks present. Of course it helps to tell developers / administrators which permissions apply to each API, so the code which generates the API reference documentation was also enhanced so that the API reference lists the permissions required in each circumstance.

The polkit access control driver

Libvirt has long made use of polkit for authenticating connections over its UNIX domain sockets. It was thus natural to expand on this work to make use of polkit as a driver for the access control framework. Historically this would not have been practical, because the polkit access control rule format did not provide a way for the admin to configure access control checks on individual object instances – only object classes. In polkit 0.106, however, a new engine was added which allowed admins to use javascript to write access control policies. The libvirt polkit driver takes object class names and permission names to form polkit action names. For example, the “getattr” permission on the virDomainPtr class maps to the polkit org.libvirt.api.domain.getattr permission. When performing a access control check, libvirt then populates the polkit authorization “details” map with one or more attributes which uniquely identify the object instance. For example, the virDomainPtr object gets “connect_driver” (libvirt driver name), “domain_uuid” (globally unique UUID), and “domain_name” (host local unique name) details set. These details can be referenced in the javascript policy to scope rules to individual object instances.

Consider a local user berrange who has been granted permission to connect to libvirt in full read-write mode. The goal is to only allow them to use the QEMU driver and not the Xen or LXC drivers which are also available in libvirtd. To achieve this we need to write a rule which checks whether the connect_driver attribute is QEMU, and match on an action name of org.libvirt.api.connect.getattr. Using the javascript rules format, this ends up written as

polkit.addRule(function(action, subject) {
    if (action.id == "org.libvirt.api.connect.getattr" &&
        subject.user == "berrange") {
          if (action.lookup("connect_driver") == 'QEMU') {
            return polkit.Result.YES;
          } else {
            return polkit.Result.NO;
          }
    }
});

As another example, consider a local user berrange who has been granted permission to connect to libvirt in full read-write mode. The goal is to only allow them to see the domain called demo on the LXC driver. To achieve this we need to write a rule which checks whether the connect_driver attribute is LXC and the domain_name attribute is demo, and match on a action name of org.libvirt.api.domain.getattr. Using the javascript rules format, this ends up written as

polkit.addRule(function(action, subject) {
    if (action.id == "org.libvirt.api.domain.getattr" &&
        subject.user == "berrange") {
          if (action.lookup("connect_driver") == 'LXC' &&
              action.lookup("domain_name") == 'demo') {
            return polkit.Result.YES;
          } else {
            return polkit.Result.NO;
          }
    }
});

Futher work

While the access control support in libvirt 1.1.1 provides a useful level of functionality, there is still more that can be done in the future. First of all, the polkit driver needs to have some performance optimization work done. It currently relies on invoking the ‘pkcheck’ binary to validate permissions. While this is fine for hosts with small numbers of objects, it will quickly become too costly. The solution here is to directly use the DBus API from inside libvirt.

The latest polkit framework is fairly flexible in terms of letting us identify object instances via the details map it associates with every access control check. It is far less flexible in terms of identifying the client user. It is fairly locked into the idea of identifying users via remote PID or DBus service name, and then exposing the username/groupnames to the javascript rules files. While this works fine for local libvirt connections over UNIX sockets, it is pretty much useless for connections arriving on libvirt’s TCP sockets. In the latter case the libvirt user is identied by a SASL username (typically a Kerberos principal name), or via an x509 certificate distinguished name (when using client certs with TLS). There’s no way official way to feed the SASL username of x509 dname down to the polkit javascript authorization rules files. Requests upstream to allow extra identifying attributes to be provide for the authorization subject have not been productive, so I’m considering (ab-)using the “details” map to provide identifying info for the users, alongside the identifying info for the object.

As mentioned earlier, there was a proof of concept SELinux driver written, that is yet to be finished. The work there is around figuring out / defining what the SELinux context is for each object to be checked and doing some work on SELinux policy. I think of this work as providing a capability similar to that done in PostgreSQL to enable SELinux MAC checks. It would be very nice to have a system which provides end-to-end MAC. I refer to this as sVirt 2.0 – the first (current) version of sVirt protected the host from guests – the second (future) version would also protect the host from management clients.

The legacy XenD based Xen driver has a couple of methods which lack access control, due to the inability to get access to the identifying attributes for the objects being operated upon. While we encourage people to use the new libxl based Xen driver, it is desirable to have the legacy Xen driver fully controlled for those people using legacy virtualization hosts. Some code refactoring will be required to fix the legacy Xen driver, likely at the cost of making some methods less efficient.

If there is user demand, work may be done to write an access control driver which is natively implemented entirely within libvirt. While the polkit javascript engine is fairly flexible, I’m not much of a fan of having administrators write code to define their access control policy. It would be preferable to have a way to describe the policy that was entirely declarative. With a libvirt native access control driver, it would be possible to create a simple declarative policy file format tailored to our precise needs. This would let us solve the problem of providing identifying info about the subject being checked. It would also have the potential to be more scalable by avoiding the need to interact with any remote authorization deamons over DBus. The latter could be a big deal when an individual API call needs to check 1000’s of permissions at once. The flipside of course, is that a libvirt specific access control driver is not good for interoperability across the broader system – the standardized use of polkit is good in that respect. There’s no technical reason why we can’t support multiple access control drivers to give the administrator choice / flexibility.

Finally, this work is all scheduled to arrive in Fedora 20, so anyone interested in testing it should look at current rawhide, or keep an eye out for the Fedora 20 virtualization test day.

EDITED: Aug 15th: Change example use of ‘action._detail_connect_driver’ to ‘action.lookup(“connect_driver”)’