Full coverage of libvirt XML schemas achieved in libvirt-go-xml

Posted: December 7th, 2017 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Virt Tools | Tags: , | No Comments »

In recent times I have been aggressively working to expand the coverage of libvirt XML schemas in the libvirt-go-xml project. Today this work has finally come to a conclusion, when I achieved what I believe to be effectively 100% coverage of all of the libvirt XML schemas. More on this later, but first some background on Go and XML….

For those who aren’t familiar with Go, the core library’s encoding/xml module provides a very easy way to consume and produce XML documents in Go code. You simply define a set of struct types and annotate their fields to indicate what elements & attributes each should map to. For example, given the Go structs:

type Person struct {
    XMLName xml.Name `xml:"person"`
    Name string `xml:"name,attr"`
    Age string `xml:"age,attr"` 
    Home *Address `xml:"home"`
    Office *Address `xml:"office"`
} 
type Address struct { 
    Street string `xml:"street"`
    City string `xml:"city"` 
}

You can parse/format XML documents looking like

<person name="Joe Blogs" age="24">
  <home>
    <street>Some where</street><city>London</city>
  </home>
  <office>
    <street>Some where else</street><city>London</city>
  </office>  
</person>

Other programming languages I’ve used required a great deal more work when dealing with XML. For parsing, there’s typically a choice between an XML stream based parser where you have to react to tokens as they’re parsed and stuff them into structs, or a DOM object hierarchy from which you then have to pull data out into your structs. For outputting XML, apps either build up a DOM object hierarchy again, or dynamically format the XML document incrementally. Whichever approach is taken, it generally involves writing alot of tedious & error prone boilerplate code. In most cases, the Go encoding/xml module eliminates all the boilerplate code, only requiring the data type defintions. This really makes dealing with XML a much more enjoyable experience, because you effectively don’t deal with XML at all! There are some exceptions to this though, as the simple annotations can’t capture every nuance of many XML documents. For example, integer values are always parsed & formatted in base 10, so extra work is needed for base 16. There’s also no concept of unions in Go, or the XML annotations. In these edge cases custom marshaling / unmarshalling methods need to be written. BTW, this approach to XML is also taken for other serialization formats including JSON and YAML too, with one struct field able to have many annotations so it can be serialized to a range of formats.

Back to the point of the blog post, when I first started writing Go code using libvirt it was immediately obvious that everyone using libvirt from Go would end up re-inventing the wheel for XML handling. Thus about 1 year ago, I created the libvirt-go-xml project whose goal is to define a set of structs that can handle documents in every libvirt public XML schema. Initially the level of coverage was fairly light, and over the past year 18 different contributors have sent patches to expand the XML coverage in areas that their respective applications touched. It was clear, however, that taking an incremental approach would mean that libvirt-go-xml is forever trailing what libvirt itself supports. It needed an aggressive push to achieve 100% coverage of the XML schemas, or as near as practically identifiable.

Alongside each set of structs we had also been writing unit tests with a set of structs populated with data, and a corresponding expected XML document. The idea for writing the tests was that the author would copy a snippet of XML from a known good source, and then populate the structs that would generate this XML. In retrospect this was not a scalable approach, because there is an enourmous range of XML documents that libvirt supports. A further complexity is that Go doesn’t generate XML documents in the exact same manner. For example, it never generates self-closing tags, instead always outputting a full opening & closing pair. This is semantically equivalent, but makes a plain string comparison of two XML documents impractical in the general case.

Considering the need to expand the XML coverage, and provide a more scalable testing approach, I decided to change approach. The libvirt.git tests/ directory currently contains 2739 XML documents that are used to validate libvirt’s own native XML parsing & formatting code. There is no better data set to use for validating the libvirt-go-xml coverage than this. Thus I decided to apply a round-trip testing methodology. The libvirt-go-xml code would be used to parse the sample XML document from libvirt.git, and then immediately serialize them back into a new XML document. Both the original and new XML documents would then be parsed generically to form a DOM hierarchy which can be compared for equivalence. Any place where documents differ would cause the test to fail and print details of where the problem is. For example:

$ go test -tags xmlroundtrip
--- FAIL: TestRoundTrip (1.01s)
	xml_test.go:384: testdata/libvirt/tests/vircaps2xmldata/vircaps-aarch64-basic.xml: \
            /capabilities[0]/host[0]/topology[0]/cells[0]/cell[0]/pages[0]: \
            element in expected XML missing in actual XML

This shows the filename that failed to correctly roundtrip, and the position within the XML tree that didn’t match. Here the NUMA cell topology has a ‘<pages>‘  element expected but not present in the newly generated XML. Now it was simply a matter of running the roundtrip test over & over & over & over & over & over & over……….& over & over & over, adding structs / fields for each omission that the test identified.

After doing this for some time, libvirt-go-xml now has 586 structs defined containing 1816 fields, and has certified 100% coverage of all libvirt public XML schemas. Of course when I say 100% coverage, this is probably a lie, as I’m blindly assuming that the libvirt.git test suite has 100% coverage of all its own XML schemas. This is certainly a goal, but I’m confident there are cases where libvirt itself is missing test coverage. So if any omissions are identified in libvirt-go-xml, these are likely omissions in libvirt’s own testing.

On top of this, the XML roundtrip test is set to run in the libvirt jenkins and travis CI systems, so as libvirt extends its XML schemas, we’ll get build failures in libvirt-go-xml and thus know to add support there to keep up.

In expanding the coverage of XML schemas, a number of non-trivial changes were made to existing structs  defined by libvirt-go-xml. These were mostly in places where we have to handle a union concept defined by libvirt. Typically with libvirt an element will have a “type” attribute, whose value then determines what child elements are permitted. Previously we had been defining a single struct, whose fields represented all possible children across all the permitted type values. This did not scale well and gave the developer no clue what content is valid for each type value. In the new approach, for each distinct type attribute value, we now define a distinct Go struct to hold the contents. This will cause API breakage for apps already using libvirt-go-xml, but on balance it is worth it get a better structure over the long term. There were also cases where a child XML element previously represented a single value and this was mapped to a scalar struct field. Libvirt then added one or more attributes on this element, meaning the scalar struct field had to turn into a struct field that points to another struct. These kind of changes are unavoidable in any nice manner, so while we endeavour not to gratuitously change currently structs, if the libvirt XML schema gains new content, it might trigger further changes in the libvirt-go-xml structs that are not 100% backwards compatible.

Since we are now tracking libvirt.git XML schemas, going forward we’ll probably add tags in the libvirt-go-xml repo that correspond to each libvirt release. So for app developers we’ll encourage use of Go vendoring to pull in a precise version of libvirt-go-xml instead of blindly tracking master all the time.

Commenting out XML snippets in libvirt guest config by stashing it as metadata

Posted: February 8th, 2017 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Virt Tools | Tags: , , | 3 Comments »

Libvirt uses XML as the format for configuring objects it manages, including virtual machines. Sometimes when debugging / developing it is desirable to comment out sections of the virtual machine configuration to test some idea. For example, one might want to temporarily remove a secondary disk. It is not always desirable to just delete the configuration entirely, as it may need to be re-added immediately after. XML has support for comments <!-- .... some text --> which one might try to use to achieve this. Using comments in XML fed into libvirt, however, will result in an unwelcome suprise – the commented out text is thrown into /dev/null by libvirt.

This is an unfortunate consequence of the way libvirt handles XML documents. It does not consider the XML document to be the master representation of an object’s configuration – a series of C structs are the actual internal representation. XML is simply a data interchange format for serializing structs into a text format that can be interchanged with the management application, or persisted on disk. So when receiving an XML document libvirt will parse it, extracting the pieces of information it cares about which are they stored in memory in some structs, while the XML document is discarded (along with the comments it contained). Given this way of working, to preserve comments would require libvirt to add 100’s of extra fields to its internal structs and extract comments from every part of the XML document that might conceivably contain them. This is totally impractical to do in realityg. The alternative would be to consider the parsed XML DOM as the canonical internal representation of the config. This is what the libvirt-gconfig library in fact does, but it means you can no longer just do simple field accesses to access information – getter/setter methods would have to be used, which quickly becomes tedious in C. It would also involve re-refactoring almost the entire libvirt codebase so such a change in approach would realistically never be done.

Given that it is not possible to use XML comments in libvirt, what other options might be available ?

Many years ago libvirt added the ability to store arbitrary user defined metadata in domain XML documents. The caveat is that they have to be located in a specific place in the XML document as a child of the <metadata> tag, in a private XML namespace. This metadata facility to be used as a hack to temporarily stash some XML out of the way. Consider a guest which contains a disk to be “commented out”:

<domain type="kvm">
  ...
  <devices>
    ...
    <disk type='file' device='disk'>
    <driver name='qemu' type='raw'/>
    <source file='/home/berrange/VirtualMachines/demo.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </disk>
    ...
  </devices>
</domain>

To stash the disk config as a piece of metadata requires changing the XML to

<domain type="kvm">
  ...
  <metadata>
    <s:disk xmlns:s="http://stashed.org/1" type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/home/berrange/VirtualMachines/demo.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </s:disk>
  </metadata>
  ...
  <devices>
    ...
  </devices>
</domain>

What we have done here is

– Added a <metadata> element at the top level
– Moved the <disk> element to be a child of <metadata> instead of a child of <devices>
– Added an XML namespace to <disk> by giving it an ‘s:’ prefix and associating a URI with this prefix

Libvirt only allows a single top level metadata element per namespace, so if there are multiple tihngs to be stashed, just give them each a custom namespace, or introduce an arbitrary wrapper. Aside from mandating the use of a unique namespace, libvirt treats the metadata as entirely opaque and will not try to intepret or parse it in any way. Any valid XML construct can be stashed in the metadata, even invalid XML constructs, provided they are hidden inside a CDATA block. For example, if you’re using virsh edit to make some changes interactively and want to get out before finishing them, just stash the changed in a CDATA section, avoiding the need to worry about correctly closing the elements.

<domain type="kvm">
  ...
  <metadata>
    <s:stash xmlns:s="http://stashed.org/1">
    <![CDATA[
      <disk type='file' device='disk'>
        <driver name='qemu' type='raw'/>
        <source file='/home/berrange/VirtualMachines/demo.qcow2'/>
        <target dev='vda' bus='virtio'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
      </disk>
      <disk>
        <driver name='qemu' type='raw'/>
        ...i'll finish writing this later...
    ]]>
    </s:stash>
  </metadata>
  ...
  <devices>
    ...
  </devices>
</domain>

Admittedly this is a somewhat cumbersome solution. In most cases it is probably simpler to just save the snippet of XML in a plain text file outside libvirt. This metadata trick, however, might just come in handy some times.

As an aside the real, intended, usage of the <metdata> facility is to allow applications which interact with libvirt to store custom data they may wish to associated with the guest. As an example, the recently announced libvirt websockets console proxy uses it to record which consoles are to be exported. I know of few other real world applications using this metadata feature, however, so it is worth remembering it exists :-) System administrators are free to use it for local book keeping purposes too.

ANNOUNCE: New libvirt project Go XML parser model

Posted: January 5th, 2017 | Filed under: Coding Tips, Fedora, libvirt, OpenStack, Virt Tools | Tags: , , , | No Comments »

Shortly before christmas, I announced the availability of new Go bindings for the libvirt API. This post announces a companion package for dealing with XML parsing/formatting in Go. The master repository is available on the libvirt GIT server, but it is expected that Go projects will consume it via an import of the github mirror, since the Go ecosystem is heavilty github focused (e.g. godoc.org can’t produce docs for stuff hosted on libvirt.org git)

import (
  libvirtxml "github.com/libvirt/libvirt-go-xml"
  "encoding/xml"
)

domcfg := &libvirtxml.Domain{Type: "kvm", Name: "demo",
                             UUID: "8f99e332-06c4-463a-9099-330fb244e1b3",
                             ....}
xmldoc, err := xml.Marshal(domcfg)

API documentation is available on the godoc website.

When dealing with the libvirt API, most applications will find themselves needing to either parse or format XML documents describing configuration of various libvirt objects. Traditionally this task has been left upto the application to deal with and as a result most applications end up creating some kind of structure / object model to represent the XML document in a more easily accessible manner. To try to reduce this duplicate effort, libvirt has already created the libvirt-glib package, which contains a libvirt-gconfig library mapping libvirt XML documents into the GObject world. This library is accessible to many programming languages via the magic of GObject Introspection, and while there is some work to support this in Go, it is not particularly mature at this time.

In the Go world, there is a package “encoding/xml” which is able to transform between XML documents and Go structs, given suitable annotations on the struct fields. It is very easy to deal with, simply requiring someone to define a bit set of structs with annotated fields to map to the XML document. There’s no real “code” to write as it is really a data definition task.  Looking at applications using libvirt in Go, we see quite a few have already go down this route for dealing with libvirt XML. It should be readily apparent that every application using libvirt in Go is essentially going to end up writing an identical set of structs to deal with the XML handling. This duplication of effort makes no sense at all, and as such, we have started this new libvirt-go-xml package to provide a standard set of Go structs to deal with libvirt XML. The current level of schema support is pretty minimal supporting the capabilities XML, secrets XML and a small amount of the domain XML, so we’d encourage anyone interested in this to contribute patches to expand the XML schema coverage.

The following illustrates a further example of its usage in combination with the libvirt-go library (with error checking omitted for brevity):

import (
  libvirt "github.com/libvirt/libvirt-go"
  libvirtxml "github.com/libvirt/libvirt-go-xml"
  "encoding/xml"
  "fmt"
)

conn, err := libvirt.NewConnect("qemu:///system")
dom := conn.LookupDomainByName("demo")
xmldoc, err := dom.GetXMLDesc(0)

domcfg := &libvirtxml.Domain{}
err := xml.Unmarshal([]byte(xmldocC), domcfg)

fmt.Printf("Virt type %s", domcfg.Type)