Better living through API design: low level memory allocation

Posted: May 23rd, 2008 | Filed under: Coding Tips | No Comments »

The libvirt library provides a core C API upon which language specific bindings are also built. Being written in C of course means we have to do our memory management / allocation. Historically we’ve primarily used the standard malloc/realloc/calloc/free APIs that even knows and loves/hates, although we do have an internal variable length string API (more on that in a future post). Of course there are many other higher level memory management APIs (obstacks/memory pools, garbage collectors, etc) you can write C code with too, but in common with the majority of other other apps/code we happen to be using the general purpose malloc/free pattern everywhere. I’ve recently been interested in improving our code quality, reliability and testing, and found myself thinking about whether there was a way to make our use of these APIs less error prone, and verifiably correct at build time.

If you consider the standard C library malloc/realloc/calloc/free APIs, they in fact positively encourage application coding errors. Off the top of my head there are at least 7 common problems, probably more….

  1. malloc() – pass incorrect number of bytes
  2. malloc() – fail to check the return value
  3. malloc() – forget to fully initialize returned memory
  4. free() – double free by not setting pointer to NULL
  5. realloc() – fail to check the return value
  6. realloc() – fail to re-assign the pointer with new address
  7. realloc() – leaking memory upon failure to resize

A great many libraries will create wrapper functions around malloc/free/realloc but they generally only attempt to address one or two of these points. As an example, consider GLib, which has a wrapper around malloc() attempting to address point 2. It does this by making it call abort() on failure to allocate, but then re-introduces the risk by also adding a wrapper which doesn’t abort()

  gpointer g_malloc         (gulong        n_bytes) G_GNUC_MALLOC;
  gpointer g_try_malloc     (gulong        n_bytes) G_GNUC_MALLOC;

It also wraps realloc() for the same reason, and adds an annotation to make the compiler warn if you don’t use the return value

  gpointer g_realloc        (gpointer      mem,
                             gulong        n_bytes) G_GNUC_WARN_UNUSED_RESULT;
  gpointer g_try_realloc    (gpointer      mem,
                             gulong        n_bytes) G_GNUC_WARN_UNUSED_RESULT;

This at least addresses point 6, ensuring that you update your variable with the new pointer, but can’t protect against failure to check for NULL. And the free() wrapper doesn’t address the double-free issue at all.

You can debate which of “checking for NULL” vs “calling abort()” is the better approach – Havoc has some compelling points – but that’s not the point of this blog posting. I was interested in whether I could create wrappers for libvirt which kept the choice in the hands of the caller, while still protecting from the common risks.

  1. For any given pointer type the compiler knows how many bytes need to be allocated for a single instance of it – it sizeof(datatype) or sizeof(*mptr). Both options are a little tedious to write out, eg

    mytype *foo;
    foo = malloc(sizeof(*foo));
    foo = malloc(sizeof(mytype));
    

    Since C has no data type representing a data type, you cannot simply pass a data type to a function as you might like to:

    mytype *foo = malloc(mytype);
    

    You do, however, have access to the preprocessor and so you can achieve the same effect via a trivial macro.

    #define MALLOC(ptr) malloc(sizeof(*(ptr)))
    or
    #define MALLOC(mytype) malloc(sizeof(mytype))
    
  2. GCC has a number of ways to annotate APIs, one of which is __attribute__((__warn_unused_result__)). This causes emit a warning if return value is not used / checked. Add -Werror and you can get guarenteed compile time verification (the rest of your code was 100% warning free, wasn’t it – it ought to be :-). This doesn’t actually help with the NULL check for malloc, because you always do something with the return value – assign it to a variable. The core problem here is that the API encodes the error condition as a special case value in the returned “data”. The obvious answer to this is to separate the two cases, keeping the return value soley as a bi-state success of fail flag.
    The data can be passed by via a output parameter. This might look like:

    myptr *foo;
    mymalloc(&foo;, sizeof(*foo));
    

    And the compiler would be able to warn that you failed to check for failure. It looks rather ugly to write this way, but consider that a preprocessor macro is already being used for the previous issue, so we can merely extend its use

    #define MALLOC(ptr) mymalloc(&(foo), sizeof(*(foo))
    myptr *foo;
    if (MALLOC(foo) < 0)
       ... do error handling...
    
  3. The memory allocated by malloc() is not initialized to zero. For that a separate calloc() function is provided. This leaves open the possiblity of an app mistakenly using the wrong variant. Or consider an app using malloc() and explicitly initializing all struct members. Some time later a new member is added and now the developer needs to find all malloc() calls wrt to that struct and explicitly initilize the new member. It would be safer to always have malloc zero all memory - even though it has an obvious performance impact, the net win in safety is probably worth it for most applications. Dealing with this in realloc() is much harder though, because you don't know how many extra bytes (if any) you need to initialize without knowing the original size.

  4. Since free() takes the pointer to be freed directly it cannot reset the caller's original pointer variable back to NULL. So the application is at risk of mistakenly calling free() a second time, leading to corruption or a crash. free() will happily ignore a NULL pointer though. If free() were to accept a pointer to a pointer instead, it could automatically NULL-ify it. Again this has extra computational cost from the often redundant assignment to NULL, but it is negligable in the context of the work done to actually free the memory region, and easily offset by the safety benefits.

  5. As with point 2, this is caused by the fact that the error condition is special case of the return data value.

  6. This is a fun one which may escape detection for ages because most of the time the returned pointer address is the same as the original address. realloc() doesn't often need to relocate the data to another address. Given that solving the previous point required changing the return data to be an output-parameter, we've already basically solved this issue to. Pass in a pointer to the pointer to be reallocated and it can update the address directly.

  7. What todo on failure of a realloc() is an interesting question. The chances are that most applications will immediately free() the block and go into cleanup code - indeed I don't recall coming across code that would ever continue its work if realloc() failed. Thus one could simply define that if realloc fails it automatically free's the original data too.

Having considered this all its possible to define a set of criteria for the design of a low level memory allocation API that is considerably safer than the standard C one, while still retaining nearly all its flexibility and avoiding the imposition of policy such as calling abort() on failure.

  • Use return values only for a success/fail error condition flag
  • Annotate all the return values with __warn_unused_result__
  • Pass a pointer to the pointer into all functions
  • Define macros around all functions to automatically calculate datatype sizes

So the primary application usage would be via a set of macros:

    VIR_ALLOC(ptr)
    VIR_ALLOC_N(ptr, count)
    VIR_REALLOC_N(ptr, count)
    VIR_FREE(ptr)

These call through to the underlying APIs:

    int virAlloc(void *ptrptr, size_t bytes)
    int virAllocN(void *ptrptr, size_t bytes, size_t count)
    int virReallocN(void *ptrptr, size_t bytes, size_t count)
    int virFree(void *ptrptr)

The only annoying thing here is that although we're passing a pointer to a pointer into all these, the first param is still just 'void *' and not 'void **'. This works because 'void *' is defined to hold any type of pointer, and in addition using 'void **' causes the compiler to complain bitterly about strict aliasing violations. Internally the impls of these methods can still safely cast to 'void **' when deferencing the pointer.

All 3 of Alloc/Realloc functions will have __warn_unused_result__ annotation so the caller is forced to check the return value for failure, validated at compile time generating a warning (or fatal compile error with -Werror).

And finally to wire up the macros to the APIs:

  #define VIR_ALLOC(ptr)            virAlloc(&(ptr), sizeof(*(ptr)))
  #define VIR_ALLOC_N(ptr, count)   virAllocN(&(ptr), sizeof(*(ptr)), (count))
  #define VIR_REALLOC_N(ptr, count) virReallocN(&(ptr), sizeof(*(ptr)), (count))
  #define VIR_FREE(ptr)             virFree(&(ptr))

If this is all sounding fairly abstract, an illustration of usage should clear things up. These snippets are taken from libvirt, showing before and after

Allocating a new variable:

Before

      virDomain *domain;
      if ((domain = malloc(sizeof(*domain))) == NULL) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
      }

After

      virDomain *domain;

      if (VIR_ALLOC(domain) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
      }
Allocating an array of domains

Before

       virDomain *domains;
       int ndomains = 10;

       if ((domains = malloc(sizeof(*domains) * ndomains)) == NULL) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }

After

       virDomain *domains;
       int ndomains = 10;

       if (VIR_ALLOC_N(domains, ndomains) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
Allocating an array of domain pointers

Before

       virDomain **domains;
       int ndomains = 10;

       if ((domains = malloc(sizeof(*domains) * ndomains)) == NULL) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }

After

       virDomain **domains;
       int ndomains = 10;

       if (VIR_ALLOC_N(domains, ndomains) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
Re-allocate the array of domains to be longer

Before

       virDomain **tmp;
       ndomains = 20

       if ((tmp = realloc(domains, sizeof(*domains) * ndomains)) == NULL {
         free(domains);
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
       domains = tmp;

After

       ndomains = 20

       if (VIR_REALLOC_N(domains, ndomains) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
Free the domain

Before

       free(domain);
       domain = NULL;

After

       VIR_FREE(domain);

As these short examples show the number of lines of code hasn't changed much, but the clarity of them has - particularly the realloc() usage, and of course there is now compile time verification of usage. The main problem remaining is the classic memory leak, by forgetting to call free() at all. If you want to use a low level memory allocation API this problem is essentially unavoidable. Fixing it really requires a completely different type of API (eg, obstacks/pools) or a garbage collector. And there's always valgrind to identify leaks which works very nicely, particularly if you have extensive test suite coverage

The original mailing thread from which this post is derived is on the libvirt mailing list