Using C unions in high level code

I always considered the C union to be highly underappreciated ugly-child that nobody cares about in high level programming. Not really meant for persistent storage (specific cases excluded), it’s difficult to see any good use for such constructs, especially for beginner programmers. Unless you deal with compilers or close to the metal development, chances are you have barely used or spotted a union in an application’s runtime code. But unions can come in handy, sometimes in quite unexpected ways. I often forget about their applications, so hopefully this post will help me remember in the future.

So what can we do with them? Consider a situation, where we would like to pack one data type into another. Specifically, assume that we want to pack low precision uint8_t 4D vector coordinates into a single uint32_t variable. What most people think of doing first is taking each respective coordinate and OR it with the target using bit shifting operations. This would work fine but thanks to unions we can provide a nice, cleaner code for this purpose:

uint32_t packVector(uint8_t x, uint8_t y, uint8_t z, uint8_t w)
    {
        union
        {
            uint32_t m_packed;
            uint8_t  m_unpacked[4];
        } u; // 4 bytes 

        u.m_unpacked[0] = x;
        u.m_unpacked[1] = y;
        u.m_unpacked[2] = z;
        u.m_unpacked[3] = w;

        return u.m_packed;
    }

A completely different problem that a union can easily solve is floating point endian swapping. You’d usually run into this when dealing with different CPU architectures and binary data stored in a file. Depending on the supported platforms you may need to juggle between little and big endian and in case of floats this might incur some performance issues if you decide to use integer endian swapping with “classic” type casting. But all those problems dissapear if instead of casting you use an union:

// assuming that float is 32 bit for simplicity
    float swapFloat(float value)
    {
        union
        {
            uint32_t m_int;
            float    m_float;
        } u;

        u.m_float   = value;
        u.m_integer = integerEndianSwap(u.m_integer); // no penalty

        return u.m_float;
    }

See this article for a detailed explanation on why this is generally a better approach.