Do you know how much memory your structs and classes require? Unless you plan carefully, structs and classes may require as much as twice memory than actually needed, due to the compiler’s default packing alignment. The pack directive can be used in order to define the best packing alignment for your code.

The Waste Problem

When compiling code with any modern compiler, the compiler determines the default packing alignment of structures, unions and classes. This alignment is set for an entire compilation unit, and can usually be controlled with compiler switches like /Zp (in VC’s cl.exe).

The default packing alignment varies when working on different compilers. However, it defaults to 4 or 8 on most compilers.
As an example, observe the following struct:

struct myPack
{
    char m_char;
    int m_int;
    char m_boolean:1;
};

void main()
{
    printf("%d bytes required for myStruct, n", sizeof(myStruct));
}

myPack consists of 42 ( = 8 + 32 + 1) useful bits of information, which may be rounded to 6 bytes of actual memory space needed. Surprisingly, the print out of the program above would yield “12 bytes”. This is beacause under the default packing alignment of 4, myPack would actually require 12 bytes, as illustrated in the following chart:

[m_char , _, _, _, m_int, m_int, m_int, m_int, m_boolean, _, _, _]

(every position in the chart represents one byte)

This inefficient allocation gets worse when declaring an array of myStruct:

struct myStructArray
{
    myStruct	structArr[256][256];
};

The size of myStructArray is 256 * 256 * sizeof(myStruct) = 786432 bytes, whereas only 42 bits * 256 * 256 = 344064 bytes may be actually used!

In most cases this would not budge anyone as memory is not always among your program’s limitation. However, it is very easy to think of problems that can be occur by this waste of memory space when using large memory structs like myStructArray. Starting from performance issues like the fact that more page faults would occur when addressing that memory, hence your program could be slower than possible. Ending with hardware problems like wasting that memory when serializing large structs buffers into files or hardware devices that could be influenced by this waste.

Changing the packing alignment

Note: It is important to mention here that the default packing alignment is set to 4 or 8 mainly in order to fit the CPU’s registers. If there is no special reason like those mentioned above, your program would probably run faster with the default alignment.

Not very surprisingly, a simple C++ pragma instruction can set the compiler’s packing alignment to the required size.
When using #pragma pack(n), the compiler uses n bytes as a packing alignment from that directive until the end of the compilation unit. Under Microsoft’s compiler and GCC , N could be 1,2,4,8 or 16.

In order to preserve the default packing alignment for declarations other than your required declaration, the following syntax can be used to control the internal compiler’s stack:

#pragma pack(push, 1)
struct myStruct
{
    char m_char;
    int m_int;
    char m_boolean:1;
};
#pragma pack(pop)

myStruct would now be compiled with a 1 byte alignment. You can relax as sizeof(myStruct) = 6 bytes only, instead of 12, and of course myStructArray is about 393KB instead of 786KB.

Note that after the pop directive, the default packing alignment influences the rest of the code in the sense that anyone who uses myStruct would get only size of 6 bytes, including myStructArray, even if myStruct is used inside a region of another packing alignment. However, any other struct or class before the push or after the pop would be compiled with the default packing alignment, unless mentioned otherwise.

The pack syntax even lets your declare identifiers and use them to verify that the compiler’s stack is in the state you wanted it to be. For example:

#pragma(push, beforeInclude)

// an evil include file that uses #pragma push itself witout popping
#include "IDontTrustThis.h"

#pragma(pop, beforeInclude)

When popping, the compiler would scream that the stack’s top is not beforeInclude, and this would alert you to check that include file for a missing pop.

The pack pragma is not supported in all architectures (powerPC for example), a less powerful, but platform independent alternative is supported in GCC; The __attribute((packed)) directive. The less powerful equivalent to the above code with the GCC alternative would be:

struct myStruct
{
	char m_char;
	int m_int;
	char m_boolean:1;
} __attribute((packed));

In this case, the compiler will not align the struct, and pack it into 6 bytes.

Warnings

Using the pack directive does have some dark sides:

  • Beware of excessive use of the pack directive. The default packing alignment is usually set to 4 or 8 mainly because this is the size the CPU works with, and as mentioned above, you have to consider carefully when changing the packing alignment that it would really help.
  • Older compilers might behave unexpectedly when using the pack directive. For example, it has been seen that older compilers might assume that a pointer to the inner m_int in the structure actually points to an int with a packing alignment of 4, and this could lead to unwanted behavior.
  • Remember that when the program dumps data to files or shared memory in binary form, it will do so according to their packing alignment, so if one program packs it’s struct, any other program that wish to read the data must also make sure it is defined with the same packing alignment.
Posted by Tomer Margolin | |

1 Comment »

  1. As mentioned in the article, changing the packing will definitely decrease the performance and might increase the code size, which might not be acceptable.

    It could be worth mentioning that a simpler and more efficient practice is simply to order the members of the structure from the bigger storage size to the smaller. Most of the time, the amount of wasted space will only be at most once the size of the bigger type minus the size of the smaller type.

    on the example given, reordering the parameters this way:
    struct myPack
    {
    int m_int;
    char m_char;
    char m_boolean:1;
    };
    gives the following memory layout:

    [m_int, m_int, m_int, m_int, m_char, m_boolean, _, _]

    that’s 8 bytes instead of 12 and everything is still aligned as it should to maximize performance.

    Comment by Cedric Perthuis — August 2, 2008 @ 10:33 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment