In C++, why does a derived class that just contains a union with an instance of its base class take more memory than the size of the union?

  • A+
Category:Languages

More specifically, a class, inheriting from an empty class, containing just a union whose members include an instance of the base data-less class, takes up more memory than just the union. Why does this happen and is there any way to avoid spending the extra memory?

The following code illustrates my question:

#include <iostream>  class empty_class { };  struct big : public empty_class {     union     {         int data[3];         empty_class a;     }; };  struct small {     union     {         int data[3];         empty_class a;     }; };  int main() {        std::cout << sizeof(empty_class) << std::endl;     std::cout << sizeof(big)         << std::endl;     std::cout << sizeof(small)       << std::endl; } 

The output of this code, when compiled using gcc version 7.3.0 compiled with -std=c++17 (although, I get the same result using c++11 and c++14), is:

1 16 12 

I would expect that the classes big and small should be of the same size; however strangely, big takes up more memory than small even though they both, seemingly, contain the same data.

Also even if the size of the array in the union is changed, the difference between the size of big and small is a constant 4 bytes.

-Edit:

It seems as though this behavior is not specific to classes with union data types. Similar behavior occurs in other similar situations where a derived class has a member with the base class type. Thanks to those who pointed this out.


This is because of what I call the "unique identity rule" of C++. Every (live) object in C++ of a particular type T must always have a different address from every other live object of type T. The compiler cannot provide a layout for a type where this rule would be violated, where two distinct subobjects of the same type T would have the same offset in the layout.

Class big contains two subobjects of note: a base class empty_class and an anonymous union containing a member empty_class.

The empty base optimization is based on aliasing the "storage" for an empty base class with other types. Typically, this is done by giving it the same address as the parent class, which means the address will typically be the same as the first non-empty base or first member subobject.

If the compiler gave the base class empty_class the same address as the union member, then you would have two distinct subobjects of the class (big::empty_class and big::a) which have the same address but are different objects.

Such a layout would violate the unique identity rule. And therefore, the compiler cannot employ the empty base optimization here. That's also why big is not standard layout.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: