What is the rationale behind the strict aliasing rule?

  • A+
Category:Languages

I am currently wondering about the rationale behind the strict aliasing rule. I understand that certain aliasing is not allowed in C and that the intention is to allow optimizations, but I am surprised that this was the preferred solution over tracing type casts when the standard was defined.

So, apparently the following example violates the strict aliasing rule:

uint64_t swap(uint64_t val) {     uint64_t copy = val;     uint32_t *ptr = (uint32_t*)© // strict aliasing violation     uint32_t tmp = ptr[0];     ptr[0] = ptr[1];     ptr[1] = tmp;     return copy; } 

I might be wrong, but as far as I can see a compiler should perfectly and trivially be able to trace down the type casts and avoid optimizations on types which are casted explicitly (just like it avoids such optimizations on same-type pointers) on anything called with the affected values.

So, which problems with the strict aliasing rule did I miss that a compiler can't solve easily to automatically detect possible optimizations)?

 


Since, in this example, all the code is visible to a compiler, a compiler can, hypothetically, determine what is requested and generate the desired assembly code. However, demonstration of one situation in which a strict aliasing rule is not theoretically needed does nothing to prove there are not other situations where it is needed.

Consider if the code instead contains:

foo(&val, ptr) 

where the declaration of foo is void foo(uint64_t *a, uint32_t *b);. Then, inside foo, which may be in another translation unit, the compiler would have no way of knowing that a and b point to (parts of) the same object.

Then there are two choices: One, the language may permit aliasing, in which case the compiler, while translating foo, cannot make optimizations relying on the fact that *a and *b are different. For example, whenever something is written to *b, the compiler must generate assembly code to reload *a, since it may have changed. Optimizations such as keeping a copy of *a in registers while working with it would not be allowed.

The second choice, two, is to prohibit aliasing (specifically, not to define the behavior if a program does it). In this case, the compiler can make optimizations relying on the fact that *a and *b are different.

The C committee chose option two because it offers better performance while not unduly restricting programmers.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: