Can an uninitialized bool crash a program?

  • A+
Category:Languages

I know that "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I would have assumed the code looked safe enough. In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization were enabled.

I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string "true" or "false" to an existing destination buffer. Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value.

// Zero-filled global buffer of 16 characters char destBuffer[16];  void Serialize(bool boolValue) {     // Determine which string to print based on boolValue     const char* whichString = boolValue ? "true" : "false";      // Compute the length of the string we selected     const size_t len = strlen(whichString);      // Copy string into destination buffer, which is zero-filled (thus already null-terminated)     memcpy(destBuffer, whichString, len); } 

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."

I have setup a Compiler Explorer example that shows the problem in the disassembly:

#include <iostream> #include <cstring>  // Simple struct, with an empty constructor that doesn't initialize anything struct FStruct {     bool uninitializedBool;     __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem    FStruct() {}; };  int main() {     // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.     FStruct structInstance;      // Output "true" or "false" to stdout     Serialize(structInstance.uninitializedBool);     return 0; } 

The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:

const size_t len = strlen(whichString); // original code const size_t len = 5 - boolValue;       // clang clever optimization 

While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way? Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

 


The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: