Intriguing assembly for comparing std::optional of primitive types

  • A+
Category:Languages

Valgrind picked up a flurry Conditional jump or move depends on uninitialised value(s) in one of my unit tests.

Inspecting the assembly, I realized that the following code:

bool operator==(MyType const& left, MyType const& right) {     // ... some code ...     if (left.getA() != right.getA()) { return false; }     // ... some code ...     return true; } 

Where MyType::getA() const -> std::optional<std::uint8_t>, generated the following assembly:

   0x00000000004d9588 <+108>:   xor    eax,eax    0x00000000004d958a <+110>:   cmp    BYTE PTR [r14+0x1d],0x0    0x00000000004d958f <+115>:   je     0x4d9597 <... function... +123> x  0x00000000004d9591 <+117>:   mov    r15b,BYTE PTR [r14+0x1c] x  0x00000000004d9595 <+121>:   mov    al,0x1     0x00000000004d9597 <+123>:   xor    edx,edx    0x00000000004d9599 <+125>:   cmp    BYTE PTR [r13+0x1d],0x0    0x00000000004d959e <+130>:   je     0x4d95ae <... function... +146> x  0x00000000004d95a0 <+132>:   mov    dil,BYTE PTR [r13+0x1c] x  0x00000000004d95a4 <+136>:   mov    dl,0x1 x  0x00000000004d95a6 <+138>:   mov    BYTE PTR [rsp+0x97],dil     0x00000000004d95ae <+146>:   cmp    al,dl    0x00000000004d95b0 <+148>:   jne    0x4da547 <... function... +4139>     0x00000000004d95b6 <+154>:   cmp    r15b,BYTE PTR [rsp+0x97]    0x00000000004d95be <+162>:   je     0x4d95c8 <... function... +172>      => Jump on uninitialized     0x00000000004d95c0 <+164>:   test   al,al    0x00000000004d95c2 <+166>:   jne    0x4da547 <... function... +4139> 

Where I marked with x the statements that are not executed (jumped over) in the case where the optional is NOT set.

The member A here is at offset 0x1c into MyType. Checking the layout of std::optional we see that:

  • +0x1d corresponds to bool _M_engaged,
  • +0x1c corresponds to std::uint8_t _M_payload (inside an anonymous union).

The code of interest for std::optional is:

constexpr explicit operator bool() const noexcept { return this->_M_is_engaged(); }  // Comparisons between optional values. template<typename _Tp, typename _Up> constexpr auto operator==(const optional<_Tp>& __lhs, const optional<_Up>& __rhs) -> __optional_relop_t<decltype(declval<_Tp>() == declval<_Up>())> {     return static_cast<bool>(__lhs) == static_cast<bool>(__rhs)          && (!__lhs || *__lhs == *__rhs); } 

Here, we can see that gcc transformed the code quite substantially; if I understand it correctly, in C this gives:

char rsp[0x148]; // simulate the stack  /* comparisons of prior data members */  /* 0x00000000004d9588 <+108>:   xor    eax,eax 0x00000000004d958a <+110>:   cmp    BYTE PTR [r14+0x1d],0x0 0x00000000004d958f <+115>:   je     0x4d9597 <... function... +123> 0x00000000004d9591 <+117>:   mov    r15b,BYTE PTR [r14+0x1c] 0x00000000004d9595 <+121>:   mov    al,0x1 */  int eax = 0; if (__lhs._M_engaged == 0) { goto b123; } bool r15b = __lhs._M_payload; eax = 1;  b123: /* 0x00000000004d9597 <+123>:   xor    edx,edx 0x00000000004d9599 <+125>:   cmp    BYTE PTR [r13+0x1d],0x0 0x00000000004d959e <+130>:   je     0x4d95ae <... function... +146> 0x00000000004d95a0 <+132>:   mov    dil,BYTE PTR [r13+0x1c] 0x00000000004d95a4 <+136>:   mov    dl,0x1 0x00000000004d95a6 <+138>:   mov    BYTE PTR [rsp+0x97],dil */  int edx = 0; if (__rhs._M_engaged == 0) { goto b146; } rdi = __rhs._M_payload; edx = 1; rsp[0x97] = rdi;  b146: /* 0x00000000004d95ae <+146>:   cmp    al,dl 0x00000000004d95b0 <+148>:   jne    0x4da547 <... function... +4139> */  if (eax != edx) { goto end; } // return false  /* 0x00000000004d95b6 <+154>:   cmp    r15b,BYTE PTR [rsp+0x97] 0x00000000004d95be <+162>:   je     0x4d95c8 <... function... +172> */  //  Flagged by valgrind if (r15b == rsp[097]) { goto b172; } // next data member  /* 0x00000000004d95c0 <+164>:   test   al,al 0x00000000004d95c2 <+166>:   jne    0x4da547 <... function... +4139> */  if (eax == 1) { goto end; } // return false  b172:  /* comparison of following data members */  end:     return false; 

Which is equivalent to:

//  Note how the operands of || are inversed. return static_cast<bool>(__lhs) == static_cast<bool>(__rhs)          && (*__lhs == *__rhs || !__lhs); 

I think that the assembly is correct, if strange. That is, as far as I can see, the result of the comparison between uninitialized values does not actually influence the result of the function (and unlike C or C++, I do expect comparing junk in x86 assembly NOT to be UB):

  1. If one optional is nullopt and the other is set, then the conditional jump at +148 jumps to end (return false), OK.
  2. If both optionals are set, then the comparison reads initialized values, OK.

So the only case of interest is when both optionals are nullopt:

  • if the values compare equal, then the code concludes that the optionals are equal, which is true since they are both nullopt,
  • otherwise, the code concludes that the optionals are equal if __lhs._M_engaged is false, which is true.

In either case, the code therefore concludes that both optionals are equal when both are nullopt; CQFD.


This is the first instance I see of gcc generating apparently "benign" uninitialized reads, and therefore I have a few questions:

  1. Are uninitialized reads OK in assembly (x84_64)?
  2. Is this the syndrome of a failed optimization (reversing ||) which could trigger in non-benign circumstances?

For now, I am leaning toward annotating the few functions with optimize(1) as a work-around to prevent optimizations from kicking in. Fortunately, the identified functions are not performance critical.


Environment:

  • compiler: gcc 7.3
  • compile flags: -std=c++17 -g -Wall -Werror -O3 -flto (+ appropriate includes)
  • link flags: -O3 -flto (+ appropriate libraries)

Note: can appear with -O2 instead of -O3, but never without -flto.


Fun facts

In the full code, this pattern appears 32 times in the function outlined above, for various payloads : std::uint8_t, std::uint32_t, std::uint64_t and even a struct { std::int64_t; std::int8_t; }.

It only appears in a few big operator== comparing types with ~40 data members, not in smaller ones. And it does not appear for the std::optional<std::string_view> even in those specific functions (which call into std::char_traits for the comparison).

Finally, infuriatingly, isolating the function in question in its own binary makes the "problem" disappear. The mythical MCVE is proving elusive.

 


There are no trap values in x86 integer formats, so reading and comparing uninitialized values generates unpredictable truth/false values and no other direct harm.

In a cryptographic context, the state of the uninitialized values causing a different branch to be taken could leak into timing information leaking or other side-channel attacks. But cryptographic hardening is probably not what you are worried about.

The fact that gcc does uninitialized reads when it doesn't matter if the read gives the wrong value doesn't mean it will do it when it matters.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: