Different linkage for extern “C” vs. extern “C” { } in C++ [duplicate]

  • A+
Category:Languages

This question already has an answer here:

I realize that, at first sight, my question might seem an obvious duplicate of one of the many questions here related with the extern keyword, but I was unable to find any answer talking about the difference between extern "C" and extern "C" { }. On the contrary, I've found several people stating that the two constructs are equivalent, as I believe it is reasonable to expect. Unfortunately, empirical evidence shows that they really are not equivalent.

Here is an example:

extern "C" { const int my_var1 = 21; } extern "C" const int my_var2 = 42; const int my_var3 = 121;  int main() { } 

After compiling it with gcc 7, with g++ externC.cpp, I see a remarkable difference:

$ readelf -s ./a.out | grep my_var     34: 0000000000000694     4 OBJECT  LOCAL  DEFAULT   15 _ZL7my_var1     35: 000000000000069c     4 OBJECT  LOCAL  DEFAULT   15 _ZL7my_var3     59: 0000000000000698     4 OBJECT  GLOBAL DEFAULT   15 my_var2 

my_var1 and my_var3 both have local binding and a C++ mangled name, while my_var2 has global binding and actual C linkage. So, it looks like the extern "C" { } has been completely ignored, while the similar extern "C" without {} did have effect. That is super weird to me.

Things get even more interesting if I remove the const and just try to read the variables:

#include <cstdio>  extern "C" { int my_var1; } extern "C" int my_var2; int my_var3;  int main() {     printf("%d, %d, %d/n", my_var1, my_var2, my_var3); } 

When I try to compile this 2nd program, the linker complains that it has been unable to find a reference for my_var2:

/tmp/ccfs9cis.o: In function `main': externC.cpp:(.text+0xc): undefined reference to `my_var2' collect2: error: ld returned 1 exit status 

And that means that in this case two things happened:

  1. extern "C" { int my_var1; } instantiated in the translation unit a variable called my_var1 with C linkage.

  2. extern "C" int my_var2; declared an extern variable, where with extern I mean in the traditional sense (like extern int x;), but with "C" linkage.

Which, from, my point of view, is inconsistent with the behavior in the 1st case above, using const. In other words:

  • In the 1st program with const

    • extern "C" behaved like I expected extern "C" {} to behave [change the linkage]

    • extern "C" {} instead, did nothing

  • In the 2nd program, without const:

    • extern "C" {} behaved like I originally expected [change the linkage] BUT

    • extern "C" behaved like: extern "C" { extern int my_var2; } which is the way to declare an extern variable with C linkage (and unfortunately in C++ the keyword extern has been reused).

In conclusion, my question is: can anyone (maybe a compiler expert?) explain the theory behind the reason for extern "C" and extern "C" {} to behave so differently and in such a inconsistent (at least for me) way ? In mine experience with C++, I realized that once you understand in deep details a given concept, even its tricky and complex corner cases start to look pretty reasonable and consistent. Just, you need to see the whole picture very clearly. I believe that is such a case.

Thanks a lot to everybody, in advance.


Edit[1]

[At the end it turned that a similar question did exist here, just I was unable to find it. Sorry for that.]

Thanks to the answers so far, I understand now the subtle difference between extern "C" {} and extern "C", even if I'd still be curious to understand how we (the C++ developers/ISO committee) ended up with such a solution. It's kind-of like making if (x) foo(); to be behave slightly differently than if (x) { foo(); }. Anyway, given this new knowledge, I'd have a few (hopefully) interesting observations to make:

Given that the transformation: extern "C" X => extern "C" { extern X } is always correct

It follows that:

  • The only way to define (instantiate) a const variable with C linkage in the current translation unit is to make it extern, even if we want don't want that: the compiler will decide if we're instantiating or just declaring an extern depending on if we initialized the variable with a value: in that case, we're defining, otherwise we're just declaring.

  • The same logic (extern + const) applies to regular const variables with C++ linkage as well. A const variable with C linkage is no different except for the lack of name mangling.

  • From the statements above it follows that, since const implies internal linkage in C++ (but not in C!), the extern when used for a const does not mean extern, but just less internal or more extern than static.

In other words:

  • const int var = 23; creates a global variable with internal linkage, like static int var = 23; would except for being placed in a read-only segment.
  • extern const int var = 23; creates a global variable with regular (external) linkage. The extern neutralizes the implicit static. The result is the same as int var = 23 except that with const it will be placed in a read-only segment.
  • extern const int var; declares a proper extern variable in a foreign read-only segment.

 


See here:

[extern "C" { ... }] Applies the language specification string-literal to all function types, function names with external linkage and variables with external linkage declared in declaration-seq.

Since const int my_var1 = 21; has internal linkage, wrapping extern "C" { } around it has no effect.

Also:

[extern "C" ...] Applies the language specification string-literal to a single declaration or definition.

and

A declaration directly contained in a language linkage specification is treated as if it contains the extern specifier for the purpose of determining the linkage of the declared name and whether it is a definition.

extern "C" int x; // a declaration and not a definition // The above line is equivalent to extern "C" { extern int x; }  extern "C" { int x; } // a declaration and definition 

This explains why for extern "C" const int my_var2 = 42; the variable has external linkage and an unmangled name. It also explains why you're seeing an undefined reference to my_var2 in your second code example.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: