In Visual Studio, `thread_local` variables' destructor not called when used with std::async, is this a bug?

  • A+
Category:Languages

The following code

#include <iostream> #include <future> #include <thread> #include <mutex>  std::mutex m;  struct Foo {     Foo() {         std::unique_lock<std::mutex> lock{m};         std::cout <<"Foo Created in thread " <<std::this_thread::get_id() <<"/n";     }      ~Foo() {         std::unique_lock<std::mutex> lock{m};         std::cout <<"Foo Deleted in thread " <<std::this_thread::get_id() <<"/n";     }      void proveMyExistance() {         std::unique_lock<std::mutex> lock{m};         std::cout <<"Foo this = " << this <<"/n";     } };  int threadFunc() {     static thread_local Foo some_thread_var;      // Prove the variable initialized     some_thread_var.proveMyExistance();      // The thread runs for some time     std::this_thread::sleep_for(std::chrono::milliseconds{100});       return 1; }  int main() {     auto a1 = std::async(std::launch::async, threadFunc);     auto a2 = std::async(std::launch::async, threadFunc);     auto a3 = std::async(std::launch::async, threadFunc);      a1.wait();     a2.wait();     a3.wait();      std::this_thread::sleep_for(std::chrono::milliseconds{1000});              return 0; } 

Compiled and run width clang in macOS:

clang++ test.cpp -std=c++14 -pthread ./a.out 

Got result

Foo Created in thread 0x70000d9f2000 Foo Created in thread 0x70000daf8000 Foo Created in thread 0x70000da75000 Foo this = 0x7fd871d00000 Foo this = 0x7fd871c02af0 Foo this = 0x7fd871e00000 Foo Deleted in thread 0x70000daf8000 Foo Deleted in thread 0x70000da75000 Foo Deleted in thread 0x70000d9f2000 

Compiled and run in Visual Studio 2015 Update 3:

Foo Created in thread 7180 Foo this = 00000223B3344120 Foo Created in thread 8712 Foo this = 00000223B3346750 Foo Created in thread 11220 Foo this = 00000223B3347E60 

Destructor are not called.

Is this a bug or some undefined grey zone?

P.S.

If the sleep std::this_thread::sleep_for(std::chrono::milliseconds{1000}); at the end is not long enough, you may not see all 3 "Delete" messages sometimes.

When using std::thread instead of std::async, the destructors get called on both platform, and all 3 "Delete" messages will always be printed.

 


I think @super has hit the nail on the head there. And if you think that through you will realise that you have to a bit careful here.

cppreference has this to say (emphasis mine):

The template function async runs the function f asynchronously (potentially in a separate thread which may be part of a thread pool).

(Where 'potentially' here means 'depending on the parameters you pass to it', which is not what I thought on first reading).

Now, seasoned Win32 programmers will know all about Windows' built-in support for ThreadPools and I'm pretty confident that's what's happening here.

I'm sure (and you have indeed pretty-much proved) that the compiler is emitting code to destroy the relevant thread_local Foo when the thread that created it terminates, but, on Windows, that thread probably never terminates when you use std::async because the thread pool will keep it alive for use by the next customer. That's its job, basically - to recycle threads to reduce overhead (since creating a thread is relatively expensive) and perhaps to limit the maximum number running at any one time (which is a separate issue).

So that code is basically flawed, because it is assuming that each invocation of threadFunc() will run in its own unique thread your thread_local objects will be destroyed immediately when threadFunc () returns, whereas in fact the standard offers no such (explicit) guarantee, see my second edit below.

Other stuff:

  1. The static keyword does not affect the storage duration of a thread_local object (see Explanation 5 here), it only determines whether or not the variable, if declared at global scope, has external linkage. thread_local variables will always be constructed when a thread starts running and destroyed when it exits.

  2. Replace static thread_local Foo some_thread_var; with Foo some_thread_var; and all three destructors are duly called (obviously), even when using std::async. Which is, of course, by far the best way to do things like this.

Edit: Turns out that MSVC's implementation of std::async is actually built on top of PPL, see @DavidHaim's comments here, which, in turn, is built on top of the native ThreadPool API, so, same difference in the end.

Layers upon layers upon layers, no wonder we need so much RAM these days.

Edit 2: Interesting. With reference to my original post, I just read the above cppreference link a bit more carefully and it also says this (emphasis mine):

If the async flag is set (i.e. (policy & std::launch::async) != 0), then async executes the callable object f on a new thread of execution (with all thread-locals initialized) as if spawned by std::thread(std::forward(f), std::forward(args)...)

So my initial analysis was not quite right (I have added a note above) and those destructors should get called when the thread eventually gets recycled, but you have no control over exactly when (and the standard certainly offers no guarantee that this will happen at the exact time threadFunc() exits, although it would obviously be nice if it did). Sorry for the confusion but it sounds like you're still in a bit of a hole.

So where does that leave us? OK, so:

  1. I will try to devise a test to see if MSVC correctly manages the object lifetimes of thread_local objects when a thread is recycled Enquiring minds want to know. I'll post back.

  2. If you want to safely use std::async - on any platform - you will have to manage your own object lifetimes more carefully. I see no way of getting away from that, sorry, you'll have to have a bit if a rethink. Oooh boy, you have to be so careful with this stuff!

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: