How to implement thread-safe container with natural looking syntax?

  • A+
Category:Languages

Preface

Below code results in undefined behaviour, if used as is:

vector<int> vi; ... vi.push_back(1);  // thread-1 ... vi.pop(); // thread-2 

Traditional approach is to fix it with std::mutex:

vi_mutex.lock(); vi.push_back(1); vi_mutex.unlock(); 

Encapsulating it in macro, it can be slightly simplified. However, as the code grows, such things start looking cumbersome, as for every object, we may have to maintain a mutex.

Objective

Without compromising in the syntax of accessing an object and declaring an explicit mutex, I would like to create a template such that, it does all the boilerplate work. e.g.

Concurrent<vector<int>> vi;  // specific `vi` mutex is auto declared in this wrapper ... vi.push_back(1); // thread-1: locks `vi` only until `push_back()` is performed ... vi.pop ()  // thread-2: locks `vi` only until `pop()` is performed 

In current C++, it's impossible to achieve this. However, I have attempted a code where if just change vi. to vi->, then the things work as expected in above code comments.

Code

// The `Class` member is accessed via `->` instead of `.` operator // For `const` object, it's assumed only for read purpose; hence no mutex lock template<class Class,          class Mutex = std::mutex> class Concurrent : private Class {   public: using Class::Class;    private: class Safe            {              public: Safe (Concurrent* const this_,                            Mutex& rMutex) :                      m_This(this_),                      m_rMutex(rMutex)                      { m_rMutex.lock(); }              public: ~Safe () { m_rMutex.unlock(); }               public: Class* operator-> () { return m_This; }              public: const Class* operator-> () const { return m_This; }              public: Class& operator* () { return *m_This; }              public: const Class& operator* () const { return *m_This; }               private: Concurrent* const m_This;              private: Mutex& m_rMutex;            };    public: Safe ScopeLocked () { return Safe(this, m_Mutex); }   public: const Class* Unsafe () const { return this; }    public: Safe operator-> () { return ScopeLocked(); }   public: const Class* operator-> () const { return this; }   public: const Class& operator* () const { return *this; }    private: Mutex m_Mutex; }; 

Test

#include<iostream> #include<mutex> #include<thread> #include<vector>  std::mutex mutex; #define PRINT(VALUE) std::cout << __FUNCTION__ << "(" << std::this_thread::get_id() << "): " << VALUE << "/n"  /* above utility code here */  Concurrent<std::vector<int>> vi{8,9};  // Test: constructors  void Thread () {      PRINT("started...");   vi->push_back(2); // Test: insertion   PRINT((*vi)[0]);  // Test: array with * indirection   auto viLocked = vi.ScopeLocked();  // Test: locking for a while   std::copy(viLocked->begin(), viLocked->end(), std::ostream_iterator<int>(std::cout, ", "));   std::sort(viLocked->begin(), viLocked->end());   PRINT("copied..."); }     int main () {      std::thread run(Thread);   PRINT("started...");    vi->push_back(1);  // Test: insertion   {        PRINT("looping...");     auto viLocked = vi.ScopeLocked();  // Test: locking for a short block     for(auto it = viLocked->begin(); it != viLocked->end(); ++it)       PRINT(*it);   }       run.join();    auto viLocked = *vi.ScopeLocked();  // Test: locking again   auto it = std::begin(viLocked);   PRINT(*++it); } 

Demo

Questions

  • Is using the temporary object to call a function with overloaded operator->() leads to undefined behavior in C++
  • The range based for loop syntax is not working in this case. It gives compilation error. What is the correct way to fix it?
  • Does this small utility class serve the purpose of thread-safety for an encapsulated object in all the cases?

Clarifications

There are some productive comments here. However, I see that those are trivial issues.

For a longer locking, there is a method introduced as ScopedLock(). This is equivalent of the std::lock_guard(). However the mutex for a given object is maintained internally, so it's still better syntactically.

The code is now further simplified by eliminating an intermediate class, which was specifically meant for the begin()/end(). Some const methods are added for betterment.

Update: The latest version of code contains a small update where the const object of Concurrent<> doesn't lock the mutex. This could however be controlled, if required when the read access also should be locked.

 


This endeavor is fraught with peril and performance problems. Iterators generally depend on the state of the whole data structure and will usually be invalidated if the data structure changes in certain ways. This means that iterators either need to hold a mutex on the whole data structure when they're created, or you'll need to define a special iterator that carefully locks only the stuff it's depending on in the moment, which is likely more than the state of the node/element it's currently pointing at. And this would require internal knowledge of the implementation of what's being wrapped.

As an example, think about how this sequence of events might play out:

Thread 1:

 void thread1_func(Concurrent<vector<int>> &cq)  {        cq.push_back(1);        cq.push_back(2);  } 

Thread 2:

 void thread2_func(Concurrent<vector<int>> &cq)  {        ::std::copy(cq.begin(), cq.end(), ostream_iterator<int>(cout, ", "));  } 

How do you think that would play out? Even if every member function is nicely wrapped in a mutex so they're all serialized and atomic, you're still invoking undefined behavior as one thread changes a data structure another is iterating over.

You could make creating an iterator also lock a mutex. But then, if the same thread creates another iterator, it should be able to grab the mutex, so you'll need to use a recursive mutex.

And, of course, that means that your data structure can't be touched by any other threads while one thread is iterating over it, significantly decreasing concurrency opportunties.

It's also very prone to race conditions. One thread makes a call and discovers some fact about the data structure that it's interested in. Then, assuming this fact is true, it makes another call. But, of course, the fact is no longer true because some other thread has poked it's nose in in between getting the fact and using the fact. The example of using size and then deciding whether or not to iterate over it is just one example.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: