How to efficiently use asyncio when calling a method on a BaseProxy?

  • A+
Category:Languages

I'm working on an application that uses LevelDB and that uses multiple long-lived processes for different tasks.

Since LevelDB does only allow a single process maintaining a database connection, all our database access is funneled through a special database process.

To access the database from another process we use a BaseProxy. But since we are using asyncio our proxy shouldn't block on these APIs that call into the db process which then eventually read from the db. Therefore we implement the APIs on the proxy using an executor.

    loop = asyncio.get_event_loop()      return await loop.run_in_executor(         thread_pool_executor,         self._callmethod,         method_name,         args,     ) 

And while that works just fine, I wonder if there's a better alternative to wrapping the _callmethod call of the BaseProxy in a ThreadPoolExecutor.

The way I understand it, the BaseProxy calling into the DB process is the textbook example of waiting on IO, so using a thread for this seems unnecessary wasteful.

In a perfect world, I'd assume an async _acallmethod to exist on the BaseProxy but unfortunately that API does not exist.

So, my question basically boils down to: When working with BaseProxy is there a more efficient alternative to running these cross process calls in a ThreadPoolExecutor?

 


Unfortunately, the multiprocessing library is not suited to conversion to asyncio, what you have is the best you can do if you must use BaseProxy to handle your IPC (Inter-Process communication).

While it is true that the library uses blocking I/O here you can't easily reach in and re-work the blocking parts to use non-blocking primitives instead. If you were to insist on going this route you'd have to patch or rewrite the internal implementation details of that library, but being internal implementation details these can differ from Python point release to point release making any patching fragile and prone to break with minor Python upgrades. The _callmethod method is part of a deep hierarchy of abstractions involving threads, socket or pipe connections, and serializers. See multiprocessing/connection.py and multiprocessing/managers.py.

So your options here are to stick with your current approach (using a threadpool executor to shove BaseProxy._callmethod() to another thread) or to implement your own IPC solution using asyncio primitives. Your central database-access process would act as a server for your other processes to connect to as a client, either using sockets or named pipes, using an agreed-upon serialisation scheme for client requests and server responses. This is what multiprocessing implements for you, but you'd implement your own (simpler) version, using asyncio streams and whatever serialisation scheme best suits your application patterns (e.g. pickle, JSON, protobuffers, or something else entirely).

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: