Asynchronous, Bi-Directional ONC-RPC: status + input requested

Derek Atkins warlord@MIT.EDU
02 Jan 2001 15:51:55 -0500


Hi.  Happy New Year.

[ Note: This is probably a bit rambling, as I've got a lot floating
around in my head and I wanted to get it down in text.  If, after
reading through this a few times you are still confused, please let me
know and I'll try to explain what I mean. -derek ]

As I've alluded to in previous messages, I've been working on a
multiplexed ONC-RPC implementation that allows you to to provide
bi-directional call-reply semantics across a single (client-created)
TCP stream.  The goal was to create a means for a client to connect to
a server, make RPC requests, get RPC responses, and also enable the
server to make RPC requests (callbacks) to the connected client (maybe
even getting an RPC callback response, but maybe not).

The ONC RPC protocol definition allows such behavior, however the
standard implementation does not implement it.  The reasons are
obvious if you think about it.  Most RPCs are Synchronous calls (the
same is true of CORBA).  The standard RPC implementation makes the
call out to the network and then blocks waiting for the response.

A standard call (e.g. Split ** xaccQueryGetSplits(Query *)) would like
this would look like:

 Client                  Server
   call (arg)-------------->
   blocks                        makes local call()
  for reply                      gets local results
     <------------- (repl) response

The call has a timeout (so your client doesn't hang forever).  You can
specify an Asynchronous call by setting an explicit timeout of zero.

As I said, the problem here is obvious: If you're sitting there in the
client waiting for a response, how do you handle an incoming callback?
A related problem is how do you make a client callback from within a
service call? (it's the same question posed from the other side).

 Client                 Server
   call (arg)------------>
  block                         makes local call.
 for reply                      makes callback
    <-----------------(arg) call
proceses callback?              block, waiting for callback response
 response (repl) -------->
                                gets local results
     <------------- (repl) response
     
There a few approaches that we can take:

  1) Use Threads..  Assume you have a background thread whose job it
     is to read from the network and then "process" the incoming
     message.  You have one thread per side per connection.  In other
     words, the client would have one thread per server (normally 1),
     and the server would have one thread per client.

     A client writes out its request to the network and then
     blocks waiting for its response.  The background thread reads
     from the network connection, and will wakeup the client call
     thread when its reply arrives.  If an incoming message is a call
     request, the background thread calls out to the service dispatch
     to handle the call.

     This approach allows you to multiplex multiple requests and
     responses, but there is still a problem.  You see, a call request
     needs to block waiting for its response.  That's ok, except if
     you make a callback from a service, you wind up blocking the
     service (background) thread.  This means your background "reader"
     thread is now blocked waiting for someone to wake it up, however
     it's the thread that normally does the wakeup.  So you've got a
     deadlock.

     There are a number of solutions to this problem:

     a) Run each service call in its own thread.  However, this means
        we are even more tightly tied to threads than before, and we
        are 'forking' threads left and right for every call.  A bit
        expensive in my opinion, but certainly an option.

     b) Require all callbacks to be purely asynchronous.  This implies
        that a callback from a service never needs to block.

     c) Make sure you don't call back to your own client.  You can
        successfully call another client, just not your own, because
        each client has its own reader thread on the server.  You wont
        necessarily deadlock calling out to another client.

     d) Don't allow callbacks from within a service call.  This is a
        harse restriction, and means the server process needs some
        other event-based model for calling back to clients.  Most
        likely the main process thread would need to watch a variable
        and then make callbacks when that variable changes.

     Unfortunately there is no way to really enforce b, c, or d.  An
     implementor may still shoot themselves in the foot.  However
     there is another option:

  2) Use a purely event-based model.  All calls are considered
     asynchronous.  There are no "replies."  This implies that there
     is no blocking of a call, becuase replies come in via callback
     'events' (i.e, a call from the other side).  The problem with
     this is that you cannot implement something like the existing
     xaccQueryGetSplits() function because you cannot block waiting
     for the response.

     This option is certainly much more difficult for someone to use,
     because you have to be aware that a function is an RPC (and not
     local).  This implies you have to split your processing into two
     parts, the setup/request processor and the response processor.
     (E.g., one function to setup your Query* and make your request,
     and then another function that processes the Split** that
     returns).  However, this model is also much more like XWindows,
     in that everything is an event.

So, I'd like to get opinions from people here.  I don't really know
what the "right thing" to do is.  Option 2 would be simple from an
infrastructure aspect, and would certainly be easy to implement.
However it would impose a relatively large architectural change to
GnuCash.  Most of the suboptions of Option 1 are a bit more difficult
to implement, but would require fewer changes to the core of GnuCash.
My personal preference would be option 1b or 1c, or option 2 (these
being the simplest to do).

I have no idea how hard it would be to change the GnuCash architecture
to have an event based data retreival model.

Comments?  Suggestions?

-derek
-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available