[MUD-Dev] TECH: Single process v.s. multi process?

Tue May 28 07:47:32 CEST 2002

From: Philip Mak [mailto:pmak at animeglobe.com]

>  (1) Single process: A single process that listens for player
>  connections on a TCP port and handles everything.

>  (2) Multi process: Run xinetd/tcpserver to spawn a program for
>  each connection to a TCP port. These programs interact with some
>  centralized database/message queue.

>  (3) Multi thread: A single program that listens for player
>  connections and spawns a thread for each of them.

I'd like to add a (4) to that list if I may:

  (4) Multi-path, not "path-per-connection": It sounds like the
  desire to utilize multiple paths through the code is entirely "to
  defend against malicious softcode that consumes too much
  resources." Since it may well be that, for instance, a few users
  acting in a complex mprog-heavy area could be consuming _much_ of
  the system's resources, it would be nice if, as designers, we
  could make only _those_ users pay the lag-time penalty for their
  resource-heavy behavior. So, a natural concept is have each user's
  resource usage tied to a given path of control through the code
  (thread or process); however, this parallelization has a minimum
  level of independence in that _every_ action that changes the
  state of the world (as visible by other players) _must_ update the
  database (however the representation) and in many (most?) cases
  display those changes to other users (the side-effects of most
  changes to the database assumed to be messages to other users that
  can 'see' the action).

So there is a definite limit on how fine-grained the resource use
may be.

On the other hand, some resource use is likely to be mostly
independent of changes to the database, and the classically
"expensive" actions likely fall in this category (script
interpretation; input-output handling--including things like dns
name resolution; memory reclamation--possibly, depending on garbage
collection & the language; & the ever infamous etc....). It would be
_nice_ then to parallelize some of these expensive areas, especially
in ways that make the most use of the hardware.

I'd argue for parallelization of the input/output processing using
either a reactive approach (the reactor pattern, most specifically a
thread-pooling reactor
http://www.cs.wustl.edu/~schmidt/PDF/reactor-siemens.pdf describes
it reasonably); or, if the design and the underlying OS/platform
allows it, a proactive approach (the proactor pattern, which can be
viewed as a reactive pattern with the reactor provided by the OS and
communication/notification of events given to the application by
asynchronous completion tokens
http://www.cs.wustl.edu/~schmidt/PDF/proactor.pdf describes it
reasonably).

In the first approach, n threads divide the active socket descriptor
list and provide event (de)multiplexing for the sockets. Usually
using the select mechanism to determine if data is available, or if
the socket is writeable. These events get dispatched (usually via
callbacks that the application registers in the reactor
pattern). Thus n threads share the input-output load. This is good
because it is possible to spread the load across as many paths of
control as there are processors in the machine (usually a good
number for best efficiency), and because expensive operations that
don't tie up the input-output threads often will allow at least some
input-output processing to continue (avoiding for instance
flow-controlling your players).

However, the drawback is that the socket descriptor list cannot
usually be shared well, so while expensive post-select operations
can be performed by one of the reactor's thread-pooled threads, if
the socket descriptor list is large, the selection operation itself
becomes a sticking point. This leads to the second approach.

In the second approach, an underlying OS mechanism (posix's AIO, or
Win32's Asynch IO are the only two examples I know of) undertakes to
provide an asynchronous operation processor, providing asynchronous
operations to the application, and notifying the application's
asynchronous operation completion dispatcher (sorry, I know, but
those are the terms used in the second paper linked above--page 6
shows the pattern--and I didn't want to get off-topic too far). So
the grunt work of event (de)multiplexing is undertaken by the OS and
the events themselves are all that is reported to the application.

Where possible, this second approach often provides the greatest
flexibility and performance; at the cost of being evil and
un-intuitive to debug.

With the IO parallelized, the next question is the database access.
Now, while this probably doesn't mean database as in dbms, it
probably does mean something like "large memory space devoted to
user data structures".  I'd argue that if this is the case, then
providing threads to access it is a bad idea; rather, providing a
good set of locking mechanisms (fine and coarse grained) and
possibly a few coarse-grained traversal type functions would be
plenty of work. For instance, many 3D graphics engines work from the
concept of a "scene graph" that is much like the runtime object data
maintained by most mu*'s. Traversals of the graph for updates and
rendering (which in the case of a mu* would be something perhaps
like per-pulse processing) may require a certain variety of lock on
all or a portion of the database. Investing some time in creating or
adapting a balanced Readers/Writers locking mechanism (where usually
n readers, or 1 writer may freely access data, and awaiting threads
are assigned either "on the coattails" to gain access--for instance
an arriving reader sneaks in on the coattails of the readers
currently holding the lock--or in a first-come-first-serve
situation) would probably provide most of the locking types needed
for this kind of use.

After those two points, what's left is loosely the "business logic"
of the mu*. Things like the script interpreting, the different "game
loops", output generation, & etc live in this world. And it's here
that I'd say is the best place to use parallelization to combat the
types of resource-consumption that were the focus of the original
post (sorry, it's way to early to write things to the point, not
enough caffeine). In these cases, I'd say there are some rules of
thumb, parallelize the game loops, try to use dedicated threads, or
a thread pool to do some of the expensive things (like script
interpretting), and try always to keep in mind that you _want_ the
expensive operations to cause pauses for only those users really
dependent on the output of the expensive operation. An active-object
patterned approach (message queues to communicate into the thread,
asynchronous completion tokens to wait on results) is a good means
to some of the design, and gets a pretty pdf here
(http://www.cs.wustl.edu/~schmidt/PDF/Act-Obj.pdf).

Above all, two points (which I guess I should have put first, but
like I said, not enough trimethylxanthine (caffeine) yet.

  1) Paralellization is hard to get to its 'most efficient' and
  threads/processes are expensive. Any situation where you find the
  design 'calling for' a thread-per-user is suspect.

  2) Paralellization framework code is ugly, platform specific, and
  a nightmare to write/maintain; for this reason it is a _very_ good
  idea to find a third-party framework that suits your needs instead
  of "rolling your own". Fortunately, some good frameworks exist,
  and a few are free (as in beer). In five years of maturation (mine
  and its) I haven't found a better freely available framework for
  multithreading & communications than the ACE Libraries (Adaptive
  Communications Environment:
  http://www.cs.wustl.edu/~schmidt/ACE.html); but, of course ymmv.

Good luck, and I hope something in this rambling mess helped

-Dave

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev