[MUD-Dev] Distributed Processing

Gaffney Gaffney
Tue Feb 27 19:29:49 CET 2001


> -----Original Message-----
> From: John Buehler [mailto:johnbue at msn.com]
> Sent: Monday, February 26, 2001 4:01 AM
> To: MUD-Dev
> Subject: [MUD-Dev] Distributed Processing
> 
> 
> Here's a new screwball idea for folks to munch on from the 'JB Files'.
> 
> Organize a game's server farm along skill lines.
> 
> This means that the algorithms and data devoted to a certain skill in
> the game world are all on a machine or machines used only for that
> skill.  For example, archery, or tracking, or baking, or some the
> blacksmithing skills, etc.
> 
> Such an approach seems to have interesting features:
 
Distributed processing on MUDs/MMPs is a generically interesting
proposition, but what you will find after examining the problem for a
while is that your theoretical bottleneck is provably not CPU load,
memory usage, or any per-machine spec, but rather inter-server
bandwidth (especially user-MUX-DEMUX) for any interesting application.
Unfortunately, doing a split by skills/algorithms means that you will
rapidly have a case that is intractable in its usage of bandwidth
(since presumably any geographical or character result of the skill
usage will need to be sent to the server owning the area/character).
This implies that it will contribute to your worst case.  Of course,
it's worth noting that providing content to satisfy 10,000 users is a
harder problem than providing a server architecture to support 10,000
users, so it's possible most folks don't care about colliding with
theoretical maxima which become an issue at around that point.

The inter-server bandwidth bottleneck may be non-obvious - especially
since there are several games out there not currently bound by the
theoretical/global maxima but are rather by local ones.  As a simple
explanation of why; consider that there is nothing in theory to keep
you from adding more CPUs, memory, hard drives or other hardware.
However, each new piece of hardware needs to communicate with some
subset of the other CPUs/boxes/etc (in implementation, this isn't
always via network, but could be by shared busses, files, or other
things) Thus, this communication (which in worst case is an O(n^2)
proposition) is the one bottleneck which will prevent arbitrary
growth.  In an optimally arranged setting, where each world server
needs to talk to some small finite number (especially zero!) of other
servers, if the communications path is shared with other servers
communicating between themselves, it should become obvious that the
shared pipe will become the single point of limitation - and, in fact,
in every system I know running today, this would be the limit.  If the
communications path is not shared (i.e. each server has a devoted
connection to each other server with which it needs to communicate)
then how to distribute events or results to the clients of a given
server is an interesting (though not intractable) proposition.
Usually this is done through a central router, or through a set of
annex/entry point processes - thus my contention above, that after
arriving at an optimal distribution of daisy-chained servers, getting
users or packets to or from the server will still be an O(f(N))
operation.

I'm not privy to EQ's architecture (though I can extrapolate much of
it from having /played time of 60 days or so - thanks for setting AC
and UO2/UWOO back, Brad & co.;) but their zone structure is a great
one for minimizing inter-server bandwidth; unless there is some design
inefficiency in their packet routing (i.e. a centralized router or
annex server setup) or shared file/database access, they should have
very little to keep them from expanding a given server farm
arbitrarily, since the most a server would need to communicate is when
a player crosses from one to the next.  In AC/UO (to which I'm more
privy) where you can see across server boundaries, you'd see
inter-server bandwidth be the limitation if you tried to throw an
arbitrary amount of hardware at a server farm to try and get your
per-server user count up above the 10000 person limit or so (I could
guess at some earlier issues based on file distribution and routing in
UO, on AC...  I can't recall any particular design limitation from the
architecture, but my memory has faded over the last few years.)

Of course, a more challenging project is coming up with content to
satisfy that many users.  Or a system of socialization which provides
sufficient organization to take us from our current "village"
population levels up to a reasonable "city" population level without
the game feeling too impersonal - it could be argued that social
organization could be one of the big areas of near-future development
for MMPs, ala AC's Allegiance Hierarchy or Shadowbane or others'
squad, battalion, or company level fighting dynamics.  The allegiance
system was designed initially to encourage large-scale
communication/interaction, all the better to take advantage of the
Turbine Engine's scaled architecture - in practice, it didn't turn out
quite that way, but that doesn't mean it's impossible, just hard.

Since the reason you'd look to a distributed system is to either get
more people in a single shard, or more efficiently spread the load
across an existing shard, I couldn't resist commenting - it's a
fascinating problem, but unfortunately one that really warrants (IMO)
examination after we solve the social/ content dilemmi.  If anyone
invents a time machine, please go pop back to '94 and let me (and the
other Turbinites of the time) know that - would have helped a lot. :)

For those not dissuaded, you'll find that the dynamic load
distribution problem reduces (if approached as a geometrical
distribution) to a weighted edge graph cutting algorithm (weight being
potential interserver traffic).  This is an NP-complete algorithm,
with some polynomial time approximations currently being examined in
the distributed systems literature (based on what your edge weight
approximation metric is, of course, which warrants a research paper in
and of itself).  Given that you can have quite successful games
without doing this (nasty, multi-year, bleeding-edge) research...  I
might advise against doing it unless there is grant money involved.
Please do prove me wrong, however :)

-jg

(Er, to summarize the above if it is unreadable technobabble (my
fault, not yours!) - Dynamic load balancing is hard, both in CPU cost
and brain strain.  But cool.  If you wanted to make a Gibson-style
Cyberspace (i.e.  geographically distributed virtual space) you'll
find it to be an identical problem, which is rather interesting.  So
if you ignore me, and solve this, go make your MUD be the basis for
the next rev of the WWW.:)

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev



More information about the mud-dev-archive mailing list