[MUD-Dev] Architecture (Cell Rebalancing)

J C Lawrence claw at kanga.nu
Thu Jul 3 09:04:56 CEST 2003


On Thu, 3 Jul 2003 12:06:38 +0200 
Rossmann Peter <peter.rossmann at siemens.com> wrote:
> On Wed, 02 Jul 2003 13:59:40 -0700 ceo <ceo at grexengine.com> wrote:

> I havent been thinking that deeply about your words, (because i was
> confused by why you mention working sets) until now:

>   My idea of working sets is that i would never see any page paged...

Page swapping is not the only impact of working sets.  The rate of
working set migration is a primary factor in the effectiveness of the
dcache.  In the multi-processor cases the rate of working set collision
and the rate of working set migration are primary factors in memory bus
saturation and cache invalidation (this is one of the more common
reasons that shared memory based systems can run dramatically slower on
SMP systems).

All memory is not equal.  Take a simple example: 

  1) Allocate a block of say 2Gig 

  2) Traverse the block writing one byte to every memory page in the
  block (if you're unsure of the page alignment of your block, just
  write a byte every N bytes where N is a bit smaller than your system's
  memory page size.  Note: Its better if your randomise the order of
  pages touched while ensuring that all pages are touched, rather than
  hitting proximal pages.

  3) Time how long that takes.

  4) Now allocate a block which is a little less (128 bytes?) than one
  memory page long (due to random alignment it will likely cross two
  pages).  The slightly smaller than page size is to accommodate sbrk()
  overhead which could stretch the block across three pages.

  5) Write the same number of bytes as you wrote in #2 to this new
  block, but to (random?) locations in this smaller block.

  6) Time how long that takes.

Don't waste your TLB or CPU caches.

> Using shared memory, you need to optimize data layout so to minimize
> cache misses.

This is tough.  In the SMP case the simple fact of tracking and
maintaining coherency for a SHM between the CPU caches is a non-trivial
overhead.  (Intel systems do particularly badly in this space)

> Otherwise, your program, even though it is expressed in a concurrent
> fashion, could run very very slow.

Quite.

--
J C Lawrence                
---------(*)                Satan, oscillate my metallic sonatas. 
claw at kanga.nu               He lived as a devil, eh?		  
http://www.kanga.nu/~claw/  Evil is a name of a foeman, as I live.

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev



More information about the mud-dev-archive mailing list