Thanks! All new results added.
joerg, did you see no benefit from compiling with -IPA and/or -LNO? I did, but only sometimes. Depended on the system.
Btw, interesting to see just how much communications overhead is involved for the shorter scene test on your system.
Compared to linear scaling over a single R14K/600, your O3400 is only 2.8% slower than ideal for the longer high-res
scene test (the one test that involves some main RAM access), 7.8% slower for complex sphfract, 10% slower for normal
sphfract, but a much larger 25% slower than ideal for the short scene test.
This suggests the overall throughput of a system like O3400 would be better demonstrated by a more real-world test,
ie. a more complex scene render, or multiple frames. Thus, definitely a good idea to try the Blender test; full details on how
to run the test are on my site
, but briefly: download Blender 2.44 and the test scene file, install Blender in /usr/local/bin,
make sure the Blender directory is added to your path, turn off all unnecessary background daemons (I shut down mediad,
httpd, nsd, ipaliases, nfs, timed, etc.), if the system has no gfx then rlogin from elsewhere (gbit link is best of course), run
up Blender, load the scene file, change the number of threads to 8, press F12 to begin the render. Blender's renderer is a
little odd - you won't see the multiple threads running until part or all of the way through the processing of the first
subsection of the scene. Once finished, note the full description of the elapsed time and PM it to me (don't post here, that
would be off-topic). Since your system has 32 CPUs, its effective throughput would obviously be 4X as fast as the final
result, ie. if rendering multiple frames for a complete animation.