Blender rendering speed tips (for multi cpu)

3D/2D CGI and the tools used in their creation (Maya, Photoshop, Blender, GIMP, etc.).
Forum rules
Any posts concerning pirated software or offering to buy/sell/trade commercial software are subject to removal.
User avatar
fzalfa
Posts: 739
Joined: Sun Jun 19, 2005 11:38 am
Location: avignon ,provence, france
Contact:

Blender rendering speed tips (for multi cpu)

Unread postby fzalfa » Mon Jan 01, 2007 5:06 am

hello

i have try the latest Blender provided in the previous topic, support up to 8 cpu is great

i cannot resist to do some benchmark and i observe some strange behaviour in multithread

i use the same test scene between the Octane2 dual R12k and the O2k 8 R10k 250

the first try i do with the O2k was about 1m17s75.....slower than my Octane2.... ??

parametters was, 8 threads, render tile 4x4

at the begining of the rendering all the cpu are loaded, and gradualy as therendering progress, the used cpu number goes lowering. why ??

at the begining the rendering was very fast, and when the most rendering parts are done, the rendering brake seriousely

so i have found a solution, i adjuste the tile render zone as multiple of numbers of cpu, so this is the result with this mods:

8 threads, render tiles 8x8 done in 25s63...... awesome no ?

from to start to the end of the rendering, all the cpu are used....

regards & happy new year

Laurent
SGI or die !!!
:O2: :Octane2: :Octane: :Indigo2IMP: :Indigo2IMP: :Indigo: :Indigo: :Indy: :PI: :Crimson: :PWRSeries: :Onyx: :O2000R:
HP proliant DL 585 Quad Opteron dual core 2.5Ghz 16Gb

User avatar
squeen
Moderator
Moderator
Posts: 2932
Joined: Fri May 09, 2003 6:10 am
Location: Maryland, USA

Unread postby squeen » Mon Jan 01, 2007 5:30 am

cool work around. Tell the Blender team?

User avatar
fzalfa
Posts: 739
Joined: Sun Jun 19, 2005 11:38 am
Location: avignon ,provence, france
Contact:

Unread postby fzalfa » Mon Jan 01, 2007 9:53 am

hum, not for now.....

i think it's possible to make a version for more cpu no ???
i'm sure i can get some more speed with more than 8 thread on 8 cpu.....

i do this with dual (or more cpu config), you can push the thread number as twice the real cpu number.

i can gain some 10-20% on rendering time with dual cpu Octane with 4 threads
because blender rendering engin, assign definite image parts to each thread, and once the part is finish, the thread is not re-allocated to an unfinished part. so the cpu idle, what a lose of power.... so more thread feed your cpu(s).

laurent
SGI or die !!!
:O2: :Octane2: :Octane: :Indigo2IMP: :Indigo2IMP: :Indigo: :Indigo: :Indy: :PI: :Crimson: :PWRSeries: :Onyx: :O2000R:
HP proliant DL 585 Quad Opteron dual core 2.5Ghz 16Gb

User avatar
Vagabondo
Posts: 337
Joined: Tue Sep 06, 2005 7:05 pm
Location: Saint Louis

Unread postby Vagabondo » Mon Jan 01, 2007 3:35 pm

I concur...just finished installing the new blender...

Using "test.blend" as instructed on the benchmarking site, but running 8 threads and 8*8 matrix, I achieved a 0:59.77 for a 32p O3800...

Top score on the benchmarking list was 2:01...

This is very promising. I am not aware how difficult it is to increase the number of threads...hope it can be done relatively easilly...would definately like to see what would happen if I could run 64 threads simultaneously!! Not sure where the point of scheduling diminishing returns is, but would definately bring new life to those of us with >8p Iron!!

Kudos for achieving this level of multi-threading. I dont want to appear demanding since this is a great step forward. Thanks to all who have worked to make this possible :D :D

Nevermind...looks as though the new years rebound got me: Quad 3GHz (Dual Dual core) with 16GB ram posted 2.XX SECONDS! Oh well...still room for improvement though :wink:
Last edited by Vagabondo on Mon Jan 01, 2007 3:46 pm, edited 1 time in total.

User avatar
Vagabondo
Posts: 337
Joined: Tue Sep 06, 2005 7:05 pm
Location: Saint Louis

Unread postby Vagabondo » Mon Jan 01, 2007 3:38 pm

@fzalfa...what file were you using to benchmark your test??

User avatar
joerg
Posts: 2223
Joined: Thu Jan 08, 2004 6:57 am
Location: In an origin rack - Germany
Contact:

Unread postby joerg » Mon Jan 01, 2007 4:29 pm

Vagabondo wrote:@fzalfa...what file were you using to benchmark your test??


I think the file is from this webside http://www.eofw.org/bench/

regards
Joerg

User avatar
fzalfa
Posts: 739
Joined: Sun Jun 19, 2005 11:38 am
Location: avignon ,provence, france
Contact:

Unread postby fzalfa » Tue Jan 02, 2007 1:49 am

i have post the O2k test on this site......
but i do the Octane2 / o2k comparison with a personal scene.


Laurent
SGI or die !!!
:O2: :Octane2: :Octane: :Indigo2IMP: :Indigo2IMP: :Indigo: :Indigo: :Indy: :PI: :Crimson: :PWRSeries: :Onyx: :O2000R:
HP proliant DL 585 Quad Opteron dual core 2.5Ghz 16Gb

User avatar
Tartiflette
Posts: 5
Joined: Tue Jan 02, 2007 12:32 pm
Location: Uzes, France

Unread postby Tartiflette » Tue Jan 02, 2007 12:40 pm

joerg wrote:
Vagabondo wrote:@fzalfa...what file were you using to benchmark your test??


I think the file is from this webside http://www.eofw.org/bench/

regards
Joerg


WOW !!

The o2k are very impressive, if you ask me ! 8)

If you consider as well that the processors in there are very old ones now, it's even more impressive !
The r10k is about twice as fast as the so called "revolutionnary" CoreDuo on a per Mhz basis !!! :shock:

It's even more depressing now to think where SGI is...
BTW, the fact that Blender still runs on SGI is a good thing !

Regards,
Laurent aka Tartiflette :)

P.S. : And hello to all the members of the board, it's my first post here ! :D

User avatar
Vagabondo
Posts: 337
Joined: Tue Sep 06, 2005 7:05 pm
Location: Saint Louis

Unread postby Vagabondo » Tue Jan 02, 2007 12:41 pm

Well, considering the fact that the O3800 is only running 8 threads when it could easilly handle 4X that at 1/CPU and it came in equivalent to a quad core opteron running at 2.6GHz...not too shabby...plus, I can use it to dry my hair in the morning :wink: .

Can't wait to see if the 8 threads ability was a proof of principle and can just be extended...and just spawn threads to the OS and have that take care of scheduling and resources.

Welcome Tartiflette!! The more the merrier :D

User avatar
Tartiflette
Posts: 5
Joined: Tue Jan 02, 2007 12:32 pm
Location: Uzes, France

Unread postby Tartiflette » Tue Jan 02, 2007 1:07 pm

Vagabondo wrote:Well, considering the fact that the O3800 is only running 8 threads when it could easilly handle 4X that at 1/CPU and it came in equivalent to a quad core opteron running at 2.6GHz...not too shabby...plus, I can use it to dry my hair in the morning :wink: .

Can't wait to see if the 8 threads ability was a proof of principle and can just be extended...and just spawn threads to the OS and have that take care of scheduling and resources.

Welcome Tartiflette!! The more the merrier :D


From what i've seen, digging in the blender code when making my own build on Mac OS X, you can raise the number of threads in the code as you wish, but i've read that the scaling isn't that good past a certain number of thread...

But i'm not even a programmer so i don't know for sure if that's true or not ? :?:

But it would be cool to see your O3800 running full throttle the bench just to know what to expect from such a beast ! :D

It's good to see that those processor and systems were really so good in the past, and that they can still compete with state of the art computers ! It says a lot about the possibilities back in thos days.. 8)

Anyway, i hope that Blender will live for a long time and that it will still be available for the SGI platform.


And thank you for the wlecome ! :wink:


Regards,
Laurent aka Tartiflette :D

Jarndyce911
Posts: 240
Joined: Thu Jan 15, 2004 2:00 pm
Location: Silicon Valley
Contact:

Unread postby Jarndyce911 » Sat Jan 06, 2007 10:34 am

Here is the newly compiled version with a maximum of 1024 threads:

http://www.jarndyce.org/blender-2.43RC1 ... ps.tar.bz2

I also update the other topic, so check it out:

viewtopic.php?t=11942&start=15

Let me know if you have comments or questions!

Thanks
Charles

User avatar
joerg
Posts: 2223
Joined: Thu Jan 08, 2004 6:57 am
Location: In an origin rack - Germany
Contact:

Unread postby joerg » Sat Jan 06, 2007 12:28 pm

Sorry....but its not working with more than 8 threads. It takes a maximum of 790% cputime.


But i have some questions.

Why it always creates a /tmp/0001.jpg when specify -F PNG?

How to set the tilesets when using the command line only?

Code: Select all

./blender -t 24 -F PNG -g  -b test.blend -f 1


On a multi cpu system you can create a cpuset in combination with a simple shell wrapper

Code: Select all

[o2k]:~/blender-2.43RC1-irix-6.5-mips $ cpuset -q blender -A ./bench.sh 8
guessing './blender' == '/usr/people/beh/blender-2.43RC1-irix-6.5-mips/./blender'
Compiled with Python version 2.4.
Checking for installed Python... got it!
Fra:1 Mem:3.54M Sce: Scene Ve:998 Fa:985 La:1
Fra:1 Mem:23.25M | Part 1-16
Fra:1 Mem:22.82M | Part 5-16
Fra:1 Mem:23.25M | Part 2-16
Fra:1 Mem:23.25M | Part 6-16
Fra:1 Mem:23.19M | Part 4-16
Fra:1 Mem:23.19M | Part 3-16
Fra:1 Mem:23.25M | Part 7-16
Fra:1 Mem:23.25M | Part 9-16
Fra:1 Mem:23.25M | Part 8-16
Fra:1 Mem:21.84M | Part 11-16
Fra:1 Mem:20.42M | Part 10-16
Fra:1 Mem:19.00M | Part 13-16
Fra:1 Mem:17.59M | Part 12-16
Fra:1 Mem:16.11M | Part 14-16
Fra:1 Mem:14.69M | Part 15-16
Fra:1 Mem:13.27M | Part 16-16
Saved: /tmp/0001.jpg Time: 02:12.91


regards
Joerg
Last edited by joerg on Sat Jan 06, 2007 11:25 pm, edited 1 time in total.

Jarndyce911
Posts: 240
Joined: Thu Jan 15, 2004 2:00 pm
Location: Silicon Valley
Contact:

Unread postby Jarndyce911 » Sat Jan 06, 2007 1:30 pm

hmmm, I will have to check into it more tomorrow (computer is shutdown
till tomorrow morning). There must be a setting for it in another place.

As for the command line tile set, I will have to check on that as well.

Thanks for the feedback, I appreciate it.
Charles

User avatar
Vagabondo
Posts: 337
Joined: Tue Sep 06, 2005 7:05 pm
Location: Saint Louis

Unread postby Vagabondo » Sat Jan 06, 2007 6:55 pm

Unfortunately, I can confirm that the max of 8 threads still remains.

Installed the newest compile...It does allow you to SET >8 threads, but does not carry through to the process.

One step in the right direction though.

Thanks Jarndyce for the extra effort :wink:

User avatar
deBug
Posts: 840
Joined: Mon Feb 27, 2006 1:44 pm
Location: Sweden

Re: Blender rendering speed tips (for multi cpu)

Unread postby deBug » Sat Jan 06, 2007 7:12 pm

fzalfa wrote:hello

i have try the latest Blender provided in the previous topic, support up to 8 cpu is great

i cannot resist to do some benchmark and i observe some strange behaviour in multithread

i use the same test scene between the Octane2 dual R12k and the O2k 8 R10k 250

the first try i do with the O2k was about 1m17s75.....slower than my Octane2.... ??

parametters was, 8 threads, render tile 4x4

at the begining of the rendering all the cpu are loaded, and gradualy as therendering progress, the used cpu number goes lowering. why ??

at the begining the rendering was very fast, and when the most rendering parts are done, the rendering brake seriousely

so i have found a solution, i adjuste the tile render zone as multiple of numbers of cpu, so this is the result with this mods:

8 threads, render tiles 8x8 done in 25s63...... awesome no ?

from to start to the end of the rendering, all the cpu are used....

regards & happy new year

Laurent


I'm I missing something deeper in this?
Because for me it looks obvious what’s going on here.

When there are less than 8 tiles left, some CPUs will be unused as they will not be able to pick any more tiles, right?
So at the end there will always be unused CPUs.
As the tiles completes, more and more CPUs will be unallocated until there is just one CPU doing the last tile.
So the total amount of processing power goes down when there is less than 8 tiles left, right?

So the trick is making enough tiles in the job so that the last 8 tiles is a very small percentage of the total number of tiles, thus affecting the job the least.

In your first example with 4x4 tiles (16) the last 8 tiles is half of the entire job, thus giving big impact on the job.
But in your second example 8x8 tiles (64) the last 8 tiles is just one 1/8 of the entire job, giving less impact on the job.

So your right in that more tiles are better but I don’t believe it has to be a multiple of the number of CPUs, for example, a job with 6x10 tiles are probably as fast as a job with 8x8 tiles.


Return to “SGI: Computer Graphics”

Who is online

Users browsing this forum: No registered users and 1 guest