C-Ray FP/CPU Benchmark Test Results

SGI hardware problems, solutions, tips, hacks, etc.
Forum rules
Any posts concerning pirated software or offering to buy/sell/trade commercial software are subject to removal.
User avatar
mapesdhs
Posts: 2516
Joined: Mon Nov 10, 2003 4:17 pm
Location: Edinburgh, Scotland
Contact:

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby mapesdhs » Fri Apr 22, 2011 3:41 am

Thanks for the info!! 8) I do want to try this out at some point, just never seem to have the time. I have a number of systems I'd like to
test - i3 540, i5 670 (hoping for near 5GHz if possible), i5 760, i7 870 (currently at 4.3), Q6600, etc. I'll look into the bootable business.

Ian.
I'm working on a charitable PC build for the Learn Engineering YouTube channel. Please PM/email/call if you'd like to contribute!
Donations of any kind of item I can sell to provide funds are also most welcome.
mapesdhs@yahoo.com
+44 (0)7434 635 121

jwhat
Posts: 316
Joined: Sat Aug 09, 2003 6:25 pm
Location: Australia

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby jwhat » Mon Nov 20, 2017 3:21 am

Hi Ian & Co,

came across the c-ray benchmark and thought it would be interesting to see how my recently revived Onyx 4 did, but while I was at it I was interested to also see how old SGI performed relative to classic Mac Pro (Late 2012 with 12 3.46GHZ Cores).

In summary, running all tests with modified "RUN.full" script, so every test was done with: 1,32,64,128,256,512 threads.

Relative Performance compared to Mac Pro:
Onyx with 8 CPU (4x1GHz/4x800MHz) - Best result approximately -800 to 900 % slower ;-)
Octane2 with 2 x 600MHz - Best Results approximately 6,000 to 10,000 % slower :-(

So good news is that Onyx 4 is with 8 CPUs is order of magnitude faster than Octane2 with 2 CPUs
But bad news is that new Mac Pro is light years faster than 8 CPU Onyx 4.

Here is tablular summary of longest running benchmark (sphfract - 1024 x 760):

Code: Select all

Threads   millisec
MacPro - 12 Core 3.46GHZ   103190
1      78014
32      5149
64      5027
128      5000
256      4993
512      5007
Octane - 2x600   2074395
1      589674
32      297261
64      296804
128      296705
256      296846
512      297105
Onyx4 - 8 CPU (4x1GHz+4x800MHz)   600878
1      353940
32      49901
64      49334
128      49090
256      49249
512      49364


For the very simple test (scene - 800 x 600):

Code: Select all

Thread   millisec
MacPro - 12 Core 3.46GHZ   636
1      250
32      19
64      17
128      17
256      17
512      316
Octane - 2x600   10146
1      1709
32      1679
64      1693
128      1683
256      1697
512      1685
Onyx4 - 8 CPU (4x1GHz+4x800MHz)   1977
1      1013
32      215
64      166
128      274
256      155
512      154


Compiler wise the SGI benchmarks were compiled on Octane2 with MIPSpro 7.3 and binary copied over to Onyx 4
The MacPro used Xcode CC complier which I believe is CLANG.

I was very keen to try to compile the Open64 version of what was historically MIPSpro, but lastest version 5.x (I think) does not have MIPS IRIX target support so gave up, as I suspect getting that compiler up and running might be significant effort.

Also here is hinv of Onyx 4 as it currently stands, which now shows the Fibre Channel card (as I have gone up to IRIX 6.5.29, which has driver support for this) & also added Adaptec 3 x USB/2 x FW PCI card (reports as DM10) and required me to put systune config in to allow it to work (as per Fuel Aggregator posting):

Code: Select all

Location: /hw/module/001c01/node
       IP59_4CPU Board: barcode RAG401     part 030-1989-003 rev -C
Location: /hw/module/001c01/IXbrick/xtalk/15
       2U_INT_53 Board: barcode REF634     part 030-1809-006 rev -D
Location: /hw/module/001c01/IXbrick/xtalk/15/pci-x/0/1/ioc4
             IO9 Board: barcode NZT098     part 030-1771-006 rev -A
Location: /hw/module/001c02/node
       IP53_4CPU Board: barcode NRG676     part 030-1956-003 rev -A
Location: /hw/module/001c02/IXbrick/xtalk/12
      ODY128B1_2 Board: barcode RCZ635     part 030-1909-008 rev -B
Location: /hw/module/001c02/IXbrick/xtalk/15
       2U_INT_53 Board: barcode RHM011     part 030-1809-006 rev -D
Location: /hw/module/001c02/IXbrick/xtalk/15/pci-x/0/1/ioc4
             IO9 Board: barcode NFV336     part 030-1771-005 rev -A
Processor 0: 1.0 GHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
Processor 1: 1.0 GHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
Processor 2: 1.0 GHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
Processor 3: 1.0 GHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
Processor 4: 800 MHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 2.2
FPU: MIPS R16010 Floating Point Chip Revision: 2.2
Processor 5: 800 MHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 2.2
FPU: MIPS R16010 Floating Point Chip Revision: 2.2
Processor 6: 800 MHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 2.2
FPU: MIPS R16010 Floating Point Chip Revision: 2.2
Processor 7: 800 MHZ IP35
CPU: MIPS R16000 Processor Chip Revision: 2.2
FPU: MIPS R16010 Floating Point Chip Revision: 2.2
CPU 0 at Module 001c01/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
  Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 1 at Module 001c01/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
  Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 2 at Module 001c01/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
  Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 3 at Module 001c01/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
  Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 4 at Module 001c02/Slot 0/Slice A: 800 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.2. Scache: Size 4 MB Speed 400 Mhz  Tap 0xa
CPU 5 at Module 001c02/Slot 0/Slice B: 800 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.2. Scache: Size 4 MB Speed 400 Mhz  Tap 0xa
CPU 6 at Module 001c02/Slot 0/Slice C: 800 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.2. Scache: Size 4 MB Speed 400 Mhz  Tap 0xa
CPU 7 at Module 001c02/Slot 0/Slice D: 800 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.2. Scache: Size 4 MB Speed 400 Mhz  Tap 0xa
Main memory size: 16384 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 16 Mbytes
Secondary unified instruction/data cache size: 16 Mbytes
Secondary unified instruction/data cache size: 16 Mbytes
Secondary unified instruction/data cache size: 16 Mbytes
Secondary unified instruction/data cache size: 4 Mbytes
Secondary unified instruction/data cache size: 4 Mbytes
Secondary unified instruction/data cache size: 4 Mbytes
Secondary unified instruction/data cache size: 4 Mbytes
Memory at Module 001c01/Slot 0: 8192 MB (enabled)
  Bank 0 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 1 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 2 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 3 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 4 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 5 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 6 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 7 contains 1024 MB (Premium) DIMMS (enabled)
Memory at Module 001c02/Slot 0: 8192 MB (enabled)
  Bank 0 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 1 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 2 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 3 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 4 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 5 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 6 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 7 contains 1024 MB (Premium) DIMMS (enabled)
Integral SCSI controller 2: Version IDE (ATA/ATAPI) IOC4
  CDROM: unit 0 on SCSI controller 2
Integral SCSI controller 3: Version IDE (ATA/ATAPI) IOC4
Integral SCSI controller 0: Version QL12160, low voltage differential
  Disk drive: unit 1 on SCSI controller 0 (unit 1)
Integral SCSI controller 1: Version QL12160, low voltage differential
Integral SCSI controller 7: Version Fibre Channel LS949X Port 1
Integral SCSI controller 8: Version IEEE1394 SBP2
Integral SCSI controller 4: Version QL12160, low voltage differential
Integral SCSI controller 5: Version QL12160, low voltage differential
Integral SCSI controller 6: Version Fibre Channel LS949X Port 0
IOC3/IOC4 serial port: tty3
IOC3/IOC4 serial port: tty4
IOC3/IOC4 serial port: tty5
IOC3/IOC4 serial port: tty6
IOC3/IOC4 serial port: tty7
IOC3/IOC4 serial port: tty8
IOC3/IOC4 serial port: tty9
IOC3/IOC4 serial port: tty10
Graphics board: V12
Integral Gigabit Ethernet: tg0, module 001c01, PCI bus 1 slot 4
Gigabit Ethernet: tg1, module 001c02, PCI bus 1 slot 4
Iris Audio Processor: version MAD revision 1, number 3
Iris Audio Processor: version RAD revision 13.0, number 2
  PCI Adapter ID (vendor 0x10a9, device 0x100a) PCI slot 1
  PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 3
  PCI Adapter ID (vendor 0x14e4, device 0x1645) PCI slot 4
  PCI Adapter ID (vendor 0x10a9, device 0x100a) PCI slot 1
  PCI Adapter ID (vendor 0x3388, device 0x0021) PCI slot 2
  PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 3
  PCI Adapter ID (vendor 0x14e4, device 0x1645) PCI slot 4
  PCI Adapter ID (vendor 0x1412, device 0x1724) PCI slot 1
  PCI Adapter ID (vendor 0x10a9, device 0x0005) PCI slot 2
  PCI Adapter ID (vendor 0x1033, device 0x0035) PCI slot 8
  PCI Adapter ID (vendor 0x1033, device 0x0035) PCI slot 8
  PCI Adapter ID (vendor 0x1033, device 0x00e0) PCI slot 8
  PCI Adapter ID (vendor 0x104c, device 0x8024) PCI slot 12
  PCI Adapter ID (vendor 0x1000, device 0x0640) PCI slot 1
  PCI Adapter ID (vendor 0x1000, device 0x0640) PCI slot 1
IOC4 firmware revision 83
IOC4 firmware revision 79
IOC3/IOC4 external interrupts: 1
IOC3/IOC4 external interrupts: 2
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
HUB in Module 001c02/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
Dual Channel Display
IP35prom in Module 001c01/Slot n0: Revision 6.210
IP35prom in Module 001c02/Slot n0: Revision 6.210
DMediaPro DM10 FW option: unit 0, revision 1.1.0
USB controller: type OHCI
USB controller: type OHCI


Would be interested to hear any advise on getting Open64 compiler going.
Also I have loaded up full results set into spreadsheet with graphs...

Updated results set (with thread: 1,2,4,8,16,32,64,128,256,512): spreadsheet 2

Cheers from Oz,


jwhat.
Last edited by jwhat on Tue Nov 21, 2017 12:48 am, edited 1 time in total.
jwhat - ask questions, provide answers

User avatar
mapesdhs
Posts: 2516
Joined: Mon Nov 10, 2003 4:17 pm
Location: Edinburgh, Scotland
Contact:

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby mapesdhs » Mon Nov 20, 2017 4:14 pm

Thanks for the data! I'm tied up atm, will try and add the numbers to my site later this week.


jwhat wrote:But bad news is that new Mac Pro is light years faster than 8 CPU Onyx 4.


Sadly as it should be given the huge differences in time and tech between the two. :D I have some PCs which would just make the SGI results look embarassing if I added them to the tables. :D

Ian.
I'm working on a charitable PC build for the Learn Engineering YouTube channel. Please PM/email/call if you'd like to contribute!
Donations of any kind of item I can sell to provide funds are also most welcome.
mapesdhs@yahoo.com
+44 (0)7434 635 121

jwhat
Posts: 316
Joined: Sat Aug 09, 2003 6:25 pm
Location: Australia

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby jwhat » Mon Nov 20, 2017 4:52 pm

Hi Ian,

After reviewing results I thought I should try additional thread counts, so have updated script to do: 1,2,4,8,16,32,64,128,256,512.

Hold off loading updates until I upload more comprehensive set of thread counts, expect this will show Octane2 thread behaviour variation better.

Here are updated results with thread count (1,2,4,8,16,32,64,128,256,512) : spreadsheet 2

And for visual view here is graph of longest running case (scene @ 7500x3500):

Image

Cheers,

jwhat.
jwhat - ask questions, provide answers

User avatar
mapesdhs
Posts: 2516
Joined: Mon Nov 10, 2003 4:17 pm
Location: Edinburgh, Scotland
Contact:

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby mapesdhs » Tue Nov 21, 2017 3:59 am

Good heavens, you're going to town with this one. :D Btw, the Mac Pro would be more optimal with 12 threads, 24, etc.

What always amuses me about these results is the way those who initially appear shocked at the speed of newer systems fail to appreciate how comparatively well the SGIs are doing in these tests, if one scales for clocks, ie. MIPS was pretty efficient really; just a shame SGI never went multi-core (was planned, killed off by the IA64 fiasco).

Bearing in mind the MIPS CPUs in the Onyx4 have no media instructions, AVX or anything like that, nor any general arch improvements that x86 has benefited from in all the years hence, and is using old/slow RAM, a very blunt scaling would suggest that an equivalent "amount" of MIPS MHz would score approx. 8500 for the sphract 1024x760 test, ie. the Mac would only be about a third quicker, and indeed the 32-CPU O350 result shows this approximation is very close. Pretty good IMO for a CPU design that's more than 20 years old. 8)

Ian.
I'm working on a charitable PC build for the Learn Engineering YouTube channel. Please PM/email/call if you'd like to contribute!
Donations of any kind of item I can sell to provide funds are also most welcome.
mapesdhs@yahoo.com
+44 (0)7434 635 121

jwhat
Posts: 316
Joined: Sat Aug 09, 2003 6:25 pm
Location: Australia

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby jwhat » Tue Nov 21, 2017 1:50 pm

Hi Ian,

yes it is a bit of fun running these... and I am pretty good at spreadsheets, so easy for me to do.

Agree SGI and MIP a great case study in what could have been.

Realised that I put scene/7500x3500 on graph when I should have put sphfract/1024x760, so here are numbers for that set of tests:

Code: Select all

MacPro - 12 Core 3.46GHZ   182064
1      77566
2      39918
4      21676
8      11163
16      6517
32      5162
64      5022
128      5012
256      5003
512      5025
Octane - 2x600   3397683
1      588741
2      301881
4      303173
8      297934
16      298242
32      420788
64      296796
128      296651
256      296581
512      296896
Onyx4 - 8 CPU (4x1GHz+4x800MHz)   1082272
1      441093
2      184439
4      99155
8      60385
16      50710
32      49221
64      48986
128      49089
256      49581
512      49613


Interesting performance on Octane2 on this one as it pretty much has no performance improvement after 2 threads... and even gets worse at 32 (maybe that was due to me giggling mouse to stop screen saver ;-) ).

Also as per your README I stopped mediad and sgi_apache on SGI boxes prior to running benchmarks, but did not bother stopping anything else including X11 display manager.

Anyhow if you have time it is easy to add additional result in to spreadsheet and get graphs. It would be interesting to see scatter diagram of result vs era...
Last edited by jwhat on Tue Nov 21, 2017 6:59 pm, edited 1 time in total.
jwhat - ask questions, provide answers

User avatar
mapesdhs
Posts: 2516
Joined: Mon Nov 10, 2003 4:17 pm
Location: Edinburgh, Scotland
Contact:

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby mapesdhs » Tue Nov 21, 2017 3:53 pm

Thanks for the extra info! Alas I don't have time for anything more than the web page, and that needs a rewrite. Leaving it until next year. Main thing I want to do is move the sphract test up to be the main test, the scene test is far too short.

Btw, did you know Phoronix adopted c-ray as one of their standard tests? It sometimes crops up in CPU reviews on Anandtech and elsewhere. :D

Ian.
I'm working on a charitable PC build for the Learn Engineering YouTube channel. Please PM/email/call if you'd like to contribute!
Donations of any kind of item I can sell to provide funds are also most welcome.
mapesdhs@yahoo.com
+44 (0)7434 635 121

User avatar
jan-jaap
Donor
Donor
Posts: 4938
Joined: Thu Jun 17, 2004 11:35 am
Location: Wijchen, The Netherlands
Contact:

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby jan-jaap » Wed Nov 22, 2017 12:51 am

mapesdhs wrote:Btw, did you know Phoronix adopted c-ray as one of their standard tests? It sometimes crops up in CPU reviews on Anandtech and elsewhere. :D

Hmmm. c-ray is a rather trivial program that runs entirely in L2 cache of any CPU made in the last 20 years or so. So while it may say something about CPU performance, it's entirely useless as a system benchmark which is obvious from these results:

Code: Select all

 102   2 x MIPS R10000 195MHz (4MB)                2600      32     mapesdhs     MIPS Pro 7.3         onyx2           SGI Onyx2, IRIX 6.5.26m, Compiler Ref 16.
 132   MIPS R10000 195 MHz (1MB)                   5119       1     mapesdhs     MIPS Pro 7.3         fire            SGI Octane, IRIX 6.5.26, Compiler Ref 7.
 133   MIPS R10000 195 MHz (2MB)                   5151       1     mapesdhs     MIPS Pro 7.3         onyx            SGI Onyx, IRIX 6.5.22, Compiler Ref 14.
 134   MIPS R10000 195 MHz (1MB)                   5231       1     mapesdhs     MIPS Pro 7.3         indigo2         SGI Indigo2, IRIX 6.5.22, Compiler Ref 11.
 136   MIPS R10000 195 MHz (1MB)                   5298       1     mapesdhs     MIPS Pro 7.3         o2              SGI O2, IRIX 6.5.26, Compiler Ref 6.

Something else: the single CPU systems perform best with a render single thread, which makes sense (no context switching between threads, no L1 cache flushes). But if best performance for the dual CPU is found with 16 threads competing for one execution unit there's got to be something wrong. Could be the code or what the compiler makes of it, but that's not right.
:PI: :Indigo: :Indigo: :Indy: :Indy: :Indy: :Indigo2: :Indigo2: :Indigo2IMP: :Octane: :Octane2: :O2: :O2+: Image :Fuel: :Tezro: :4D70G: :Skywriter: :PWRSeries: :Crimson: :ChallengeL: :Onyx: :O200: :Onyx2: :O3x02L:
To accentuate the special identity of the IRIS 4D/70, Silicon Graphics' designers selected a new color palette. The machine's coating blends dark grey, raspberry and beige colors into a pleasing harmony. (IRIS 4D/70 Superworkstation Technical Report)

User avatar
mapesdhs
Posts: 2516
Joined: Mon Nov 10, 2003 4:17 pm
Location: Edinburgh, Scotland
Contact:

Re: C-Ray FP/CPU Benchmark Test Results

Unread postby mapesdhs » Wed Nov 22, 2017 1:22 am

jan-jaap writes:
> Hmmm. c-ray is a rather trivial program that runs entirely in L2 cache of any CPU made in the last
> 20 years or so. So while it may say something about CPU performance, it's entirely useless as
> a system benchmark which is obvious from these results:

Yes I know, that's what I told Phoronix, and that's what John told me. :D Phoronx responded by saying this is why they wanted to use it, they were looking for a near-pure CPU test. It's also why I created the more complex tests, to do something that wouldn't fit in cache (at least on SGIs), and why I want to move the scene test to be secondary because it's so small and completes too fast to be meaningful on larger systems.


> ... Could be the code or what the compiler makes of it, but that's not right.

Beats me, but it's not specific to c-ray, the same effect happens with the Blender test, and not just on SGIs.

Ian.
I'm working on a charitable PC build for the Learn Engineering YouTube channel. Please PM/email/call if you'd like to contribute!
Donations of any kind of item I can sell to provide funds are also most welcome.
mapesdhs@yahoo.com
+44 (0)7434 635 121


Return to “SGI: Hardware”

Who is online

Users browsing this forum: nyef and 1 guest