Onyx2 and Origin2K experiment.

SGI hardware problems, solutions, tips, hacks, etc.
Forum rules
Any posts concerning pirated software or offering to buy/sell/trade commercial software are subject to removal.
User avatar
foetz
Moderator
Moderator
Posts: 6542
Joined: Mon Apr 14, 2003 4:34 am
Contact:

Unread postby foetz » Sat Sep 10, 2005 4:18 pm

well, equip all machines with full routers and just try it :D

User avatar
krafty
Posts: 142
Joined: Fri Sep 09, 2005 1:07 am
Location: West Palm Beach, FL

Unread postby krafty » Sun Sep 11, 2005 7:11 am

foetz wrote:well, equip all machines with full routers and just try it :D


I have, and that's when I get all of the "Bad structure" or "Premium memory required for this configuration" messages. That's why I think that some sort of manual manipulation of the routing table might be nessecary, but how?
There once was a woman named Bright,
Whose speed was much faster than light.
She set out one day- in a relative way,
And returned on the previous night!

TeeTylerToe
Posts: 913
Joined: Mon Sep 13, 2004 11:56 pm

Unread postby TeeTylerToe » Sun Sep 11, 2005 7:28 am

is there any reason to think that only having one router, and two node boards on the origin 2000 could help?

User avatar
krafty
Posts: 142
Joined: Fri Sep 09, 2005 1:07 am
Location: West Palm Beach, FL

Unread postby krafty » Sun Sep 11, 2005 8:52 am

TeeTylerToe wrote:is there any reason to think that only having one router, and two node boards on the origin 2000 could help?


I've tried that too- it gives me a "Too few non-express links" error.
There once was a woman named Bright,

Whose speed was much faster than light.

She set out one day- in a relative way,

And returned on the previous night!

TeeTylerToe
Posts: 913
Joined: Mon Sep 13, 2004 11:56 pm

Unread postby TeeTylerToe » Sun Sep 11, 2005 11:33 am

well, did you try making that one link an express link by using two cables between the o2K & the deskside onyx?

User avatar
krafty
Posts: 142
Joined: Fri Sep 09, 2005 1:07 am
Location: West Palm Beach, FL

Unread postby krafty » Sun Sep 11, 2005 1:09 pm

TeeTylerToe wrote:well, did you try making that one link an express link by using two cables between the o2K & the deskside onyx?


Here's the configuration I'm trying now:

Origin2000- 2 node boards in slots n1 and n2, and one router in the left hand slot.

Onyx2- 1 node board in slot n1, and one full router board.

Here's the output I get when I try just one connection from port 3 on one router board to port 3 on the other:

Code: Select all

IP27 PROM SGI Version 6.156  built 11:27:56 AM Nov 18, 2003
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             DONE
promlog: Flash PROM write error: Cannot program a 0 into a 1
Discovering local IO ......................             DONE
Discovering NUMAlink connectivity .........             DONE
Found 5 objects (3 hubs, 2 routers) in 184941 usec
Waiting for peers to complete discovery....             DONE
Recognized 390 MHz midplane
Global master is /hw/module/1/slot/n1

route: error: get_route: bad routing table -- probably bad system structure!
*** /hw/module/1/slot/n1: Router table calculation failed
*** metaid(2)==-1!
*** This configuration requires all nodes to have premium DIMMS.
*** /hw/module/1/slot/n1 not premium
*** /hw/module/2/slot/n2 not premium
*** /hw/module/2/slot/n1 not premium
Going to die...

1A 000: *** Failure LEDs interrupted on node 0


(At this point, I tried an initalllogs from cac mode of the pod, and it hangs the system)

Now, if I try using two connections (Port 3 to port 3 and port 2 to port 2), I get this output:

Code: Select all

IP27 PROM SGI Version 6.156  built 11:27:56 AM Nov 18, 2003
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             DONE
promlog: Flash PROM write error: Cannot program a 0 into a 1
Discovering local IO ......................             DONE
Discovering NUMAlink connectivity .........             DONE
Found 5 objects (3 hubs, 2 routers) in 200914 usec
Waiting for peers to complete discovery....             DONE
Recognized 390 MHz midplane
Global master is /hw/module/1/slot/n1

*** Configuration Error at module 1, slot r0, port(s) 0x6:
        multiple express links (max 1 allowed).
        Note: misconfigured ports will have their amber LEDs set.

*** Configuration Error at module 2, slot r1, port(s) 0x6:
        multiple express links (max 1 allowed).
        Note: misconfigured ports will have their amber LEDs set.
*** /hw/module/1/slot/n1: NASID calculation failed
*** metaid(1)==-1!
*** metaid(2)==-1!
*** This configuration requires all nodes to have premium DIMMS.
*** /hw/module/1/slot/n1 not premium
*** /hw/module/2/slot/n2 not premium
*** /hw/module/2/slot/n1 not premium
Going to die...

1A 000: *** Failure LEDs interrupted on node 0
1A 000: POD IOC3 Dex>


Again, trying to initalllogs from pod hangs the system.
There once was a woman named Bright,

Whose speed was much faster than light.

She set out one day- in a relative way,

And returned on the previous night!

User avatar
foetz
Moderator
Moderator
Posts: 6542
Joined: Mon Apr 14, 2003 4:34 am
Contact:

Unread postby foetz » Sun Sep 11, 2005 5:19 pm

sounds like your config needs all machines to have the premium ram.
like octanes to go beyond 4gb in some configs.

User avatar
krafty
Posts: 142
Joined: Fri Sep 09, 2005 1:07 am
Location: West Palm Beach, FL

Unread postby krafty » Sun Sep 11, 2005 8:36 pm

foetz wrote:sounds like your config needs all machines to have the premium ram.
like octanes to go beyond 4gb in some configs.


I've got some directory memory kicking around here somewhere- I'll give it a shot.

In the mean time, does anybody know what that "promlog: Flash PROM write error: Cannot program a 0 into a 1" error is? I get it periodically on my systems (Both the Onyx2 and the Origin2K) and there doesn't seem to be any reason for it other than randomness.
There once was a woman named Bright,

Whose speed was much faster than light.

She set out one day- in a relative way,

And returned on the previous night!

User avatar
foetz
Moderator
Moderator
Posts: 6542
Joined: Mon Apr 14, 2003 4:34 am
Contact:

Unread postby foetz » Mon Sep 12, 2005 2:22 pm

krafty wrote:
foetz wrote:sounds like your config needs all machines to have the premium ram.
like octanes to go beyond 4gb in some configs.


I've got some directory memory kicking around here somewhere- I'll give it a shot.

In the mean time, does anybody know what that "promlog: Flash PROM write error: Cannot program a 0 into a 1" error is? I get it periodically on my systems (Both the Onyx2 and the Origin2K) and there doesn't seem to be any reason for it other than randomness.


some vars can have 0 or 1 for on and off.
according to the message the system is not able to set these vars. this could be a corrupted flash or ram issue.
try to set a var by hand and reset. it should still be there. if not the flash is not okay.

User avatar
artherd
Posts: 108
Joined: Fri Sep 03, 2004 11:45 pm
Location: SF Bay Area, CA
Contact:

Unread postby artherd » Tue Sep 13, 2005 9:19 pm

I think you need directory memory in all node boards to go beyond 8proc (or 2 chassis)

Or was that to go beyond *32* procs in 4 chassis? I forget.

Pretty certin you will need 2 craylinks, I don't think one is a valid config.
My first Indy is still my favourite SGI.
CDglobal Networks: http://www.cdglobal.net/

User avatar
artherd
Posts: 108
Joined: Fri Sep 03, 2004 11:45 pm
Location: SF Bay Area, CA
Contact:

Unread postby artherd » Tue Sep 13, 2005 9:20 pm

Also I think you must have an even number of node boards.
My first Indy is still my favourite SGI.

CDglobal Networks: http://www.cdglobal.net/

User avatar
archaic
Posts: 354
Joined: Mon Mar 08, 2004 9:28 pm
Location: Houston, TX

Origin Error

Unread postby archaic » Wed Sep 14, 2005 11:05 am

krafty wrote:
foetz wrote:sounds like your config needs all machines to have the premium ram.
like octanes to go beyond 4gb in some configs.


I've got some directory memory kicking around here somewhere- I'll give it a shot.

In the mean time, does anybody know what that "promlog: Flash PROM write error: Cannot program a 0 into a 1" error is? I get it periodically on my systems (Both the Onyx2 and the Origin2K) and there doesn't seem to be any reason for it other than randomness.



Go into the POD from the PROM monitor and run:

"go cac"
"clearalllogs"
"clear"
"flush"
"reset"


and then go back into the PROM monitor and execute:

"enableall"
"update"
"reset"


This should take care of it. I have to do it all the time when I am testing boards and memory. It will take longer to boot after you do the first set of commands as it has to re-discover everything in the chassis. Have Fun. If I can get my ONYX2 I can give it a try. It looks to me that you need 16MB Directory DIMMS for the 128MB banks of RAM and 64MB Directory DIMMS for the 512MB banks of RAM. Cheers.
Patrick

- You have got to be kidding... -

User avatar
foetz
Moderator
Moderator
Posts: 6542
Joined: Mon Apr 14, 2003 4:34 am
Contact:

Unread postby foetz » Wed Sep 14, 2005 8:09 pm

just to be sure:

you have 2 router boards with 3 connectors each and linked with 2 craylink cables.
connect the cables straight to the 2 top connectors.
if it's still not working do what archaic said and try again.
if it's still not working then i would guess some hardware is corrupt.

loonvf
Posts: 341
Joined: Wed May 05, 2004 12:15 am
Location: Netherlands

Unread postby loonvf » Fri Sep 16, 2005 1:26 pm

Hi all,

Since you guys seem to know more about Origin 2000 systems, I have a question:

When powering up my (half part of rack) Origin 2000 I get the next error:

Code: Select all

Waiting for peers to complete recovery... Reading link 0 ( addr 0x92000000  00000004) failed
DONE
*** Global master /hw/module/1/slot/n1 does not have a console
Global master is /hw/module/1/slot/n1
Testing/Initializing all memory......... DONE
Checking partition information......... DONE
*** Partition master /hw/module/1/slot/n1 does not have a console
Local slave entering slave loop
HubIO Link is down. Never came out of reset
Cannot talk to IO board. ^C to enter POD


My serial console is connected to the serial port down below at the fan metal and not to the front MMSC diagnostics bus or the official IO6 console port.

The O2K has currently 1 dual 180Mhz CPU, some memory and 1 full router in the left slot (right router slot not populated)

Please help me to resurrect my Origin 2000

Thanks in advance,

TeeTylerToe
Posts: 913
Joined: Mon Sep 13, 2004 11:56 pm

Unread postby TeeTylerToe » Fri Sep 16, 2005 3:09 pm

need a null router?


Return to “SGI: Hardware”

Who is online

Users browsing this forum: No registered users and 1 guest