Nekochan Net

Official Chat Channel: #nekochan // irc.nekochan.net
It is currently Sun Sep 21, 2014 8:07 am

All times are UTC - 8 hours


Forum rules


Any posts concerning pirated software or offering to buy/sell/trade commercial software are subject to removal.



Post new topic Reply to topic  [ 11 posts ] 
Author Message
Unread postPosted: Wed Sep 30, 2009 3:18 am 
Offline
Moderator
Moderator
User avatar

Joined: Tue Nov 25, 2003 12:09 pm
Posts: 799
Location: Europe
I was getting a "firmware too old" message from numastatd at startup, so I thought it would be a good idea to update it to 1.44.0 which came with 6.5.30.

It turns out that this combination of -003 motherboard and 1.44.0 L1 firmware renders the system unbootable. Won't respond to power button, verbal abuse, nothing.

The procedure to reverse this is quite simple, though :)

Remove the machine's side panel. Somewhere near the SCSI connector on the motherboard you will find an RS232 port, attach a null-modem cable to it.
On the other machine, start a terminal emulator ('cu' works well), set it to 38400 8N1 (e.g. 'cu -l /dev/your_serial_port -s 38400').
As soon as you plug in the power cable on the Fuel, you should be greeted by a prompt:
Code:
ALERT: Error reading the display I/O expander, no acknowledge


SGI SN1 L1 Controller
Firmware Image A: Rev. 1.44.0, Built 07/17/2006 18:19:54


001?01-L1>


The following entries in the log appeared at the time of the update:
Code:
09/29/09 21:21:28 L1 booting 1.44.0
09/29/09 21:21:28 vram checksum error - initializing core data.
09/29/09 21:21:28 ALERT: Error reading the display I/O expander, no acknowledge
09/29/09 21:21:28 ** fixing invalid SSN value


If you try to issue a power up command, you will receive this lovely message:
Code:
001?01-L1>pwr up
ERROR: no power supplies available.


Resetting the NVRAM didn't help, so I decided to boot the other L1 image.

Code:
001?01-L1>flash status
Flash image A currently booted

Image      Status        Revision    Built
-----   -------------   ----------   -----
  A     default         1.44.0       07/17/2006 18:19:54
  B     valid           1.10.12      02/01/2002 14:40:22
001?01-L1>flash default b

(if your L1 booted from image B, enter "flash default a" instead)
Code:
001?01-L1>reboot_l1


After this, everything works normally:
Code:
SGI SN1 L1 Controller
Firmware Image B: Rev. 1.10.12, Built 02/01/2002 14:40:22


001a01-L1>flash status
Flash image B currently booted

Image      Status        Revision    Built
-----   -------------   ----------   -----
  A     valid           1.44.0       07/17/2006 18:19:54
  B     user default    1.10.12      02/01/2002 14:40:22


If anyone has the newest 1.48.0 L1 image, I could give it a try to see whether they've fixed this or not :)


Top
 Profile  
 
Unread postPosted: Wed Sep 30, 2009 3:29 am 
Offline
User avatar

Joined: Thu Jun 17, 2004 10:35 am
Posts: 3894
Location: Wijchen, The Netherlands
ShadeOfBlue wrote:
If you try to issue a power up command, you will receive this lovely message:
Code:
001?01-L1>pwr up
ERROR: no power supplies available.


Damn, that happened to me too, but in my case the system was a Fuel prototype!
I used the same procedure to recover the system.

ShadeOfBlue wrote:
If anyone has the newest 1.48.0 L1 image, I could give it a try to see whether they've fixed this or not :)

Forget it. That's what I used:
Code:
Flash image B currently booted

Image      Status        Revision    Built
-----   -------------   ----------   -----
  A     valid           1.48.1       01/22/2007 11:33:34
  B     user default    1.9.15       12/04/2001 16:21:34

_________________
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Onyx2:(2x) :O3x02L:
In the museum: almost every MIPS/IRIX system.
Wanted: GM1 board for Professional Series GT graphics (030-0076-003, 030-0076-004)


Top
 Profile  
 
Unread postPosted: Wed Sep 30, 2009 3:35 am 
Offline
Moderator
Moderator
User avatar

Joined: Tue Nov 25, 2003 12:09 pm
Posts: 799
Location: Europe
jan-jaap wrote:
Forget it. That's what I used:
Code:
Flash image B currently booted

Image      Status        Revision    Built
-----   -------------   ----------   -----
  A     valid           1.48.1       01/22/2007 11:33:34
  B     user default    1.9.15       12/04/2001 16:21:34

Oh well, I'll stick with 1.10.12 then and do a 'chkconfig numastatd off' :)


Top
 Profile  
 
Unread postPosted: Wed Sep 30, 2009 3:56 pm 
Offline

Joined: Wed Jul 19, 2006 7:37 am
Posts: 5751
Location: Renton, WA
Question (not having an IP35-derived system yet):

Why haven't you re-flashed the first image back so you have a failsafe again?

_________________
Damn the torpedoes, full speed ahead!

There are those who say I'm a bit of a curmudgeon. To them I reply: "GET OFF MY LAWN!"

:Indigo: :Octane: :Indigo2: :Indigo2IMP: :Indy: :PI: :O3x0: :ChallengeL: :O2000R: (single-CM)


Top
 Profile  
 
Unread postPosted: Wed Sep 30, 2009 10:10 pm 
Offline

Joined: Tue Feb 24, 2004 4:10 pm
Posts: 9644
ShadeOfBlue wrote:
I was getting a "firmware too old" message from numastatd at startup, so I thought it would be a good idea to update it to 1.44.0 which came with 6.5.30.

It turns out that this combination of -003 motherboard and 1.44.0 L1 firmware renders the system unbootable. Won't respond to power button, verbal abuse, nothing.

The procedure to reverse this is quite simple, though :)

Remove the machine's side panel. Somewhere near the SCSI connector on the motherboard you will find an RS232 port, attach a null-modem cable to it.
On the other machine, start a terminal emulator ('cu' works well), set it to 38400 8N1 (e.g. 'cu -l /dev/your_serial_port -s 38400').
As soon as you plug in the power cable on the Fuel, you should be greeted by a prompt:
Code:
ALERT: Error reading the display I/O expander, no acknowledge


SGI SN1 L1 Controller
Firmware Image A: Rev. 1.44.0, Built 07/17/2006 18:19:54


001?01-L1>


The following entries in the log appeared at the time of the update:
Code:
09/29/09 21:21:28 L1 booting 1.44.0
09/29/09 21:21:28 vram checksum error - initializing core data.
09/29/09 21:21:28 ALERT: Error reading the display I/O expander, no acknowledge
09/29/09 21:21:28 ** fixing invalid SSN value


If you try to issue a power up command, you will receive this lovely message:
Code:
001?01-L1>pwr up
ERROR: no power supplies available.


This is interesting for an odd reason - I have a grafix card that is flaky. If the machine does boot then it runs fine. But when it doesn't, I get the exact symptoms you describe. It appears that whatever on the grafix card does the 'acknowledge' during the post is headed south. Dr. Dave, where are you ?


Top
 Profile  
 
Unread postPosted: Wed Sep 30, 2009 11:09 pm 
Offline
User avatar

Joined: Fri Feb 13, 2004 10:37 pm
Posts: 2311
Location: Ottawa, Canada >burp<
Good question. Probably U11, though I don't have a Fuel video card handy. It's the Philips part, and should be basically an I/O chip with an I2C interface - I tried looking at the pics posted a while back but can't read the part number. Philips calls these parts "I/O Expanders" so no bowdlerization there.

So.... it's looking like in the case where there are issues, the I2C interface to the graphics card is not working, or at least the I2C peripherals on the card are not responding. I'd bet the later revisions of the L1 firmware have issues initializing the I2C controller on the motherboard. and thus (depending on the severity of the problem) may cause *none* of the I2C peripherals to be detected. This of course leads to the inevitable problem...

Hamei, have a look at U11 closely and see if there is anything like a cold solder joint. There is a cluster of I2C chips around there, including the Dallas environment monitor and an Atmel flash chip, as well as the Philips part and whatever else. Check them all. The address is latched internally at reset if I remember correctly, so basically if it's flaky and you reset it enough times it will eventually 'latch' the correct value and everything is then hunky-dory until the next reset/powerup.

I2C is a bidirectional serial protocol, all the peripheral chips are wired in parallel, and a unique 'address' is usually hard-strapped to each device by tying pins high and low. For the flaky board, it could be possible that one of the chips is getting an incorrect address strapped to it (which would interfere with the address decoding mechanism), or again the mainboard I2C (or whatever I2C controller the L1 has access too, probably on-chip) is not being set-up/initialized/run properly by the later L1 firmware.

_________________
:O3000: <> :O3000: :O2000: :Tezro: :Fuel: x2+ :Octane2: :Octane: x3 :1600SW: x2 :O2: x2+ :Indigo2IMP: :Indigo2: x2 :Indigo: x3 :Indy: x2+

Once you step up to the big iron, you learn all about physics, electrical standards, and first aid - usually all in the same day


Top
 Profile  
 
Unread postPosted: Thu Oct 01, 2009 12:09 am 
Offline
User avatar

Joined: Thu Jun 17, 2004 10:35 am
Posts: 3894
Location: Wijchen, The Netherlands
SAQ wrote:
Why haven't you re-flashed the first image back so you have a failsafe again?

That would be the rational thing to do. But the damn thing gave me a heart attack when it pulled that stunt on me (it wasn't my system) so I decided not to mess with it anymore :oops:

_________________
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Onyx2:(2x) :O3x02L:
In the museum: almost every MIPS/IRIX system.
Wanted: GM1 board for Professional Series GT graphics (030-0076-003, 030-0076-004)


Top
 Profile  
 
Unread postPosted: Thu Oct 01, 2009 1:41 am 
Offline
Moderator
Moderator
User avatar

Joined: Tue Nov 25, 2003 12:09 pm
Posts: 799
Location: Europe
SAQ wrote:
Why haven't you re-flashed the first image back so you have a failsafe again?

The system currently boots from the good image, so it can't be overwritten. If for some reason the system decides to boot off the second image, I can always attach the cable again and re-do the procedure.
It is unlikely that the machine would suddenly start ignoring the "user default" flash setting, so I'm just going to leave it as it is :)

SGI did a really crappy job at testing these IP35 updates... With such a small set of possible hardware combinations I'd expect them to test everything, at least to see if it powers up.


Top
 Profile  
 
Unread postPosted: Fri May 11, 2012 1:50 am 
Offline
User avatar

Joined: Tue Sep 14, 2004 2:03 am
Posts: 482
Location: Hampshire, UK
I know this is an old post, but to save anyone else making the same mistakes..... from being too keen to update firmware I thought I should post this. SGI did test this sort of situation out before releasing updates and I think somewhere there is a release note that says don't jump straight from IRIX 6.5.11 or similar to 6.5.30 for a number of reasons....

If you have an older series Fuel that gives you these "firmware too old" messages, simply do a flashsc update to a mid life version such as 1.22.0 found in IRIX 6.5.21, followed by another update to 1.44.0 found in 6.5.30.

This avoids the problems mentioned above and would stop people incorrectly bad mouthing SGI's tech guy's.

I have plenty of Fuel systems that run 030-1707-003 motherboards, originally running very old f/w successfully running 1.44.0, it just needs some patience and 5 minutes of your time to check the procedure before running it.

Hope this helps anyhow.

_________________
In order of use at the moment..... :Fuel: :O3000:

Currently looking to buy good :Fuel: and :O2: :O2+: machine.


Top
 Profile  
 
Unread postPosted: Sat May 12, 2012 9:36 am 
Offline

Joined: Sat Jun 26, 2010 4:40 pm
Posts: 243
Location: Oslo, Norway
More detailed info on the wiki: http://www.nekochan.net/wiki/L1_Controller_Updates

_________________
Torfinn


Top
 Profile  
 
Unread postPosted: Sun May 13, 2012 2:04 am 
Offline
Moderator
Moderator
User avatar

Joined: Tue Nov 25, 2003 12:09 pm
Posts: 799
Location: Europe
tjsgifan wrote:
This avoids the problems mentioned above and would stop people incorrectly bad mouthing SGI's tech guy's.

A firmware update should never render a system unbootable.

It wouldn't have been that hard to add a check to the flashsc program to prevent such a situation from occurring, or to actually handle non-incremental updates properly. For a system that used to cost many thousands of dollars, I'd expect them to get at least this right.

But this is hardly the only problem with the 1.44.0 update; the update for the O3000's L2 controller is actually missing a file in the firmware image, breaking a large part of the L2 controller's functionality (including the ability to reflash it to an earlier version) -- how does this even get past testing?

I usually have a lot of respect for SGI engineers, but whoever was responsible for quality assurance on the 1.44.0 update did a bad job.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC - 8 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group