Nekochan Net

Official Chat Channel: #nekochan // irc.nekochan.net
It is currently Fri Oct 31, 2014 12:15 pm

All times are UTC - 8 hours [ DST ]


Forum rules


Any posts concerning pirated software or offering to buy/sell/trade commercial software are subject to removal.



Post new topic Reply to topic  [ 37 posts ]  Go to page 1, 2, 3  Next
Author Message
Unread postPosted: Thu Jan 12, 2012 6:09 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
... since a memory upgrade - thank's to ramq :) - my tezro constantly crashes after ~4 hours with system panic.
:shock:
in the SYSLOG I get

Code:
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: Serial #: P1000651
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: HARDWARE ERROR STATE:
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +  Errors on node Nasid 0x0 (0)
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +    IP35 in /hw/module/001c01/node [serial number NCG031]
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +      BEDROCK signalled following errors.
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB1_A Entry Register: 0x817400ae2227d1
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          00<->00: IO Write operation
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          39<->02: SN0Net address 0x2b8889f4
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          44<->40: TNUM of XTalk req. 0x14
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          48<->45: SIDN of XTalk req. 0xb
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          54<->52: CRB error code 0x0
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          55<->55: CRB has an error
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB1_B Entry Register: 0x280302400000000
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB4_A Entry Register: 0x817400ae222781
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          00<->00: IO Write operation
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          39<->02: SN0Net address 0x2b8889e0
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          44<->40: TNUM of XTalk req. 0x14
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          48<->45: SIDN of XTalk req. 0xb
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          54<->52: CRB error code 0x0
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          55<->55: CRB has an error
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB4_B Entry Register: 0x300304400000000
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB7_A Entry Register: 0xf77400ae222741
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          00<->00: IO Write operation
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          39<->02: SN0Net address 0x2b8889d0
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          44<->40: TNUM of XTalk req. 0x14
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          48<->45: SIDN of XTalk req. 0xb
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          49<->49: Error set in incoming XTalk request
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          50<->50: CRB entry has been marked
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          54<->52: CRB error code 0x7
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          55<->55: CRB has an error
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB7_B Entry Register: 0x300304400000000
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB9_A Entry Register: 0x817400ae2227c1
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          00<->00: IO Write operation
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          39<->02: SN0Net address 0x2b8889f0
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          44<->40: TNUM of XTalk req. 0x14
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          48<->45: SIDN of XTalk req. 0xb
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          54<->52: CRB error code 0x0
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          55<->55: CRB has an error
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        ICRB9_B Entry Register: 0x280302400000000
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +        BEDROCK NI Port Error Register: 0xff
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: +          07<->00: Number of LLP SN errors 0xff
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: End Hardware Error State
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: ++FRU ANALYSIS BEGIN
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: No rules triggered:  Insufficient data
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: 
Jan 12 14:35:41 5D:tezro sn0log: Timeout Histogram is empty.
Jan 12 14:35:41 5D:tezro sn0log: 
Jan 12 14:35:41 5D:tezro sn0log: A Fatal: ++FRU ANALYSIS END
Jan 12 14:35:41 6D:tezro sn0log: End of flashlog for /hw/module/001c01/node/hub/mon



What could that be?


Last edited by hhoffman on Thu Jan 12, 2012 10:15 am, edited 1 time in total.

Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 6:28 am 
Offline
User avatar

Joined: Tue Aug 21, 2007 10:12 pm
Posts: 2859
Location: Fantasyland
Did you clean out the sockets, clean the contacts, try moving the sticks around, all that good stuff? I would start with blowing all the dust out and re-plugging every stick, extra firm.


If problems continue, last time I heard of a tezro memory upgrade failing it was because of a VRM pushed beyond it's limits. It could be power related, but normally the diagnostic is smart enough to tell you more-or-less exactly what is failing in the power systems.

_________________
"If no one comes from the future to stop you from doing it then how bad of a decision can it really be?"


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 7:13 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
... well they where cleaned now from dust for several times. I mean both, the sockets and the modules .....
all RAM is showing up at hinv.


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 7:16 am 
Offline
User avatar

Joined: Fri Apr 02, 2010 9:14 am
Posts: 59
Location: NC - USA
When it crash do it call the same dimm "IP35 in /hw/module/001c01/node [serial number NCG031]
"
I'm assuming that is the DIMM location. If so, as suggested earlier move it around and if crash see if the same serial number or hardware location is reported.

_________________
5/11/11 12:58:19 AM gfxCardStatus[268] AMD Radeon HD 6750M in use. Bummer! Less battery life for you.
5/11/11 12:58:20 AM gfxCardStatus[268] Intel HD Graphics 3000 in use. Sweet deal! More battery life.
MacBook Pro 17inch 2011
Mac Mini 2010


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 8:22 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
hmm, with
/usr/sbin/l1cmd --scdev /hw/module/001c01/L1/controller serial all

I will get:

Code:
EEPROM      Product Name    Serial         Part Number           Rev  T/W   
----------  --------------  -------------  --------------------  ---  ------
NODE        IP53_4CPU       NCG031         030_1868_001          C    00


and

Code:
EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI     
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE000000000000000C6E9900 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 2     CE000000000000000CC84D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 4     CE000000000000000CBB4D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 6     7F94FFFFFFFFFFFFCE79C80D SM57264DSGI100C2   00FF   8.0  N/A     
DIMM 1     CE000000000000000C309D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 3     CE000000000000000CBF4D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 5     CE000000000000000CC34D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 7     CE000000000000000C9CAF00 M3 47L6423DT3-CA0   3D   10.0  N/A 


... so it looks more like a cpu problem?


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 9:00 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
any ideas? also nothing useful (for me) in the controller log:

Code:
/usr/sbin/l1cmd --scdev /hw/module/001c01/L1/controller log     
01/11/12 11:32:39 SCC WR 6 (len=97) - UART:UART_TIMEOUT
01/11/12 11:32:41 power up (PANEL)
01/11/12 11:32:45 Node 0 XTalk clock 88
01/11/12 11:32:48 reset again MIPS
01/11/12 11:32:52 Node 0 XTalk clock 88
01/11/12 15:59:46 Power Down: issue 'pwr d' again to immediately power down.
01/11/12 15:59:47 power down (PANEL)
01/11/12 15:59:47 SCC WR 6 (len=97) - UART:UART_TIMEOUT
01/12/12 00:52:41 L1 booting 1.30.11
01/12/12 00:52:43 USB0: waiting on open
01/12/12 01:40:24 L1 booting 1.30.11
01/12/12 01:40:27 USB0: waiting on open
01/12/12 01:40:56 power up (PANEL)
01/12/12 01:41:01 Node 0 XTalk clock 88
01/12/12 01:41:03 reset again MIPS
01/12/12 01:41:06 Cooling system stabilized
01/12/12 01:41:07 Node 0 XTalk clock 88
01/12/12 05:53:19 Power Down: issue 'pwr d' again to immediately power down.
01/12/12 05:53:20 SCC WR 6 (len=97) - UART:UART_TIMEOUT
01/12/12 05:55:44 power down (PANEL)
01/12/12 05:55:45 power up (PANEL)
01/12/12 05:55:50 Node 0 XTalk clock 88
01/12/12 05:55:52 reset again MIPS
01/12/12 05:55:57 Node 0 XTalk clock 88
01/12/12 08:16:55 Power Down: issue 'pwr d' again to immediately power down.
01/12/12 08:16:56 SCC WR 6 (len=97) - UART:UART_TIMEOUT
01/12/12 08:16:57 power down (PANEL)
01/12/12 08:18:47 L1 booting 1.30.11
01/12/12 08:18:49 USB0: waiting on open
01/12/12 08:19:02 power up (PANEL)
01/12/12 08:19:07 Node 0 XTalk clock 88
01/12/12 08:19:09 Cooling system stabilized
01/12/12 08:19:09 reset again MIPS
01/12/12 08:19:14 Node 0 XTalk clock 88
01/12/12 08:37:12 Power Down: issue 'pwr d' again to immediately power down.
01/12/12 08:37:13 power down (PANEL)
01/12/12 08:37:13 SCC WR 6 (len=97) - UART:UART_TIMEOUT
01/12/12 08:45:58 L1 booting 1.30.11
01/12/12 08:46:00 USB0: waiting on open
01/12/12 08:46:51 power up (PANEL)
01/12/12 08:46:56 Node 0 XTalk clock 88
01/12/12 08:46:58 reset again MIPS
01/12/12 08:47:00 Cooling system stabilized
01/12/12 08:47:02 Node 0 XTalk clock 88


tezro chrashed again. I removed now all new memory modules.


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 10:13 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
... still no luck with removing all *new* RAM modules, the system is still crashing (with panic).
:cry:

any ideas?


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 10:59 am 
Offline
User avatar

Joined: Fri Apr 02, 2010 9:14 am
Posts: 59
Location: NC - USA
I'm going to guess that CPU /hw/module/001c01/node [serial number NCG031 caught the error and reported it. I will look at "DIMM 6 7F94FFFFFFFFFFFFCE79C80D SM57264DSGI100C2 00FF 8.0 N/A " as being suspect. It don't match the rest. Different Speed. If you had another then you could have paired it off.
Try removing it and see how the system run. Take a look at it also see if it physically different.

_________________
5/11/11 12:58:19 AM gfxCardStatus[268] AMD Radeon HD 6750M in use. Bummer! Less battery life for you.
5/11/11 12:58:20 AM gfxCardStatus[268] Intel HD Graphics 3000 in use. Sweet deal! More battery life.
MacBook Pro 17inch 2011
Mac Mini 2010


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 11:07 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
that printout came from a reinstall of the *old* 512 modules. This is the actual printout in the moment:

Code:
EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI     
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE000000000000000C6E9900 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 2     CE000000000000000CBB4D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 4     CE000000000000000CBF4D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 6     CE000000000000000CD74D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 1     CE000000000000000C309D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 3     CE000000000000000CC84D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 5     CE000000000000000CC34D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 7     CE000000000000000CD34D00 M3 46L2820DT2-CA0   2D   10.0  N/A 


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 11:53 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
does anyone know where the runalldiags script went in irix 6.5.28?


Top
 Profile  
 
Unread postPosted: Thu Jan 12, 2012 3:23 pm 
Offline
Moderator
Moderator
User avatar

Joined: Sun Jun 06, 2004 5:55 pm
Posts: 5219
Location: NC - USA
Doesn't look like you've found a ready solution. I'd suggest some basic diagnostic troubleshooting to see if you can narrow down cause and effect.

First, a little additional background info might be helpful. Could you post an hinv -vm, an L1 serial all, and an L1 pci? (before and after your recent upgrade if they're available)

Then I'd recommend reducing your Tezro to a minimal configuration, if it runs in that configuration without generating the crash/panic, re-add the components one at a time to see if the re-installation of any particular item(s) case the problem to reappear.

So, remove:
  • All PCI boards,
  • All but the memory in the first bank (two DIMMs),
  • the DM3
  • the secondary hard drive (unit 2 in your hinv)
Boot the system, stop in the PROM, and clear the Power-on Diagnostics (POD) logs (so you don't propagate existing diagnostic failures). To access POD mode, stop at the PROM command line (item 5 in the PROM menu list), and sequentially run the following commands from the command line:
  • pod
  • go cac
  • log <Note: I'd suggest recording the output of the pod log command>
  • clearalllogs
  • initalllogs
  • flush
  • reset (the system will restart)
After the system restarts, go back into the PROM monitor and execute:
  • update
Then boot the system and test to see if the panic re-occurs.

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************


Top
 Profile  
 
Unread postPosted: Fri Jan 13, 2012 3:18 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
when I cal pod from the command line in the PROM, nothing happens ....
:roll:

Meanwhile I have a nullmodemcable, but not a genderchanger to connect the console port of the tezro to the serial port of an o2. Is there somewhere a wiki about how to 'talk' to the tezro L1 from another irix workstation?
Or maybe a wiki, how to connect from linux through the usb L1 connector?

How do I find faulty memory modules?

well that is the system:

Code:
Location: /hw/module/001c01/node
       IP53_4CPU Board: barcode NCG031     part 030-1868-001 rev -C
Location: /hw/module/001c01/IXbrick/xtalk/11
       WS_INT_53 Board: barcode NBY613     part 030-1881-006 rev -A
Location: /hw/module/001c01/IXbrick/xtalk/12
      ODY128B1_2 Board: barcode MZD531     part 030-1884-004 rev -D
Location: /hw/module/001c01/IXbrick/xtalk/13
       XT-DIGVID Board: barcode NCB425     part 030-1927-003 rev  B
Location: /hw/module/001c01/IXbrick/xtalk/15
       WS_INT_53 Board: barcode NBY613     part 030-1881-006 rev -A
Location: /hw/module/001c01/IXbrick/xtalk/15/pci-x/0/1/ioc4
             IO9 Board: barcode NAF962     part 030-1771-005 rev -A
Location: /hw/module/001c01/IXbrick/xtalk/15/pci-x/1/2
     PCI_SIO_UFC Board: barcode MYS662     part 030-1657-003 rev  A
4 700 MHZ IP35 Processors
CPU: MIPS R16000 Processor Chip Revision: 2.1
FPU: MIPS R16010 Floating Point Chip Revision: 2.1
CPU 0 at Module 001c01/Slot 0/Slice A: 700 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.1. Scache: Size 4 MB Speed 350 Mhz  Tap 0xc
CPU 1 at Module 001c01/Slot 0/Slice B: 700 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.1. Scache: Size 4 MB Speed 350 Mhz  Tap 0xc
CPU 2 at Module 001c01/Slot 0/Slice C: 700 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.1. Scache: Size 4 MB Speed 350 Mhz  Tap 0xc
CPU 3 at Module 001c01/Slot 0/Slice D: 700 Mhz MIPS R16000 Processor Chip (enabled)
  Processor revision: 2.1. Scache: Size 4 MB Speed 350 Mhz  Tap 0xc
Main memory size: 8192 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 4 Mbytes
Memory at Module 001c01/Slot 0: 8192 MB (enabled)
  Bank 0 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 1 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 2 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 3 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 4 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 5 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 6 contains 1024 MB (Premium) DIMMS (enabled)
  Bank 7 contains 1024 MB (Premium) DIMMS (enabled)
Integral SCSI controller 3: Version Fibre Channel QL2342 Port 1, 133 MHz PCI-X
  Disk drive: unit 0 on SCSI controller 3 (unit 0)
  Disk drive: unit 1 on SCSI controller 3 (unit 1)
  Disk drive: unit 2 on SCSI controller 3 (unit 2)
  Disk drive: unit 3 on SCSI controller 3 (unit 3)
  Disk drive: unit 4 on SCSI controller 3 (unit 4)
  Disk drive: unit 5 on SCSI controller 3 (unit 5)
  Disk drive: unit 6 on SCSI controller 3 (unit 6)
  Disk drive: unit 7 on SCSI controller 3 (unit 7)
  Disk drive: unit 8 on SCSI controller 3 (unit 8)
  Disk drive: unit 9 on SCSI controller 3 (unit 9)
  Disk drive: unit 10 on SCSI controller 3 (unit 10)
  Disk drive: unit 11 on SCSI controller 3 (unit 11)
  Disk drive: unit 12 on SCSI controller 3 (unit 12)
  Disk drive: unit 13 on SCSI controller 3 (unit 13)
  Disk drive: unit 14 on SCSI controller 3 (unit 14)
Integral SCSI controller 4: Version Fibre Channel QL2342 Port 2, 133 MHz PCI-X
  Disk drive: unit 0 on SCSI controller 4 (unit 0)
  Disk drive: unit 1 on SCSI controller 4 (unit 1)
  Disk drive: unit 2 on SCSI controller 4 (unit 2)
  Disk drive: unit 3 on SCSI controller 4 (unit 3)
  Disk drive: unit 4 on SCSI controller 4 (unit 4)
  Disk drive: unit 5 on SCSI controller 4 (unit 5)
  Disk drive: unit 6 on SCSI controller 4 (unit 6)
  Disk drive: unit 7 on SCSI controller 4 (unit 7)
  Disk drive: unit 8 on SCSI controller 4 (unit 8)
  Disk drive: unit 9 on SCSI controller 4 (unit 9)
  Disk drive: unit 10 on SCSI controller 4 (unit 10)
  Disk drive: unit 11 on SCSI controller 4 (unit 11)
  Disk drive: unit 12 on SCSI controller 4 (unit 12)
  Disk drive: unit 13 on SCSI controller 4 (unit 13)
  Disk drive: unit 14 on SCSI controller 4 (unit 14)
Integral SCSI controller 5: Version Fibre Channel QL2342 Port 1, 133 MHz PCI-X
Integral SCSI controller 6: Version Fibre Channel QL2342 Port 2, 133 MHz PCI-X
Integral SCSI controller 2: Version IDE (ATA/ATAPI) IOC4
  CDROM: unit 0 on SCSI controller 2
Integral SCSI controller 0: Version QL12160, low voltage differential
  Disk drive: unit 1 on SCSI controller 0 (unit 1)
  Disk drive: unit 2 on SCSI controller 0 (unit 2)
Integral SCSI controller 1: Version QL12160, low voltage differential
IOC3/IOC4 serial port: tty3
IOC3/IOC4 serial port: tty4
IOC3/IOC4 serial port: tty5
IOC3/IOC4 serial port: tty6
Graphics board: V12
Gigabit Ethernet: tg1, module 001c01, PCI bus 3 slot 1 port 0
Gigabit Ethernet: tg2, module 001c01, PCI bus 3 slot 1 port 1
Integral Gigabit Ethernet: tg0, module 001c01, PCI bus 1 slot 4
Iris Audio Processor: version MAD revision 1, number 1
Iris Audio Processor: version RAD revision 13.0, number 1
  PCI Adapter ID (vendor 0x1077, device 0x2312) PCI slot 1
  PCI Adapter ID (vendor 0x1077, device 0x2312) PCI slot 1
  PCI Adapter ID (vendor 0x14e4, device 0x1648) PCI slot 1
  PCI Adapter ID (vendor 0x14e4, device 0x1648) PCI slot 1
  PCI Adapter ID (vendor 0x1077, device 0x2312) PCI slot 2
  PCI Adapter ID (vendor 0x1077, device 0x2312) PCI slot 2
  PCI Adapter ID (vendor 0x10a9, device 0x100a) PCI slot 1
  PCI Adapter ID (vendor 0x104c, device 0xac28) PCI slot 2
  PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 3
  PCI Adapter ID (vendor 0x14e4, device 0x1645) PCI slot 4
  PCI Adapter ID (vendor 0x1412, device 0x1724) PCI slot 2
  PCI Adapter ID (vendor 0x10a9, device 0x0005) PCI slot 1
  PCI Adapter ID (vendor 0x10a9, device 0x0003) PCI slot 2
XT-DIGVID Multi-standard Digital Video: controller 0, unit 0, version 0x0
IOC4 firmware revision 79
IOC3/IOC4 external interrupts: 1
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
Dual Channel Display
IP35prom in Module 001c01/Slot n0: Revision 6.210


and

Code:
Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      NVRAM         P1000651
Reference System Serial Number  NVRAM         P1000651
Local Brick Serial Number       EEPROM        NBY613
Reference Brick Serial Number   NVRAM         NBY613


EEPROM      Product Name    Serial         Part Number           Rev  T/W   
----------  --------------  -------------  --------------------  ---  ------
INTERFACE   WS_INT_53       NBY613         030_1881_006          A    00   
IO9         IO9             NAF962         030_1771_005          A    00   
ODYSSEY     ODY128B1_2      MZD531         030_1884_004          D    00   
SNOWBALL    SNOWBALL_EDGE   NCB425         030_1927_003          B    00   
NODE        IP53_4CPU       NCG031         030_1868_001          C    00   
IO DGHTR    CHWS_IO_DAUG    NAM759         030_1875_003          A    00   

EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI     
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE000000000000000C6E9900 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 2     CE000000000000000CBB4D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 4     CE000000000000000CBF4D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 6     CE000000000000000CD74D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 1     CE000000000000000C309D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 3     CE000000000000000CC84D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 5     CE000000000000000CC34D00 M3 46L2820DT2-CA0   2D   10.0  N/A     
DIMM 7     CE000000000000000CD34D00 M3 46L2820DT2-CA0   2D   10.0  N/A 


and

Code:
Bus Slot Slot Stat Bus Stat  Power Mode/Speed
--- ---- --------- --------- ----- ----------
  1    1 0x80 0x01      0x04   15W PCI  66MHz
  2    1 0x00 0x00      0x02  7.5W PCI  33MHz
  2    2 0x00 0x00      0x02  7.5W PCI  33MHz
  2    3 0x00 0x0f      0x02  none PCI  33MHz
  3    1 0x00 0x0d      0x2c   15W PCIX 133MHz
  3    2 0x00 0x0d      0x2c   15W PCIX 133MHz
  4    1 0x00 0x0d      0x6c   15W PCIX 133MHz
  4    2 0x00 0x0f      0x6c  none PCIX 133MHz


Top
 Profile  
 
Unread postPosted: Fri Jan 13, 2012 3:39 pm 
Offline
Moderator
Moderator
User avatar

Joined: Sun Jun 06, 2004 5:55 pm
Posts: 5219
Location: NC - USA
hhoffman wrote:
Meanwhile I have a nullmodemcable, but not a genderchanger to connect the console port of the tezro....
That should be cheap (and easy) to solve.
hhoffman wrote:
How do I find faulty memory modules?
As guardian452 mentioned, there's already been a tezro-with-issues thread that ended up being attributed to the power draw on the system. Yours looks pretty heavily loaded - probably more so than the one in the other thread. You might indeed have faulty memory, but it might also be possible that maxing out your Tezro's memory caused just enough additional power load to cause problems.

If that's the case, hopefully no permanent damage was done. Testing your system with a minimal configuration and working your way back towards your original config will probably tell the story.

Considering your mention that your Tezro would crash after approximately four hours of use, there's also the possibility the additional load might raised the temperature on some of the internal components just enough to cause heat-related issues. That superficially fits with your Bedrock ASIC being prominently mentioned in the error log you posted - the Bedrock is usually the hottest running part of a Tezro. As you start the add-to-the-minimal-config troubleshooting process, during each step I'd suggest allowing your Tezro to run as long as it took for it to crash before, and then record the output of an L1 env so you can see what effect adding additional load has on system temperatures (and stability).

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************


Top
 Profile  
 
Unread postPosted: Sun Jan 15, 2012 7:42 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
ok, meanwhile I got a genderchanger and can now connect on o2 (/dev/ttyd1) to the console port of the tezro. What's next? How will I get the L1 env output of the tezro displayed on the o2?

Thank you for yor help, recondas!
:D


Top
 Profile  
 
Unread postPosted: Sun Jan 15, 2012 9:08 am 
Offline
User avatar

Joined: Fri Apr 01, 2011 7:45 am
Posts: 71
... btw, the tezro is now running for 4:30 with full setup and 8GB RAM in idle mode without crash.
:roll:


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 37 posts ]  Go to page 1, 2, 3  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group