Discussion:
Fuel gets shutdown due to the high fault limit
(too old to reply)
Daichi Kawahata
2006-11-20 15:19:59 UTC
Permalink
Hi,

I have a problem that had started since last Monday (11/13),
after a bunch of WARNINGs below (*1), my Fuel automatically
got down, and won't boot (actually it does, but before entering
PROM screen, it gets down with red LED lamp).

Though, as far as system booting is concerned, it's not 100%
reproducible, this mail is sent from this happily booted Fuel,
however these WARNINGs still keep logged and I bet in the next
24 hours, my Fuel will get down again.

From what I've searched, it might be related with old mother
board. Do you think my board has also a problem or an another
factor (I have really no clue whether they're false-positive or
not)?

I'll willing to provide the information needed, will appreciated
any comments on that.

(*1) From SYSLOG

Nov 13 19:43:48 4A:fuel unix: |$(0x159)WARNING: 001a01 ATTN: PIMM 5V AUX high warning limit reached 5.512V.
Nov 13 19:44:08 4A:fuel unix: |$(0x160)WARNING: 001a01 ATTN: PIMM 5V AUX level stabilized @ 5.460V.
[...]
Nov 13 22:40:43 4A:fuel unix: |$(0x157)WARNING: 001a01 ATTN: PIMM 5V AUX high fault limit reached @ 6.032V.
Nov 13 22:40:43 4A:fuel unix: WARNING: Auto power down will be delayed until shutdown is complete.
[...]
Nov 13 22:40:52 3A:fuel INFO: The system is shutting down.
Nov 13 22:40:52 3A:fuel INFO: Please wait.

Regards,
--
Daichi
Daichi Kawahata
2006-11-20 17:38:59 UTC
Permalink
Hi,

I should have included my hinv outputs;

$ hinv -vm
Location: /hw/module/001c01/node
IP34 Board: barcode MMN913 part 030-1707-004 rev -C
Location: /hw/module/001c01/Ibrick/xtalk/13
ASTODYB Board: barcode MDB830 part 030-1725-001 rev -F
Location: /hw/module/001c01/Ibrick/xtalk/14
IP34 Board: barcode MMN913 part 030-1707-004 rev -C
Location: /hw/module/001c01/Ibrick/xtalk/15
IP34 Board: barcode MMN913 part 030-1707-004 rev -C
1 500 MHZ IP35 Processor
CPU: MIPS R14000 Processor Chip Revision: 2.3
FPU: MIPS R14010 Floating Point Chip Revision: 2.3
CPU 0 at Module 001c01/Slot 0/Slice A: 500 Mhz MIPS R14000 Processor Chip (enabled)
Processor revision: 2.3. Scache: Size 2 MB Speed 250 Mhz Tap 0xa
Main memory size: 2048 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 2 Mbytes
Memory at Module 001c01/Slot 0: 2048 MB (enabled)
Bank 0 contains 512 MB (Standard) DIMMS (enabled)
Bank 1 contains 512 MB (Standard) DIMMS (enabled)
Bank 2 contains 512 MB (Standard) DIMMS (enabled)
Bank 3 contains 512 MB (Standard) DIMMS (enabled)
Integral SCSI controller 2: Version QL12160, low voltage differential
Integral SCSI controller 3: Version QL12160, low voltage differential
Integral SCSI controller 0: Version QL12160, low voltage differential
Disk drive: unit 1 on SCSI controller 0 (unit 1)
Integral SCSI controller 1: Version QL12160, single ended
CDROM: unit 4 on SCSI controller 1
IOC3/IOC4 serial port: tty1
IOC3/IOC4 serial port: tty2
IOC3 parallel port: plp1
Graphics board: V10
Integral Fast Ethernet: ef0, version 1, module 001c01, pci 4
Iris Audio Processor: version MAD revision 1, number 1
PCI Adapter ID (vendor 0x1412, device 0x1724) PCI slot 1
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 2
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 1
PCI Adapter ID (vendor 0x10a9, device 0x0003) PCI slot 4
PCI Adapter ID (vendor 0x11c1, device 0x5802) PCI slot 5
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
IP35prom in Module 001c01/Slot n0: Revision 6.170
USB controller: type OHCI


On Tue, 21 Nov 2006 00:19:59 +0900
Post by Daichi Kawahata
Hi,
I have a problem that had started since last Monday (11/13),
after a bunch of WARNINGs below (*1), my Fuel automatically
got down, and won't boot (actually it does, but before entering
PROM screen, it gets down with red LED lamp).
Though, as far as system booting is concerned, it's not 100%
reproducible, this mail is sent from this happily booted Fuel,
however these WARNINGs still keep logged and I bet in the next
24 hours, my Fuel will get down again.
From what I've searched, it might be related with old mother
board. Do you think my board has also a problem or an another
factor (I have really no clue whether they're false-positive or
not)?
I'll willing to provide the information needed, will appreciated
any comments on that.
(*1) From SYSLOG
Nov 13 19:43:48 4A:fuel unix: |$(0x159)WARNING: 001a01 ATTN: PIMM 5V AUX high warning limit reached 5.512V.
[...]
Nov 13 22:40:43 4A:fuel unix: WARNING: Auto power down will be delayed until shutdown is complete.
[...]
Nov 13 22:40:52 3A:fuel INFO: The system is shutting down.
Nov 13 22:40:52 3A:fuel INFO: Please wait.
Regards,
--
Daichi
Joerg Behrens
2006-11-20 17:44:13 UTC
Permalink
Post by Daichi Kawahata
Hi,
I have a problem that had started since last Monday (11/13),
after a bunch of WARNINGs below (*1), my Fuel automatically
got down, and won't boot (actually it does, but before entering
PROM screen, it gets down with red LED lamp).
Though, as far as system booting is concerned, it's not 100%
reproducible, this mail is sent from this happily booted Fuel,
however these WARNINGs still keep logged and I bet in the next
24 hours, my Fuel will get down again.
From what I've searched, it might be related with old mother
board.....
Than you also read the solution.

1. Call SGI Service which have to replace the motherboard
2. If you dont have a service contract disable the L1 Env monitoring.


regards
Joerg

ps. I have choose option 2 for my FUEL system :|
--
TakeNet GmbH http://www.takenet.de
97080 Wuerzburg Tel: +49 931 903-2243
Alfred-Nobel-Straße 20 Fax: +49 931 903-3025
Daichi Kawahata
2006-11-21 15:06:52 UTC
Permalink
Guten Tag, Joerg,

On Mon, 20 Nov 2006 18:44:13 +0100
Post by Joerg Behrens
Post by Daichi Kawahata
I have a problem that had started since last Monday (11/13),
after a bunch of WARNINGs below (*1), my Fuel automatically
got down, and won't boot (actually it does, but before entering
PROM screen, it gets down with red LED lamp).
Though, as far as system booting is concerned, it's not 100%
reproducible, this mail is sent from this happily booted Fuel,
however these WARNINGs still keep logged and I bet in the next
24 hours, my Fuel will get down again.
From what I've searched, it might be related with old mother
board.....
Than you also read the solution.
1. Call SGI Service which have to replace the motherboard
2. If you dont have a service contract disable the L1 Env monitoring.
Thanks for your advice,

For those who are not sure how to disable environ monitoring,
try (as root);

# l1cmd env off

Gruß,
--
Daichi
Joerg Behrens
2006-11-21 15:29:23 UTC
Permalink
Post by Daichi Kawahata
Guten Tag, Joerg,
On Mon, 20 Nov 2006 18:44:13 +0100
Post by Joerg Behrens
Post by Daichi Kawahata
I have a problem that had started since last Monday (11/13),
after a bunch of WARNINGs below (*1), my Fuel automatically
got down, and won't boot (actually it does, but before entering
PROM screen, it gets down with red LED lamp).
Though, as far as system booting is concerned, it's not 100%
reproducible, this mail is sent from this happily booted Fuel,
however these WARNINGs still keep logged and I bet in the next
24 hours, my Fuel will get down again.
From what I've searched, it might be related with old mother
board.....
Than you also read the solution.
1. Call SGI Service which have to replace the motherboard
2. If you dont have a service contract disable the L1 Env monitoring.
Thanks for your advice,
For those who are not sure how to disable environ monitoring,
try (as root);
# l1cmd env off
My FUEL runs more than a year now with a disabled L1 env monitoring. I
expect that someday the machine refuses to boot because something breaks.

Updating the firmware to the latest revisions doesnt help. So this isnt
a software only problem. From some nekochan members i know that SGI
have change the hardware when this problem occurred.

regards
Joerg
--
TakeNet GmbH http://www.takenet.de
97080 Wuerzburg Tel: +49 931 903-2243
Alfred-Nobel-Straße 20 Fax: +49 931 903-3025
Loading...