[tech] Downtime & R.I.P. Maltair
Felix von Perger
frekk at ucc.asn.au
Wed Aug 8 23:51:23 AWST 2018
Dear tech subscribers,
For those of you who have not been following the committee discussions
of the last week or so, there was a total service outage this morning
between 8:00 and 10:00 which was due to RCD testing in Cameron Hall.
Apologies for any inconvenience.
Sadly, in the process of turning things back on after the power was
restored, an IMM2 firmware bug on Maltair seems to have rendered it
permanently unbootable (see
https://support.lenovo.com/au/en/solutions/ht118532). [CFE] performed a
firmware upgrade this evening to the latest version (v6.8) from v4.3
however it seems like the damage has already been done and either the
entire motherboard or the builtin 5V voltage regulator will need to be
replaced or repaired.
Due to Maltair being presently out of action, additional downtime may be
experienced for certain services that were previously hosted on Maltair.
Since Maltair accounted for most of our RAM availability, member VMs
with large RAM requirements may remain powered off for the time being or
have their maximum RAM reduced.
Any suggestions for replacement hardware for Maltair are welcome. The
existing server is a 1RU IBM System x3550 M4 (7914/7915), and it is
likely that the majority of its parts (CPU, RAM, RAID, 10Gb NIC, PSUs)
are still functional despite the system board being fried.
Best regards,
Felix von Perger [FVP]
UCC Secretary & Wheel Member
More information about the tech
mailing list