[tech] Downtime & R.I.P. Maltair

Felix von Perger frekk at ucc.asn.au
Wed Aug 8 23:51:23 AWST 2018


Dear tech subscribers,

For those of you who have not been following the committee discussions 
of the last week or so, there was a total service outage this morning 
between 8:00 and 10:00 which was due to RCD testing in Cameron Hall. 
Apologies for any inconvenience.

Sadly, in the process of turning things back on after the power was 
restored, an IMM2 firmware bug on Maltair seems to have rendered it 
permanently unbootable (see 
https://support.lenovo.com/au/en/solutions/ht118532). [CFE] performed a 
firmware upgrade this evening to the latest version (v6.8) from v4.3 
however it seems like the damage has already been done and either the 
entire motherboard or the builtin 5V voltage regulator will need to be 
replaced or repaired.

Due to Maltair being presently out of action, additional downtime may be 
experienced for certain services that were previously hosted on Maltair. 
Since Maltair accounted for most of our RAM availability, member VMs 
with large RAM requirements may remain powered off for the time being or 
have their maximum RAM reduced.

Any suggestions for replacement hardware for Maltair are welcome. The 
existing server is a 1RU IBM System x3550 M4 (7914/7915), and it is 
likely that the majority of its parts (CPU, RAM, RAID, 10Gb NIC, PSUs) 
are still functional despite the system board being fried.

Best regards,

Felix von Perger [FVP]
UCC Secretary & Wheel Member



More information about the tech mailing list