[tech] Manbo downtime, /services offline
David Adam
zanchey at ucc.gu.uwa.edu.au
Thu Jan 17 19:27:26 WST 2008
Probably due to extreme weather conditions, Manbo has been having severe
operating difficulty, so it has been shut down for an unknown period of
time.
It spent most of this afternoon trying to reboot but failing due to a ZFS
misconfiguration (Adrian, I had to mark one of your shares as
unmountable), and once this problem the network interface refused to
initialise. Additionally, the disk arrays were reporting multiple
problems.
As UCC is currently locked, we turned Manbo off remotely in order to avoid
further damage.
This has taken /services offline, which means (among other things) no
main UCC website, wiki or forums. User web space is still working. Windows
machine logins may not work, and /away accesses (including Windows
home directories) will certainly not.
We believe this is due to high temperatures in the machine room caused by
the failure of one of the airconditioners (which has been doing funny
things for a while). It's still under warranty but getting hold of the
manufacturer is proving difficult.
There are a couple of things we're doing to try and restore service:
- the cables for our Fibre Channel disk array have arrived, so we're
trying to move the files located on Manbo over to Musundo, the V480
which is hosting the FC arrays. Musundo runs much cooler (and faster).
- if we can't get the aircon fixed in a reasonable time, we'll buy a new
one (last time it took less than six hours from "we need a new aircon"
to its installation).
If you have any problems or questions please reply to the list or contact
us directly on wheel at ucc.gu.uwa.edu.au
Thanks,
David Adam
UCC Wheel Member
zanchey at ucc.gu.uwa.edu.au
More information about the tech
mailing list