Categories
Outages

[Resolved] xn2 issues

The RAID array has gone inoperable on xn2 despite being optimal with all discs present only a few days ago.

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
——————————————————————————
u0    RAID-10   INOPERABLE     –       –       64K     596.025   OFF    ON

We are hoping this is a malfunction of the RAID controller and not an actual array issue, currently we are waiting on the datacenter for updates.

Update: 8.12PM – Just called the DC to chase up the reboot request and it’s being done now, apparently they are very busy tonight.
Update: 8.26PM – Machine rebooted but not responsive to ping and no output on the KVM. Is being checked, also preparing to go onsite!
Update 8.35PM – System is now booting up and the RAID array appears intact! Will do some sanity checks etc once it’s up.

Update 8.39PM – It appears that there were multiple drive failures. This machine has a 4x disk RAID10 set.
Drive p0 – We hot-swapped a few days ago and is OK
Drive p1 – Failed, completely dead/undetected
Drive p2 – OK
Drive p3 – Rebuilding

We will let drive p3 finish rebuilding itself. Once that is complete we will replace drive p1, and when that finishes we will also replace drive p3 as a precaution. Please can we ask that you avoid any disk intensive tasks for the next 48 hours, so we can restore full redundancy and performance to the array in a timely fashion.

** At this point ALL VPS should be online. If yours is having issues, log a ticket **

Update 10.11PM – Drive p3 is 90% rebuilding. Will replace the failed p1 disk shortly.
Update 11.13PM – Drive P3 has been fully rebuilt. Drive P1 has been hot-swapped out and is rebuilding.

Update 20.59PM 26/04/12: Full redundancy + performance has now been restored to the RAID array on xn2. We dont expect any further issues but as always we will monitor this server carefully for the time being.

Thanks