Categories
Planned Maintainance

[Resolved] VZ1 Drive failure

A drive has failed in the RAID-10 array on vz1 and needs to be replaced.

20.35: Drive has been hot-swapped and the array is rebuilding. Once this completes we will reboot the machine.
00.20:
The server is now restarting
00:30: Automatic forced FSCK of /vz is now running and 17% complete.
00:40: Now 31% complete
00:50: Now 74% complete
01:00: FSCK has finished and passed. Now rebooting back into OpenVZ kernel and VM’s will boot up.

Categories
Outages

[Resolved] xn11 disk failure

A drive has failed in XN11, has been replaced and is booting up now.

Update: Server has failed to come back up. Sorry about this folks, waiting for our KVM to be moved to it now so we can see what the hold up is.

Update 2: There are some filesystem issues which we are repairing, service will be restored ASAP

Update 3: The machine is now online and all VPS are running. We are going through each VPS to complete a FSCK of their individual filesystems to ensure they are clean. If your VPS goes down, please do not try to restart it. It will be restarted automatically once the FSCK completes. Thanks

Categories
Outages

[Resolved] Network Issues

We are currently seeing some intermittent packet loss affecting all of our UK infrastructure. As soon we we have more information from the datacentre we will post it here.

Appologies for the inconvenience.

Update: Although the outage was only a few minutes, we are seeing some sporadic DNS issues which are being worked on.

Update: Total downtime was less than 5 minutes, this was caused by some issues with one of our datacenter’s upstream providers.

Categories
Outages

[Complete] XN7 Issues Continued

We are sorry to inform you that the issues on XN7 continue despite a set of 4x new disks.

At this point we have decided it’s in our best interest to scrap the machine for the time being, and only after extensive testing will it be put back into production. We are going to start migrating all VPS to a new Supermicro/Xeon Lynfield host machine. Unfortunatley this will require IP changes but it’s the best option available to us at this point.

You will be contacted by email once your VPS has been moved, with the new IP Address(s).

ALL but two VPS on XN7 have now been moved and issued new IP Addresses. We are working on those and customers will be notified ASAP.

Categories
Outages

[Complete] XN7 Issues

XN7 is currently having issues affecting some VPS customers in the 95.154.207.xx range. It’s very rare this happens but both drives in one half of the RAID10 array are showing signs of failure and its causing intermittent filesystem issues.

We are working to resolve this as quickly as possible, while maintaining integrity of all customer data.

Update: Our datacentre is being very slow doing anything for us at the moment, appologies for the delay.
Update 2: We are still waiting on our datacentre…
Update 3: It looks like the motherboard may have failed in the machine. We do have a spare onsite but are confirming this now.
Update 4: System is back online and VPS are booting up. The RAID array is rebuilding and we will continue to monitor this. Hopefully the rebuild completes and we can then swap out the failing drives immediatley.

Update 5: The rebuild has failed. Server has been restored for now with the rebuild stopped. We are going to shortly backup all data, put in new disks and reload this server.

Update 6: Senior staff will begin maintainence on this server at the datacentre, starting approximatley 10.00PM (GMT+1).

Update 7: 1.53AM. 12 out of the 18 VPS on this node have now been backed up. This gives us a rough ETA that the backing up will be completed at 4AM in around 2 hours time.

Update 8: 3.21AM. The last VPS Is backing up now, then we will reload the machine.

Update 9: We are restoring data

Update 10: We are about to boot all VM’s

XN7 has now been restored and all VPS are booting up. We will continue to monitor closely.

Categories
Planned Maintainance

[Complete]Planned Datacentre Maintenance

Our datacentre, RapidSwitch/iomart has taken the decision to relocate all full rack customers into a new, more resilient area of the datacentre. As mentioned on emails sent out earlier in the week, this means we will be relocating all of our servers into new racks.

This maintainence will be starting at midnight (00:00 British Summer Time, GMT+1) on Thursday 21/04/11 and we aim to be completed before start of the business day.

Updates will be provided here as often as possible.

10.10 – All servers are online and we are working through any remaining issues now
09.50
– Just arrived back and it looks like XN8 has lost network connectivity, investigating now.
08:06
– This has now been completed.

00:00 – Work has now begun.

20:18 – The new racks are now fully cabled up and ready to go, we will now be taking a well earned break and returning to start the move at 00:00 BST (3hrs 22 Minutes from now).

17.37 – The preparation and Cabling of the new racks is almost complete. We are finishing up soon and will then be returning at midnight 00:00 to begin migration of hardware

Categories
Outages

[Resolved] xn14 outage

xn14 is currently down and is being checked

Update: A disk had failed in the RAID array and the controller failed to migrate I/O away from that disk causing it to hang. Have now replaced the disk and it’s in the process of rebuilding. VPS are booting up.

Please note that approx 5 VPS are doing a mandatory FSCK of their filesystem as its been X amount of days since their last reboot. Until these complete there is going to be a high amount of I/O wait on this node. Please be patient and do not reboot your VPS, this will only cause more load.

Categories
News

[Resolved]LINX Network Issues

We have just been notified of an issue at LINX (London Internet Exchange), they have just dropped 600GB/s of traffic so UK Clients may notice some interuptions to service however this is completely outside of our control.

Update: Traffic flow appears to have returned to normal again now.

Categories
Outages

[Resolved]CP1 MySQL Issue

We are currently facing an issue with CP1, the machine has loaded up on the MySQL Service and we are looking into it now.

Update: This has now been resolved, apologies for the short MySQL Downtime.

Categories
Outages

[Resolved] Network Issues

We currently have malicious network traffic incoming affecting connectivity to the below servers:

xn4, xn5, xn6, xn8

It seems only 109.169.51.0/24 is targeted, systems on other subnets are fine.

Update: After some process of elimination, this looks like a possible malfunction with the NIC on xn4, which is causing a traffic storm on the subnet. XN5, XN6 + XN8 are restored.

Update 2: We are going to move to the 2nd NIC on XN4 with a new cable + switchport.

Update 3: The traffic has returned to xn5/6/7 and we are working to mitigate.

Update 4: XN5/Xn6/Xn8 restored. XN4 being worked on

Update 5: All systems are now up and we will be following this up with our datacentre in the morning.