[Resolved] cp1 load issues

cp1 is currently having high load issues which means some customers might see their services down.

We are working to restore service ASAP

Additional Info: The current CP1 will be retiring in the next 2 weeks, as we have a new replacement machine on order based on the latest Xeon Woodcrest platform to put a stop to these load issues. We really appreciate your patience.

Update: We are currently still waiting on on the DC for our KVM to be moved. Appologies for the delay.

Update 2: The machine is failing to boot, we are waiting for further information from onsite staff, on-call staff have been notified and are ready to go to site if required.

Update 3: Things are now back under control and the load will return to normal shortly.

[Resolved] xn7 performance issues

xn7 is currently having some I/O performance issues.

In order to resolve this we are installing the latest kernel and Xen stack. We will be rebooting it shortly.

Update: Appologies for the delay, / needed a FSCK so we let it run. VPS are starting up one by one now.

[Resolved] xn9 Issues

xn9 appears to be suffering some filesystem issues. Currently getting some remote eyes.

Update: For some reason disk 3 decided it would belong to a new array, and caused the filesystem to go read only. This has been corrected and the RAID is rebuilding. We will continue to monitor.

[Resolved] xn7 issues

It appears XN7 has crashed, although it is responding to ping and as such was not picked up by our monitoring.

We will update you ASAP

Update: We have rebooted the machine and VPS are now starting up one by one. We are still investigating the cause of the crash.

[Resolved]Shared / Reseller Server Issues

There are currently issues with our Shared and Reseller Server CP1 which is causing sites to fail to load.

This is being investigated and all update will be here.

Update @ 15:58: This is now resolved

Update @ 15:46: This is looking like a grub issue, it is being corrected now

[Resolved] xn11 down

xn11 has crashed and is being checked

Update: Looks ok on the console but is unreachable via the network. This is rebooting

Update 2: It turns out there was a bit of fat finger syndrome and the port was taken out of the VLAN. The server is now up and VPS are starting up as well

Sorry folks!

[Resolved] Network Issues

We are currently seeing packet loss affecting all UK servers

Update: This looks like  a potential issue with LINX (London internet exchange) as many UK sites are slowing to a crawl. We will update you as we have further information.

Update 2: We have confirmed the issues are due to LINX.

Update 3: We are now routing past LINX, therefore most ISP’s should find access to our services is back to normal speeds. Some ISP’s will still go straight to LINX unfortunatley.

Update 4: We believe this issue is now resolved which can be seen from the LINX Graph below.

[Resolved] Planned maintainence on cp1

As per the email sent out today, we will  be taking cp1 down for approximatley 15 minutes at 9PM, in order to restore full redundancy to the RAID array after the events on Saturday.

Update: This is currently in progress

Update 2: Actual downtime 5 minutes. Machine is up and services are starting

Update 3: The RAID rebuild is now starting. It’s going to be 10-15 minutes before the load stabalizes. There is going to be substancially increased I/O wait unfortunatley until the rebuild completes.

Update 4: Load is now coming down and the rebuild is chugging along nicely. We are marking this resolved and will continue to monitor until full redundancy and performance has been stored to the array.

[Resolved] cp1 down

cp1 has crashed due to what appears to be load issues. We are currently waiting on some remote eyes.

Update: It looks like possible primary hard disk failure. Giving it 5 minutes on the console to see if it boots, if it fails we will need to run a FSCK over the raid array and take it from there.

Update 2: I can confirm that /dev/sda (the primary hard disk) has failed. We are inspecting the data on the second disk. Standby for updates

Update 3: /dev/sdb is ok. We have repaired the filesystem and re-installed Grub. The machine is starting up now. Please note that CP1 is now running with 1 less idsk in the RAID set. Expect increased I/O wait an higher than normal loads. We will be replacing the disk momentarily.

Update 4: Some IP’s failed to come online properly. This has been fixed and everything is looking ok. It’s going to be 10 minutes or so before the machine stabalizes with normal levels of load.

Update 5: We have made a secondary backup of the machine onto our NAS as a precaution.. We will be restoring full redundancy to the RAID array with a new disk in due course.

Because Uptime Matters