cp1 is currently having high load issues which means some customers might see their services down.
We are working to restore service ASAP
Additional Info: The current CP1 will be retiring in the next 2 weeks, as we have a new replacement machine on order based on the latest Xeon Woodcrest platform to put a stop to these load issues. We really appreciate your patience.
Update: We are currently still waiting on on the DC for our KVM to be moved. Appologies for the delay.
Update 2: The machine is failing to boot, we are waiting for further information from onsite staff, on-call staff have been notified and are ready to go to site if required.
Update 3: Things are now back under control and the load will return to normal shortly.
xn7 is currently having some I/O performance issues.
In order to resolve this we are installing the latest kernel and Xen stack. We will be rebooting it shortly.
Update: Appologies for the delay, / needed a FSCK so we let it run. VPS are starting up one by one now.
xn9 appears to be suffering some filesystem issues. Currently getting some remote eyes.
Update: For some reason disk 3 decided it would belong to a new array, and caused the filesystem to go read only. This has been corrected and the RAID is rebuilding. We will continue to monitor.
It appears XN7 has crashed, although it is responding to ping and as such was not picked up by our monitoring.
We will update you ASAP
Update: We have rebooted the machine and VPS are now starting up one by one. We are still investigating the cause of the crash.
There are currently issues with our Shared and Reseller Server CP1 which is causing sites to fail to load.
This is being investigated and all update will be here.
Update @ 15:58: This is now resolved
Update @ 15:46: This is looking like a grub issue, it is being corrected now
xn11 has crashed and is being checked
Update: Looks ok on the console but is unreachable via the network. This is rebooting
Update 2: It turns out there was a bit of fat finger syndrome and the port was taken out of the VLAN. The server is now up and VPS are starting up as well
We are currently seeing packet loss affecting all UK servers
Update: This looks like a potential issue with LINX (London internet exchange) as many UK sites are slowing to a crawl. We will update you as we have further information.
Update 2: We have confirmed the issues are due to LINX.
Update 3: We are now routing past LINX, therefore most ISP’s should find access to our services is back to normal speeds. Some ISP’s will still go straight to LINX unfortunatley.
Update 4: We believe this issue is now resolved which can be seen from the LINX Graph below.
As per the email sent out today, we will be taking cp1 down for approximatley 15 minutes at 9PM, in order to restore full redundancy to the RAID array after the events on Saturday.
Update: This is currently in progress
Update 2: Actual downtime 5 minutes. Machine is up and services are starting
Update 3: The RAID rebuild is now starting. It’s going to be 10-15 minutes before the load stabalizes. There is going to be substancially increased I/O wait unfortunatley until the rebuild completes.
Update 4: Load is now coming down and the rebuild is chugging along nicely. We are marking this resolved and will continue to monitor until full redundancy and performance has been stored to the array.
cp1 has crashed due to what appears to be load issues. We are currently waiting on some remote eyes.
Update: It looks like possible primary hard disk failure. Giving it 5 minutes on the console to see if it boots, if it fails we will need to run a FSCK over the raid array and take it from there.
Update 2: I can confirm that /dev/sda (the primary hard disk) has failed. We are inspecting the data on the second disk. Standby for updates
Update 3: /dev/sdb is ok. We have repaired the filesystem and re-installed Grub. The machine is starting up now. Please note that CP1 is now running with 1 less idsk in the RAID set. Expect increased I/O wait an higher than normal loads. We will be replacing the disk momentarily.
Update 4: Some IP’s failed to come online properly. This has been fixed and everything is looking ok. It’s going to be 10 minutes or so before the machine stabalizes with normal levels of load.
Update 5: We have made a secondary backup of the machine onto our NAS as a precaution.. We will be restoring full redundancy to the RAID array with a new disk in due course.
Some of our xen nodes are currently the target of a denial of service attack. While no servers are down, you may notice increased latency and a reduction in network speed.
Update: This is now resolved.