[Resolved] Network Issues

We are currently seeing packet loss affecting all UK servers

Update: This looks like  a potential issue with LINX (London internet exchange) as many UK sites are slowing to a crawl. We will update you as we have further information.

Update 2: We have confirmed the issues are due to LINX.

Update 3: We are now routing past LINX, therefore most ISP’s should find access to our services is back to normal speeds. Some ISP’s will still go straight to LINX unfortunatley.

Update 4: We believe this issue is now resolved which can be seen from the LINX Graph below.

[Resolved] Planned maintainence on cp1

As per the email sent out today, we will  be taking cp1 down for approximatley 15 minutes at 9PM, in order to restore full redundancy to the RAID array after the events on Saturday.

Update: This is currently in progress

Update 2: Actual downtime 5 minutes. Machine is up and services are starting

Update 3: The RAID rebuild is now starting. It’s going to be 10-15 minutes before the load stabalizes. There is going to be substancially increased I/O wait unfortunatley until the rebuild completes.

Update 4: Load is now coming down and the rebuild is chugging along nicely. We are marking this resolved and will continue to monitor until full redundancy and performance has been stored to the array.

[Resolved] cp1 down

cp1 has crashed due to what appears to be load issues. We are currently waiting on some remote eyes.

Update: It looks like possible primary hard disk failure. Giving it 5 minutes on the console to see if it boots, if it fails we will need to run a FSCK over the raid array and take it from there.

Update 2: I can confirm that /dev/sda (the primary hard disk) has failed. We are inspecting the data on the second disk. Standby for updates

Update 3: /dev/sdb is ok. We have repaired the filesystem and re-installed Grub. The machine is starting up now. Please note that CP1 is now running with 1 less idsk in the RAID set. Expect increased I/O wait an higher than normal loads. We will be replacing the disk momentarily.

Update 4: Some IP’s failed to come online properly. This has been fixed and everything is looking ok. It’s going to be 10 minutes or so before the machine stabalizes with normal levels of load.

Update 5: We have made a secondary backup of the machine onto our NAS as a precaution.. We will be restoring full redundancy to the RAID array with a new disk in due course.

[Completed] Maintainance Notification 11:00AM GMT Monday 1st March

UPDATE : This maintainence is now underway. All machines have been cleanly powered down. We are currently double checking all systems are powered down and will swap the PDU

UPDATE2: Due to a delay with onsite staff this is taking longer than expected. Appologies for the inconvenience.

UPDATE3: PDU has been replaced and we have connectivity on Edge #1. Servers are now being power cycled.

UPDATE5: All systems are now online and passed sanity checks. IF your VPS is offline please reboot via SolusVM or open a ticket

This is a reminder for the below maintainance which will begin in approx 1hr 45mins

Planned Maintenance Notification for 11:00AM GMT Monday 1st March.

This email is to inform you of some planned maintenance which will be service affecting.

Maintenance:

The main power bar in our rack which is supplied by our data centre has failed. The failure does not affect the power going through and ultimately to our servers however what has failed is the LCD Display which shows the current power usage which is very important to ensure that the rack isn’t overloaded.

Planned for:

11:00AM GMT on Monday 1st March 2010

Expected Downtime:

The downtime expected is approx 10 – 15 minutes. We will be powering all machines down cleanly then replacing the PDU and powering each machine back on.

Note: VPS may take longer due to the way that VPS start up.

We apologize for any inconvenience this may cause.