Categories
Outages

[Resolved] Node 2 – Slow Network Connectivity

We are currently investigating slow network connectivity on Node 2. Updates will be posted here.

Update: This has now been resolved and was caused by an incorrect port profile.

Categories
Outages

[Resolved] Node03 Drive Failure

We have had a drive failure on Node03 which has now been replaced however the RAID array is now rebuilding, this may cause some disk wait.

We will update once this is complete.

Update:

  • Complete
  • Rebuilding is currently at 45%
  • Rebuilding is currently at 30%
  • Rebuilding is currently at 15%
Categories
Outages

[resolved] Support system down – license issue

We are currently facing an issue with the license for our support system, we are in contact with their support team however things are taking longer than we would like to get resolved.

We can only apologise for this however as I hope you can appreciate this is completely outside our control.

All updates will be posted here.

This has now been resolved.

Update – You can now submit tickets via our billing system, simply login to your account and click on Open Ticket in the menu bar

Update – We are currently setting up a temporary support system as we’re not getting a response regarding the license issue.

Categories
Outages

[Resolved] xn9 outage

xn9 appears to have gone unresponsive, we are currently waiting on remote hands to check the server.

Update 1: It appears the underlying partition of the LVM volume group containing VPS filesystems has gone away, despite the RAID controller, drives and array being healthy. We are currently investigating recovery opportunities.

Update 2: We have been able to manually reconstruct the underlying partition and LVM metadata, however after several attempts we are unable to get it assembled in such a way that VM filesystems are accessible. The root cause of why the partition disappeared is unclear, we suspect the size of the volume may have changed due to a bug/defect within the raid controller. It is possible that with further examination we may be able to recover complete or partial data, we cannot make any guarantees, at this time no data is available. If there is any data which is of particular importance and you can provide the complete filename, we will do our best to recover it through some alternate methods.

We will now begin a recovery operation to re-create VPS based on XN9, onto alternative host machines. Managed servers will include restoration of backups where available.

We sincerely apologise for this inconvenience and will continue to work with our customers to restore service as quickly as possible.

Update 3: All VPS have been migrated to alternate hardware.

Categories
Outages

[Resolved] XN10 Packet loss

Starting at approximately 22:20 GMT+1, xn10 was experiencing high packet loss. Due to limited access to the server we gracefully shutdown all VM’s and brought them back up to make some configuration changes. Apologies for the reboot.

We have now identified one VM is the destination of a low bandwidth, high concurrency DoS attack which has now been null routed and we continue to monitor.

Categories
Outages

[Resolved] Web1 Outage

We are aware of an issue with Web1 and are investigating.

Categories
Outages

[Resolved] XN1 Outage

We are aware of an issue with our XN1 node, we have identified the issue and are working to resolve it as quickly as possible.

Updates will be provided here.

Update @ 16:52: DC Staff are running slow, frustrating but nothing we can do unfortunately to speed this up.

Update: This is now resolved.

Categories
Outages

[Resolved] xn3 outage

xn3.pcsmarthosting.co.uk currently has issues with the RAID controller which we are working to resolve. Further updates will follow.

Update: We’ve traced this to a bug in the upstream Linux 4.9 kernel. The raid controller and array appear to be healthy. We’ll bring the server back up on a secondary kernel to restore service. Further investigations will be carried out in our test environment.

Categories
Outages

[Resolved] XN10 Outage

We have been made aware of an issue with XN10 by monitoring and are currently investigating the issue, further updates will be provided here when we have them

Update: This has now been resolved.

Categories
Outages

[Resolved] Web1 Read Only

Our web1 server has currently gone into a read only state and our technicians are currently investigating this.

Updates will be provided here when they’re available.

Update: Onsite staff are currently hooking up a crash cart to this machine.

Update: It appears a hard disk in the RAID10 array has failed, and caused the controller to hang. This is rare but it can happen. There are some filesystem inconsistencies we are working to repair then the server will be brought back online.

Update: Damage has been repaired and didn’t look too bad, just doing a 2nd pass now to be absolutely sure and will then boot.

Update: Server is now back online and the RAID array is rebuilding.