Categories
Outages

[Resolved] Xn5 Outage

xn5 appears to be down and is being checked

Update: Looks like a VPS is under attack and is causing high packet loss. Working to get this under control ASAP.

Update 2: Service restored and we will continue to monitor. This was actually caused by a FreeBSD VM which had run out of disk space and was maxing out CPU and disk, very unusual!

Downtime: Approx 4mins

Categories
Planned Maintainance

[Completed] Disk Replacement on VZ2

Below is an email sent to all users on vz2 approx 2PM 12/11/11:

You are recieving this email because you have one or more VPS on the OpenVZ Node, vz2.pcsmarthosting.co.uk

A drive has failed in the RAID10 array on this machine and requires replacement. Due to previous issues we have had on this node following hot-swap disk replacements, we are going to shut down the machine cleanly, replace the drive and power it back up again.

We are sorry for any incovenience this may cause, especially as we only replaced two other drives a couple of months ago. This machine is running a 4x disk RAID10 array and it’s not uncommon for drives to fail around the same time as they are all from the same batch.

Outage Details:

Task: Replace failed drive in the RAID array.

Scheduled Time: 11.00PM UK Time (GMT) on Sunday 13/11/2011

Expected Downtime: Approx 20 Minutes, though as VPS are started individually it may vary.

If you have any questions please don’t hesitate to open a ticket.

Update: The server halted much faster than expected and is now down. We are waiting for remote hands to swap the drive out for us and power on.

Update 2: The drive has been replaced but is not being picked up. Looks like a blackplane/RAID controller issue. We will arrange another maintenance in due course to resolve this.

Categories
Planned Maintainance

[Completed] Planned maintenance on xn15

Below is a copy of an email sent to customers on 20/11/2011:

This email is to inform you about some upcoming planned maintenance, affecting a small number of customers on the machine xn15.pcsmarthosting.co.uk. To check if you have a VPS on this machine, log into SolusVM and check the Location at https://solusvm.dns4vps.com:5656 Or just open a ticket with your IP Address and we will check for you.

If your VPS is not hosted on XN15, please discard this email as it does not affect you.

If your VPS is hosted on XN15, our datacenter will be moving this node onto the new network setup with increased resilience, as we have done with our other servers a few months back.

This will be taking place between 1AM and 3AM (UK Time, GMT+1) on 10/11/2011 and the server will be down for approximately 30 minutes while it is moved to another rack and powered up.

If you have any questions please let us know.

Regards,

The PCSmart Team

Categories
Outages

[Resolved] Xn2 Outage

xn2 appears to be down and is being checked.

The server is now back and VPS are starting

Categories
Outages

[Resolved] VZ2 Issue

One of the new Western Digital drives fitted only last week has failed in VZ2 and caused the host filesystem to go read-only. We are working on this and service will be restored ASAP.

Update: We have been waiting over 15mins so far and no response from our datacenter (iomart). Further updates will follow…
Update 2: We now have a USB DVD drive connected at last and are working to restore the machine.
Update 3: Since /lib is missing and a few other parts of the OS are resolving to the wrong files we are going to take the fastest route which is to reload the OS on the host. Customer data and configs are on a seperate partition so your data should be safe. Reload is currently 50% so ETA around 20 minutes and we will be up.
Update 4: OS reload complete. We are now doing a quick OS update, installing OpenVZ, rebooting and getting VPS up. SolusVM will be restored shortly after.
Update 5: Server is now booting into OpenVZ
Update 6: VPS are now booting up one by one and we are restoring SolusVM access to this node.
Update 7: SolusVM now restored, VPS are booting up.

Categories
Planned Maintainance

[Resolved] VZ2 and XN2 Outages

From 8PM on 02/09/11 I am going to be onsite doing some work on a few machines.

VZ2 will be down while we troubleshoot a RAID controller issue and replace two faulty disks.
XN2 will be rebooted as the RAID controller has not detected a new disk correctly which was hot-swapped in.

If your VPS is currently down we appologise for the inconvenience, but its important we return your host machine to its fully working state with an Optimal RAID10 disk array.

Thanks

Categories
Planned Maintainance

[Complete] UK Network Maintenance/Upgrades 27/08/2011

As per an email sent out to all customers last week, we are carrying out some network upgrades on Saturday 27th August at 10.00PM (GMT+1). This is to reconfigure our network with a Cisco HSRP setup for improved redundancy incase of any router issues.

There will be a short outage between 10-20 minutes affecting all of our UK servers, except XN15 and XN16 which are in different racks.

Updates will be posted here throughout.

21.50: We are now preparing for the upgrade
22.03: Upgrade has started. Network is now down
22.11: Conectivity is restored. All finished!

Categories
Outages

[Resolved] XN12 Issue

Two drives have failed in XN12 at the same time causing the RAID array to go offline and the system crash.

Update 1: We have forced the array back online and will attempt repair and recovery of data.

Update 2:The machine will not boot due to filesystem issues on the host filesystem. We are currently working to restore service ASAP.

Update 3:We have almost finished repairing the damage to / on the host machine and will be rebooting shortly.

Update 4: It seems there are more damaged files, going back into rescuemode to repair + replace.

Update 5: Attempts to repair the host filesystem has failed, the system boots but we are unable to login. We are going to backup VM configuration files and reinstall the machine. From there we will assess what data is available.

Update 6: Host OS is now reinstalled, we are going to check the RAID status and begin checking of VPS filesystems.

Update 7: VPS are being restored one by one.

Categories
Outages

[Resolved] XN11 Issue

A disk has failed in XN11 and caused the host root operating system / to go read only. Server is rebooting to restore service.

Categories
Outages

[Resolved] XN2 Issues

XN2 has crashed approx 8.42PM and is being checked.

Update 8.45PM: Currently waiting on getting a KVM attached to the machine.
Update 8.56PM: Onsite staff are currently working ont he machine
Update 9.03PM: The server is now up and VPS are booting. The RAID array on this node is currently verifying, please be patient and allow your VPS to boot up rather than just rebooting. I/O performance will be a little slow until the rebuild completes.

Thanks