2015.10.08 Service outage

Update: 2015.10.08 11:30 PST

Resolution: All systems operational.

Scope: All services affected as the cabinet experienced complete power loss.

Summary of issue: The outage primarily affected one cabinet however, numerous businesses / service providers and network peers were affected by the power failure. Due to the distributed nature of the hosted platform only a portion of clients experienced disruption and due to fail over systems in place those disruptions were minimized. There were two significant primary events - the second one was in the early afternoon. Due to hardware failures caused by the power disruption, and other factors such as load surge on restoration, client equipment and software requiring restart, and queue clearing time the restoration of services varied but was completed at approximately 15:00 PST. Mail delayed should be delivered - all backlog should be processed.

More details will follow including a timeline, however, at this time we have an outline of the events that occurred today including initial actions taken to mitigate risk of recurrence. There are still details missing as some equipment is not directly in our control and we are still reviewing data.

  • This morning, around 09:00 PST, a malfunction or failure occurred in a UPS in the Vancouver data center.
  • Engineering staff were immediately aware of the issue and started investigating and working to restore services to affected clients.
  • As the severity of the failure became immediately obvious, the decision was made that staff proceed to the site for hands on operations while other staff remained to work on the issues remotely and handle incoming calls.
  • A UPS shut down or failed - the details, cause and scope of the failure will follow as we learn them.
  • This UPS is part of the datacenter infrastructure and delivers power to equipment located in cabinets in the data center.
  • The equipment which was affected has a secondary power feed but when the primary power failed, the backup power did not handle the load and it failed as well.
  • The UPS was bypassed to return power to the equipment, and both primary and backup power were restored.
  • Maintenance was done to restore operations and test for failed equipment.
  • Damaged equipment was isolated and replaced using on site spares.
  • Additional spares have been brought to the facility.
  • As part of the phone provisioning, phones would have transferred back to the primary nodes. (N.B. Transfers between nodes may appear as brief disruptions)
  • During this time difficulty in communicating with some upstream carriers was observed - likely a result of the power issue and the effects to other equipment outside our cabinet. This may have resulted in some difficulty completing external calls.
  • While working on repairing equipment the power loss repeated - first the primary and again the secondary.
  • At this time we determined that something was causing the power to fail at what appears to be below rated load (we have load monitoring in place).
  • In response we have moved equipment from the effected cabinet and re-distributed load to reduce the risk of recurrence.
  • UPS maintenance contractors have attended the site and are involved in repairs to the primary UPS
  • The backup service has significantly lower load than before and should not be adversely impacted by a further failure on the primary service.

Actions already taken:

  • equipment repaired
  • power load reduced
  • spares replaced

We will continue to post updates as information becomes available.

Please forward any comments / concerns we will address all issues.

Thank you,

BitBlock Systems, Inc.


Original posting.

Start of issue: Approximately 2015.10.08 09:00 PST

Resolution: 2015.10.08

Description: There were multiple power issues at Vancouver. More details to come later.

Phone servers have been restored.

Some mail servers are restored. 

 

Like our support? We bet you'll like our service!
Contact us  now for more information!

Hosted VoIP | Fax Services | Cloud ServicesMicrosoft Exchange | Remote Backup | Internet Connectivity | VoIP Phones | Servers | Thin Clients

 

Login Form

Search