Linode 多伦多数据中心故障

好吧,在经过大约十九个小时的宕机后,Linode 的多伦多数据中心恢复了。这次的故障时间之长,是我使用 Linode 服务以来,第一次遇到,当时正好我在做网站的更新,重建页面呢,中途就超时错误了。然后登录 Linode 的控制面板,没有什么问题啊,但是使用它自己的 Glish 界面,却不成功的。这时我通过一些测试,发现是连接不到服务器。

当时,我先是查看了 Linode 的服务状态页面,没有故障指示,于是就提交了工单,大约 7分钟后,工单回复有故障,正在检查。

于是,这就可以看到 Linode 服务器状态页面开始有记录故障情况了。

大约我是属于最早发现故障的用户之一了,因为当时恰好正在使用,所以马上就发现了。

一直以来,对于有故障,然后提供透明信息的服务商,总是增加了我对他们的信任。

大约三四个小时之后,故障页面可以看到更新,是某个光缆被割断了。这个就有点儿无语了,然后大约再过了四五个小时,更新情况是发现这个光缆被割断,是因为多伦多大都会区的某个火灾造成的,而且多个路由出现问题,看到这个情况,是更加的没有脾气了,天灾人祸么。

当然,Linode 的工作人员一直在努力,这是报告上这么说的,具体是怎样就不得而知了,然后在六七个小时之后,IPv4的连接恢复了,之后再几个小时之内,包括 IPv6 就全部恢复正常了。

现在已经没有问题了,因为本站所在的 VPS 就是在这个数据中心,两天前,如果你发现访问不正常,就是因为这个原因。

回头看看 Linode 的故障页面的更新:

Connectivity Issues - Toronto
Resolved - Connectivity in our Toronto data center has remained stable, and we're confident this matter has been resolved. If you are still experiencing additional issues, please contact our Customer Support Team for assistance.
Jun 1, 00:07 UTC
Monitoring - We have been able to correct the issues affecting our Toronto data center, including IPv6 connectivity, and will continue monitoring to ensure our services remain stable.
May 31, 22:26 UTC
Update - We've been able to restore IPv4 connectivity to our Toronto data center. We're still working to fully restore IPv6 connectivity, and will continue to provide updates here until this issue is resolved.
May 31, 20:07 UTC
Update - We're continuing to work with our Toronto data center to have connectivity fully restored. We've identified that the fiber cut was the result of a fire occurring in the Toronto Metropolitan area and our team is working closely with data center staff to restore connectivity as soon as possible. While we have multiple fiber paths from our data centers, it appears that the fire is impacting multiple routes. We will continue to provide additional updates as we progress in correcting the issue.
May 31, 13:53 UTC
Update - We have identified this issue as being related to a fiber cut near our Toronto data center, and we do not currently have a time frame as to when this issue will be corrected. Connectivity with our peering partners continues to remain stable and is not impacted by this issue. We will continue to provide updates on our progress as we move forward.
May 31, 09:06 UTC
Update - Some connectivity has been restored and we are still continuing to work as quickly as possible to restore full connectivity to the Toronto data center.
May 31, 07:21 UTC
Identified - We have identified the connectivity issues affecting our Toronto data center. Our team is working as quickly as possible to have connectivity restored. We will provide additional updates as the situation develops.
May 31, 05:47 UTC
Investigating - We are aware of connectivity issues affecting our Toronto data center and are currently investigating. We will continue to provide additional updates as this incident develops.
May 31, 05:21 UTC

这是我当时发的 Tweet。

Linode 的 用户服务协议中的相关条款,是 99.9% 在线保证,如果不达标,可以按比例退款,照这个比例来算,大约 2.5% 的月时间故障,按比例退款也就几毛钱,算了。

Uptime Guarantee

Linode.com provides a 99.9% uptime guarantee on all Linode hardware, and on network connectivity. In any given month, if your Linode is down for more than 0.1%, you may request a pro-rated credit for the down-time.