November 1st, 2007

Nov 1 network outage [update]

For those of you just tuning in, there was another network outage from 0900-1100GMT. We’re still piecing together what exactly happened but here is what we know so far:

  • there was a non-impacting maintenance schedule by our datacenter in preparation for the core upgrade taking place in about a week. They were inserting a switch in parallel with the existing core switches.
  • starting at 0900 GMT when the maintenance was completed, the outage started.
  • after working with Cisco, the issue was resolved around 1100 GMT

This is an unacceptable level of service and people have expressed concerns in the chatroom about the recent outages. We agree completely and cannot offer more than our sincerest apologies at the moment. We are gathering more information and reviewing our options, updates will follow. In the meantime, if you have any questions or feel like venting, you can contact us via email or call me directly (314.266.3502). If I don’t answer, leave a message and one of us will get back to you shortly.

19 Comments

  1. Thanks for update!

  2. Outages are not good but honest and quick communication about them is. Thanks for the information.

  3. Agreed. I am getting bored with the outages, but at least you guys are very quick to tell us what is going on.

  4. Thanks for the update

  5. While network outages are always bad, this level of communication is very appreciated. Far superior to Dreamhosts “Stuff’s down, S—t happens” approach.

  6. Thanks for your quick status update. As others wrote, quick updates are very good, and network outages can always happen, although I was all but happy not to be able to reach my server (and yours…) this morning.

    Thanks for your service :)

  7. At first I thought this post was about the DDOS outage. Just re-read the title – didn’t know there was another outage today. Thx for the heads up. Seems like datotel needs to get their poops together. Sucks to have to rely on others for your own reputation.

  8. Your DC guys don’t seem to be terribly good at planning and communicating maintenance works, do they? It’s not the first time they do something which shouldn’t have any impact but turns out to be a major outage.

    Thanks for the update, I agree with everyone else – your honest updates and quick reaction to such events is very important to us.

  9. We were notified of the maintenance, it was just categorized as non-impacting (therefore we did not pass this along).

  10. While I dislike downtime as much as anyone, my monitoring showed less than 2-hours downtime. This is only the second (non self-induced) downtime I’ve had since opening my account. With a quick notification (Twitter) and explanation (here), coupled with the rarity of the event, and especially it’s not something related to Slicehost’s competence, I’m a very happy slicer.

  11. I appreciate your fast response and communication, but I think you guys really need to let the DC guys know how much this is affecting your business (and ours).

  12. Thanks for the information.

    Sounds like they pressed ahead and tried to fix the change (judging by the time taken to resolve). If this is indeed the case, Slicehost need to remind these people of the concept of “roll-back”. If a change doesn’t work; roll-back.

  13. Do you have a date/time for the “core upgrade taking place in about a week”? Please give us advanced notice so we can plan for this outage.

  14. Maybe it’s time to move to a new data center.

    I suppose this isn’t a realistic expectation, but you know, if you guys were in a city more wired than St Louis, you’d have much better options. In the valley, there are tons of data centers. Lots of competition is good for customers. IPv6 is available in most.

  15. Well, look at the bright side. This will buy you a few more days before you have to provide an update on the wait list (which was promised a few days ago).

  16. Datotel again? Off with their heads!

  17. Is there any chance we can get notified of these types of maintence (or even just an rss feed). I know its not your fault, but it helps to be aware of things like this in the past. Also concerning for us is the maintence windows, it is early evening for us, right when our target customers are trying to use our site

  18. I agree, a RSS feed or a post to twitter for all maintenance, “non-impacting” or not would be a decent idea. At least that we can half expect and have an immediate explanation for downtime if it occurs.

  19. I personally didn’t notice the outage, but my montastic bot did. Pretty lame to have more of these outages. However, we are all affected, you guys as much as any.

Leave a Reply