Author Topic: Wesbite & Forum Outage 10th Dec 2016  (Read 7830 times)

0 Members and 1 Guest are viewing this topic.

Offline FrankBuss

  • Supporter
  • ****
  • Posts: 1890
  • Country: de
    • Frank Buss
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #25 on: December 15, 2016, 07:24:57 pm »
Down for over 12 hours.
It was the fault of Hostgator:
https://forums.hostgator.com/12-09-16-15-00-multiple-t345474.html
Cue the people who say "switch to a real host" in 3, 2, 1...
Maybe if you switched to a high-end hosting provider like AWS.... oh wait...  :-DD

Bingo.
Everyone thinks that their hosting providers shit doesn't stink.
Well, that was a 2 hour outage at AWS because of a power outage. Hostgator was a 16 hour outage because of a configuration problem. Bigger datacenters like 1and1 have UPS, backup UPS and diesel generators for longer power outages.
So Long, and Thanks for All the Fish
 

Offline djos

  • Supporter
  • ****
  • Posts: 778
  • Country: au
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #26 on: December 15, 2016, 07:34:26 pm »

Well, that was a 2 hour outage at AWS because of a power outage. Hostgator was a 16 hour outage because of a configuration problem. Bigger datacenters like 1and1 have UPS, backup UPS and diesel generators for longer power outages.

The outage was caused by poor configuration of the DC BMS and UPS integration.

The BMS didn't know how to handle a grid voltage sag (aka brown out) and just ran the ups batteries flat because it didn't know to call up the generators.

I managed a tier 3 spec Colo DC up till a few years ago and this sort of Charlie Foxtrot just boggles the mind.
The impossible often has a kind of integrity which the merely improbable lacks.
 

Offline Ice-Tea

  • Frequent Contributor
  • **
  • Posts: 913
  • Country: be
    • Freelance Hardware Engineer
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #27 on: December 15, 2016, 07:55:00 pm »
Just wondering, can't get to the "show new replies" page for a few days now... I get this:

Code: [Select]
502 - Bad Gateway

We are sorry for the inconvenience that this error may be causing you. We are aware of the issue and are working to resolve it, please be patient.

There is no need to report this error.

Thank you for your patience,
Dave & gnif

Is that part of this issue or something else?
An engineer never has a problem. He just needs more time.

FS: TTi TSX1820P, TDS754A (w SPC fail), Agilent Infinium 54815A, 54825A, R&S CMU 200 (multiple units, various options) UPL, SMIQ03,Tektronix CSA8000B, HP 8594E, 8595E, Marconi 6201B (8GHz), IFR 2390A (22GHz), 2383 (4GHz)
 

Online MK14

  • Super Contributor
  • ***
  • Posts: 1786
  • Country: gb
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #28 on: December 15, 2016, 07:56:31 pm »
Just wondering, can't get to the "show new replies" page for a few days now... I get this:

Code: [Select]
502 - Bad Gateway

We are sorry for the inconvenience that this error may be causing you. We are aware of the issue and are working to resolve it, please be patient.

There is no need to report this error.

Thank you for your patience,
Dave & gnif

Is that part of this issue or something else?

I had the same problem and solved it (today), by clicking on the web browsers refresh button.

http://www.eevblog.com/forum/chat/show-new-reply-to-your-posts-causes-502-forum-error/msg1091765/?topicseen#msg1091765
« Last Edit: December 15, 2016, 07:58:03 pm by MK14 »
 

Online sleemanj

  • Super Contributor
  • ***
  • Posts: 2092
  • Country: nz
  • Professional tightwad.
    • The electronics hobby components I sell.
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #29 on: December 15, 2016, 08:04:39 pm »

Maybe if you switched to a high-end hosting provider like AWS.... oh wait...  :-DD

Yes, but with AWS you have a lot of power at your direct command and you can structure your systems to assume the sort of risk you want, from "availability zone in this region goes down, site is dead until it's fixed" to "this thing will keep running unless every Amazon region around the world has died".

But of course, you have to be able to do that setup, if it's not your day job, better to leave somebody else to do it for you, if nothing else that there's a lot of assholes and botnets out there who will stop at nothing to try and kill your site/server/network.  Some are smarter than others.

Not all of them are as kind as one I caught today who kindly announced it's UserAgent as "WebFuck V2.1 T0PHackTeam www.t0p.xyz" as it searched through some sites for exploits. 
~~~
EEVBlog Members - get yourself 10% discount off all my electronic components for sale just use the Buy Direct links and use Coupon Code "eevblog" during checkout.  Shipping from New Zealand, international orders welcome :-)
 

Offline djos

  • Supporter
  • ****
  • Posts: 778
  • Country: au
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #30 on: December 15, 2016, 08:07:13 pm »

Maybe if you switched to a high-end hosting provider like AWS.... oh wait...  :-DD

Yes, but with AWS you have a lot of power at your direct command and you can structure your systems to assume the sort of risk you want, from "availability zone in this region goes down, site is dead until it's fixed" to "this thing will keep running unless every Amazon region around the world has died".

But of course, you have to be able to do that setup, if it's not your day job, better to leave somebody else to do it for you, if nothing else that there's a lot of assholes and botnets out there who will stop at nothing to try and kill your site/server/network.  Some are smarter than others.

Not all of them are as kind as one I caught today who kindly announced it's UserAgent as "WebFuck V2.1 T0PHackTeam www.t0p.xyz" as it searched through some sites for exploits.

You clearly didn't click on the link, they lost the whole Sydney D.C. Due to a poorly configured BMS not handling a grid voltage sag.

Morale of the story, it can happen to anyone.
The impossible often has a kind of integrity which the merely improbable lacks.
 

Online sleemanj

  • Super Contributor
  • ***
  • Posts: 2092
  • Country: nz
  • Professional tightwad.
    • The electronics hobby components I sell.
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #31 on: December 15, 2016, 08:22:41 pm »
You clearly didn't click on the link, they lost the whole Sydney D.C. Due to a poorly configured BMS not handling a grid voltage sag.

My point was that the people who were using the Sydney DC could have used the AWS services at their disposal to prepare their system to be resilient to such failure of an entire region if they wanted to invest that time effort and expense.

~~~
EEVBlog Members - get yourself 10% discount off all my electronic components for sale just use the Buy Direct links and use Coupon Code "eevblog" during checkout.  Shipping from New Zealand, international orders welcome :-)
 

Offline djos

  • Supporter
  • ****
  • Posts: 778
  • Country: au
Re: Wesbite & Forum Outage 10th Dec 2016
« Reply #32 on: December 15, 2016, 09:05:49 pm »
You clearly didn't click on the link, they lost the whole Sydney D.C. Due to a poorly configured BMS not handling a grid voltage sag.

My point was that the people who were using the Sydney DC could have used the AWS services at their disposal to prepare their system to be resilient to such failure of an entire region if they wanted to invest that time effort and expense.

Geographic diversity is no guarantee of 99.9999% uptime, I've managed 2 salesforce outages in 6 months caused by their Japanese D.C. Suffering networking failures.
The impossible often has a kind of integrity which the merely improbable lacks.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf