Author Topic: Forum Outage  (Read 55794 times)

0 Members and 1 Guest are viewing this topic.

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
Re: Forum Outage
« Reply #25 on: November 17, 2013, 05:47:46 pm »
Letting an untrained/not monitored/not attended cleaner messed up the server room its like letting an ex rapist to clean the bed room not guarded while your mom/wife/girlfriend/daughter is sleeping on the bed.  :palm:

Offline SeanB

  • Super Contributor
  • ***
  • Posts: 16281
  • Country: za
Re: Forum Outage
« Reply #26 on: November 17, 2013, 06:23:44 pm »
Letting an untrained/not monitored/not attended cleaner messed up the server room its like letting an ex rapist to clean the bed room not guarded while your mom/wife/girlfriend/daughter is sleeping on the bed.  :palm:

This was on a feed from the UPS to the workstations on another floor. Central UPS and dedicated wiring for power to the computers.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1676
  • Country: au
Re: Forum Outage
« Reply #27 on: November 18, 2013, 06:46:52 am »
The outage was caused by a power failure at the data center, and yes, the db corruption was due to the cold reboot.
 

Offline eendje

  • Newbie
  • Posts: 9
Re: Forum Outage
« Reply #28 on: November 18, 2013, 11:58:07 am »
Hi Dave,

Glad it's working again, there is lots different penguins, I never heard about a repair penguin.

but it seems to be a cool bird  :clap: :clap: :clap:

Eendje
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1676
  • Country: au
Re: Forum Outage
« Reply #29 on: November 19, 2013, 12:20:04 am »
The outage was caused by a power failure at the data center, and yes, the db corruption was due to the cold reboot.

Should I be surprised that they are not protected by a UPS and diesel generators? Is that an extra cost option?

Did you have to restore the DB from backups or just run uncommitted updates from a journal?

One would expect that they have that kind of equipment, they have not provided a reason for the outage. We were just able to repair the tables, it was just uncommitted updates.
 

Offline xrunner

  • Super Contributor
  • ***
  • Posts: 7516
  • Country: us
  • hp>Agilent>Keysight>???
Re: Forum Outage
« Reply #30 on: November 19, 2013, 12:22:54 am »
You know SMF is at v 2.0.6 now right? (this forum is still at 2.0.4)
I told my friends I could teach them to be funny, but they all just laughed at me.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1676
  • Country: au
Re: Forum Outage
« Reply #31 on: November 19, 2013, 03:42:20 am »
You know SMF is at v 2.0.6 now right? (this forum is still at 2.0.4)

Normally that is for Dave to handle, but I believe that this is 2.0.6 but due to the reboot it seems it is a little confused about it's version. When I find some time I will have a look into it.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37734
  • Country: au
    • EEVblog
Re: Forum Outage
« Reply #32 on: November 19, 2013, 03:43:51 am »
You know SMF is at v 2.0.6 now right? (this forum is still at 2.0.4)

Normally that is for Dave to handle, but I believe that this is 2.0.6 but due to the reboot it seems it is a little confused about it's version. When I find some time I will have a look into it.

I just upgraded to 2.0.6
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37734
  • Country: au
    • EEVblog
Re: Forum Outage
« Reply #33 on: November 21, 2013, 02:16:14 am »
Looks like another power failure or whatever happened to the server again, along with another database corruption.
All fixed now.
 

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
Re: Forum Outage
« Reply #34 on: November 21, 2013, 02:20:58 am »
Is there any SLA or anything similar on the server uptime ? This is getting worst.  :-\

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37734
  • Country: au
    • EEVblog
Re: Forum Outage
« Reply #35 on: November 21, 2013, 02:51:03 am »
and it just a happened AGAIN 30 min later!
Two database tables this time.
The data centre really has some serious issues at present...
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37734
  • Country: au
    • EEVblog
Re: Forum Outage
« Reply #36 on: November 21, 2013, 02:57:07 am »
Is there any SLA or anything similar on the server uptime ? This is getting worst.  :-\

Like most, it's like 99.9% or something. But when shit happens, what do you do, change hosts? That would be foolish.
This server has been very good to me reliability wise.
 

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
Re: Forum Outage
« Reply #37 on: November 21, 2013, 03:00:47 am »
Yep, just saw it again and said something like forum member list or something is corrupted and crashed.  :o

Edit :
This -> "Table './eevblog_smp01/smf_members' is marked as crashed and should be repaired"
« Last Edit: November 21, 2013, 03:05:34 am by BravoV »
 

Offline digsys

  • Supporter
  • ****
  • Posts: 2209
  • Country: au
    • DIGSYS
Re: Forum Outage
« Reply #38 on: November 21, 2013, 03:07:55 am »
OUCH! Yup, saw something similar to that too .... maybe there's a dead forum member stuck in a post :-) ???
Hello <tap> <tap> .. is this thing on?
 

Offline nanofrog

  • Super Contributor
  • ***
  • Posts: 5446
  • Country: us
Re: Forum Outage
« Reply #39 on: November 21, 2013, 03:08:56 am »
FWIW, just got the following error message not long ago.

Quote
Table './eevblog_smp01/smf_members' is marked as crashed and should be repaired
 

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
Re: Forum Outage
« Reply #40 on: November 21, 2013, 03:11:30 am »
Not an expert on this matter, just wondering if the crash and recovered, but this keeps happening again and again, will this affect the overall SMF data's integrity in long run ?

Offline vk6zgo

  • Super Contributor
  • ***
  • Posts: 7586
  • Country: au
Re: Forum Outage
« Reply #41 on: November 21, 2013, 03:14:40 am »
Glad to see it is back up...
I come here when things are slow over on QRZ (The Zed).

Hi Sue!
 

Offline strangelovemd12

  • Regular Contributor
  • *
  • Posts: 102
  • Country: 00
Re: Forum Outage
« Reply #42 on: November 21, 2013, 03:18:44 am »
I just found this place a few days ago and I love it in every way, except the vBulletin flashbacks I'm getting at every corner.  The good news is that a Google search on the outage informed me that Dave tweets like a canary in orgasm.  Followed!
Please hit my ignorance with a big stick.
 

Offline Anks

  • Frequent Contributor
  • **
  • Posts: 252
  • Country: gb
    • www.krisanks.wordpress.com
Re: Forum Outage
« Reply #43 on: November 21, 2013, 03:28:43 am »
Is there any SLA or anything similar on the server uptime ? This is getting worst.  :-\

Like most, it's like 99.9% or something. But when shit happens, what do you do, change hosts? That would be foolish.
This server has been very good to me reliability wise.

I'm with you on this Dave. Changing host in my experience generally bring different issues and this forum isn't the worst for outages Ive seen.
 

Offline grumpydoc

  • Super Contributor
  • ***
  • Posts: 2905
  • Country: gb
Re: Forum Outage
« Reply #44 on: November 21, 2013, 08:10:33 am »
Quote
Like most, it's like 99.9% or something. But when shit happens, what do you do, change hosts? That would be foolish.
This server has been very good to me reliability wise.
A reasonable stance but it might be worth putting some thought into hardening the forum more against server outage since the next time it goes the database might not be repairable.

I'm sure you have good backups - have you investigated turning on full (data as well as metadata) journalling (might be a performance hit) on your filesystems or using a more robust database (as some suggested earlier in the thread)?
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Forum Outage
« Reply #45 on: November 21, 2013, 08:50:45 am »
and it just a happened AGAIN 30 min later!
Two database tables this time.
The data centre really has some serious issues at present...
Maybe its time to get serious about software and hardware. Back in the old days when I was a sys-admin I more or less had this checklist for anything which needed to be reliable:
- Dell server or HP Proliant server with ECC memory
- Database system with journaling / automatic recovery (preferably a real database like Postgresql and most certainly not Mysql with myIsam tables)

I can imagine the database is tuned for performance and less for reliability but as this forum is your income you better tune for reliability even if that means setting up a second server to handle the load. Maybe even hire an expert to harden the database but I'd check the quality of the server hardware first.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline arekm

  • Supporter
  • ****
  • Posts: 165
  • Country: pl
Re: Forum Outage
« Reply #46 on: November 21, 2013, 08:54:56 am »
Really try InnoDB instead of MyISAM (and there is always a way to go back if there are any problems).
 

Offline Clint

  • Regular Contributor
  • *
  • Posts: 119
  • Country: gb
Re: Forum Outage
« Reply #47 on: November 21, 2013, 09:04:45 am »
Move the whole thing to Amazon, its all I am working on now with superb results, never any worry about hardware :)
=-=-=-=-=-=-=-=-=
g33K5 L1k3 80085
 

alm

  • Guest
Re: Forum Outage
« Reply #48 on: November 21, 2013, 09:06:06 am »
Until recently MySQL didn't support full-text search in InnoDB tables, and I believe MariaDB still doesn't support it. This might be the reason for the MyISAM tables. But these crashes have been a good demonstration why ignoring the consistency from ACID, like MyISAM does, may be a bad idea for real databases.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37734
  • Country: au
    • EEVblog
Re: Forum Outage
« Reply #49 on: November 21, 2013, 10:14:45 am »
Maybe its time to get serious about software and hardware.

How much more serious can I get about hardware? I already run a multi-hundred dollar/month dedicated top shelf server box with a top provider.
Arguing over which host provider is best, and which hardware is best is pretty pointless, ask 100 experts (I have!), and you get 100 different answers.
IME, no matter which host provider you go with, even the "redundant cloud" type, your site can go down.
I do daily automated full site and database backups to a remote site (need a new solution for this, as autositebackup.com are ceasing)
And really, if the forum is down for a few hours (rare), or even a few days (never happened), is it the end of the world?

Quote
- Database system with journaling / automatic recovery (preferably a real database like Postgresql and most certainly not Mysql with myIsam tables)

I believe I can only run what the SMF forum software uses?
« Last Edit: November 21, 2013, 10:16:26 am by EEVblog »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf