Forum Posting Problem

#25 Reply
Posted by GEuser on 21 Oct, 2014 13:44
Quote from: T3sl4co1l on 21 Oct, 2014 13:42
Saw this earlier,

Code: [Select]
Table './eevblog_smp01/smf_members' is marked as crashed and should be repaired
Hope it's not hackers or something

That was another that popped up too

#26 Reply
Posted by gnif on 21 Oct, 2014 13:50
Quote from: GEuser on 21 Oct, 2014 13:44
Quote from: T3sl4co1l on 21 Oct, 2014 13:42
Saw this earlier,

Code: [Select]
Table './eevblog_smp01/smf_members' is marked as crashed and should be repaired
Hope it's not hackers or something

That was another that popped up too

Thanks for that, I will correct it now, that is a problem from the prior crash.

Edit: This has been done, should not present again.

Edit 2: I noted that even though we setup memcached way back, for some reason it is not in use. I have enabled it in SMF and have already noted quite a substantial reduction in server I/O. The forum also seems to be quite a bit faster for me, but I have not done any testing to properly confirm this.

#27 Reply
Posted by EEVblog on 21 Oct, 2014 13:58
Quote from: T3sl4co1l on 21 Oct, 2014 13:42
Code: [Select]
Table './eevblog_smp01/smf_members' is marked as crashed and should be repairedHope it's not hackers or something

There are two issues at the moment, that have been persistent over the last two days.

The first and most common is that error message listed above. I believe this is caused because the MySQL database shat itself for some reason, and when that happens there is a chance that any table that was being accessed will corrupt. Hence why you see different error message here. This requires a simple but manual table repair to fix. This seems to be happening every few hours at present.

The second issue is that of a complete MySQL lockup. In this case the MySQL server has to be restarted. This will bring down not only the forum, but the wordpress blog too.

I've disabled Cloudflare until all this gets fixed.

#28 Reply
Posted by EEVblog on 21 Oct, 2014 14:01
Quote from: gnif on 21 Oct, 2014 13:50
Edit 2: I noted that even though we setup memcached way back, for some reason it is not in use. I have enabled it in SMF and have already noted quite a substantial reduction in server I/O.

I haven't fiddled with that. Although I did just update to SMF 2.0.9 yesterday.

#29 Reply
Posted by gnif on 21 Oct, 2014 14:10
Quote from: EEVblog on 21 Oct, 2014 13:58
I've disabled Cloudflare until all this gets fixed.

Cloudflare has nothing in common with this issue, but with regards to the other issues people have had with it, I do not thing that it is a bad idea to disable it anyway.

#30 Reply
Posted by EEVblog on 21 Oct, 2014 14:29
Quote from: gnif on 21 Oct, 2014 14:10
Cloudflare has nothing in common with this issue, but with regards to the other issues people have had with it, I do not thing that it is a bad idea to disable it anyway.

I've had constant problems with it myself.

#31 Reply
Posted by gnif on 21 Oct, 2014 14:45
Leave it off then, I will keep a close eye on the server and see how things perform.

#32 Reply
Posted by alimirjamali on 21 Oct, 2014 16:25
I would agree with gnif that CloudFlare should have nothing to do with MySql issue. Currently CloudFlare cashing is disabled and only the DNS queries are resolved by them . Let's see if would help at all (which I doubt) . I presume that you check mysql logs (/var/log/mysql/error.log || /var/log/mysqld.log || etc) for errors?

Beside gnif, do you have sudoers (or any other support person) in other TimeZones? Maybe someone in UK or US you trust in person. Someone that could restart the bloody MySql service if needed!

The value of posts on the Forum is substantial. Every time Forum crashes, I panic . Would you confirm that MySql databases and httpd data are backed up regularly (CRON, rsync, etc.)?

Quote from: gnif on 21 Oct, 2014 13:18
With regards to RAM, It looks like the server may be overcommited,
If it is not a huge secret, what is the output of free -m after a fresh restart? I am curious to know how much RAM is installed on the server.

#33 Reply
Posted by tautech on 25 Oct, 2014 01:35
FYI: Had more of this early this morning:
504 - Gateway Timeout
We are sorry for the inconvenience that this error may be causing you. We are aware of the issue and are working to resolve it, please be patient.

There is no need to report this error.

Thank you for your patience,
Dave & gnif

#34 Reply
Posted by EEVblog on 25 Oct, 2014 02:00
Quote from: tautech on 25 Oct, 2014 01:35
FYI: Had more of this early this morning:
504 - Gateway Timeout

Apparently the FTDI thing got picked up on Hacker News and I presume the server got overloaded.

#35 Reply
Posted by zapta on 25 Oct, 2014 03:29
Quote from: EEVblog on 25 Oct, 2014 02:00
Apparently the FTDI thing got picked up on Hacker News and I presume the server got overloaded.

Or FTDI decided to nuke also these forums...

#36 Reply
Posted by tautech on 26 Oct, 2014 02:41
Quote from: tautech on 25 Oct, 2014 01:35
FYI: Had more of this early this morning:
504 - Gateway Timeout
We are sorry for the inconvenience that this error may be causing you. We are aware of the issue and are working to resolve it, please be patient.

There is no need to report this error.

Thank you for your patience,
Dave & gnif
More again for a short while today.

#37 Reply
Posted by nanofrog on 26 Oct, 2014 04:11
Quote from: tautech on 25 Oct, 2014 01:35
504 - Gateway Timeout
We are sorry for the inconvenience that this error may be causing you. We are aware of the issue and are working to resolve it, please be patient.

There is no need to report this error.

Thank you for your patience,
Dave & gnif
I'm still getting this.

#38 Reply
Posted by gnif on 26 Oct, 2014 07:34
Quote from: wilfred on 26 Oct, 2014 03:37
Even though I am not entirely convinced they always DO know about the problem, in spite of this canned response, it is nonetheless clear there is no need to report the problem.

You are sort of correct, that error about us knowing about it was when we had a prolonged window where we expected this to occur (server changes) and the error page was never fixed/updated when we were done.

I have been monitoring the server closely to try to determine where the performance bottlekneck is in the current configuration, we initially assumed these issues were due to the Slashdot traffic, but it is becoming aparrent that there is something more going on.

#39 Reply
Posted by tautech on 26 Oct, 2014 07:40
Quote
but it is becoming aparrent that there is something more going on.
Yep for the last few weeks, sometimes for some hours.
Today....infrequent short outages.

#40 Reply
Posted by gnif on 26 Oct, 2014 07:45
Quote from: tautech on 26 Oct, 2014 07:40
Quote
but it is becoming aparrent that there is something more going on.
Yep for the last few weeks, sometimes for some hours.
Today....infrequent short outages.

I have just changed the website to store sessions in memcached instead of in the database on disk, this seems to be the bulk of I/O occuring and should make a noticable improvement to performance. If you get timeouts again please don't hesitate to PM me as I will notice that first. The next stage might just be a simple matter of increasing the number of apache processes as the gateway timeout occurs when Nginx can not communicate with the Apache backend (better solution yet would be to remove apache from the equasion, but would prefer not to as this starts to get outside of what cPanel will support).

#41 Reply
Posted by T3sl4co1l on 27 Oct, 2014 13:29
Been getting a "session timeout, please attempt to reply again" message lately. Yes, I write overly long posts, but still...

Tim

#42 Reply
Posted by Thor-Arne on 27 Oct, 2014 14:23
I sometimes get "session timeout" when marking a board as read, so it's not the long posts.

#43 Reply
Posted by EEVblog on 27 Oct, 2014 21:11
It went down again a few hours ago.
Even the director of marketing at HostGator emailed me out of the blue and said they noticed it, and asked if it was due to FTDIgate
I guess he's a viewer

So I think it is simply a spike in traffic every time FTDIgate gets linked somewhere.
gnif has put in place caching that seems to be working well, but I guess there is only so much you can do with a single dedicated server.

#44 Reply
Posted by alimirjamali on 27 Oct, 2014 21:23
Quote from: EEVblog on 27 Oct, 2014 21:11
So I think it is simply a spike in traffic every time FTDIgate gets linked somewhere.
A proper engineer does not think but measure . Do you monitor RAM usage, Disk IO, CPU load ? There are many utilities in Penguin for both real-time and logged measurement of system metrics (free/top/htop/iftop/iotop/etc. ). Maybe it is time for an upgrade ?

#45 Reply
Posted by gnif on 28 Oct, 2014 02:49
Quote from: alimirjamali on 27 Oct, 2014 21:23
Quote from: EEVblog on 27 Oct, 2014 21:11
So I think it is simply a spike in traffic every time FTDIgate gets linked somewhere.
A proper engineer does not think but measure . Do you monitor RAM usage, Disk IO, CPU load ? There are many utilities in Penguin for both real-time and logged measurement of system metrics (free/top/htop/iftop/iotop/etc. ). Maybe it is time for an upgrade ?

We do indeed monitor, and nothing directly points to an issue which is why it has been hard to find out what is going on. Since we corrected the issue with the database crashing we are getting some more constant error reporting from the server that is helping to track the issue down. At this point it seems that there is something wrong with PHP as we are now seeing tons of php segfaults in dmesg, I am sill investigating as to the cause.

Edit: The outage that just occured was intentional, we just upgraded php to resolve a known bug with custom error handlers in the version of PHP we were running, I will conitnue to monitor the server and see if this resolves the issue that has been occuring.

#46 Reply
Posted by gdewitte on 28 Oct, 2014 14:37
I'm consistently getting a "504-Gateway timeout" when I try to "Show unread posts…" Anyone else having this problem, or could it be something hosed on my PC? All the other links to various posts seem to work just fine.

#47 Reply
Posted by Owen on 28 Oct, 2014 20:31
Quote from: gdewitte on 28 Oct, 2014 14:37
I'm consistently getting a "504-Gateway timeout" when I try to "Show unread posts…" Anyone else having this problem, or could it be something hosed on my PC? All the other links to various posts seem to work just fine.

I'm getting this 504-Gateway timeout when i'm trying to enter "www.eevblog.com/forum/testgear/".

#48 Reply
Posted by EEVblog on 28 Oct, 2014 20:39
Quote from: gdewitte on 28 Oct, 2014 14:37
I'm consistently getting a "504-Gateway timeout" when I try to "Show unread posts…" Anyone else having this problem

Yep, I get that too, but only on my machine at the lab, and only on Firefox. Not other browsers or computers.
https://www.eevblog.com/forum/index.php?action=unread works fine though and does the same thing, but the https://www.eevblog.com/forum/unread/ link does not work.

#49 Reply
Posted by alimirjamali on 28 Oct, 2014 21:02
Quote from: EEVblog on 28 Oct, 2014 20:39
Yep, I get that too, but only on my machine at the lab, and only on Firefox. Not other browsers or computers.
https://www.eevblog.com/forum/index.php?action=unread works fine though and does the same thing, but the https://www.eevblog.com/forum/unread/ link does not work.
This is a server side issue and has almost nothing to do with your browser (Firefox) or internet connection (@lab).

The reason you were able to get it through index.php is you virtually bypassed browser cashing by an alternative URL and got a fresh copy. The fresh copy might work fine or be another 504-Gateway timeout page.

Hey gnif: Maybe it would be better to add META HTTP-EQUIV="Pragma" CONTENT="no-cache" to head section on the default 504-Gateway Timeout HTML page. Maybe add a meta http-equiv="refresh" content="300" tag too.

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

There was an error while thanking

Thanking...

Go to page:

« 1 2 3 » All

Full site Menu

Navigation

Powered by SMFPacks Advanced Attachments Uploader Mod