-
Return of 502- Bad gateway Error
Posted by
Kevin.D
on 01 Feb, 2015 12:03
-
The
"502 - Bad Gateway
We are sorry for the inconvenience that this error may be causing you. We are aware of the issue and are working to resolve it, please be patient.
There is no need to report this error.
Thank you for your patience,
Dave & gnif
"
The old error above has returned from yesterday on certain forum pages after being ok for ~ 8 months
Has the forum web server been updated or something ?
Regards
-
#1 Reply
Posted by
macboy
on 01 Feb, 2015 19:09
-
Clear your cookies. Works for me every time.
-
#2 Reply
Posted by
tautech
on 05 Apr, 2015 03:46
-
Anybody else having problems today with this?
~ the last 6 hrs
Main forum: bad gateway
Main topic boards: bad gateway
Topics posted in: OK
All sub boards: OK
Any type of Reply in any thread: OK
-
-
Yes, I saw it this morning, one of the general chat threads,
it was a two page thread, first page was fine, second page
was bad gateway, nothing to do with cookies on my pc or tv.
I will go fetch it and let you know which one.
Muttley
-
-
Found one, test equipment, Keithley 2000 page two,
see if anyone else can open it, reply was server error.
Muttley
-
#5 Reply
Posted by
cs.dk
on 05 Apr, 2015 09:01
-
Found one, test equipment, Keithley 2000 page two,
see if anyone else can open it, reply was server error.
Muttley
I just get a blank page.. Win7 64/FF browser.
-
-
Look's like miguelvp was the last one in there,
so we can all point the finger at him for leaving it a mess..
Muttley
-
#7 Reply
Posted by
miguelvp
on 05 Apr, 2015 10:29
-
Nope, I had the same problem so I did hit reply and saw Dave that posted test or something like that.
So what I posted in my reply was:
somewhere in page two of this thread something went really wrong, I just get a blank page and the only way to read it is by pressing reply and read the messages bellow
Edit: I'm sure it wasn't Dave that broke it but he looked at it why it was happening. the bad post is probably the one before his. All you have to do is hit reply on page 1, in case you didn't figure that out
-
#8 Reply
Posted by
miguelvp
on 05 Apr, 2015 10:50
-
Here are the captures of the posts missing, maybe posting more will make it jump to page 3.
Some of the screenshots have overlaps, didn't want to spend too much time doing it just right
Edit: or maybe Dave was testing something that broke the thread? Hey he could have been testing some new forum feature that went totally wrong!
----------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------
-
#9 Reply
Posted by
amyk
on 05 Apr, 2015 13:23
-
That Keithley 2000 thread gives a 500 Internal Server Error for me on the second page. First page is readable.
-
#10 Reply
Posted by
uncle_bob
on 16 Jun, 2016 18:42
-
Hi
I realize a few things:
1) This is an old thread
2) Stuff happens and generally gets fixed
3) In general things work very well
4) I have no desire at all to spend time debugging somebody else's code. I have plenty of bugs in my own ....
At what point does the "Bad Gateway" become a reportable / correctable error? It has been hitting me on the "show replies" section for several days solid. Is the trip point a week?
Bob
-
#11 Reply
Posted by
Back2Volts
on 17 Jun, 2016 04:56
-
I experienced a bunch of 502s days back, may be 7-10 days ago. For a while it was unusable. Lately it has behaved for me.
knock on wood !
-
#12 Reply
Posted by
uncle_bob
on 17 Jun, 2016 11:56
-
Hi
I'm back up and running. It's just sort of weird how there is no real way to know if it's on my end (as in a cached error page) or something else.
Bob
-
#13 Reply
Posted by
RoGeorge
on 21 Mar, 2017 17:56
-
Not always, but sometimes - like once in a week or so - I get this 502 error.
It happens ONLY for the
https://www.eevblog.com/forum/...any_topic_here/, but never saw this error with other sites/forums. Reloading the page fix the error, but it's annoying, especially if it happens when you want to post a reply. At least I can still find the reply text and the edit form if I hit Back in the browser.
Win10/Firefox
-
#14 Reply
Posted by
Neganur
on 21 Mar, 2017 18:03
-
-
#15 Reply
Posted by
Benta
on 21 Mar, 2017 20:02
-
-
#16 Reply
Posted by
tautech
on 21 Mar, 2017 20:12
-
-
#17 Reply
Posted by
jpanhalt
on 22 Mar, 2017 23:15
-
This thread is more than a year old, and administrator "gnif" seems to have no other response than to blame the users for the bad gateway fail. Why hasn't it been fixed?
For at least the 6th time since I have been participating, I got this message (Bad Gateway message attached below). It is a PITA to get fixed by the user.
Second, please update your attachments/images process to the 21st century. Why can't we use ctrl+V to paste a png snippet in line with a comment?
PC = Intel-based home made
OS = Win 7 Pro
Browser = Chrome (current)
John
-
#18 Reply
Posted by
tooki
on 23 Mar, 2017 00:51
-
This thread is more than a year old, and administrator "gnif" seems to have no other response than to blame the users for the bad gateway fail. Why hasn't it been fixed?
For at least the 6th time since I have been participating, I got this message (Bad Gateway message attached below). It is a PITA to get fixed by the user.
Where is gnif blaming the user? He said it's high CPU load of unknown origin. He merely commented that force-refreshing isn't solving, but exacerbating the problem.
-
#19 Reply
Posted by
RoGeorge
on 24 Mar, 2017 20:56
-
I am constantly get the 502 Bad Gateway error when I'm accessing the links on the right of my avatar, the "Show unread posts since last visit." and "Show new replies to your posts."
-
#20 Reply
Posted by
jpanhalt
on 24 Mar, 2017 21:09
-
I am constantly get the 502 Bad Gateway error when I'm accessing the links on the right of my avatar, the "Show unread posts since last visit." and "Show new replies to your posts."
Have you tried a refresh? That is F5 or CTRL+F5 (assuming you are Windows).
John
-
#21 Reply
Posted by
EEVblog
on 24 Mar, 2017 21:43
-
-
-
Yes this forum was almost unusable yesterday. Gave me time to do something productive with my life, like personal hygiene.
-
#23 Reply
Posted by
gnif
on 25 Mar, 2017 02:45
-
This thread is more than a year old, and administrator "gnif" seems to have no other response than to blame the users for the bad gateway fail. Why hasn't it been fixed?
For a long time gnif was a volunteer. As far as I know he still is. But that may have changed recently. I also missed where he blamed users. Last I saw he said it was an issue with record locking during DB maintenance.
I did not blame users, I stated that forcing a full refresh, clearing the browser cache was pointless and exacerbated the issue, I never stated the issue was caused by users.
-
#24 Reply
Posted by
jpanhalt
on 25 Mar, 2017 04:09
-
From yesterday morning (March 24 at 1:48 AM Eastern DST, USA) (
https://www.eevblog.com/forum/news/forum-database-upgrade/msg1168976/#msg1168976 ):
Depends on time of day, to be honest this morning at 4:00AM (ADST) was the first time I had ever seen the 502 error personally, which was a lucky break as I was able to inspect the cause while it was occurring.
I have been using a refresh (F5 or CTRL+F5), and it has worked for me. I interpreted your prior statement that such action "exacerbated the issue" as blaming those users, like myself, who use that workaround for making the issue worse.
If that is not the case, can you explain why it seems to work, and more important, how you knew that it made the "situation" worse before you had actually observed the problem? How does it make the situation worse for other users?
John
-
#25 Reply
Posted by
SeanB
on 25 Mar, 2017 09:37
-
Refresh would force a reload, and generally a transient overload on the server, which caused the original failure, will be gone. However, the continuing reloads only add to the load on the server.
A much better thing instead is to stop, get up from your device, and go either have a drink of water, a short walk around outside, or something else that does not involve going to the website. This serves many functions. Firstly you are not adding to the load on the single site, so the backlog will be done, and the next time you come there, a few minutes later, you will be successful. Second it gets you off your backside, and moving about a little, alleviating the risk of DVT and at a least getting your circulation going better. The drink of water will hydrate you as well, and if you go for a short walk it will clear your brain.
Note none of this has any cost, and also is going to both relax you, be good for your general well being and help you in reducing stress in your life.
-
#26 Reply
Posted by
jpanhalt
on 25 Mar, 2017 09:55
-
Refresh would force a reload, and generally a transient overload on the server, which caused the original failure, will be gone. However, the continuing reloads only add to the load on the server.
A much better thing instead is to stop, get up from your device, and go either have a drink of water, a short walk around outside, or something else that does not involve going to the website. This serves many functions. Firstly you are not adding to the load on the single site, so the backlog will be done, and the next time you come there, a few minutes later, you will be successful. Second it gets you off your backside, and moving about a little, alleviating the risk of DVT and at a least getting your circulation going better. The drink of water will hydrate you as well, and if you go for a short walk it will clear your brain.
Note none of this has any cost, and also is going to both relax you, be good for your general well being and help you in reducing stress in your life.
Neglecting your smartass advice and addressing your content...
It seems clear you don't know the reason for the error, as your suggested solution simply didn't work. First time it happened to me, I tried accessing over two days, including at least one complete shutdown of my system. Although I had not had that exact error before, my assumption was something at the server, which for most sites that I visit is fixed within a few hours. I then cleared my shortcut and re-entered the URL. That worked.
Next time it happened, I did a very quick search on Google, found the refresh option in a stackexchange thread on the very subject, and have been using that option since.
As best I can tell, this problem on eevblog was first reported in 2014 (
https://www.eevblog.com/forum/chat/%27recent-posts%27-502-bad-gateway-error/ ). That and several related threads have had no response from administration. As they say, if you are not part of the solution, you are part of the problem.
John
-
#27 Reply
Posted by
SeanB
on 25 Mar, 2017 10:10
-
Neglecting your smartass advice and addressing your content...
It seems clear you don't know the reason for the error, as your suggested solution simply didn't work. First time it happened to me, I tried accessing over two days, including at least one complete shutdown of my system. Although I had not had that exact error before, my assumption was something at the server, which for most sites that I visit is fixed within a few hours. I then cleared my shortcut and re-entered the URL. That worked.
Next time it happened, I did a very quick search on Google, found the refresh option in a stackexchange thread on the very subject, and have been using that option since.
As best I can tell, this problem on eevblog was first reported in 2014 (https://www.eevblog.com/forum/chat/%27recent-posts%27-502-bad-gateway-error/ ). That and several related threads have had no response from administration. As they say, if you are not part of the solution, you are part of the problem.
John
Does work for me though.
-
#28 Reply
Posted by
PA0PBZ
on 25 Mar, 2017 10:16
-
What seems to be going on and where the confusion comes from I think is the following:
Whenever I get/got this 502 error it always takes some time before it shows up, but then when I go back and click the link again the error shows up instantly, so my guess is that the error gets cached somehow. You can click the link 10 times if you want and the error stays, but one time <Ctrl-F5> will clear it. So I don't care how many people say that it doesn't work, it works for me and I'm not the only one.
-
#29 Reply
Posted by
tooki
on 25 Mar, 2017 10:44
-
OK, I see where you're coming from.
Just to make sure we're on the same page, we know that the error itself is a server-side problem.
I guess the link in the error page causes the browser to use its cache. So when the server-side problem resolves itself, the browser isn't even attempting to reload.
Force refreshing doesn't make the server-side problem go away, but in fact exacerbates it. But without a refresh, your browser will continue to show the error message instead of the page (at least, until the expiration time of the cached page elapses.) So we've got the issue where refreshing actually delays the root issue being fixed, but is nonetheless necessary for the user-facing problem to go away.
I've normally simply walked away when I've gotten the error, done something else, and 5 minutes later it works. That's unwittingly given the server enough time to catch its breath, at which point the refresh succeeds.
-
#30 Reply
Posted by
tautech
on 25 Mar, 2017 18:23
-
To add to others experience and mentioned before, Unread Replies link might not work but immediately removing the forward slash from the URL does. Thereafter refreshing the page works without incident indefinitely but using the link again produces errors again no matter how long one waits.
I've had same problem with the main forum page off and on for a year or more, again tweaking the URL by removing .php gets it working without exception.
It's typical to need multiple tabs open to be able to use the forum and keep up to date.
-
#31 Reply
Posted by
Monkeh
on 25 Mar, 2017 18:41
-
To add to others experience and mentioned before, Unread Replies link might not work but immediately removing the forward slash from the URL does. Thereafter refreshing the page works without incident indefinitely but using the link again produces errors again no matter how long one waits.
I've had same problem with the main forum page off and on for a year or more, again tweaking the URL by removing .php gets it working without exception.
It's typical to need multiple tabs open to be able to use the forum and keep up to date.
This seems far more likely to be your browser being exceedingly badly behaved. That or your ISP is running a really, really broken caching proxy.
-
#32 Reply
Posted by
Monkeh
on 25 Mar, 2017 19:05
-
How about Cloudflare being the broken caching proxy?
If that were the case there'd be more than one or two people reporting such substantial issues. I'm no fan of Cloudflare, but they're not to blame here. Just for a dozen other things.
-
#33 Reply
Posted by
hendorog
on 25 Mar, 2017 19:28
-
OK, I see where you're coming from.
Just to make sure we're on the same page, we know that the error itself is a server-side problem.
I guess the link in the error page causes the browser to use its cache. So when the server-side problem resolves itself, the browser isn't even attempting to reload.
Force refreshing doesn't make the server-side problem go away, but in fact exacerbates it. But without a refresh, your browser will continue to show the error message instead of the page (at least, until the expiration time of the cached page elapses.) So we've got the issue where refreshing actually delays the root issue being fixed, but is nonetheless necessary for the user-facing problem to go away.
^^^ This.
The browser is caching the error page for days without a force refresh - I just cleared the Database Maintenance error message dated 24th March with a forced refresh.
-
#34 Reply
Posted by
tooki
on 25 Mar, 2017 21:06
-
But that's weird. It shouldn't stay cached for anywhere near that long! (And on my computers/devices, it doesn't.)
-
#35 Reply
Posted by
jpanhalt
on 25 Mar, 2017 22:03
-
But that's weird. It shouldn't stay cached for anywhere near that long! (And on my computers/devices, it doesn't.)
Shouldn't, wouldn't, couldn't ...
I have heard that too many times to count. In my business, such theories didn't hold a candle to what actually happens. I know of at least one IT "expert" who got fired because he didn't understand that difference.
If this blog wants to be PC-specific, then it needs to tell us what we need to do to get access. Otherwise, it needs to be robust enough to work with many platforms, including at least Windows and Chrome.
John
-
#36 Reply
Posted by
xrunner
on 26 Mar, 2017 00:42
-
Man this error is really bad here this evening. Not sure why ...
-
-
But that's weird. It shouldn't stay cached for anywhere near that long! (And on my computers/devices, it doesn't.)
There's some strange caching going on now that wasn't before.
'Shift-Reload'.
-
#38 Reply
Posted by
xrunner
on 26 Mar, 2017 02:31
-
I just got a "524 Error"
I tried to get a screen capture but failed. I'll try next time.
-
#39 Reply
Posted by
bktemp
on 26 Mar, 2017 05:51
-
I just got a "524 Error"
I had one too yesterday. It was the first time I had seen this error on the forum.
First it took a long time loading the page (maybe 30s) then the 524 timeout error showed up (it was from cloudflare). After pressing F5 it turned into the 502 error page. Pressing F5 again finally loaded the page but very slowly (>10s).
It looks like there is still something wrong and occasionally the server is busy servicing all the requests.
-
#40 Reply
Posted by
tautech
on 26 Mar, 2017 07:24
-
To add to others experience and mentioned before, Unread Replies link might not work but immediately removing the forward slash from the URL does. Thereafter refreshing the page works without incident indefinitely but using the link again produces errors again no matter how long one waits.
I've had same problem with the main forum page off and on for a year or more, again tweaking the URL by removing .php gets it working without exception.
It's typical to need multiple tabs open to be able to use the forum and keep up to date.
This seems far more likely to be your browser being exceedingly badly behaved. That or your ISP is running a really, really broken caching proxy.
I've often wondered the same but no other websites repeatedly throw errors like the forum does.
Chrome, latest version and Win 7 64 bit.
I consistently have ~20 tabs open as I have slow internet and it's simpler/faster to refresh those I work with than reopen them.
-
#41 Reply
Posted by
hendorog
on 26 Mar, 2017 07:43
-
To add to others experience and mentioned before, Unread Replies link might not work but immediately removing the forward slash from the URL does. Thereafter refreshing the page works without incident indefinitely but using the link again produces errors again no matter how long one waits.
I've had same problem with the main forum page off and on for a year or more, again tweaking the URL by removing .php gets it working without exception.
It's typical to need multiple tabs open to be able to use the forum and keep up to date.
This seems far more likely to be your browser being exceedingly badly behaved. That or your ISP is running a really, really broken caching proxy.
Actually I'd say this is just an artifact of the way the forum servers are setup. The cached error page is associated with the URL you requested with the slash on the end. Removing the slash is seen as a different request and so the cache is ignored and hits the server. This particular proxy/server/app combo doesn't care about whether the slash is there or not and so serves up the same page.
Same thing happens when you remove the .php. The server still serves the page with or without the .php, but the cache in the browser sees it as a different request.
i.e. These all go to the same page:
https://www.eevblog.com/forum/indexhttps://www.eevblog.com/forum/index.phphttps://www.eevblog.com/forum/
-
#42 Reply
Posted by
tooki
on 26 Mar, 2017 10:54
-
But that's weird. It shouldn't stay cached for anywhere near that long! (And on my computers/devices, it doesn't.)
Shouldn't, wouldn't, couldn't ...
I have heard that too many times to count. In my business, such theories didn't hold a candle to what actually happens. I know of at least one IT "expert" who got fired because he didn't understand that difference.
Oy gevalt, uppity much?
I said it
shouldn't, not that it
wasn't. The entire implication of my comment was that it's strange that you're experiencing that behavior. Ergo, something to look into.
As for someone getting fired, that's meaningless, in that people are often fired as scapegoats for something they really weren't responsible for.
If this blog wants to be PC-specific, then it needs to tell us what we need to do to get access. Otherwise, it needs to be robust enough to work with many platforms, including at least Windows and Chrome.
What on earth is this rant about?
-
#43 Reply
Posted by
RoGeorge
on 26 Mar, 2017 11:11
-
How does a server distinguish between a page request coming from the link "Show unread posts since last visit." (the one on the top of each page, near the user's avatar) versus the same address, "
https://www.eevblog.com/forum/unread/", coming from a refresh (F5) action?
I am asking this because:
- before the database maintenance, I had 502 Err here and there, maybe once in a week, and the err goes away by itself at the next access of the same link
- after the database maintenance, the two links near the avatar were ALWAYS returned 502 err for me. I testes it many times, I tried them maybe 20-30 times over a couple of hours. Always 502 Err.
- then, after one F5 for each link, the 502 Err has gone forever. I didn't saw a 502 Err since then.
Of course, this might be just a coincidence, but it's very hard to believe.
The most probable explanation to me is that the server has received a different type of request for the F5, compared with the request coming from the link "
https://www.eevblog.com/forum/unread/". That different request, apart from re-sending the page, also permanently fixed the 502 Err.
Seems to me that F5 is sending something else then just the URL from the address bar, or at least is doing something different on the browser side.
So, my curiosity is: What exactly does a browser send/do when I press F5, compared to a simple click on a link, please?
-
#44 Reply
Posted by
tooki
on 26 Mar, 2017 21:57
-
I don't know the exact naming, but an HTTP request does include the "referrer", so a web server really can distinguish a clicked link from a refresh or a manually-entered URL.
-
-
502s seen on both the 502 threads. LOL
-
#46 Reply
Posted by
tkuhmone
on 27 Mar, 2017 12:08
-
Fyi,
I saw 502 error today at 12:00 GMT. Waited couple of minutes and then try to login at forum again -> works ok...
-
#47 Reply
Posted by
EEVblog
on 27 Mar, 2017 12:40
-
I've had several slowdowns and 524 errors tonight.
The server will be getting a RAM upgrade soon which gnif assures me will help.
-
#48 Reply
Posted by
xrunner
on 28 Mar, 2017 01:50
-
Don't know if any upgrades have been done, but no errors at all today!
-
#49 Reply
Posted by
tautech
on 28 Mar, 2017 02:35
-
Don't know if any upgrades have been done, but no errors at all today!
Just checked my Unread Replies link that's returned 502 errors for
days and up until a few hours ago, instead of using it I leave a tab open and just refresh it......don't need to do that now.
Hope I don't have to post in this thread again.
-
#50 Reply
Posted by
tautech
on 28 Mar, 2017 03:42
-
-
#51 Reply
Posted by
gnif
on 28 Mar, 2017 06:49
-
Please be patient, we are working on a solution to the issue, prior there was a database table lock contention issue which was been resolved, but in doing so the servers workload has been increased. Dave is having additional RAM installed today which will allow us to cache the entire database working set into it, which should improve performance considerably.
We also have some server back end changes coming soon after some testing and debugging is done, a select group of users are currently using the new configuration and when we are confident it is ready it will be switched on for the masses.
-
#52 Reply
Posted by
bktemp
on 28 Mar, 2017 07:36
-
It is probably due to the server upgrade, I just had a session timed out message and Connection Problems message. This is the first time I have seen those messages.
The following error or errors occurred while posting this message:
Your session timed out while posting. Please try to re-submit your message.
Connection Problems
Sorry, SMF was unable to connect to the database. This may be caused by the server being busy. Please try again later.
-
#53 Reply
Posted by
gnif
on 28 Mar, 2017 07:43
-
It is probably due to the server upgrade, I just had a session timed out message and Connection Problems message. This is the first time I have seen those messages.
The following error or errors occurred while posting this message:
Your session timed out while posting. Please try to re-submit your message.
Connection Problems
Sorry, SMF was unable to connect to the database. This may be caused by the server being busy. Please try again later.
Please also note that after the server comes back online there will be a period of tuning which will cause various errors and timeouts, I will update this thread when this is completed, there is no need to report these and I apologize in advance for the frustration it will cause.
-
#54 Reply
Posted by
tautech
on 28 Mar, 2017 08:25
-
All good here so far gnif.
-
#55 Reply
Posted by
edy
on 06 Jul, 2017 05:10
-
Strange, when I use Chromium on Ubuntu 16.04 I am getting the 502 - Bad Gateway message for the forums.
However, in another browser on the same machine (Mozilla Firefox) I am able to load the forums and currently posting this message. I've tried refreshes, clearing caches, etc.... all to no avail. Chromium doesn't want to load the forum, but Firefox is ok. I wonder if maybe it is a browser compatibility issue?
The other thing I've noticed is that Firefox complained that EEVBLOG wasn't secure. I'm not sure if that has always been the case, or if we lost the https or something changes with a certificate?
[EDIT: I didn't realize this thread was from March... I just searched because I was experiencing this problem the last couple of days. Anyone else notice this?]
-
#56 Reply
Posted by
igendel
on 06 Jul, 2017 05:15
-
Anyone else notice this?
Me too, on Firefox - the links for "Show unread posts since last visit" and "Show new replies to your posts" lead me to a 502 page for a few days now. The actual URL has a "/" at the end, if I delete it manually I get the expected pages.
-
-
It appears that the forum was taken down within the last few hours(?) for maintenance or upgrade, etc. I noticed that I repeatedly saw the same "down for maintenance" notice whenever I tried back after a few minutes. Until I remembered that there is some significant culpability from the browser here. It doesn't actually re-try the URL, it just shows the same notice over and over again. I finally remembered and forced the browser to refresh the page, whereupon, EEVblog Forum was back in all its glory.
So the learning from that is to use the explicit browser refresh page function before concluding that the page is really not available.
-
-
I've been getting the 502 - Bad Gateway error the last few days as well. It seems to only happen when I click on the "Projects, Designs & Technical Stuff" link while reading a message within that subforum (that is, to go back up to the subforum level from the thread level). It does not happen when I click on that subforum heading from the main forum list, or when I refresh the subforum while reading it. To clarify, click on this:
https://www.eevblog.com/forum/projects/ Seems to work fine, whereas if I click on this
while currently reading a message:
https://www.eevblog.com/forum/index.php?board=6.0I get the 502 error. Clicking on that link here (ie - from this message) seems to work, however, and that is all kinds of puzzling to me. But, I'm struggling to get a wordpress based site up right now so clearly I'm no authority on website internals...
-
#59 Reply
Posted by
TerraHertz
on 06 Jul, 2017 14:31
-
I've been getting the 502 bad gateway screen in bursts since the 4th. Always on attempting to fetch the forum front page. When it comes up it's persistent - every retry gets the same thing. I just go away and do something else, when I try again in 20 minutes it's OK. I'm using Firefox.
Incidentally the "bad gateway" message still includes a message from last March. See screen cap attached.
-
-
I had the 502 thing as well, and fixed it the usual way:
File > Tools > WTF > Patience > Try Again Tomorrow > STAY CALM > click OK Works every time
-
#61 Reply
Posted by
xrunner
on 07 Jul, 2017 00:20
-
Gah - I get this!
Error 503
The Error 502 file could not be found - contact your system administrator
-
#62 Reply
Posted by
hendorog
on 07 Jul, 2017 00:24
-
I think that page has some caching pragma's in it so it sticks around like shit to a blanket - which would make sense if the server was really down as you want that page to be visible.
The problem is, when there is a short outage it is hard to get rid of that page.
I do this to force a reload - and therefore determine if the server is down or if its just the error page hanging around.
(Google Chrome on Windows)
Hit F12
Right click the reload button on the Chrome toolbar.
Select Empty Cache and Reload
Hit F12 again
-
#63 Reply
Posted by
tautech
on 07 Jul, 2017 03:44
-
I think that page has some caching pragma's in it so it sticks around like shit to a blanket - which would make sense if the server was really down as you want that page to be visible.
The problem is, when there is a short outage it is hard to get rid of that page.
I do this to force a reload - and therefore determine if the server is down or if its just the error page hanging around.
(Google Chrome on Windows)
Hit F12
Right click the reload button on the Chrome toolbar.
Select Empty Cache and Reload
Hit F12 again
You nailed it Rog.
F5 works fine each time I hit a page that returned the 502 when the forum was down.
-
#64 Reply
Posted by
Falcon69
on 07 Jul, 2017 04:59
-
Interesting. For the last couple days been having trouble accessing the general chat. 502 bad gateway. This both my phone and desktop. Desktop using firefox...but could access it using microsoft edge browser. But i opened Firefox on desktop and pressed F5. Now i can access it on both desktop and phone now. How? I never did F5 to clear cache on phone? Thats weird. Is it something stored in the modem causing the error then.....wait...
Cant be. Because i could not access general chat on LTE4 network either...not just wifi. Im confused.
-
#65 Reply
Posted by
jpanhalt
on 07 Jul, 2017 06:25
-
Interesting. For the last couple days been having trouble accessing the general chat. 502 bad gateway. This both my phone and desktop. Desktop using firefox...but could access it using microsoft edge browser. But i opened Firefox on desktop and pressed F5. Now i can access it on both desktop and phone now. How? I never did F5 to clear cache on phone? Thats weird. Is it something stored in the modem causing the error then.....wait...
Cant be. Because i could not access general chat on LTE4 network either...not just wifi. Im confused.
What you are seeing is the difference between correlation (or coincidence) and cause and effect.
Changes at the server seem to have fixed the problem.
John
-
-
It was obvious the issues were not due to members computers, browsers, cookies, skinny milk, or locations
When there's a server problem, or an update, failed drive, or some d!ckhead DOS sp@mmer
bringing the system to a halt etc
members have to back off on the requests, log out and give the EEV admin a chance to sort it out
A couple of days without EEVblog isn't that tragic,
ok it is for many, but a week or two offline is a lot worse if everyone is banging on the 502 gate all at once
Is it all ok for everyone now?
all good here in Australia
-
#67 Reply
Posted by
sokoloff
on 08 Jul, 2017 00:04
-
I was getting very consistent 502s on the "unreadreplies" link for days.
I tested with an incognito window (no shared cookies), logged in, and the unreadreplies link worked.
Back to the normal window, that link 502s. It was totally repeatable.
I then went into my cookies in the normal browser window and deleted the forum.eevblog cookies one by one, testing after each deletion. Whatever I deleted first didn't do anything. The second cookie was something like _unam and after deleting that one, the unread replies link works fine 100% of the time.
I thought the suggestion farther up thread to "delete cookies and try again" was just voodoo, but in this case, it was voodoo that seems to have fixed the problem for me.
-
-
sokoloff, I did the same drill as you and the 502 was consistent and or intermittent, some posts showed up, some 502
If members keep banging on the door, they will get an answer eventually
...not sure if that helps the guys trying to get the server biz sorted out
I'd rather wait it out
works every time,
including this time, and no messing with browser settings required, except for log out and page closures
-
-
These are common on sites running nginx or cloudflare, there are configuration options that will fix them if that isnthe case. i have the same problem on a forum I run.
-
-
Lots of 502s these days.
-
-
A couple of 502s appeared yesterday, but gone today
-
#72 Reply
Posted by
Halcyon
on 11 Jul, 2017 09:36
-
Yep, I'm getting them more often than not from various different connections/locations, even after clearing cookies/browser cache. Will keep an eye on it over the next little while.
-
-
1605 Guests, 251 Users (3 Hidden)
and plenty of 502s.
-
#74 Reply
Posted by
EEVblog
on 12 Jul, 2017 12:30
-
Still happening for me a few times today. gnif hasn't gotten back with why yet.
-
-
check error logs for
mod_fcgid: errors
or sql errors "'max_allowed_packet"
These are easily tunable settings that are notorious
-
#76 Reply
Posted by
sibeen
on 12 Jul, 2017 12:49
-
Still happening for me a few times today. gnif hasn't gotten back with why yet.
That's terrible. A decent flogging can often sort out such mute resistance.
-
#77 Reply
Posted by
rstofer
on 12 Jul, 2017 15:55
-
502 used to be a Vehicle Code section related to misdemeanor drunk driving. Back when it was known as a 'deuce'.
You have to wonder about the people picking the error code numbers.
-
#78 Reply
Posted by
tooki
on 12 Jul, 2017 19:18
-
The inventor of the WWW, Tim Berners-Lee, is British and was working at CERN (Switzerland) at the time, so his choice of error codes is unlikely to have been influenced by US legal terminology.
-
#79 Reply
Posted by
TerraHertz
on 12 Jul, 2017 20:12
-
The forum was inaccessible via firefox for me last night. The whole evening, 502 everywhere.
Interestingly when I used a Tor browser it could get the front page reliably, and sometimes threads, though slowly. The slowness is the Tor system, so maybe that's a hint - speed dependent? - when a browser makes all the page element requests rapidly, the eevblog server fails and returns 502.
What intrigues me, is that the 502 error message is a specific thing. The eevblog maintainers know where that is, and have edited it in the past. Ref: the text "Thank you for your patience, Dave & gnif 24th March 2017: We are performing some ..."
Not that I know anything about large web server diagnosis, but if this was some electronics system I'd be trying the following:
* Make a change in what I thought was the '502 error message' text source. Both to check it's actually the right source, and to update that March message, which looks bad. Like no one could be bothered updating stuff.
* Search the server code for anything that can throw that 502 message. Stick some error logging messages in each one, or point each one at a unique 502 error text, to identify WHICH one is the cause.
Hopefully that would result in understanding the cause. Half way to a solution. Oh, and mentioning the identified cause in this thread, and what can be done about it, would be nice too.
-
#80 Reply
Posted by
tautech
on 12 Jul, 2017 20:20
-
I've had it every morning this week on a few pages and F5 gets past it every time. Chrome browser.
For the rest of the day there's no problems and it seems like there's a problem when I'm offline after ~9pm Sydney time.
Puzzling.
-
#81 Reply
Posted by
M4trix
on 12 Jul, 2017 20:24
-
Well, dunno but after clearing the FF browser cache and cookies I don't have this error anymore. Two days without 502 error.
-
-
Any chance of a DATE / TIME error in one of the servers, or something connected ?
I had fun with that at a clients office a while back,
a clapped Bios battery in one of the servers did it's thing reverting the system to 2012 AD,
after a power failure and reboot got the party started
There was so much dust in those machines, it resembled a dumped vacuum cleaner's contents,
I pulled the bundles out by hand while the system was running..
free fairy floss anyone?
sorry no pink left, just grey
-
-
Some behaviors that might provide clues to someone who understands how this stuff works.
1. Bookmark for the forum in Chrome consistently gave error. Displayed page identical to Terrahertz post.
2. Entering URL for EEVBlog main page worked, and from there was able to navigate to the forums page without trouble.
3. Bookmarked that page, and this new bookmark works consistently, even though the old bookmark consistently gets 502 error.
4. Hitting the HOME button on the page consistently results in 502 error with same display page.
The new bookmark has resolved the issue for me, but I have no clue why. I don't use the home button often so not a problem for me.
-
-
my guess is that the bookmark (happens to me, too) bypasses a real web fetch and goes to cache, first. when you enter a url on the url line, it has no idea you went there before (unlike a bookmark) and so it has to do a real network web fetch.
I'm just guessing, though.
dynamic content and incorrect 'expires' tags might be a problem. web devs should always be testing their code using the 'bookmark' method as well as urls entered directly.
of course, middle man 'content delivery' systems really throw a monkeywrench into this. that could be a big part of it; THEY are also a cache.
-
-
I know what is going on.
Dave simply could not resist performing a "teardown" of its main server....
He is going to post a video soon.
-
-
I just got a series of mixed error messages, the first was stating that the site could not be reached and shortly after the 502 message appeared again, after a wait and a number of refreshes the page eventually restored but it took quite a while, I suspect that the 502 is just a generic indicator that the site is temporarily busy or offline, sooner or later it always returns.
I have no clue but that in itself is nothing new.
-
-
'502 City, have a nice day..' this morning
Working like a champ in the evening
No changes performed on the browser,
because if Ebay and other big sites are working ok, why stuff about with browser settings?
KISS
-
#88 Reply
Posted by
SeanB
on 16 Jul, 2017 09:49
-
Almost sure this is something interacting with the browser and Clodflare, i rarely get the 502, but do get stalled connections, which are fine with a non cached reload.
-
#89 Reply
Posted by
alm
on 16 Jul, 2017 10:01
-
I think it is just timezone related combined with randomness. So for some people it will be very prevalent because it if frequent during the times they visit the forum, and others will rarely see it. If the effect is only occurring 10% of the time, then you would be very unlikely to get the error twice in a row. At other times the rate might be close to 100%, which would have the users report that the error is persistent.
-
#90 Reply
Posted by
pknoe3lh
on 16 Jul, 2017 10:22
-
I agree :-)
I think its load related!
On the weekend I see the error more often.
Good luck fixing it
-
#91 Reply
Posted by
Kleinstein
on 16 Jul, 2017 13:01
-
I got the error very constantly when calling "
https://www.eevblog.com/forum/" over the last few days. However other pages, e.g. calling the recently changed pages work ok. Not sure why, there could be something cached. It happens not only from the bookmarks, but also when going there from the main-page.
-
#92 Reply
Posted by
tautech
on 16 Jul, 2017 13:06
-
I got the error very constantly when calling "https://www.eevblog.com/forum/" over the last few days. However other pages, e.g. calling the recently changed pages work ok. Not sure why, there could be something cached. It happens not only from the bookmarks, but also when going there from the main-page.
I had exactly the same, yep it's locally cached.
Try F5 each time you bump into the 502............haven't had any now for a couple of days.
-
-
I viewed the forum briefly this morning at around 06:00 local time and everything appeared fine but at around 10:30 it all turned to crap with the constant 502 error window, another similar thread suggested users should press the ctrl & F5 keys to reload the page but being on a smart TV at the time without a keyboard attached made this operation impossible.
Anyway, is there any chance that these intermittent outages are being caused on purpose or maliciously generated by some fruit loop ?, it wouldn't surprise me one bit you know.
-
#94 Reply
Posted by
xrunner
on 19 Jul, 2017 02:07
-
I see there are two threads on this error, but as I said, tonight it's killing me, I can't even PM a member I need to talk to.
-
-
There are way more than two theads on the subject, three that I know of just in this chat section and probably stacks more elsewhere. I have noticed this becoming more frequent of late with multiple threads all with the same or similar content, people really need to use the forums search feature before creating new threads or look back at recently created threads for this to be minimised, It is becoming difficult to keep track on which thread to follow.
-
-
I think its load related!
It doesn't seem to match with the peaks in Guests+Users.
-
#97 Reply
Posted by
EEVblog
on 19 Jul, 2017 13:15
-
Houston, we have a problem:
-
-
Houston, we have a problem:
I can't see it!
BTW the 502s are here now.
-
#99 Reply
Posted by
EEVblog
on 19 Jul, 2017 13:32
-
I just did a graceful reboot.
Seems to have fixed it, now only using 9.6GB of the 32GB memory
-
-
hey, at least the message comes from cloudflare now
-
-
I suspect that the popcorn emoticon is sucking all the juice.
-
#102 Reply
Posted by
EEVblog
on 19 Jul, 2017 13:38
-
gnif informs me this is normal. We are using all the RAM for a database buffer for increased performance. The 9.6GB will slowly grow back to 32GB again.
gnif will investigate tomorrow.
-
#103 Reply
Posted by
LIV2
on 19 Jul, 2017 13:44
-
Besides DB buffers linux will use the memory for cache/buffers for other stuff too.
Explanation:
http://www.linuxatemyram.com/Dave do you have any stats graphing configured on your server? Something like munin or influx+telegraf & grafana would give you historical stats for the memory usage etc and make it a lot easier to pinpoint the problem. maybe difficult to get going depending on your linux sysadmin skill level but well worth the effort.
-
#104 Reply
Posted by
alm
on 19 Jul, 2017 13:51
-
Linux considers free memory useless, so will try to fill it with cache or buffers. You should only worry if memory is almost full with little cache or buffers.
-
-
I have not checked all the threads; but I'm curious; WHAT HAS CHANGED since these errors started showing up?
did the content mgmt system get upgraded? anything upgraded? anything changed?
if nothing on the server side changed, it would have to be cloudflare.
and if its cloudflare, maybe its time to dump them.
but I suspect something at the code side has changed. do you guys have version controls so you can go back and find what happened when things broke (timeframe)?
-
#106 Reply
Posted by
Naguissa
on 20 Jul, 2017 06:02
-
Houston, we have a problem:
Noooooo!
Add 15Gb of buffers' 3rd col to free memory! You had half memory available!
What I don't know is why it was using 1.2Gb of swap with so much free ram.
Probably you have a (cron?) process that uses a lot of memory runing from time to time (backup process?).
Enviado desde mi Jolla mediante Tapatalk
-
#107 Reply
Posted by
alm
on 20 Jul, 2017 06:43
-
What I don't know is why it was using 1.2Gb of swap with so much free ram.
Probably some data that has not been accessed for a long time. Might as well swap it out so you can use more memory as buffer if nobody is going to use that data anyhow. I find a few GB of swap usage quote normal for systems that have been running for a couple of days, even those with barely any memory pressure.
-
-
Site seem slow, the 502s will be along in a minute.
-
#109 Reply
Posted by
PA0PBZ
on 20 Jul, 2017 19:56
-
Site seem slow, the 502s will be along in a minute.
Yep, just encountered the first one for today
-
#110 Reply
Posted by
floobydust
on 20 Jul, 2017 20:01
-
I got quite a few 502 errors today.
I have 32GB in my home PC, server needs more
-
#111 Reply
Posted by
pknoe3lh
on 20 Jul, 2017 20:06
-
Me too
-
#112 Reply
Posted by
M4trix
on 20 Jul, 2017 20:10
-
Maybe Aurora Australis is active these days !
-
#113 Reply
Posted by
PA0PBZ
on 20 Jul, 2017 20:14
-
It looks like it's back alive now. It is always the same pattern, the site gets slower and slower until it takes 15-20 seconds to load when you click a topic. Soon after that it will give you the famous 502 page which you can overcome by refreshing the page (Ctrl-F5) but it will still be slow. Then for some unknown reason it is back to normal like now.
-
-
Yep, just encountered the first one for today
I haven't seen one yet, but I recognize the slowdown just before they appear.
-
-
Yep, it has recovered to full speed now, I managed to not see any 502s.
-
-
Got the same error yesterday or the day before. It was doing it despite rebooting the computer.
I then went to my laptop (via WiFi but on the same router as my main computer) and the link was working just fine.
I then went back the main computer and I got the gateway error again.
Using CCleaner I cleaned up all the cookies and caches on the machine and ran a registry scan. All is OK now.
Looks like a cookie or cache from the Chrome browser, acquired when there was a real bad gateway issue with the forum, was doing this even though the forum was OK now.
I'll see if that is repeatable next time there is an issue.
-
#117 Reply
Posted by
tautech
on 22 Jul, 2017 04:55
-
The forum is seriously sick, 502's one after another.
Pot luck if you can get access to any board or thread and if you want to make a post, good luck with that too.
Copy all info typed and pray when you go to post that it gets accepted.
The Preview when it works at least puts your efforts on the server and then you can play the Post>Back>Post>Back>Post game.
When you send a command (press link) if it doesn't get bounced back immediately with a 502, the server seems to take an age to respond (20s) before the browser indicates it's loading.
My poor F5 has never had such a hiding.
Calling gnif, come in gnif.
Edit to add.
When accessing the last post in a thread you briefly get to see it and when page then finishes loading your looking at the top of the page.
Somethings seriously sick.
-
#118 Reply
Posted by
KE5FX
on 22 Jul, 2017 06:15
-
-
-
What a nightmare, probably not related at all but I even had trouble trying to get on to the #eevblog IRC channel for some news or other peoples connection status, if the main forum page does load then you should look at the last post times to verify that the page is live and not a historical cached version.
-
#120 Reply
Posted by
ebastler
on 22 Jul, 2017 06:29
-
Got the same error yesterday or the day before. It was doing it despite rebooting the computer.
I then went to my laptop (via WiFi but on the same router as my main computer) and the link was working just fine.
I then went back the main computer and I got the gateway error again.
Using CCleaner I cleaned up all the cookies and caches on the machine and ran a registry scan. All is OK now.
Looks like a cookie or cache from the Chrome browser, acquired when there was a real bad gateway issue with the forum, was doing this even though the forum was OK now.
I'll see if that is repeatable next time there is an issue.
Mate, this error has nothing to do with your computer. It's the server which has problems. No need for you to reboot or fiddle around otherwise.
-
#121 Reply
Posted by
RoGeorge
on 22 Jul, 2017 06:40
-
Starting a few hours, the 502 err started again for me. Since then it was VERY consistent. Always 502 err bad gateway. I was using Win10/Mozilla.
Trying to debug a little, when tested with Tor under a Debian 8 VMware machine, EEVblog was consistently working just fine.
Under Mozilla/Win10 I use to access EEVblog by clicking a bookmark saved long time ago. My Mozilla/Win10 bookmark was pointing to "
https://www.eevblog.com/forum/". If I edit the link in the address bar to use https, "
https://www.eevblog.com/forum/" then it's working OK.
To sum up:VMware/Linux/Tor/http - always working
VMware/Linux/Tor/https - always working
Win10/Mozilla/http - 502 bad gateway
Win10/Mozilla/https - always working
In conclusion, using
https instead of
http helped me to get rid of the 502 error on Win10/Mozila.
-
#122 Reply
Posted by
1Ghz
on 22 Jul, 2017 06:43
-
Mate, this error has nothing to do with your computer. It's the server which has problems. No need for you to reboot or fiddle around otherwise.
You are right.
HTTP error codes 5xx are server side errors. Not client side.
-
-
Welcome Back to 502 City a few hours ago
All good again, like nothing happened
Are there any ad thingies interfering with the works perhaps?
Anyway guys, don't break Dave's balls about it,
he's supplying the forum for FREE after all
and it will get sorted out when it does
Meanwhile, back at the WWW superhighway...
-
#124 Reply
Posted by
not1xor1
on 22 Jul, 2017 08:56
-
here:
linux (kubuntu 14.04):
firefox 54.0 = bad gateway Error
chromium 59.0.3071.109 = no problem
-
#125 Reply
Posted by
alm
on 22 Jul, 2017 09:37
-
Probably caching/session related, rather than browser/OS.
-
#126 Reply
Posted by
not1xor1
on 22 Jul, 2017 09:51
-
Probably caching/session related, rather than browser/OS.
yes... reloading the page via F5 solved the problem in firefox
-
-
From: 1GHz
""
Quote from: ebastler on Yesterday at 04:29:42 PM
Mate, this error has nothing to do with your computer. It's the server which has problems. No need for you to reboot or fiddle around otherwise.
You are right.
HTTP error codes 5xx are server side errors. Not client side.
""
You are both right of course. The original 502 gateway error is server side.
However since my main computer was always showing the error but simultaneously my laptop did not have issues both connected to the forum at the same time from the same cable modem via my router obviously something was different between the two computer. Cleaning cookies and cache on the machine cleared it. I assume my Chrome browser had the initial 502 page in its cache...
Has not come back yet. Next time I'll see if that is reproducible.
-
-
heh, this is a server side issue and a problem with resources, not something clients are doing. There is nothing you can do to workaround it.
Most likely a bad configuration relating to sql timeouts and buffer sizes, or max_connections being too small, or a memory leak in sql or forum code most likely. I am guessing config , since this issue arises with growth and traffic.
-
-
heh, this is a server side issue and a problem with resources, not something clients are doing.
It starts off as a temporary server problem, but then becomes a bigger problem - for most users, the way the 502 page is cached.
I saw very few 502s yesterday, but tried to capture some of their headers, to me 304 seemed right, but 14400 seemed wrong, but that could very easily be because I had no idea what I was doing!
traffic.
I usually have the main index page open in another tab, so have been looking at the number of guests and users when I see some 502s, very roughly, so far the 502s seem to appear when the number of guests and users is low-ish rather than high-ish.
-
#130 Reply
Posted by
alm
on 22 Jul, 2017 17:54
-
What is the cause and what is the effect? Could it be that many of the guests (some of which bots) stop trying if they see errors popping up? Like a reasonable person would do, instead of some of you addicts
.
-
-
I shut down ALL page/s when the 502 appears and try the site again later
Pumping the F5 button and various reload/clear/refresh force tricks
isn't doing the EEVblog server or boss Dave any favors,
especially whilst trying to sort itself out, AND keep up with the extra
'addicts' demand
-
#132 Reply
Posted by
MarkS
on 23 Jul, 2017 01:19
-
extra 'addicts' demand
Off to take a hit of "Projects, Designs and Technical Stuff"...
-
#133 Reply
Posted by
tautech
on 23 Jul, 2017 01:24
-
Pumping the F5 button and various reload/clear/refresh force tricks isn't doing the EEVblog server or boss Dave any favors,
especially whilst trying to sort itself out, AND keep up with the extra 'addicts' demand
Nor is having a forum that was basically unusable for ~3hrs yesterday.
Online #'s dropped right away as most couldn't be bothered with the BS.
Maybe the new RAM that was installed a few weeks back is crook or there's a virus in the server as up until a week or two back the forums been better than ever.
-
-
Nor is having a forum that was basically unusable for ~3hrs yesterday.
Would your ~3hrs happen to be very close to 4hrs?
-
#135 Reply
Posted by
tautech
on 23 Jul, 2017 10:12
-
Nor is having a forum that was basically unusable for ~3hrs yesterday.
Would your ~3hrs happen to be very close to 4hrs?
It was shite for most of yesterday afternoon but worst from ~3pm until just after 6 NZT.
The post list on the main page just showed a few posts in that time, some were mine after bashing the shite out of F5.
-
-
On the few browsers I've tried, ctrl-reload or ctrl-refresh always gets passed the caching, (I don't know about ctrl-F5) often giving just another 502! - so wait a bit before trying that thread again.
I think if all 1200 guests and 250 users were doing that it would slow down the recovery, so don't tell anyone.
-
#137 Reply
Posted by
pknoe3lh
on 24 Jul, 2017 09:57
-
got some 502 before .....
Now its time for more information:
http://rsgiveaway.knoebel.at/loadtime.phpI wrote an Cornjob which checks the side every 1 Minute.
It also shows users and guests and should also detect 502 ;-)
The link shows a graph with the captured data.
-
-
Killa performance in Kangaroo Central today
boss Dave has pumped server with
Virtual Adrenaline software?
-
#139 Reply
Posted by
Gary350z
on 24 Jul, 2017 11:07
-
No 502's on Chrome or IE for about 2 days.
Previous 2 days the forum was unusable on any browser.
-
#140 Reply
Posted by
pknoe3lh
on 24 Jul, 2017 11:17
-
Even though clearing my browser cache sometimes seems to resolve the issue it definitely doesn't always. So, for that reason alone I am prepared to believe those who comment and seem to know, this is ultimately a server problem. I hope we get some clarity about the issue soon so this speculation can be put to rest. At least for now.
Maybe if you get an 502 check with my graph if it is correlated ;-)
http://rsgiveaway.knoebel.at/loadtime.php
-
#141 Reply
Posted by
ebastler
on 24 Jul, 2017 14:35
-
Maybe if you get an 502 check with my graph if it is correlated ;-)
http://rsgiveaway.knoebel.at/loadtime.php
Your graph is lacking labels or units on the time axis, which makes it difficult to correlate anything. Could you add labels?
-
#142 Reply
Posted by
PA0PBZ
on 24 Jul, 2017 14:58
-
Your graph is lacking labels or units on the time axis, which makes it difficult to correlate anything. Could you add labels?
Point your mouse at the loadtime line.
-
#143 Reply
Posted by
nugglix
on 24 Jul, 2017 15:00
-
Your graph is lacking labels or units on the time axis, which makes it difficult to correlate anything. Could you add labels?
Point your mouse at the loadtime line.
Doesn't help much, I'd say.
-
#144 Reply
Posted by
pknoe3lh
on 24 Jul, 2017 15:17
-
Your graph is lacking labels or units on the time axis, which makes it difficult to correlate anything. Could you add labels?
UPDATE:
http://rsgiveaway.knoebel.at/loadtime.phpI'm using google charts .. they should handle all the front end ... but yes i tried 10 different things no luck ...
The only work around i found was to use the Tooltip!
After some time (1 Day) the grid lines should be showing up.. just not working under one day
-
#145 Reply
Posted by
ebastler
on 24 Jul, 2017 21:21
-
UPDATE:
http://rsgiveaway.knoebel.at/loadtime.php
[...]
The only work around i found was to use the Tooltip!
Thanks, the tool tip does help! The preliminary conclusion from your data seems to be that inordinate load times (and hence probably 502 errors as well?) are not correlated with an unusually high number of users?
-
#146 Reply
Posted by
eugenenine
on 24 Jul, 2017 22:22
-
somebody spank that gateway already
-
#147 Reply
Posted by
DG41WV
on 25 Jul, 2017 00:51
-
F5 worked for me.
windows 8.1 Firefox 54.0.1
-
-
or a memory leak
they dont use liquid memory anymore.
so, thats not it.
-
#149 Reply
Posted by
pknoe3lh
on 25 Jul, 2017 09:47
-
-
#150 Reply
Posted by
RGB255_0_0
on 25 Jul, 2017 09:52
-
Yes.
-
-
Saw a few 502s while trying to get to this thread. LOL
There always seems to be a drop in the number of Guests at the time of the 502s, not really enough resolution or data on pknoe3lh's graph to see which happens first, but
it looks like the drop in number of Guests happens first, but that could just be because the graph itself hasn't seen a 502 - yet.
-
#152 Reply
Posted by
pknoe3lh
on 25 Jul, 2017 12:43
-
-
#153 Reply
Posted by
ebastler
on 25 Jul, 2017 14:37
-
or a memory leak
they dont use liquid memory anymore.
so, thats not it.
Ugh, yes -- those mercury leaks must have been messy, back in the day...
-
-
yeah, between memory spilling all over the place, and lp0 being on fire, it was a lot to keep up with, back in the day.
-
#155 Reply
Posted by
mtdoc
on 25 Jul, 2017 15:58
-
Whatever the cause, the 502 errors just keep coming back. It's making the forum borderline unusable for me.
-
#156 Reply
Posted by
PA0PBZ
on 25 Jul, 2017 18:11
-
-
#157 Reply
Posted by
tautech
on 25 Jul, 2017 18:37
-
The forum is seriously sick, 502's one after another.
Pot luck if you can get access to any board or thread and if you want to make a post, good luck with that too.
Copy all info typed and pray when you go to post that it gets accepted.
The Preview when it works at least puts your efforts on the server and then you can play the Post>Back>Post>Back>Post game.
When you send a command (press link) if it doesn't get bounced back immediately with a 502, the server seems to take an age to respond (20s) before the browser indicates it's loading.
My poor F5 has never had such a hiding.
Calling gnif, come in gnif.
Edit to add.
When accessing the last post in a thread you briefly get to see it and when page then finishes loading your looking at the top of the page.
Somethings seriously sick.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Groundhog day.
-
#158 Reply
Posted by
pknoe3lh
on 25 Jul, 2017 18:56
-
Update:
http://rsgiveaway.knoebel.atnew Url ... I overwrote something else :-(
Added new line on the bottom
I like the smaller stars (maybe a bit too small?) but it looks like the script stopped collecting data now?
Changed to an additional curve
And yes I was working on this changes
-
#159 Reply
Posted by
ebastler
on 25 Jul, 2017 19:05
-
I don't get this. It's not like Dave is running this server from his bedroom closet, right? There is supposedly a professional service provider behind this -- known as user "gnif" in this forum, but he is doing this for a living (
https://hostfission.com/). Why is is not possible to get this error condition sorted out? Why it is recurring again and again, after months, weeks, days?!
-
#160 Reply
Posted by
PA0PBZ
on 25 Jul, 2017 19:10
-
Changed to an additional curve
And yes I was working on this changes
I guessed that much yes, thanks. Would be interesting to run the script at some other servers in other parts of the world, sometimes it looks like not all are equally seeing the problem, which would point more in the direction of Cloudflare. And I really think the 502 page could do with a pragma no-cache and pragma content=no-cache...
-
#161 Reply
Posted by
tautech
on 25 Jul, 2017 19:16
-
I don't get this. It's not like Dave is running this server from his bedroom closet, right? There is supposedly a professional service provider behind this -- known as user "gnif" in this forum, but he is doing this for a living (https://hostfission.com/). Why is is not possible to get this error condition sorted out? Why it is recurring again and again, after months, weeks, days?!
AFAIK Daves server is in the US and gnif just remotely manages it and does the tweaks.
The server provider threw a whole lot more RAM in it recently and it was 100% for some months but it's back to it's old tricks it seems.
But yeah, the silence is deafening.
-
-
with a vengeance
-
-
It's always 24th March 2017
-
#164 Reply
Posted by
MK14
on 25 Jul, 2017 19:56
-
I've experienced a huge number of 502 errors, in the last few hours, timing approximate. Making the forum essentially unusable.
EDIT: Just getting this post made, has itself caused a significant number of error 502's as well.
I.e. lots of attempts.
When it does work, there seems to be a long delay as well, of something like 20 seconds or so.
-
#165 Reply
Posted by
ebastler
on 25 Jul, 2017 19:59
-
But yeah, the silence is deafening.
Agree; I am disappointed that "Dave & gnif" don't comment on this at all. Guys, this is embarrasing for a technology forum, in case you have not noticed!!
They could at least update that stupid error message talking about the March database changes, for Pete's sake! It's almost as if they were a scope manufacturerer and could not be bothered to fix that "pluses" label...
-
#166 Reply
Posted by
edpalmer42
on 25 Jul, 2017 20:09
-
It's always 24th March 2017
Yup, definitely Groundhog Day.
FWIW, I haven't seen more than an occasional 502 error in the past. But today, it's been crazy. I either get an instant 502 error or I have to wait up to 45 sec. for the page, any page, to load. If I get a 502 error, I don't have to hit F5, I just reload the page. After a few attempts, instead of an instant 502 error I get the long wait for the page load.
Do we know for sure that this is a bug? Could it be a hack or an attack?
-
#167 Reply
Posted by
Gyro
on 25 Jul, 2017 20:12
-
It's always 24th March 2017
At least it's a useful reminder of my wedding anniversary.
-
#168 Reply
Posted by
tautech
on 25 Jul, 2017 20:14
-
But yeah, the silence is deafening.
Agree; I am disappointed that "Dave & gnif" don't comment on this at all. Guys, this is embarrasing for a technology forum, in case you have not noticed!!
They could at least update that stupid error message talking about the March database changes, for Pete's sake! It's almost as if they were a scope manufacturerer and could not be bothered to fix that "pluses" label...
Dave did comment ~a week ago, see reply ~100.
Either SMF is not playing nice with the server OS or there's something wrong with the OS or the setup.
Then again Dave was going to shell out for more RAM but the server people added the additional RAM for free........maybe it's crook RAM.
-
#169 Reply
Posted by
grumpydoc
on 25 Jul, 2017 20:17
-
I've experienced a huge number of 502 errors, in the last few hours, timing approximate. Making the forum essentially unusable.
EDIT: Just getting this post made, has itself caused a significant number of error 502's as well.
I.e. lots of attempts.
When it does work, there seems to be a long delay as well, of something like 20 seconds or so.
There is an irony to the fact that you can't complain about all the 502 errors because of all the 502 errors
Forum essentially unusable at present, as you observe.
-
#170 Reply
Posted by
pknoe3lh
on 25 Jul, 2017 20:24
-
-
-
It's been getting worse for a couple of weeks, and it's basically unusable at this point. I don't know what it is...I'm on a lot of forums, this one has by far the least amount of traffic, yet has by far the most amount of problems. It's continuous, I can't even remember the last time I went a full week without getting a server error on this forum. I'm not just talking about the last couple of weeks, I'm talking about ever since I signed up several years ago.
-
-
Today is pretty bad. Combination of gateway errors and very very slow response if it works.....
-
#173 Reply
Posted by
M4trix
on 25 Jul, 2017 20:57
-
Too many Aussies are currently online. Probably the server can't handle it.
-
-
http://rsgiveaway.knoebel.atI don't know where the graph's server is, but the timing of the 502s that I saw matched with the stars/dots exactly.
From the posts, I think everyone is seeing the same thing from every location.
-
#175 Reply
Posted by
pknoe3lh
on 25 Jul, 2017 20:59
-
http://rsgiveaway.knoebel.at
I don't know where the graph's server is, but the 502s that I saw matched with the stars/dots exactly.
From the posts, I think everyone is seeing the same thing from every location.
Germany ;-)
-
#176 Reply
Posted by
tautech
on 25 Jul, 2017 21:04
-
Too many Aussies are currently online. Probably the server can't handle it.
Nah, they're just waking up. It's around 7am there.
-
#177 Reply
Posted by
RGB255_0_0
on 25 Jul, 2017 21:05
-
Should set the server to reboot every 3 days as a work around until this gets sorted. It did seem to work in the short term.
-
-
Should set the server to reboot every 3 days
3 times a day would be better.
I was here when dave rebooted, it was very quick, I only saw one 521.
-
#179 Reply
Posted by
Naguissa
on 25 Jul, 2017 21:32
-
-
-
If this site uses cloudflare you might need to make sure your forum has X-Forwarded-For inspection enabled.
-
-
In the US here. Sometimes the 502s drive me crazy from home (DSL), yet it seems to happen less often at work. (yes, I check in on breaks and lunch) Might just be the time of day. Sometimes I'll do a flush on cached Chrome data and then no 502s. Or maybe it's just luck. There doesn't seem to be a pattern to it. Frustrating.
-
#182 Reply
Posted by
gnif
on 25 Jul, 2017 22:36
-
Hi Everyone,
I am sorry about the lack of attention I have given this forum as of late, I have had little free time to assist here. That said I am digging into this now and will hopefully have a solution soon.
If this site uses cloudflare you might need to make sure your forum has X-Forwarded-For inspection enabled.
Thanks for the info, this is already catered for though. 502 is a server side error, a failure of Nginx to communicate with the PHP fpm process.
-
#183 Reply
Posted by
Mr.B
on 25 Jul, 2017 22:36
-
-
#184 Reply
Posted by
tautech
on 25 Jul, 2017 22:47
-
Hi Everyone,
I am sorry about the lack of attention I have given this forum as of late, I have had little free time to assist here. That said I am digging into this now and will hopefully have a solution soon.
Be sure to check this site put together by pknoe3lh:
http://rsgiveaway.knoebel.at/You might get some clues from correlation of error reports on the server to it.
-
-
Thanks for checking in gnif
it has been pretty bad
-
#186 Reply
Posted by
gnif
on 25 Jul, 2017 23:08
-
Dave, you can try to intall:
memcache and php-memcache or php5-memcache
And, of php5, php5-apc or php5-apcu
https://www.simplemachines.org/community/index.php?topic=467415.0
Enviado desde mi Jolla mediante Tapatalk
Thanks but we are already leveraging opcode caching and memcached.
Hi Everyone,
I am sorry about the lack of attention I have given this forum as of late, I have had little free time to assist here. That said I am digging into this now and will hopefully have a solution soon.
Be sure to check this site put together by pknoe3lh:
http://rsgiveaway.knoebel.at/
You might get some clues from correlation of error reports on the server to it.
I appreciate the information but I am already collecting this data amongst much more, this server is fully monitored but as I stated, due to the fact that I offer this to Dave as a favor I can not justify the time on it when I have paying clients that demand attention.
Thanks for checking in gnif
it has been pretty bad
You're most welcome
-
#187 Reply
Posted by
gnif
on 25 Jul, 2017 23:12
-
Just a heads up, I will be restarting services on the server to tune/test and debug, this will cause disruptions to the site. Reports of outages during this time only serve to distract, I will update here when things are stable and no further changes are expected.
-
#188 Reply
Posted by
gnif
on 26 Jul, 2017 00:35
-
I believe I have located the issue, it is a bug with how PHP handles file locking of the sessions. I attached a debugger to PHP during antoher outage just now which confirmed it was hung on a file lock request.
(gdb) bt
#0 0x00007f7b27384c47 in flock () from /lib64/libc.so.6
#1 0x0000000000667795 in ps_files_open (data=0x7f7b264bb0c0, key=0x7f7b264ad018 "hb2g026hbbs97ukagljo5ainok") at *REDACTED*/ext/session/mod_files.c:214
#2 0x0000000000667a88 in ps_read_files (mod_data=<value optimized out>, key=<value optimized out>, val=0x7fff595009a8, maxlifetime=<value optimized out>)
at *REDACTED*/ext/session/mod_files.c:482
#3 0x0000000000664c07 in php_session_initialize () at *REDACTED*/ext/session/session.c:426
#4 0x000000000066518d in php_session_start () at *REDACTED*/ext/session/session.c:1534
To work around this I am evaluating switching session storage to memcached, it seems though that SMF doesn't like this much, I am investigating options right now to try to resolve this.
Update: SMF is trying to store too much data in sessions to use memcached, I have instead switched it over to session storage in the database. We turned this off years ago due to the performance hit due to the inability to use InnoDB tables, and a lack of server RAM. This is no longer the case, and as such we can return to using the native SMF database session storage.
-
-
This is probably just a side effect of your process
gnif and I don't mean to distract you but the numbers of online users dropped off dramatically during the last hour or so. Oh here we go, the third picture is the most recent and shows the users returning, somebody must have left the gate open.
-
#190 Reply
Posted by
gnif
on 26 Jul, 2017 01:19
-
This is probably just a side effect of your process gnif and I don't mean to distract you but the numbers of online users dropped off dramatically during the last hour or so. Oh here we go, the third picture is the most recent and shows the users returning, somebody must have left the gate open.
The switch of the session storage will have logged everyone out and messed with the stats of online users.
-
#191 Reply
Posted by
hendorog
on 26 Jul, 2017 01:21
-
Quick lock the buggers out before they sneak back in, it's going nicely again now
-
-
Today is pretty bad. Combination of gateway errors and very very slow response if it works.....
OK just came back after dinner and it did show gateway error again on my Chrome browser, just like this afternoon. Cleared up the cookies and cache and all is OK now (till next time).
Although the original issue is server side my browser will keep giving me the original cached error page, even though the real problem has now been cleared it seem.
Lots of people might "see" the error a much longer time than it was active if they don't cleanup the caches. The behaviour might be different with different browsers of course.
-
#193 Reply
Posted by
pknoe3lh
on 26 Jul, 2017 08:12
-
-
#194 Reply
Posted by
gnif
on 26 Jul, 2017 10:38
-
Today is pretty bad. Combination of gateway errors and very very slow response if it works.....
OK just came back after dinner and it did show gateway error again on my Chrome browser, just like this afternoon. Cleared up the cookies and cache and all is OK now (till next time).
Although the original issue is server side my browser will keep giving me the original cached error page, even though the real problem has now been cleared it seem.
Lots of people might "see" the error a much longer time than it was active if they don't cleanup the caches. The behaviour might be different with different browsers of course.
This could be CloudFlare and your browser, it will cache the 502s to try to alleviate some load from the target server. The server logs show that there has been no 502's since I corrected the session issue, but with this kind of error only time will tell if it is finally fixed. Thank you everyone for being so patient, I take it personally when things I support do not function correctly as it is a reflection of my technical ability.
-
#195 Reply
Posted by
RGB255_0_0
on 26 Jul, 2017 14:19
-
Today is pretty bad. Combination of gateway errors and very very slow response if it works.....
OK just came back after dinner and it did show gateway error again on my Chrome browser, just like this afternoon. Cleared up the cookies and cache and all is OK now (till next time).
Although the original issue is server side my browser will keep giving me the original cached error page, even though the real problem has now been cleared it seem.
Lots of people might "see" the error a much longer time than it was active if they don't cleanup the caches. The behaviour might be different with different browsers of course.
This could be CloudFlare and your browser, it will cache the 502s to try to alleviate some load from the target server. The server logs show that there has been no 502's since I corrected the session issue, but with this kind of error only time will tell if it is finally fixed. Thank you everyone for being so patient, I take it personally when things I support do not function correctly as it is a reflection of my technical ability.
It shouldn't question your ability but it does question your ability to provide time
I think I speak for everyone when I say we appreciate your effort.
-
#196 Reply
Posted by
gnif
on 26 Jul, 2017 15:16
-
I believe I have located the issue, it is a bug with how PHP handles file locking of the sessions. I attached a debugger to PHP during antoher outage just now which confirmed it was hung on a file lock request.
To work around this I am evaluating switching session storage to memcached, it seems though that SMF doesn't like this much, I am investigating options right now to try to resolve this.
Update: SMF is trying to store too much data in sessions to use memcached, I have instead switched it over to session storage in the database. We turned this off years ago due to the performance hit due to the inability to use InnoDB tables, and a lack of server RAM. This is no longer the case, and as such we can return to using the native SMF database session storage.
Does SMF follow the same session locking strategy whether memcached is used or a table? One being faster than the other only. If that is the case (I don't claim to know one way or the other) then won't the session lock still happen under the same circumstances albeit perhaps with a smaller timing window in the faster case?
At the moment just based on reading what you've said I don't see that waiting on an unavailable lock is a bug necessarily. Waiting may not be the best response in some cases, and it might be better for a session to release all locks and try later if some locks are currently held.
Perhaps if gnif hasn't the time explain the nature of the bug someone else here who does this stuff can explain it.
PHP can handle sessions using several different methods.
1) Files (default). It creates a /tmp/sess_XXXXX file for each session which contains the PHP serialized data for the session.
2) Memcache, in this instance memcache handles the atomic operations in RAM rather then on the FS, which is extremely fast, but... the record size is constrained too much for SMF to work with it.
3) Custom, you can register your own handlers and store it however you want, in the case of SMF it is using a session table in the database.
With files, PHP relies on the flock system call, and then funlock at the end of the request, this is fine normally, but when you start to get a very busy website if a PHP process takes an age to complete and gets terminated by the CGI handler (FPM in this instance), the call to funlock never gets called, the file remains locked (PHP processes are reused, the process doesn't terminate so the kernel doesn't clean these up). Then on the next request, PHP hangs without a timeout waiting on it's call to flock the session. This should really be handled better, such as registering the locks with the CGI handler, or something similar. There is only a finite amount of PHP handlers running, once they are all hung up on waiting for locks that will never occur, there are none left for Nginx to pass the request to and thus the 502.
With memcache, there is a limit on the object size stored, SMF is storing way too much in the session to store it in memcache. Also there is the issue of persistance, memcache is not guaranteed not to evict the data stored to make room for other stuff, there is a chance that your cached data gets dropped.
With the custom method, in this instance DB storage, it is up to MySQL to handle the locking, which detects if the session is terminated and unlocks any locks held when the client disconnects.
I hope this clears things up a little.
-
#197 Reply
Posted by
gnif
on 27 Jul, 2017 01:34
-
It sounds like this is an inherent design problem in PHP and short of a change it is a vulnerability the system admin can't get rid of. Not without some means to clear hung sessions holding critical locks.
It must be a common probem in busy servers using the same architecture then. If so I am surprised someone else hasn't already developed a custom solution that can be installed. Or have they?
This is the first instance I personally have seen of this issue on a production system, usually however when a site grows as large as this one you ditch things like cPanel and run full dedicated, which gives much greater control on how the server operates. Unfortunately Dave needs the ability to manage this server when I am not around so this is not really much of an option for him.
What happens if a forum user logs out? Does that trigger a release of the lock held for that session? Is that why you disconnected people yesterday?
It really depends on the application, most apps will call session_destroy which does exactly that, removes the entire session record/file/cache entry, others will blank the auth data in the session. Even if you are not logged in though you have a session, every visitor is assigned one so the website can track you even if you are a guest. This is common practice.
As for the 'disconnection' of people, no, this was not intentional, it is a side effect of changing storage methods. We went from files on disk to database records, there is no means to import the sessions from the files on disk into the database.
-
#198 Reply
Posted by
PA0PBZ
on 27 Jul, 2017 07:37
-
And... it's back
-
#199 Reply
Posted by
nugglix
on 27 Jul, 2017 07:37
-
-
#200 Reply
Posted by
grumpydoc
on 27 Jul, 2017 07:38
-
And... it's back
With a vengeance!
-
-
Good, with no forum to read lately I was starting to get some overdue stuff done around here anyway. I got three outages in a row over about five minutes.
What a bugger, I thought we had it nailed.
-
#202 Reply
Posted by
H.O
on 27 Jul, 2017 08:10
-
I'm sorry if this is completely unrelated, I haven't been following this thread but on this laptop of mine - for the last two days or so - I have not been able to (and still can't) reach the forum using IE (Edge), however using Chrome works.
-
#203 Reply
Posted by
gnif
on 27 Jul, 2017 08:51
-
I'm sorry if this is completely unrelated, I haven't been following this thread but on this laptop of mine - for the last two days or so - I have not been able to (and still can't) reach the forum using IE (Edge), however using Chrome works.
If you want help you must let us know what "still cant" means? do you get an error? what error?... etc.
As for the new outages... working on it, again please don't report for now, I will let you all know when I am done.
-
#204 Reply
Posted by
H.O
on 27 Jul, 2017 09:19
-
I'm sorry...
As of right now at this moment I can not reach the forum using IE, I get the 502-Bad gateway error message/page.
However, using Chrome I'm able to reach the forum and write this post.
I open IE (Edge) and browse to eevblog.com/forum and get the 502 error page. open Chrome and browse to eevblog.com/forum and it works, I try again with IE still get the 502. And that's how it's been for me the last two days or so.
-
-
Guys, PLEASE !!!
Cut the EEVblog admin some slack here,
running servers isn't like troubleshooting bad capacitors or fried diodes whilst googling repair tips, eating maccas and listening to Abba in the background
Try and understand this alien lingo, and put yourself in the admin's shoes..
http://technet.microsoft.com/en-us/library/bb794799.aspx
-
#206 Reply
Posted by
H.O
on 27 Jul, 2017 10:50
-
I, for one, certainly didn't mean to criticize anyone I was mearly trying to help. If I was
or if anyone (admin in particullar) felt I was criticizing I appologize.
-
-
I have noticed that when I got the 502 error, the next time I log on, I have to refresh the page (F5) otherwise the error stays in the cache. And this is the same for every subforum I am in. Each one needs a F5, if it previously had the error.
Today I had no problems at all.
Thanks for fixing it.
-
#208 Reply
Posted by
RoGeorge
on 27 Jul, 2017 14:21
-
I can access the main page, "
https://www.eevblog.com/forum/" or any topic I am clicking, but
constantly getting 502 - Bad Gateway errors when I'm logged in and access the two links near my avatar on the top of the page
-
Show unread posts since last visit.-
Show new replies to your posts.For me, the forum was unusable in the last couple of days.
-
#209 Reply
Posted by
RoGeorge
on 27 Jul, 2017 14:28
-
And now, after a few F5 (refresh page) over the 502 err pages, all is working OK.
I don't understand why 502 Err keeps coming back after a few days/weeks, and why each time there is another way to solve it. Last time it was to switch from http to https, now it's just an F5. I don't get it.
-
#210 Reply
Posted by
pknoe3lh
on 27 Jul, 2017 14:39
-
And now, after a few F5 (refresh page) over the 502 err pages, all is working OK.
I don't understand why 502 Err keeps coming back after a few days/weeks, and why each time there is another way to solve it. Last time it was to switch from http to https, now it's just an F5. I don't get it.
The browser is just caching it! Thats all!
The idea behind it is to reduce the load in case of an error. (everyone is just trying to reload)
So if you reload the side again it will not see your request => less load!
some browsers are overdoing it
so the cache will stay too long.
when you enter https instead of http its a new side for the browser and it will not load it from the cache.
So it depends on your browser. For some its enough to hit F5 a lot
Others you need to press SHIFT and F5 at the same time.
so good luck fixing your problem
-
#211 Reply
Posted by
gnif
on 27 Jul, 2017 22:44
-
Update: I have made a few changes and enabled some additional metrics collection, I would not consider the issue resolved at this point. Before proceeding any further I need to wait a day or two to collect more data to see if a pattern emerges.
-
#212 Reply
Posted by
joeqsmith
on 28 Jul, 2017 00:54
-
Strange and maybe not a clue but Win7 Explorer will give me the 502 with http. https, nothing else I don't have the problem. Switch back, 502 returns. Back to s, goes away. Win10 box Explorer at same time without s, no 502. Very strange and makes no sense but seems repeatable.
-
#213 Reply
Posted by
tautech
on 28 Jul, 2017 03:11
-
Another session ~20 mins ago and this fixed it
-
#214 Reply
Posted by
EEVblog
on 28 Jul, 2017 03:37
-
I got it again today too, this is not good.
-
#215 Reply
Posted by
xrunner
on 28 Jul, 2017 03:39
-
Same here.
-
#216 Reply
Posted by
pknoe3lh
on 28 Jul, 2017 07:25
-
-
-
briefly here also, not like before though
-
#218 Reply
Posted by
gnif
on 29 Jul, 2017 02:47
-
Further changes went in today that should help with things, I have also noted that there is a nasty plugin on the wordpress website issuing some pretty poor queries. Dave is now migrating to a new plugin that should fix this.
-
#219 Reply
Posted by
EEVblog
on 29 Jul, 2017 07:51
-
Further changes went in today that should help with things, I have also noted that there is a nasty plugin on the wordpress website issuing some pretty poor queries. Dave is now migrating to a new plugin that should fix this.
Yep, this was Podpress. We have moved over to a new podcasting system (which even the Podpress author recommends) and an issue I have had with slow logging into wp-admin has now gone.
Hopefully this was the problem.
-
#220 Reply
Posted by
gnif
on 30 Jul, 2017 00:24
-
Almost 24 hours since the last detected 502's
... we might have this all sorted now
. Anyone have any more 502's that a refresh wont fix, please report them here.
-
#221 Reply
Posted by
hermit
on 30 Jul, 2017 00:56
-
Show new replies to your posts.
That link gives me a 502 still.
-
#222 Reply
Posted by
gnif
on 30 Jul, 2017 01:04
-
Show new replies to your posts.
That link gives me a 502 still.
Force refresh then, it's cached. 502s are entire website wide, not just one page.
-
#223 Reply
Posted by
hermit
on 30 Jul, 2017 01:16
-
Hmm... Update must have trashed my settings. I do some web work here and there and set my browser not to cache. Refreshing wasn't helping but dumping the cache, which wasn't supposed to be on, did work.
-
#224 Reply
Posted by
gnif
on 30 Jul, 2017 03:31
-
You want to reveal what you think the cause of the problem was? Or not.
Two things.
1) Microsoft (aka, Bing) hammering the search on the forum. Seems their crawler uses the search feature of the forum to look for keywords, which was causing quite high server load when the crawl started each time.
2) The podcast plugin for the forum
-
-
good grief
I do hope you found a way to make MS bugger off
-
#226 Reply
Posted by
gnif
on 30 Jul, 2017 05:19
-
good grief
I do hope you found a way to make MS bugger off
No, we need the indexing, blocking crawlers is generally not an option.
You want to reveal what you think the cause of the problem was? Or not.
Two things.
1) Microsoft (aka, Bing) hammering the search on the forum. Seems their crawler uses the search feature of the forum to look for keywords, which was causing quite high server load when the crawl started each time.
2) The podcast plugin for the forum
So it was essentially transient high CPU behind the problem?
Pretty much, assuming it is fixed now. The podcast plugin that Dave was using tracks every view/hit/listen and where they came from, but instead of keeping a count of the totals it was doing a 'SELECT COUNT(DISTINCT col)' query over 6.2 million records. Each time Dave logged into the wordpress admin interface it would re-count them, causing a huge spike in database load.
SMF also does some strange stuff to search the forums, it creates a temporary table (Hash) that it uses to store the result set into for further sorting. Since this table never actually exists, I had never seen the effect it was having. To resolve this I installed the Sphinx full text search engine onto the server and configured SMF to use that instead, now the search can be hammered as much as Bing want's and it wont cause spikes in DB load.
Here is a small sample of what Bing is feeding the search box, such a dirty method of crawling.
[Sun Jul 30 03:16:42.090 2017] 0.073 sec 0.073 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by tech swiss
[Sun Jul 30 03:16:45.231 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by
[Sun Jul 30 03:16:51.472 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or
[Sun Jul 30 03:16:54.275 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk
[Sun Jul 30 03:16:57.432 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather
[Sun Jul 30 03:17:08.289 2017] 0.024 sec 0.024 sec [ext2/1/ext 466 (0,1000) @id_topic] [smf_index] evolution
[Sun Jul 30 03:17:40.449 2017] 0.021 sec 0.021 sec [ext2/1/ext 218 (0,1000) @id_topic] [smf_index] 121
[Sun Jul 30 03:17:51.521 2017] 0.041 sec 0.041 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count bags with
[Sun Jul 30 03:17:56.119 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count
[Sun Jul 30 03:17:59.322 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80
[Sun Jul 30 03:18:05.293 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift
-
#227 Reply
Posted by
hermit
on 30 Jul, 2017 05:32
-
Crawl? That look more like a dictionary attack.
-
-
you have got to be kidding
-
#229 Reply
Posted by
gnif
on 30 Jul, 2017 05:45
-
Crawl? That look more like a dictionary attack.
Makes me wonder what would happen if I made it fudge the result set, could we pollute their DB with crap?
-
#230 Reply
Posted by
hermit
on 30 Jul, 2017 06:34
-
Makes me wonder what would happen if I made it fudge the result set, could we pollute their DB with crap?
M$ has been capable of manufacturing their own crap for quite a while now. No external help needed. But, let's put this in their database.
LINUX!
-
#231 Reply
Posted by
Brumby
on 30 Jul, 2017 07:51
-
Crawl? That look more like a dictionary attack.
Makes me wonder what would happen if I made it fudge the result set, could we pollute their DB with crap?
My guess - any such revenge will be short lived. I would expect any DB pollution would be picked up and steps would be taken...
-
#232 Reply
Posted by
gnif
on 30 Jul, 2017 08:12
-
Pretty much, assuming it is fixed now. The podcast plugin that Dave was using tracks every view/hit/listen and where they came from, but instead of keeping a count of the totals it was doing a 'SELECT COUNT(DISTINCT col)' query over 6.2 million records. Each time Dave logged into the wordpress admin interface it would re-count them, causing a huge spike in database load.
SMF also does some strange stuff to search the forums, it creates a temporary table (Hash) that it uses to store the result set into for further sorting. Since this table never actually exists, I had never seen the effect it was having. To resolve this I installed the Sphinx full text search engine onto the server and configured SMF to use that instead, now the search can be hammered as much as Bing want's and it wont cause spikes in DB load.
Would it be informative if you could correlate the bing searches and the times Dave logged in to the Wordpress admin interface? Maybe you already tried. If they match the times of the 502's, as well as if the install of the new podcast plugin coincided with the onset of the 502's you might feel fairly confident of the root cause. That last bit shouldn't be hard and if there is a coincidence you'll know it was all Dave's fault.
It was a contributing factor, it was not the cause. This plugin has been in place since before I started helping Dave out, it has been a gradual slowdown as the tables grew.
But back to a more serious note those Bing searches surely would not have been a recent phenomenon. So whilst the Sphinx install may have been a nice thing to do it may have been unnecessary. Nice to have, but unnecessary.
I beg to differ, every time I managed over the last few days to be available during the 502 issue there was a backlog of temp table inserts for the search. Performing a search myself to evaluate the impact confirmed that the load of performing a single search was unacceptable. The sheer number of posts it is searching is huge and warranted the move to a full text search engine. SMF is good at building a keyword table, but since this is a technical forum there are tons of extra keywords in that table that a regular forum would not have, such as part numbers, model numbers, even every combination of N.Nk or N.Nkohm... the keyword table is enormous.
Note that implementing Sphinx was a last resort, I have been trying to avoid this up till now, but it has become obvious that it is required if this forum is to continue with it's "no deletion policy".
-
#233 Reply
Posted by
ebastler
on 30 Jul, 2017 08:30
-
Here is a small sample of what Bing is feeding the search box, such a dirty method of crawling.
[Sun Jul 30 03:16:42.090 2017] 0.073 sec 0.073 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by tech swiss
[Sun Jul 30 03:16:45.231 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by
[Sun Jul 30 03:16:51.472 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or
[Sun Jul 30 03:16:54.275 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk
[Sun Jul 30 03:16:57.432 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather
[Sun Jul 30 03:17:08.289 2017] 0.024 sec 0.024 sec [ext2/1/ext 466 (0,1000) @id_topic] [smf_index] evolution
[Sun Jul 30 03:17:40.449 2017] 0.021 sec 0.021 sec [ext2/1/ext 218 (0,1000) @id_topic] [smf_index] 121
[Sun Jul 30 03:17:51.521 2017] 0.041 sec 0.041 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count bags with
[Sun Jul 30 03:17:56.119 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count
[Sun Jul 30 03:17:59.322 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80
[Sun Jul 30 03:18:05.293 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift
Can anyone fathom why they search in this way? If one wants to index all posts and threads on the forum, systematically crawling the content herarchy still seems like the way to go. What do they expect to get from this forum search by trial-and-error?
In the example shown, it looks like they are searching for descriptions of consumer products. Is that the case for all such searches -- i.e. is this search somehow targeting ad-worthy content? I can't figure out what the benefit would be...
-
#234 Reply
Posted by
RGB255_0_0
on 30 Jul, 2017 10:18
-
It's amazing that the forum he doesn't have issues with respect to the length of some forum threads. Overclock.net had to delete several threads due to the performance issues they had on overall performance.
Eventually the no delete mantra may have to change.
-
#235 Reply
Posted by
SeanB
on 30 Jul, 2017 11:00
-
Might have to split the forum into chunks if it grows too big, or archive posts older than XXX or which have had nothing posted for XXX months in a separate database.
-
#236 Reply
Posted by
EEVblog
on 30 Jul, 2017 11:03
-
It's amazing that the forum he doesn't have issues with respect to the length of some forum threads. Overclock.net had to delete several threads due to the performance issues they had on overall performance.
Eventually the no delete mantra may have to change.
Doesn't look like they use SMF like this forum, so would be entirely different code and database.
-
#237 Reply
Posted by
gnif
on 30 Jul, 2017 12:35
-
It's amazing that the forum he doesn't have issues with respect to the length of some forum threads. Overclock.net had to delete several threads due to the performance issues they had on overall performance.
Eventually the no delete mantra may have to change.
Hah! I challenge that
... I am managing the
RealGM forums where the same policy applies, its phpBB with over 35 million posts. Setup the hosting right, tune it right and it's amazing what you can accomplish when you know what you are doing. They were on the verge of deleting posts when they approached me.
-
-
I'm surprised SMF can't insert attachments in between paragraphs rather than all at the bottom. It would save BW use by not having to expand the attachments into full size pics within the text.
-
#239 Reply
Posted by
station240
on 30 Jul, 2017 15:39
-
Couldn't you just throttle Bing, by Bandwidth or queries per second ?
That way it can still index, but without putting too much load on the server.
-
#240 Reply
Posted by
pknoe3lh
on 30 Jul, 2017 15:45
-
You can use one server just for the database ;-)
How is ping coming throw the captcha question?
If you are not logged in you need to answer a question.
-
#241 Reply
Posted by
tronde
on 30 Jul, 2017 16:37
-
Almost 24 hours since the last detected 502's ... we might have this all sorted now . Anyone have any more 502's that a refresh wont fix, please report them here.
Got one about 30 mins ago. Not fixed by refresh. Had to wait a couple of minutes.
-
#242 Reply
Posted by
hermit
on 30 Jul, 2017 16:44
-
Almost 24 hours since the last detected 502's ... we might have this all sorted now . Anyone have any more 502's that a refresh wont fix, please report them here.
Got one about 30 mins ago. Not fixed by refresh. Had to wait a couple of minutes.
I just had a one off. It was a different page. Said SMF couldn't connect to the database. Reloaded the page and it disappeared. No waiting. Number of database connections problem maybe? I use the 'new posts' link and middle click to open new tabs as I go down the list.
-
#243 Reply
Posted by
hans
on 30 Jul, 2017 18:12
-
On most other forums I see them splitting up mega threads into multiparts. There usually was a 1000 post limit per thread, or atleast they tried to work with that.
But those threads (e.g. the launch and discussion of a SmartPhone series or other computer gear) have very well maintained thread starts linking in models, different review videos, etc. and then also linking in the multi parts in the thread starts somewhere.
But I imagine that if the post table it self becomes too large (instead of the forum script querying which posts belong, in which order, to a specific large thread), then that also doesn't work.
-
-
-
-
-
#246 Reply
Posted by
gnif
on 30 Jul, 2017 21:08
-
Almost 24 hours since the last detected 502's ... we might have this all sorted now . Anyone have any more 502's that a refresh wont fix, please report them here.
Got one about 30 mins ago. Not fixed by refresh. Had to wait a couple of minutes.
I just had a one off. It was a different page. Said SMF couldn't connect to the database. Reloaded the page and it disappeared. No waiting. Number of database connections problem maybe? I use the 'new posts' link and middle click to open new tabs as I go down the list.
Ok, that is great news! I can account for these ones now the noise of the other 502s are gone, they are caused by the backup process. cPanel creates a tar and mysql dump daily then compresses, because cPanel can't know if all the databases are using InnoDB tables only it locks tables to perform their backups. Also it lacks the ability to perform incremental database backups, so each backup loads things down pretty high each day.
I will talk to Dave about using an alternative solution that my company can provide that won't cause this high load each day and will also save on a ton of bandwidth.
-
#247 Reply
Posted by
eugenenine
on 30 Jul, 2017 22:04
-
Clicking on the Show unread posts since last visit. link causes a 502.
-
#248 Reply
Posted by
gnif
on 30 Jul, 2017 22:06
-
Clicking on the Show unread posts since last visit. link causes a 502.
That's the 2nd report of that, please clear your cache or force refresh. 502's don't affect a single page, they are site wide.
-
-
502's don't affect a single page, they are site wide.
Really, there's often only one thread or part of the site, such as the projects index page that we can't get to.
-
#250 Reply
Posted by
gnif
on 30 Jul, 2017 22:47
-
502's don't affect a single page, they are site wide.
Really, there's often only one thread or part of the site, such as the projects index page that we can't get to.
Yes, 502 means that there is an internal server error where the PHP process is not availble for Nginx to communicate with. For those that don't understand the process, this is how an incoming HTTP request is served.
1) Your client establishes a connection with the server (Nginx)
2) Your client asks the server for a resource
3) The server determines if the resource is a static file (ie, image), or dynamic (php script in this instance).
4a) For static the webserver just grabs the file on disk and send's it out.
4b) For dynamic the webserver connects internally to the PHP service on the server and asks it to run the script
502 errors are when the webserver can't connect to a non busy PHP service... the cause can be any number of things, bad script, PHP crashed, MySQL didnt return in a timely manner and all the PHP processes are hung up waiting. etc...
It doesn't matter what page, it doesn't care at this point. It can happen on any page at any time. Unfortunately CF do strange things with caching the 502 error (this likely can be resolved but up till now I have not investigated this aspect of the issue yet).
Because tons of people use the 'show new replies' feature, when we get 502 errors it has a high probability of being cached at cloud flare as broken. CF starts dishing out the static 502 even after the issue has been resolved for that particular page, which is why a force refresh will resolve it as this instructs CloudFlare to go and fetch the content again anyway.
-
#251 Reply
Posted by
hermit
on 30 Jul, 2017 23:13
-
502's don't affect a single page, they are site wide.
Really, there's often only one thread or part of the site, such as the projects index page that we can't get to.
A full 24 hours after "Show unread posts since last visit." started acting correctly " Show new replies to your posts." was still giving me a 502, even after multiple page reloads. I had to dump my cache, which I though was turned off.
-
#252 Reply
Posted by
EEVblog
on 30 Jul, 2017 23:18
-
How is ping coming throw the captcha question?
If you are not logged in you need to answer a question.
You can read and search this forum without an account. The search bot are not logging in.
I think I can disable the search feature to account holders only, but I don't like doing that.
-
#253 Reply
Posted by
glarsson
on 30 Jul, 2017 23:27
-
The search bot are not logging in.
How does it get past the captcha you are asked to answer when not logged in? I thought the captcha was intended to stop bots.
-
-
You can read and search this forum without an account. The search bot are not logging in.
I think I can disable the search feature to account holders only, but I don't like doing that.
Generally speaking and just for the record, both members and guests are still able to use Google or similar search engines to find particular topics or threads, I have a tendency to use this rather than use the forums search feature which from my experience can be a bit hit and miss, I don't know why this is, example below.
-
#255 Reply
Posted by
gnif
on 30 Jul, 2017 23:49
-
I am updating the 502 error (still states that we are doing DB updates, lol)... I will also create a thread that it specifically links to for reporting errors.
-
#256 Reply
Posted by
tronde
on 30 Jul, 2017 23:54
-
I think I can disable the search feature to account holders only, but I don't like doing that.
What about a time limit between searches for guests? I have seen some forums using only a few seconds as a limit. After reading here, I guess that is related to bots.
-
-
I am updating the 502 error (still states that we are doing DB updates, lol)... I will also create a thread that it specifically links to for reporting errors.
This has been discussed before and I don't know why it was not implemented earlier, as you can see we end up with multiple threads discussing the same subject or new threads addressing individual users problems which are sometimes just a profile checkbox tick.
-
#258 Reply
Posted by
gnif
on 30 Jul, 2017 23:57
-
I think I can disable the search feature to account holders only, but I don't like doing that.
What about a time limit between searches for guests? I have seen some forums using only a few seconds as a limit. After reading here, I guess that is related to bots.
Not sure why this is even being discussed... the problem is resolved, the server is able to handle a massive amount of search load now without any negative impact.
-
#259 Reply
Posted by
gnif
on 30 Jul, 2017 23:58
-
I am updating the 502 error (still states that we are doing DB updates, lol)... I will also create a thread that it specifically links to for reporting errors.
This has been discussed before and I don't know why it was not implemented earlier, as you can see we end up with multiple threads discussing the same subject or new threads addressing individual users problems which are sometimes just a profile checkbox tick.
The thread will not be for forum usage/profile issues, etc.. it is specifically for server outages. If you want a thread for that kind of support please ask Dave as I do not cover individual user issues, I do not have the time.
-
#260 Reply
Posted by
pknoe3lh
on 31 Jul, 2017 07:47
-
How is ping coming throw the captcha question?
If you are not logged in you need to answer a question.
You can read and search this forum without an account. The search bot are not logging in.
I think I can disable the search feature to account holders only, but I don't like doing that.
Yes I can do that ... But not a robot ;-)
So how can bing use the search box?
Can they solve captchas?
-
#261 Reply
Posted by
gnif
on 31 Jul, 2017 07:52
-
How is ping coming throw the captcha question?
If you are not logged in you need to answer a question.
You can read and search this forum without an account. The search bot are not logging in.
I think I can disable the search feature to account holders only, but I don't like doing that.
Yes I can do that ... But not a robot ;-)
So how can bing use the search box?
Can they solve captchas?
Top right hand corner quick search doesn't require a captcha
Edit: For search engines... it is intentional so that they can index the site.
-
#262 Reply
Posted by
eugenenine
on 01 Aug, 2017 22:59
-
[
So how can bing use the search box?
Why would you want to use a website that can't find anything (bing). I have to use google to search the MSKB at work now because bing can't find articles that exist.
-
#263 Reply
Posted by
pknoe3lh
on 02 Aug, 2017 14:57
-