Author Topic: Return of 502- Bad gateway Error  (Read 58653 times)

0 Members and 2 Guests are viewing this topic.

Offline WastelandTek

  • Frequent Contributor
  • **
  • Posts: 609
  • Country: 00
Re: Return of 502- Bad gateway Error
« Reply #225 on: July 30, 2017, 03:57:29 am »
good grief

I do hope you found a way to make MS bugger off
I'm new here, but I tend to be pretty gregarious, so if I'm out of my lane please call me out.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1690
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #226 on: July 30, 2017, 05:19:18 am »
good grief

I do hope you found a way to make MS bugger off

No, we need the indexing, blocking crawlers is generally not an option.

You want to reveal what you think the cause of the problem was? Or not. ;)

Two things.

1) Microsoft (aka, Bing) hammering the search on the forum. Seems their crawler uses the search feature of the forum to look for keywords, which was causing quite high server load when the crawl started each time.

2) The podcast plugin for the forum
So it was essentially transient high CPU behind the problem?

Pretty much, assuming it is fixed now. The podcast plugin that Dave was using tracks every view/hit/listen and where they came from, but instead of keeping a count of the totals it was doing a 'SELECT COUNT(DISTINCT col)' query over 6.2 million records. Each time Dave logged into the wordpress admin interface it would re-count them, causing a huge spike in database load.

SMF also does some strange stuff to search the forums, it creates a temporary table (Hash) that it uses to store the result set into for further sorting. Since this table never actually exists, I had never seen the effect it was having. To resolve this I installed the Sphinx full text search engine onto the server and configured SMF to use that instead, now the search can be hammered as much as Bing want's and it wont cause spikes in DB load.

Here is a small sample of what Bing is feeding the search box, such a dirty method of crawling.

Code: [Select]
[Sun Jul 30 03:16:42.090 2017] 0.073 sec 0.073 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by tech swiss
[Sun Jul 30 03:16:45.231 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by
[Sun Jul 30 03:16:51.472 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or
[Sun Jul 30 03:16:54.275 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk
[Sun Jul 30 03:16:57.432 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather
[Sun Jul 30 03:17:08.289 2017] 0.024 sec 0.024 sec [ext2/1/ext 466 (0,1000) @id_topic] [smf_index] evolution
[Sun Jul 30 03:17:40.449 2017] 0.021 sec 0.021 sec [ext2/1/ext 218 (0,1000) @id_topic] [smf_index] 121
[Sun Jul 30 03:17:51.521 2017] 0.041 sec 0.041 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count bags with
[Sun Jul 30 03:17:56.119 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count
[Sun Jul 30 03:17:59.322 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80
[Sun Jul 30 03:18:05.293 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift
« Last Edit: July 30, 2017, 05:23:01 am by gnif »
 

Offline hermit

  • Frequent Contributor
  • **
  • Posts: 482
  • Country: us
Re: Return of 502- Bad gateway Error
« Reply #227 on: July 30, 2017, 05:32:18 am »
Crawl?  That look more like a dictionary attack.
 

Offline WastelandTek

  • Frequent Contributor
  • **
  • Posts: 609
  • Country: 00
Re: Return of 502- Bad gateway Error
« Reply #228 on: July 30, 2017, 05:39:20 am »
you have got to be kidding

 :o
I'm new here, but I tend to be pretty gregarious, so if I'm out of my lane please call me out.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1690
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #229 on: July 30, 2017, 05:45:49 am »
Crawl?  That look more like a dictionary attack.

Makes me wonder what would happen if I made it fudge the result set, could we pollute their DB with crap?
 

Offline hermit

  • Frequent Contributor
  • **
  • Posts: 482
  • Country: us
Re: Return of 502- Bad gateway Error
« Reply #230 on: July 30, 2017, 06:34:59 am »
Makes me wonder what would happen if I made it fudge the result set, could we pollute their DB with crap?
M$ has been capable of manufacturing their own crap for quite a while now.  No external help needed.  But, let's put this in their database.  LINUX!
 

Offline Brumby

  • Supporter
  • ****
  • Posts: 12327
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #231 on: July 30, 2017, 07:51:46 am »
Crawl?  That look more like a dictionary attack.

Makes me wonder what would happen if I made it fudge the result set, could we pollute their DB with crap?

My guess - any such revenge will be short lived.  I would expect any DB pollution would be picked up and steps would be taken...
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1690
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #232 on: July 30, 2017, 08:12:16 am »
Pretty much, assuming it is fixed now. The podcast plugin that Dave was using tracks every view/hit/listen and where they came from, but instead of keeping a count of the totals it was doing a 'SELECT COUNT(DISTINCT col)' query over 6.2 million records. Each time Dave logged into the wordpress admin interface it would re-count them, causing a huge spike in database load.

SMF also does some strange stuff to search the forums, it creates a temporary table (Hash) that it uses to store the result set into for further sorting. Since this table never actually exists, I had never seen the effect it was having. To resolve this I installed the Sphinx full text search engine onto the server and configured SMF to use that instead, now the search can be hammered as much as Bing want's and it wont cause spikes in DB load.

Would  it be informative if you could correlate the bing searches and the times Dave logged in to the Wordpress admin interface? Maybe you already tried. If they match the times of the 502's, as well as if the install of the new podcast plugin coincided with the onset of the 502's you might feel fairly confident of the root cause. That last bit shouldn't be hard and if there is a coincidence you'll know it was all Dave's fault.  :)

It was a contributing factor, it was not the cause. This plugin has been in place since before I started helping Dave out, it has been a gradual slowdown as the tables grew.

But back to a more serious note those Bing searches surely would not have been a recent phenomenon. So whilst the Sphinx install may have been a nice thing to do it may have been unnecessary. Nice to have, but unnecessary.

I beg to differ, every time I managed over the last few days to be available during the 502 issue there was a backlog of temp table inserts for the search. Performing a search myself to evaluate the impact confirmed that the load of performing a single search was unacceptable. The sheer number of posts it is searching is huge and warranted the move to a full text search engine. SMF is good at building a keyword table, but since this is a technical forum there are tons of extra keywords in that table that a regular forum would not have, such as part numbers, model numbers, even every combination of N.Nk or N.Nkohm... the keyword table is enormous.

Note that implementing Sphinx was a last resort, I have been trying to avoid this up till now, but it has become obvious that it is required if this forum is to continue with it's "no deletion policy".
 

Online ebastler

  • Super Contributor
  • ***
  • Posts: 6647
  • Country: de
Re: Return of 502- Bad gateway Error
« Reply #233 on: July 30, 2017, 08:30:13 am »
Here is a small sample of what Bing is feeding the search box, such a dirty method of crawling.

Code: [Select]
[Sun Jul 30 03:16:42.090 2017] 0.073 sec 0.073 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by tech swiss
[Sun Jul 30 03:16:45.231 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or dresser organizer by
[Sun Jul 30 03:16:51.472 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk or
[Sun Jul 30 03:16:54.275 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather desk
[Sun Jul 30 03:16:57.432 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] valet tray leather
[Sun Jul 30 03:17:08.289 2017] 0.024 sec 0.024 sec [ext2/1/ext 466 (0,1000) @id_topic] [smf_index] evolution
[Sun Jul 30 03:17:40.449 2017] 0.021 sec 0.021 sec [ext2/1/ext 218 (0,1000) @id_topic] [smf_index] 121
[Sun Jul 30 03:17:51.521 2017] 0.041 sec 0.041 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count bags with
[Sun Jul 30 03:17:56.119 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80 count
[Sun Jul 30 03:17:59.322 2017] 0.001 sec 0.001 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift set 80
[Sun Jul 30 03:18:05.293 2017] 0.000 sec 0.000 sec [ext2/3/ext 0 (0,1000) @id_topic] [smf_index] stash tea flavor variety pack gift

Can anyone fathom why they search in this way? If one wants to index all posts and threads on the forum, systematically crawling the content herarchy still seems like the way to go. What do they expect to get from this forum search by trial-and-error?

In the example shown, it looks like they are searching for descriptions of consumer products. Is that the case for all such searches -- i.e. is this search somehow targeting ad-worthy content? I can't figure out what the benefit would be...
 

Offline RGB255_0_0

  • Frequent Contributor
  • **
  • Posts: 772
  • Country: gb
Re: Return of 502- Bad gateway Error
« Reply #234 on: July 30, 2017, 10:18:17 am »
It's amazing that the forum he doesn't have issues with respect to the length of some forum threads. Overclock.net had to delete several threads due to the performance issues they had on overall performance.

Eventually the no delete mantra may have to change.
Your toaster just set fire to an African child over TCP.
 

Offline SeanB

  • Super Contributor
  • ***
  • Posts: 16302
  • Country: za
Re: Return of 502- Bad gateway Error
« Reply #235 on: July 30, 2017, 11:00:39 am »
Might have to split the forum into chunks if it grows too big, or archive posts older than XXX or which have had nothing posted for XXX months in a separate database.
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 37868
  • Country: au
    • EEVblog
Re: Return of 502- Bad gateway Error
« Reply #236 on: July 30, 2017, 11:03:22 am »
It's amazing that the forum he doesn't have issues with respect to the length of some forum threads. Overclock.net had to delete several threads due to the performance issues they had on overall performance.

Eventually the no delete mantra may have to change.

Doesn't look like they use SMF like this forum, so would be entirely different code and database.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1690
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #237 on: July 30, 2017, 12:35:50 pm »
It's amazing that the forum he doesn't have issues with respect to the length of some forum threads. Overclock.net had to delete several threads due to the performance issues they had on overall performance.

Eventually the no delete mantra may have to change.

Hah! I challenge that :D... I am managing the RealGM forums where the same policy applies, its phpBB with over 35 million posts. Setup the hosting right, tune it right and it's amazing what you can accomplish when you know what you are doing. They were on the verge of deleting posts when they approached me.
 
The following users thanked this post: hermit

Offline StillTrying

  • Super Contributor
  • ***
  • Posts: 2850
  • Country: se
  • Country: Broken Britain
Re: Return of 502- Bad gateway Error
« Reply #238 on: July 30, 2017, 03:34:39 pm »
I'm surprised SMF can't insert attachments in between paragraphs rather than all at the bottom. It would save BW use by not having to expand the attachments into full size pics within the text.
.  That took much longer than I thought it would.
 

Offline station240

  • Supporter
  • ****
  • Posts: 967
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #239 on: July 30, 2017, 03:39:12 pm »
Couldn't you just throttle Bing, by Bandwidth or queries per second ?
That way it can still index, but without putting too much load on the server.
 

Offline pknoe3lh

  • Regular Contributor
  • *
  • Posts: 115
  • Country: at
  • Trust me I'm an engineer
    • My Homepage
Re: Return of 502- Bad gateway Error
« Reply #240 on: July 30, 2017, 03:45:22 pm »
You can use one server just for the database ;-)

How is ping coming throw the captcha question?
If you are not logged in you need to answer a question.

Offline tronde

  • Frequent Contributor
  • **
  • Posts: 307
  • Country: no
Re: Return of 502- Bad gateway Error
« Reply #241 on: July 30, 2017, 04:37:27 pm »
Almost 24 hours since the last detected 502's :D... we might have this all sorted now :D. Anyone have any more 502's that a refresh wont fix, please report them here.

Got one about 30 mins ago. Not fixed by refresh. Had to wait a couple of minutes.
 

Offline hermit

  • Frequent Contributor
  • **
  • Posts: 482
  • Country: us
Re: Return of 502- Bad gateway Error
« Reply #242 on: July 30, 2017, 04:44:21 pm »
Almost 24 hours since the last detected 502's :D... we might have this all sorted now :D. Anyone have any more 502's that a refresh wont fix, please report them here.

Got one about 30 mins ago. Not fixed by refresh. Had to wait a couple of minutes.
I just had a one off.  It was a different page.  Said SMF couldn't connect to the database.  Reloaded the page and it disappeared.  No waiting.  Number of database connections problem maybe?  I use the 'new posts' link and middle click to open new tabs as I go down the list.
 

Offline hans

  • Super Contributor
  • ***
  • Posts: 1653
  • Country: nl
Re: Return of 502- Bad gateway Error
« Reply #243 on: July 30, 2017, 06:12:33 pm »
On most other forums I see them splitting up mega threads into multiparts. There usually was a 1000 post limit per thread, or atleast they tried to work with that.

But those threads (e.g. the launch and discussion of a SmartPhone series or other computer gear) have very well maintained thread starts linking in models, different review videos, etc. and then also linking in the multi parts in the thread starts somewhere.

But I imagine that if the post table it self becomes too large (instead of the forum script querying which posts belong, in which order, to a specific large thread), then that also doesn't work.
 

Offline Richard Crowley

  • Super Contributor
  • ***
  • Posts: 4317
  • Country: us
  • KJ7YLK
Re: Return of 502- Bad gateway Error
« Reply #244 on: July 30, 2017, 08:10:44 pm »
There is a miscellaneous, catch-all thread on GearSlutz with 5646 posts and 557465 views so far from March, 2008:  "A thread for asking the things you should know by now but don't"

https://www.gearslutz.com/board/so-much-gear-so-little-time/184297-thread-asking-things-you-should-know-now-but-dont.html
 

Offline WastelandTek

  • Frequent Contributor
  • **
  • Posts: 609
  • Country: 00
Re: Return of 502- Bad gateway Error
« Reply #245 on: July 30, 2017, 08:19:54 pm »
There is a miscellaneous, catch-all thread on GearSlutz with 5646 posts and 557465 views so far from March, 2008:  "A thread for asking the things you should know by now but don't"

https://www.gearslutz.com/board/so-much-gear-so-little-time/184297-thread-asking-things-you-should-know-now-but-dont.html

yeah the The [H]ardForum Perpetual Freebies Thread is running at 7400 posts and grows daily,

but the most outrageous one I know of is the bitcointalk.org "wall observer" thread currently at over 347,000 posts.
« Last Edit: July 30, 2017, 08:23:06 pm by WastelandTek »
I'm new here, but I tend to be pretty gregarious, so if I'm out of my lane please call me out.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1690
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #246 on: July 30, 2017, 09:08:24 pm »
Almost 24 hours since the last detected 502's :D... we might have this all sorted now :D. Anyone have any more 502's that a refresh wont fix, please report them here.

Got one about 30 mins ago. Not fixed by refresh. Had to wait a couple of minutes.
I just had a one off.  It was a different page.  Said SMF couldn't connect to the database.  Reloaded the page and it disappeared.  No waiting.  Number of database connections problem maybe?  I use the 'new posts' link and middle click to open new tabs as I go down the list.

Ok, that is great news! I can account for these ones now the noise of the other 502s are gone, they are caused by the backup process. cPanel creates a tar and mysql dump daily then compresses, because cPanel can't know if all the databases are using InnoDB tables only it locks tables to perform their backups. Also it lacks the ability to perform incremental database backups, so each backup loads things down pretty high each day.

I will talk to Dave about using an alternative solution that my company can provide that won't cause this high load each day and will also save on a ton of bandwidth.
 

Offline eugenenine

  • Frequent Contributor
  • **
  • Posts: 865
  • Country: us
Re: Return of 502- Bad gateway Error
« Reply #247 on: July 30, 2017, 10:04:25 pm »
Clicking on the Show unread posts since last visit. link causes a 502.
 

Offline gnif

  • Administrator
  • *****
  • Posts: 1690
  • Country: au
Re: Return of 502- Bad gateway Error
« Reply #248 on: July 30, 2017, 10:06:17 pm »
Clicking on the Show unread posts since last visit. link causes a 502.

That's the 2nd report of that, please clear your cache or force refresh. 502's don't affect a single page, they are site wide.
 

Offline StillTrying

  • Super Contributor
  • ***
  • Posts: 2850
  • Country: se
  • Country: Broken Britain
Re: Return of 502- Bad gateway Error
« Reply #249 on: July 30, 2017, 10:37:22 pm »
502's don't affect a single page, they are site wide.

Really, there's often only one thread or part of the site, such as the projects index page that we can't get to.

.  That took much longer than I thought it would.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf