Author Topic: Planned Downtime @ 2018-04-21 00:00:00 GMT.  (Read 19469 times)

0 Members and 1 Guest are viewing this topic.

Offline Monkeh

  • Super Contributor
  • ***
  • Posts: 7990
  • Country: gb
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #50 on: April 23, 2018, 01:52:31 am »
gnif, no DNS records for web1. and web2.eevblog.com - makes my mail server a tad unhappy.
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #51 on: April 23, 2018, 02:11:23 am »
gnif, no DNS records for web1. and web2.eevblog.com - makes my mail server a tad unhappy.

Your mail server should not be seeing web1.eevblog.com & web2.eevblog.com, they are relaying through cpanel1.eevblog.com which is what you should be seeing. I will double check the logs to confirm however.
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #52 on: April 23, 2018, 02:14:29 am »
gnif, no DNS records for web1. and web2.eevblog.com - makes my mail server a tad unhappy.

Your mail server should not be seeing web1.eevblog.com & web2.eevblog.com, they are relaying through cpanel1.eevblog.com which is what you should be seeing. I will double check the logs to confirm however.

Checked, I see, you are rejecting based on existence of the sending domain. Rather then add records for these hosts (i'd ranter limit their IP exposure) I will adjust the mail server to use plain 'eevblog.com' as the sending domain.
 

Offline Monkeh

  • Super Contributor
  • ***
  • Posts: 7990
  • Country: gb
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #53 on: April 23, 2018, 02:18:01 am »
gnif, no DNS records for web1. and web2.eevblog.com - makes my mail server a tad unhappy.

Your mail server should not be seeing web1.eevblog.com & web2.eevblog.com, they are relaying through cpanel1.eevblog.com which is what you should be seeing. I will double check the logs to confirm however.

Checked, I see, you are rejecting based on existence of the sending domain. Rather then add records for these hosts (i'd ranter limit their IP exposure) I will adjust the mail server to use plain 'eevblog.com' as the sending domain.

Correct. Every little helps. RDNS also appears to need setting up, having looked at full logs.
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #54 on: April 23, 2018, 02:25:55 am »
gnif, no DNS records for web1. and web2.eevblog.com - makes my mail server a tad unhappy.

Your mail server should not be seeing web1.eevblog.com & web2.eevblog.com, they are relaying through cpanel1.eevblog.com which is what you should be seeing. I will double check the logs to confirm however.

Checked, I see, you are rejecting based on existence of the sending domain. Rather then add records for these hosts (i'd ranter limit their IP exposure) I will adjust the mail server to use plain 'eevblog.com' as the sending domain.

Correct. Every little helps. RDNS also appears to need setting up, having looked at full logs.

This has been requested already, I need to chase the DC to find out why this has not happened yet.
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #55 on: April 23, 2018, 07:08:53 am »
rDNS records are now in place, please allow up to 24 hours for caches to clear :D
 

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #56 on: April 23, 2018, 01:10:38 pm »
You might get problems as the forward and reverse DNS don't agree for your named MX host:


nuit$ dig +short eevblog.com mx
0 mail.eevblog.com.
nuit$ dig +short mail.eevblog.com
192.200.109.226
nuit$ dig +short -x 192.200.109.226
cpanel1.eevblog.com.
nuit$


Some people will reject mail on that basis, some won't.
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #57 on: April 23, 2018, 01:23:43 pm »
You might get problems as the forward and reverse DNS don't agree for your named MX host:


nuit$ dig +short eevblog.com mx
0 mail.eevblog.com.
nuit$ dig +short mail.eevblog.com
192.200.109.226
nuit$ dig +short -x 192.200.109.226
cpanel1.eevblog.com.
nuit$


Some people will reject mail on that basis, some won't.

That's not how rDNS filtering works, it looks for the forward and reverse DNS to match, it doesn't care about MX records, otherwise services like gmail and office365 would be plagued with the same problem.

Code: [Select]
# host 192.200.109.226
226.109.200.192.in-addr.arpa domain name pointer cpanel1.eevblog.com.
# host cpanel1.eevblog.com.
cpanel1.eevblog.com has address 192.200.109.226

Here is gmail.

Code: [Select]
# nslookup
> set type=mx
> gmail.com
Server: redacted
Address: redacted#53

Non-authoritative answer:
gmail.com mail exchanger = 10 alt1.gmail-smtp-in.l.google.com.
gmail.com mail exchanger = 30 alt3.gmail-smtp-in.l.google.com.
gmail.com mail exchanger = 40 alt4.gmail-smtp-in.l.google.com.
gmail.com mail exchanger = 5 gmail-smtp-in.l.google.com.
gmail.com mail exchanger = 20 alt2.gmail-smtp-in.l.google.com.

Code: [Select]
> set type=a
> alt1.gmail-smtp-in.l.google.com
Server: redacted
Address: redacted#53

Non-authoritative answer:
Name: alt1.gmail-smtp-in.l.google.com
Address: 64.233.179.27

Code: [Select]
# host 64.233.179.27
27.179.233.64.in-addr.arpa domain name pointer om-in-f27.1e100.net.
# host om-in-f27.1e100.net.
om-in-f27.1e100.net has address 64.233.179.27
om-in-f27.1e100.net has address 66.102.12.27
om-in-f27.1e100.net has address 216.239.32.27

All that matters is the SMTP server has the rDNS resolve to the forward DNS.
« Last Edit: April 23, 2018, 01:27:06 pm by gnif »
 

Offline geekGee

  • Supporter
  • ****
  • Posts: 49
  • Country: bm
  • IT Veteran, EE Newbie
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #58 on: April 23, 2018, 01:34:11 pm »
It could impact an SPF record check but you've already mitigated that.

eevblog.com.    299     IN      TXT     "v=spf1 +a +mx ~all"
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #59 on: April 23, 2018, 02:34:49 pm »
It could impact an SPF record check but you've already mitigated that.

eevblog.com.    299     IN      TXT     "v=spf1 +a +mx ~all"

That doesn't apply here either, the web servers relay via the mx server. Technically "a" should not be in the SPF record since the website is proxied via CloudFlare, and the "a" is giving CloudFlare permission to send email from the domain.
« Last Edit: April 23, 2018, 02:38:16 pm by gnif »
 

Offline geekGee

  • Supporter
  • ****
  • Posts: 49
  • Country: bm
  • IT Veteran, EE Newbie
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #60 on: April 23, 2018, 03:05:11 pm »
It could impact an SPF record check but you've already mitigated that.

eevblog.com.    299     IN      TXT     "v=spf1 +a +mx ~all"

That doesn't apply here either, the web servers relay via the mx server. Technically "a" should not be in the SPF record since the website is proxied via CloudFlare, and the "a" is giving CloudFlare permission to send email from the domain.

Ah... now I see.  Entries cpanel1 and mail resolve to the same IP address.
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #61 on: April 24, 2018, 12:40:10 am »
Outage just now was caused by SMF attempting to "Optimize" the tables, I will need to adjust SMF to prevent this behavior.

Edit: SMF patched, this should not reoccur.
« Last Edit: April 24, 2018, 12:51:32 am by gnif »
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 37661
  • Country: au
    • EEVblog
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #62 on: April 24, 2018, 01:29:37 am »
Outage just now was caused by SMF attempting to "Optimize" the tables, I will need to adjust SMF to prevent this behavior.
Edit: SMF patched, this should not reoccur.

Thanks.
I occasionally "optimise" the tables manually in the admin section option just to keep things tidy.
Can/should I still do this?
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #63 on: April 24, 2018, 04:42:22 am »
Outage just now was caused by SMF attempting to "Optimize" the tables, I will need to adjust SMF to prevent this behavior.
Edit: SMF patched, this should not reoccur.

Thanks.
I occasionally "optimise" the tables manually in the admin section option just to keep things tidy.
Can/should I still do this?

Previously this was fine but part of the move meant changing to a different storage engine and data format type, the optimize query literally rewrites each table in the entire database. The new format and highly optimized cluster configuration, as well as having enough ram to keep the entire data set in ram makes this an unnecessary step.
 

Offline Rerouter

  • Super Contributor
  • ***
  • Posts: 4694
  • Country: au
  • Question Everything... Except This Statement
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #64 on: April 24, 2018, 10:15:32 am »
As server juggling is going on, felt I should point out there have been a few times tonight where i have had the eevblog server fail to return a page after posting a reply, Refreshing fixes it,

Edit: Exact response is "Server gave an empty response" and happened twice while i was trying to post this message,
« Last Edit: April 24, 2018, 10:17:57 am by Rerouter »
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #65 on: April 25, 2018, 12:56:11 am »
Update: The Wiki is back ;)
 

Offline BU508A

  • Super Contributor
  • ***
  • Posts: 4522
  • Country: de
  • Per aspera ad astra
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #66 on: April 27, 2018, 08:55:58 am »
Ok, I will have another dig though, it does seem very odd that it intermittent as everything is in sync.

Certainly not overloaded, perhaps there is a plugin still trying to use the old database configuration.

gnif, whatever you did, it seems that since yesterday the database error did not show up again on my iPad.
Thank you very much for resolving this.  :)  :-+

Andreas
“Chaos is found in greatest abundance wherever order is being sought. It always defeats order, because it is better organized.”            - Terry Pratchett -
 

Offline hwj-d

  • Frequent Contributor
  • **
  • Posts: 676
  • Country: de
  • save the children - chase the cabal
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #67 on: April 27, 2018, 11:41:36 pm »
Even with me this annoying database error is gone now.
Thanks too.  :-+

But, what was it? I would be interested... 
« Last Edit: April 27, 2018, 11:45:45 pm by hwj-d »
 

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #68 on: April 28, 2018, 05:23:32 am »
But, what was it? I would be interested...

It was extremely simple, CloudFlare were caching the error pages from during the server migration and would not flush them. After discussion with Dave CF has been disabled and will only be re-enabled if we need to use it to help mitigate against an attack.
 
The following users thanked this post: hwj-d

Offline gnifTopic starter

  • Administrator
  • *****
  • Posts: 1672
  • Country: au
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #69 on: April 28, 2018, 06:38:17 am »
But, what was it? I would be interested...

It was extremely simple, CloudFlare were caching the error pages from during the server migration and would not flush them. After discussion with Dave CF has been disabled and will only be re-enabled if we need to use it to help mitigate against an attack.

When you say would not flush them do you mean they refused a request or the command you can issue for your site didn't work? What was the reason CF was enabled in the first place? Does the reason still exist with the new server setup?

CloudFlare would flush their cache (via the portal) but not globally, some of their nodes still (even today) cached the database error pages that occurred during the upgrade. CF was enabled to try to reduce load on the single server that Dave had early on.

I just posted on Patreon (no need to be a patron to read my posts) a write up on this entire move, the server configuration basics, etc. if anyone is interested.
https://www.patreon.com/posts/18456501
 
The following users thanked this post: hwj-d

Offline frozenfrogz

  • Frequent Contributor
  • **
  • Posts: 936
  • Country: de
  • Having fun with Arduino and Raspberry Pi
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #70 on: April 28, 2018, 10:52:41 am »
Thanks for the write up!
I do not understand half of the technical intricacies regarding web servers and network management, but it was a nice read none the less.
He’s like a trained ape. Without the training.
 
The following users thanked this post: gnif

Offline hwj-d

  • Frequent Contributor
  • **
  • Posts: 676
  • Country: de
  • save the children - chase the cabal
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #71 on: April 29, 2018, 12:15:45 am »
Quote
I just posted on Patreon (no need to be a patron to read my posts) a write up on this entire move, the server configuration basics, etc. if anyone is interested.
https://www.patreon.com/posts/18456501
Wow, what a story. Thanks for writing and sharing it all.   :-+
And now, i think, i want to learn what puppet is and makes ...

Thanks again.
 

Offline jc101

  • Frequent Contributor
  • **
  • Posts: 613
  • Country: gb
Re: Planned Downtime @ 2018-04-21 00:00:00 GMT.
« Reply #72 on: April 30, 2018, 05:37:56 pm »
Since the update when I go to look in...

https://www.eevblog.com/forum/dodgy-technology/

I just get a blank page.  However if from the forum I click the icon next to the forum, ( https://www.eevblog.com/forum/dodgy-technology/?action=unread;children ) to get just the new posts, I do get to see the list.

I get the same behaviour on Safari, Chrome, and Firefox. Using http or https.

Is this a hangover from the server change?
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf