Last night RSS feed was broken by rate limiter. I have it set to 15 minutes poll time, does not sound like a lot, yet I get "Until you rate limit your crawler, you can get lost!" response. First of all, seems pretty rude. But then - what is the maximum rate then?
I understand that this was probably introduced to combat AI crawlers or something like that, but for RSS feed, this limit seems unreasonable.
Well, I switched it to hourly and it did not help. So, it looks like may be a ban is by IP?
Also, this probably belongs in "News/Suggestions/Help".
Yeah, I think it is IP ban. I switched to 4 h update, but still no response. I tried to download a lot from my home PC, and there does not seem to be a limit. The request here would come from Amazon IP range (self hosted RSS reader), so I guess it is IP ban.
This is not great. RSS is the way I keep track of the posts.
Just in case anyone experiences the same issue (unlikely, I know), passing the feed though feedburner solves it, since now it is a google IP that does the request.
Last night RSS feed was broken by rate limiter. I have it set to 15 minutes poll time, does not sound like a lot, yet I get "Until you rate limit your crawler, you can get lost!" response. First of all, seems pretty rude. But then - what is the maximum rate then?
I understand that this was probably introduced to combat AI crawlers or something like that, but for RSS feed, this limit seems unreasonable.
We were seeing 20k hits per IP every hour, coming from AmazonBot. As Amazon refuse to rate limit their crawlers we had to block them.
See:
https://developer.amazon.com/amazonbotSeems we need to be more specific about the IPs we block here if people are using a feed that comes from Amazon also. I will look into this as this was not the intention here.