Since quite some time I wondered why Apple’s News Bot hehaves so agressively on WDRL’s site. It creates about 20-40000 requests to my server each day. This week I finally found the reason due to this article. AppleNewsBot is incompatible with Let’s Encrypt certificates and gets more agressive when detecting that it can’t fetch the data.
Apparently, this issue has been reported to Apple already months ago but until now, the “botnet” (I think it’s valid to call it like this in that case) is still behaving the same. Therefore to reduce my server load, I wanted to have a solution without adding Cloudfront or similar services upfront.
On an Apache Server it’s as easy as this:
# Block Apple News Bot
RewriteCond %{HTTP_USER_AGENT} AppleNewsBot [NC]
RewriteRule ^$ /status-429.php [L]
And this is the content of the PHP file:
<?php
header('HTTP/1.1 429 Too Many Requests', false, 429);
echo "Too Many Requests\n\nDear AppleNewsBot, please fix your Lets Encrypt support bug: https://www.slightfuture.com/webdev/excessive-applenewsbot-requests";
Now you might wonder why I used a PHP file here instead of returning a 429
HTTP-status code directly from Apache.
The reason is quite simple though: My shared hoster uses the stable releases of RedHat Linux which ships only some 2.2 version of Apache.
This old version (it’s getting bugfixes but no new features) does simply not know about 429
yet, introduced as late as in 2012.
So if you’re on nginx or Apache 2.4, you can return the 429
code directly from there. If you’re stuck on Apache 2.2, this is the solution you searched for.