Added a setting to .htaccess to prohibit bot traffic from accessing the site.
Contents
bot traffic
In a previous article, I wrote about a time when I received a large amount of access from bot traffic in one day.
I didn’t take immediate action at that time, but after that I had several accesses from the same site, so I added a setting to my .htaccss to prohibit access from bot traffic.
The reason why I didn’t take immediate action at that time was because I had already received information that people were accessing the site by changing the referrer address.
I thought that prohibiting access by specifying the referrer would be a weasel word, so I left it alone for a while.
However, now that I know the tendency and characteristics of referrers to some extent, I decided to exclude them by using regular expressions in “.htaccess”.
How to check the referrer (referrer source)
I checked it with Google Analytics.
- Select behavior-> Site Content > All Pages from the menu.
- Refine the dates in the target range
- Add “Source/Medium” in the Secondary dimension
It turns out that “bot-traffic.icu” is the referrer.
Main referrers
The referrers that I am aware of are as follows.
They all lead to the same site (I dare not post a screen shot).
- bot-traffic.icu
- bottraffic999.xyz
- bottraffic143.xyz
Based on the above trend, there is a possibility that they will continue to change the number part (999, 143) to access the site.
They also seem to change the top level of the domain (the icu and xyz parts).
In some cases, “-” (hyphen) is inserted between “bot” and “traffic”, and in other cases, it is not.
After checking the referrers every time, I decided to use regular expressions to specify access-prohibited referrers in “.htaccess” because it is troublesome to add them.
How to specify exclusion
The following description was added to the “.htaccess” file.
RewriteEngine on
RewriteCond %{HTTP_REFERER} bot(|-)traffic(|[0-9]{3}).... [NC]
RewriteRule .* - [F,L]
The RewriteEngine on line is already there, so only RewriteCond and RewriteRule are added.
Parameter Meaning.
RewriteCond | Specify the match condition |
%{HTTP_REFERER} | Indicates a referrer |
(|-) | A regular expression, representing none or “-” (hyphen). “|” (pipe) represents an or condition. “()” (parentheses) represents a group. I chose this specification because there were both patterns, “bot-traffic” and “bottraffic”. |
(|[0-9]{3}) | A regular expression representing a none or three-digit number. [0-9]: number |
\. | “\” (backslash) specifies an escape sequence (the subsequent character is not determined as a regular expression character). “.” (dot) has the meaning of any single character in a regular expression. In this case, “.” (dot), I want it to be recognized as a regular expression character, so I specify an escape sequence. |
… | This is a regular expression representing any three characters. I used this specification because there are cases where the top level of the domain is icu and cases where it is xyz. If the number of patterns does not increase in the future, (icu|xyz) may be acceptable. |
[NC] | Not case-sensitive |
RewriteRule | Specifying URL conversion rules |
.* | Above, I want to target the referrers of all bot traffic matched by RewriteCond, so I specify an arbitrary string (0 to n characters) that matches all of them.
^(. *)$, but it basically means the same thing.
|
– | Specify that no rewrite (URL conversion) should be done. Since this is bot traffic, there is no need to rewrite the URL, so “-” (hyphen) is specified. |
[F] | Specify the access forbidden (403-Forbidden). Disable access to bot traffic. |
[L] | Ignore all rules after that. I specify this because I don’t need to apply this rule to bot traffic even if I add RewriteCond under this rule in the future. If you specify [F], it will say “Ignore subsequent rules”, so there is no need to specify L, but I specified it so that you can see it explicitly. |
Restarting apache
After modifying and saving the .htaccess, restart apache with the following command.
sudo service apache2 restart
That concludes this article.
I hope this article will be useful to someone somewhere.
Recent Comments