Blog

Okay, at times, UserAgents can hit annoyingly apache, so what a better way to stop them from transferring the whole site for a pointless reason

So yeah, what I’ve done is .htaccess and block them, in my case CURL and Pingdom… why?…

Okay, CURL can be used to do automated actions on your website through POST and GET… and that’s for us a security hole, and even more, a pain in the bum… so what we’ve done is block it, so they receive a 403 Forbidden page.

And you can get it to do many things… instead of load a fast webpage… as redirect and many other things.

Apart from this, we’ve been tinkering around with parsing apache logs to find recurrent automated scripts that load us up and block them for some time, depending on what they are doing with the website; so far, so good. And use FIREWALL rules to do so to minimize the impact to the minimum.

Anyways how its done.

Create a .htaccess file in the main folder of your site [In our case Apache].

#
# Apache/PHP/Drupal settings:
#
 
# Protect files and directories from prying eyes.
<filesmatch>
  Order allow,deny
</filesmatch>
 
# Don't show directory listings for URLs which map to a directory.
Options -Indexes
 
# Follow symbolic links in this directory.
Options +FollowSymLinks
 
# Make Drupal handle any 404 errors.
ErrorDocument 404 /index.php
 
# Set the default handler.
DirectoryIndex index.php index.html index.htm
 
# Override PHP settings that cannot be changed at runtime. See
# sites/default/default.settings.php and drupal_environment_initialize() in
# includes/bootstrap.inc for settings that can be changed at runtime.
 
# PHP 5, Apache 1 and 2.
<ifmodule mod_php5.c="">
  php_flag magic_quotes_gpc                 off
  php_flag magic_quotes_sybase              off
  php_flag register_globals                 off
  php_flag session.auto_start               off
  php_value mbstring.http_input             pass
  php_value mbstring.http_output            pass
  php_flag mbstring.encoding_translation    off
</ifmodule>
 
# Requires mod_expires to be enabled.
<ifmodule mod_expires.c="">
  # Enable expirations.
  ExpiresActive On
 
  # Cache all files for 2 weeks after access (A).
  ExpiresDefault A1209600
 
  <filesmatch>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers set by mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a non-Drupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </filesmatch></ifmodule>
 
# Various rewrite rules.
<ifmodule mod_rewrite.c="">
  RewriteEngine on
 
  # Set "protossl" to "s" if we were accessed via https://.  This is used later
  # if you enable "www." stripping or enforcement, in order to ensure that
  # you don't bounce between http and https.
  RewriteRule ^ - [E=protossl]
  RewriteCond %{HTTPS} on
  RewriteRule ^ - [E=protossl:s]
 
  # Make sure Authorization HTTP header is available to PHP
  # even when running as CGI or FastCGI.
  RewriteRule ^ - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
 
  # Block access to "hidden" directories whose names begin with a period. This
  # includes directories used by version control systems such as Subversion or
  # Git to store control files. Files whose names begin with a period, as well
  # as the control files used by CVS, are protected by the FilesMatch directive
  # above.
  #
  # NOTE: This only works when mod_rewrite is loaded. Without mod_rewrite, it is
  # not possible to block access to entire directories from .htaccess, because
  # <directorymatch> is not allowed here.
  #
  # If you do not have mod_rewrite installed, you should remove these
  # directories from your webroot or otherwise protect them from being
  # downloaded.
  RewriteRule "(^|/)\." - [F]
 
  #Block CURL as getting too much load from curl
  BrowserMatchNoCase curl dndurl
  BrowserMatchNoCase Pingdom dndurl
  Order Deny,Allow
  Deny from env=dndurl
 
  # If your site can be accessed both with and without the 'www.' prefix, you
  # can use one of the following settings to redirect users to your preferred
  # URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
  #
  # To redirect all users to access the site WITH the 'www.' prefix,
  # (https://example.com/... will be redirected to https://www.example.com/...)
  # uncomment the following:
   RewriteCond %{HTTP_HOST} .
   RewriteCond %{HTTP_HOST} !^www\. [NC]
   RewriteRule ^ http%{ENV:protossl}://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
  #
  # To redirect all users to access the site WITHOUT the 'www.' prefix,
  # (https://www.example.com/... will be redirected to https://example.com/...)
  # uncomment the following:
  # RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
  # RewriteRule ^ http%{ENV:protossl}://%1%{REQUEST_URI} [L,R=301]
 
  # Modify the RewriteBase if you are using Drupal in a subdirectory or in a
  # VirtualDocumentRoot and the rewrite rules are not working properly.
  # For example if your site is at https://example.com/drupal uncomment and
  # modify the following line:
  # RewriteBase /drupal
  #
  # If your site is running in a VirtualDocumentRoot at https://example.com/,
  # uncomment the following line:
  # RewriteBase /
  ### BOOST START ###
 
  # Allow for alt paths to be set via htaccess rules; allows for cached variants (future mobile support)
  RewriteRule .* - [E=boostpath:normal]
 
  # Caching for anonymous users
  # Skip boost IF not get request OR uri has wrong dir OR cookie is set OR request came from this server OR https request
  RewriteCond %{REQUEST_METHOD} !^(GET|HEAD)$ [OR]
  RewriteCond %{REQUEST_URI} (^/(admin|cache|misc|modules|sites|system|openid|themes|node/add|comment/reply))|(/(edit|user|user/(login|password|register))$) [OR]
  RewriteCond %{HTTPS} on [OR]
  RewriteCond %{HTTP_COOKIE} DRUPAL_UID [OR]
  RewriteCond %{ENV:REDIRECT_STATUS} 200
  RewriteRule .* - [S=7]
 
  # GZIP
  RewriteCond %{HTTP:Accept-encoding} !gzip
  RewriteRule .* - [S=3]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html [L,T=text/html,E=no-gzip:1]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.xml -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.xml [L,T=text/xml,E=no-gzip:1]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.json -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.json [L,T=text/javascript,E=no-gzip:1]
 
  # NORMAL
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html [L,T=text/html]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.xml -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.xml [L,T=text/xml]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.json -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.json [L,T=text/javascript]
 
  ### BOOST END ###
 
  # Pass all requests not referring directly to files in the filesystem to
  # index.php. Clean URLs are handled in drupal_environment_initialize().
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^ index.php [L]
 
  # Rules to correctly serve gzip compressed CSS and JS files.
  # Requires both mod_rewrite and mod_headers to be enabled.
  <ifmodule mod_headers.c="">
    # Serve gzip compressed CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.css $1\.css\.gz [QSA]
 
    # Serve gzip compressed JS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.js $1\.js\.gz [QSA]
 
    # Serve correct content types, and prevent mod_deflate double gzip.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1]
 
    <filesmatch>
      # Serve correct encoding type.
      Header set Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </filesmatch></ifmodule></directorymatch></ifmodule>
ddemuro
administrator

Sr. Software Engineer with over 10 years of experience. Hobbist photographer and mechanic. Tinkering soul in an endeavor to better understand this world. Love traveling, drinking coffee, and investments.

You may also like

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: