.HTACCESS Guide

.HTACCESS Guide

.htaccess files are plain text documents allowing you to manage how your web server responds to requests. Although originally designed for file access control at the directory level, it also has a number of other uses.

This guide isn’t intended to be a comprehensive documentation on .htaccess. It is meant to serve as a basic introduction and outline for newer users. Although .htaccess can be used for a number of things, that doesn’t mean it always should be.

What is .htaccess?

The .htaccess, or hypertext access file, was originally meant for users to control file access. Using it, you can password protect specific directories on your web hosting server. It is used by many (but not all) web servers in the market such as Apache.

In combination with .htpasswd files you can exercise a high degree of directory access control for multiple users. At the same time, it can also be used to handle redirects, ban specific IP addresses or IP ranges, or even work with custom error pages.

Locating Your .htaccess File

Make sure you enable your file manager or FTP client to display hidden files
Make sure you enable your file manager or FTP client to display hidden files

Just because it has a use, doesn’t mean that all web hosting plans come with the .htaccess file. If you can’t locate yours – don’t panic, it might simply be hidden. Most of the time though, the file should be located in your root folder. 

When using your web hosting file manager, this will generally be www or public_html. If you’re running a few websites from the same account then you might have one in the main directory containing each website.

Most files that start with a ‘.’ are hidden files.If you can’t see .htaccess in these locations, then try to enable the ‘show hidden files’ option in your file manager settings or File Transfer Protocol (FTP) client that you are using.

Using .htaccess

For the purposes of this guide, we will be looking at .htaccess coding in context of Apache web server since it is commonly used. Nginx does not make use of this file.

As mentioned, .htaccess is quite versatile and can be used to achieve a number of things. The first thing you need to do though, is to secure the file. Unless this is done, anyone will be able to view your .htaccess file.

Open the file and add the following code:

	<FilesMatch "^\.htaccess">
		Order allow,deny
		Deny from all
	</FilesMatch>

If you do this, anyone trying to view it will simply be shown an error message. Now that you’ve protected the file, let’s take a look at what else it can be used for.

1. Directory Access Control

To prevent unauthorized entry, .htaccess can work with another file called .htpasswd. The latter is where you can store specific user names and their access permissions to specific areas. Unlike .htaccess, you only need one .htpasswd file.

To create the file and add a user:

	htpasswd -c /directory/ .htpasswd jamesdean

Once you hit the enter key, you will be asked to provide the password for the username you just defined. When storing the password, it will be hash encrypted – not stored in the form you enter it.

By default all directories are open access. To restrict access to specific directories, you will need to place one .htaccess file in each directory you want to secure. The code in the file will specify various allowances or restrictions. For example:

	AuthUserFile /directory/.htpasswd
	AuthName "Restricted Directory"
	AuthType Basic
	<Limit GET POST>
		require user jamesdean
	</Limit>

The code above allows access to the specified directory only for user jamesdean. At the same time, it restricts jamesdean’s access to only GET and POST functions.

2. Redirection

This is one of the most common uses of the .htaccess file since it makes redirection very simple. You can choose to redirect anything from a single URL to an entire folder or even another domain:

Redirect URLs: 

	RedirectMatch 301 /old-page/ /new-page/

Redirect folders: 

	RewriteRule ^/?old_folder/(.\*)$ /new_folder/$1 [R,L]

Redirect domains: 

	RewriteRule ^(.\*)$ http://new_domain.com/$1 [L,R=301]

When using these lines, you need to ensure that the module needed for handling rewrites is enabled. By default, it is. However, it is good practice to include the code to enable it together with the instructions. For a more complete example:

	<IfModule mod_rewrite.c>
		RewriteEngine On
		RedirectMatch 301 /old-page/ /new-page/
	</IfModule>

3. Custom Error Handling

Making use of custom error handles can help improve your Search Engine Optimization (SEO). Instead of visitors bumping into a generic wall, you can use .htaccess to serve them custom pages depending on the error encountered.

You will need to create one custom page for each custom error you want to handle, then redirect those types of error – one per line in .htaccess.

	ErrorDocument 400 /bad_request.html
	ErrorDocument 401 /auth_required.html
	ErrorDocument 402 /forbidden.html
	ErrorDocument 403 /file_not_found.html
	ErrorDocument 404 /internal_error.html

4. Hotlink Prevention

When another site creates hotlinks to your images, they’re not only making use of your images, but your bandwidth as well. Even if you’re on a web hosting plan with unmetered bandwidth, it will occupy your server resources.

To prevent this from happening:

	RewriteEngine on
	RewriteCond %{HTTP_REFERER} !^$
	RewriteCond %{HTTP_REFERER} !^http://(www\.)your_domain.com/.*$ [NC]
	RewriteRule \.(gif|jpg)$ - [F]

If you want to shame them for trying to abuse your resources, include another line to display an image telling people that the site owner is stealing resources from other sites:

	RewriteRule \.(gif|jpg)$ http://www.example.com/angryman.gif [R,L]

5. Block Bad Bots

The problem with bots is that not all are bad. For example, Google crawlers are also bots, but they serve an important purpose. Bad bots, however, often do unsavory things such as scrape data – while taking up your web hosting resources to do so.

Using the .htaccess file is one way of denying access to specific bots. There are a number of ways you can do this either by IP address or user agent, which is sort of an identification tag. IP blocking can be done with individual IPs or with an entire range:

	Deny from 123.123.123.123

OR

	Deny from 124.124.124.0/255

If you intend to block specific bots based on user agent:

	RewriteEngine On
	RewriteCond %{HTTP_USER_AGENT} WebReaper [OR]
	RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
	RewriteCond %{HTTP_USER_AGENT} ^AnyBot
	RewriteRule .* - [F,L]

6. Enable Server Side Includes

Server Side Includes (SSI) allow you to call CGI scripts or even either HTML documents from within HTML content. This can be useful in a number of ways, for example keeping file sizes more manageable, or helping you produce more easily maintainable sites.

You will need to define each type of file you want to enable SSI for:

	AddHandler server-parsed .html
	AddHandler server-parsed .shtml

If you find that you’re not able to run CGI files outside the cgi-bin directory, then it will be necessary to enable that:

	AddHandler cgi-script .cgi
	Options +ExecCGI

Note: This may or may not work depending on policies your web host has in place for its servers. If you get an error from doing this, you will need to contact your support team to see if they can enable it for you.

Conclusion: Use .htaccess Sparingly

Given how powerful this file is, it can be difficult to resist simply adding a few extra lines of code to get things done. However, it needs to be remembered that the .htaccess file is not a main configuration file. 

Each time the web server notes a .htaccess file, it has to read and execute it to override the main configuration settings. This read and execute process takes time and resources, which places additional strain on web servers. Where possible, avoid excessive use of this file.

Frequently Asked Questions

Should I use .htaccess?

In a global usage sense the .htaccess file can offer a lot of convenience. However, this comes at a potentially high cost in server resources. Where possible, rely on mains server configuration rather than the .htaccess file.

Can I have multiple .htaccess files?

.htaccess files can technically be placed in each directory that you want configured. If you run multiple websites, each home directory can have its own file – along with one in every subdirectory beneath it.

How do I know if .htaccess is working?

The simplest way to ensure your .htaccess file is working is to visit the URL of the directory that you’ve placed it in. If it is not working you will likely encounter a 500 Internal Server Error.

What is the rewrite rule in htaccess?

Rewrite is an Apache module that allows you to rewrite URL requests. It simply takes an incoming request and directs it towards one which you have specified to take its place instead.

Relevant Reads

author avatar

Author Profile

Timothy Shim is a writer, editor, and tech geek. Starting his career in the field of Information Technology, he rapidly found his way into print and has since worked with International, regional and domestic media titles including ComputerWorld, PC.com, Business Today, and The Asian Banker. His expertise lies in the field of technology from both consumer as well as enterprise points of view.