The default Apache access log includes many useful details about each single request to you website. This is an example of how a log entry looks like:
79.28.43.25 - - [25/Jan/2009:13:18:02 +0000] "GET /blog/2007/01/internet-explorer-7-in-italiano/ HTTP/1.1" 200 14487 "http://www.google.it/search?hl=it&q=aggiornamento+internet+explorer+&btnG=Cerca+con+Google&meta=&aq=f&oq=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
You can easily identify the client IP address, the request timestamp, the landing page and the referral, in this example represented by a Google Search Page.
Creating a custom referer log file
As a Marketer or SEO, the referral and the landing page can be really useful information. Extracting them from the default apache log file can be a little tricky and requires some parsing knowledge. For this reason you would find more convenient to write a custom log file including only those two details.
Let me show you how. You don't need to know much about Apache server management but you must have access to your virtual host configuration because the CustomLog
and LogFormat
directives can't be specified in the .htaccess
file but only at server config or virtual host level.
Write the following lines either in you Apache configuration file or in your virtual host definition depending on whether you want to create a referer log for all configured websites or just for a single virtual host.
In order to monitor incoming links you need to define a custom log format using the LogFormat directive and give it an useful name, for example referer
.
LogFormat "%{Referer}i %U" referer
Then ask Apache to generate a new log passing the custom format.
CustomLog /path/to/folder/referer.log referer
You can specify as many CustomLog
as you want, already configured logs will not be affected. In this case Apache will generate two logs for each request: the first one with the default format and the second one including only the referral string and the landing page.
Here's an example of a typical virtual host configuration.
<VirtualHost *:80>
ServerName example.com
ServerAlias www.example.com
DocumentRoot /var/www/example.com/public
# many other directives ...
LogFormat "%{Referer}i %U" referer
CustomLog /var/www/example.com/logs/referer.log referer
</VirtualHost>
For each request to example.com
Apache will write an entry in the referer.log
file including the landing page and the referer string, like the following one.
In Apache 2.x the LogFormat
name referer
appears to be reserved for the format "%{Referer}i -> %U"
. You should use a different name to prevent conflicts.
Combining LogFormat and CustomLog in a single line
If you don't need to define a reusable LogFormat
and you don't care to assign it to a format name, you can create a custom log in one step.
CustomLog /var/www/example.com/logs/referer.log "%{Referer}i %U"
The line above is equivalent to the following one.
LogFormat "%{Referer}i %U" myformat
CustomLog /var/www/example.com/logs/referer.log myformat
Writing a CSV log file
You can customize the referer log placing as many "%" directives as you wish in your log format. For example, the following format writes a CSV log file.
LogFormat "\"%{Referer}i\",\"%U\"" referer
Here's an example.
"http://www.google.com/search?q=keyword", "/page.html"
"http://www.google.com/search?q=keyword", "/new-page.html"
Log entries can be easily parsed or opened with a CSV-compatible software like OpenOffice or Excel.
Logging external referers
Logging all referrals is expensive and not so effective for marketing analysis. It would probably be a good idea to restrict the directive to match only external referers. Enviroment Variables are what we need to do this.
SetEnvIfNoCase Referer (www\.)?example\.com INTERNAL_REFERRAL
LogFormat "\"%{Referer}i\",\"%U\"" referer
CustomLog /var/www/example.com/logs/referer.log referer env=!INTERNAL_REFERRAL
First we set an environment variable called INTERNAL_REFERRAL
if the request comes with a referer string matching current website domain. Then we define the LogFormat
as usual and we enable the CustomLog
only if the environment variable is empty, thus if the request matches an external referral.