For anyone who runs a website, Google Analytics is a vital tool that helps you measure, learn and grow. Data from Google Analytics helps business owners and marketers make key decisions and improve their return on investment based on the behaviour of traffic on their website, yet the vast majority of small to medium sized business are being plagued with spam on their Google Analytics. You may not even be aware of the problem, but you should be. Spam will skew your statistics and inaccurate analytics could be preventing you from making sound business decisions.
Different types of spam
There are two main types of referral spam: crawler spam and ghost spam. While ghost spam referrals never actually visit your site, crawler spam does and uses real data to target your website. Both are problematic but can be solved with relative ease; however, they need to be tackled with different approaches. The first step is to identify whether you have a spam problem and which type of spam is hijacking your statistics.
To find out whether you are a victim of crawler spam, log in to your analytics and click on ‘Referrals’ under ‘Acquisition’. Sort the data in the table by ‘Bounce Rate’, as crawler spam tends to generate a 100% bounce rate.
On the whole, any referrals that you do not recognise will likely be spam. If you are unsure whether any are genuine or spam, it may be tempting to click on them to find out. However, unless you have impenetrable anti-virus software, we strongly advise against this approach as many of the spam sites will install malware onto your computer when you visit their site. Often a simple google of the questionable referral will give you the answer.
The other common type of spam is referred to as ghost spam, so-called because they never actually visit your site, using a clever bit of software to fool your analytics instead. Ghost spam is slightly trickier to sort because the hostnames change frequently. Nevertheless, we have got a handy solution to this irritating problem; more on that in the ghost spam section below, but first let’s begin with crawler spam.
One commonly suggested way of removing these pesky spam referrals is to add several lines of code into your .htaccess file. However, one character out of place and your whole site could come crashing down. Sound scary? We think so too, which is why we prefer the more foolproof method of adding a handy filter into your Google Analytics.
The first step is to create a new view in your Google Analytics. If you have not set one up before, the likelihood is that you currently have the one view, called ‘All my website data’. It is very important to leave this unfiltered view as it is; just in case anything goes wrong with your filters, at least you will always have the original data to fall back on. When logged into your analytics account, select ‘Admin’ and you will see the current view in the column on the right hand side. Click on the dropdown menu and select ‘Create New View’.
Enter a name for the view (in this example we have used ‘Spam Free’) and click ‘Create View’.
Now you are ready to add some handy filters. Click on ‘Filters’ in the left hand side column, then ‘Add Filter’.
Given that spam referral tends to be recurring, the easiest option is to create an exclusion filter where you input the troublesome domains. Check the ‘Create new Filter’ box and enter a filter name. Select ‘Custom’, check the ‘Exclude’ option and pick ‘Campaign Source’ from the ‘Filter Field’ drop down menu:
Next you’ll need to enter the spam referral domains into the ‘Filter Pattern’ field; these are the ones you identified earlier when looking at the referrals in your analytics. Some of the most common spam referrers to look out for are as follows:
When entering the domains into the field, there is no need to add ‘www.’, ‘http://’ or any other subdomains as it will automatically match these, but do separate each domain with a (vertical line) character. We also recommend using a backslash before the dot, as a full stop is considered a special character in the world of regular expressions used for analytics.
The backslash simply turns the full stop back into an everyday character. In-depth knowledge of regular expressions is not required here but if you would like to learn more, have a read of Google’s handy article.Our filter pattern looks like this:
You can always add to your list going forwards if any other pesky spam referrers pop up on your analytics. Before saving, click on the ‘Verify this filter’ to ensure it is all working correctly then once you are happy, click ‘Save’.
Ghost spam is a little more sneaky. Since they never actually visit your site, you cannot even block them via the .htaccess file. In addition, the domains change as quickly as they appear so it would take endless hours to manually exclude every single one. Nobody’s got time for that. Thankfully we have got an easy solution for you; don’t say we don’t treat you!
A genuine, real visit to your website will generate two server names: the source of the link (their website) and the hostname (your server). This is where ghost spam falls down; as it never actually visits your site, the hostname will not be your server. Therefore, the trick to eliminating ghost spam is to identify all variants of your hostname and set up a filter to only include these (this will make more sense as we go along, promise).
The easiest way to check whether you are being plagued by ghost spam is to click on ‘Audience’ then ‘Technology’ and then ‘Network’. Next, click on ‘Hostname’ and try not to have a nervous breakdown at the likely high number of spammy hostnames. The majority of ghost spam will be blindingly obvious as the hostnames will be gobbledegook.
The only valid hostnames will be your domain, any valid variations of your domain, plus any sites that you have added a tracking code to. For example, ecommerce services, translation services and video services are all places where you may have inserted a tracking code. Make a note of all valid hostnames, which should look similar to the below list:
Some websites will only have the one valid hostname (yourmaindomain.com) while others could have several. Once you are happy with your noted hostnames, go back into the view you created (see section above if you missed this) and add a new filter.
As before, enter a filter name and select ‘Custom’. However, this time select ‘Include’ and choose ‘Hostname’ from the ‘Filter Field’ drop down menu. Next you will need to enter your hostname(s) in the same format as above, with a backslash before the dot and a | to separate domains.
Verify the filter and once you are happy, click save. Now you can sit back and feel smug that you have beaten the spammers and can enjoy unskewed, accurate and highly valuable data.
It is not just the marketing department that should be interested in reducing this spam data. Sales teams, management and business owners should all take a vested interest in clean data so that they can make educated decisions. Hopefully this article will help you take a significant step in making your analytics work for you.
For further information or help with your analytics and marketing, get in touch with the Yellowball team today.