For some time now, Google Analytics users have been noticing spam referral traffic in their reports. While, many GA users use filters to keep spams or fake hits out of their reports, other users do not even know that hits can be fake. Fake hits are generated by using ghost referrer spam and it has emerged as a big problem in quite some time now.
What is the matter exactly?
When users opt to visit your site by clicking on link to your website, their web browser sends a HTTP request to your web server. This request contains request line and request headers as described in the following image –
In header, there is a field named referrer, which consist of URL of user’s previously visited web page (before your site). GA’s tracking code gathers inclusive data related to referrer and sends it to your web property.In result, you can see this data in reports.
On the contrary, this data can be sent directly to analytics by using measurement protocol without making a request from the browser. This is the ghost.
For analytics, these figures are similar to any other. Filters are not of any use for a request URL via measurement protocol.
Using this protocol, spammers feign hits along with complete information that looks authentic. Only thing they need is the Tracking ID of your site.
– It is the data that one sends to GA server by using the measurement protocol. Some of its parameters are –
You must know that it is possible to fake all the above mentioned parameters except version and tracking ID.
Furthermore, this data can be formatted to send pageview hit to analytics through protocol. How? Answer is –
// Tracking ID / Property ID.
555 // Anonymous Client ID.
pageview // Pageview hit type.
mydemo.com // Document hostname.
/home // Page.
homepage // Title.
Spammers provide alteredvalue to afore mentioned parameters like –
// Tracking ID / Property ID.
// randomly generated Client ID.
// Pageview hit type.
// Document hostname.
Here, all the bold
parameters are feigned and it is valid.
One can send following feigned event data to analytics by making an HTTP POST request to www.google-analytics.com –
1 // Version.
UA-12344-1 // Tracking ID / Property ID.
// Randomly generated Client ID.
event // Event hit type
// Event Category. Required.
// Event Action. Required.
So, it shows that all spammers need is your tracking, which they can gain by –
- Through Spam bots
- By arbitrarily generating property IDs and targeting any website
So, by changing values of parameters, one can effortlessly escape from filters. What you need to do is update your filters frequently. Or, encode request URIs. How does it work? It is defined here –
Page that reports encoded URI in the place of the original one
When a visitor visits home page of your website, the request URI is sdfsdfjdrwrwe90424/. It includes security key (sdfsdfjdrwrwe90424)as well as the path name of current URL (/).
Encoded URI is directed to GA
In Google analytics, you get the report of URI in following format
But if you encode it, then you will get the report in following format
URIs with proper security key is counted in reports
Without security key, filter effectively dismisses that URI from your reports. Since spammers don’t usually know about websites, they target home page the most.For this page, you get URI in / like –
Sometimes, spammers send hits for those web pages that don’t even exist. But since these requests also don’t have proper key, they are also left out.In addition, you have the facility to change this key anytime.
Decode in GA
Decoding is essential for the better understanding of data as it looks like –
During this procedure, you take out the security key from URI.
How can you encode all URIs?
It is completed in several steps that are –
Enhance the tracking code
ga(‘create’, ‘UA-30449797-1′, ‘auto’);
ga(‘send’, ‘pageview’,’<enter your security key>’ + location.pathname);
You can choose your key in any alphanumeric number, but it should be long. Here, locations.pathname returns the path name of current URL. Attach the key to this path name to modify request URI.
Make a view filter that include merely those pages that have security key
Build another filter that eradicates key from URI