Thursday, 5 May 2016

Measuring footfall with Google Analytics and a Raspberry Pi (Part 1)

If you've read my blog before hopefully you'll know I'm a geek and so was incredibly excited when Google introduced the Measurement Protocol when releasing Universal Analytics.

I've called this blog post part one primarily as I am hoping to be able to write a few follow ups explaining the results we saw and ways we are evolving this project so please do check back or feel free to e-mail me with any questions.

The challenge : 
Can you measure instore footfall with Google Analytics? 
This was the question one of our clients posed to the analytics team at 4Ps Marketing.

Our solution:
There are numerous footfall technologies available however these can be quite costly and so we wanted to build our own solution.

We hosted a workshop to discuss a simple solution and decided to try using a PIR sensor.  PIR sensors change voltage when they detect movement, so by having one near the entrance of the store would mean that in theory it would fire each time movement was detected.

The Measurement Protocol allows hits to be sent to Google Analytics from any internet enabled device. As most PIR sensors aren't internet enabled we needed something to connect to it and send hits back to Google Analytics. We decided on using a Raspberry Pi and connecting the PIR sensor to its GPIO (General Purpose Input / Output) header and then writing code to send a hit when movement was detected.

From Amazon we ordered:

1 x Raspberry Pi
1 x PIR Sensor
1 x USB Wireless Dongle
1 x Wireless Keyboard / Mouse (you don't need this but it helps)
1 x Jumper Leads
1 x HDMI Lead
1 x Power supply

Total Cost: £70

The PIR sensor had three connections, these supplied voltage and thus were wired to the appropriate power pins on the GPIO with the third being where the signal would change if motion detected.

After connecting the PIR sensor we then needed to write some script to "query" the appropriate GPIO pin and send a hit if movement was detected. For this, we used Python. Python is a very powerful development language and is part of the stack provided with the Linux distribution which comes with Raspberry Pi.

The Raspberry Pi with a PIR Sensor Attached

The first thing we needed to do was to include the GPIO library, the time library to allow us to access the sleep command and urllib2, this allows us to make HTTP requests.

import RPi.GPIO as GPIO
import time
import urllib2

We then define the UA number we want to sent to Google Analytics as well as a counter to count the number of times the PIR has been activated..

counter = 0

We need to then setup the GPIO, our PIR was plugged in to PIN 26.

PIR_IN = 26

After this we have a loop, this looks for GPIO input on PIN 26 and when there is input we increment our counter and use the urllib2.urlopen function to send this hit to Google Analytics. In this example the client id is fixed (in essence meaning one session) and we are using pageviews to determine customers, however, we could easily make Python create a random Client Id if needed.

The dp parameter (page path) is /offline/pir/ with the value of the counter and the dt (document title) parameter is set to PIR Motion (%20 is a space).

We then sleep for a short period to allow the PIR sensor to clear.

if GPIO.input(PIR_IN):
    counter = counter + 1
    print "Someone has been seen " + str(counter) + " hit(s)"
    urllib2.urlopen("" + strUA + "&cid=12345-12345-12345-12345&dt=PIR%20Motion&dp=/offline/pir/" + str(counter)

In order for this to work, the PIR sensor needed to have a clear view of the door and the Raspberry Pi needed power. We were able to use an in store television with a USB port to power the Raspberry Pi. As the television was located on the mezzanine above the main entrance it gave the PIR sensor a good view of the door.

View of PIR

Raspberry Pi on TV

Data began flowing into Google Analytics instantaneously. As a big advocate of GA I may be being biased here but I personally feel this shows how good a tool GA is in terms of it does this for free!

We created a very simple dashboard to show the number of instore visits (pageviews) and then widgets to break down pageviews by day and hour of the day. Unfortunately the Dashboard feature in GA is a bit limited (you can only have 9 bars) but with things like Data Studio 360, Analytics Canvas / Tableau it would be easy to mash up a nicer dashboard comparing visit performance by hour and day. I think it is important to state this was version one and so we wanted to see if the concept worked before elaborating.

Example GA Dashboard

Learnings and Next Steps:
The most important learning we have taken so far is to use an independent power supply. This may sound silly but as we were drawing power from a television it needed to be on and staff quite often fogot to turn it on, or turned it off mid day meaning no data. We're in the process of giving the Raspberry Pi its own supply (possibly thermonuclear ;) ).

Google Analytics also allows us to define Custom Dimensions and Metrics and so we are going to introduce a Store Visit metric and pass the Date and Time from the Raspberry Pi to Google Analytics. As the store has two floors we are also planning on introducing a second Raspberry Pi to the second floor so we can pass Floor and see what % of store visitors go upstairs.

We may also use a calculated metric to half the number of visits as it is possible for the PIR to detect people leaving as well as entering the store.

The next iteration then is one which is really exciting however also it could be quite difficult and that is to switch from PIR to use a camera to detect individuals possibly using Googles Cloud Vision API. This would stop double counting and potentially (depending upon how simple or complex you wanted to make things) allow use the hash of a persons face somehow as a User or Client Id understanding how many times the same person may visit a store.

I would really love to hear from you if you can help or have any ideas so feel free to reach out either on Twitter, or by e-mail!

Wednesday, 17 February 2016

Tag Manager and JSON-LD - Partners in Crime

Each week 4Ps hosts a Waffle Wednesday, originally started by Head of SEO Hannah Miller and then rejuvenated by Head of SEO Ruth Attwood, it offers a chance for people in the agency to have a digital geek out. Semantic markup is something that always seems to crop up in particular JSON-LD and also the Semantic Web / Knowledge Graph.

Having been a developer in previous roles and being an Analytics geek I've often wondered why I don't see more people using markup data in analytics as quite often it can contain valuable attributes about the type of content a user is consuming, in fact, in the case of JSON-LD one could argue its so valuable Google use it to help fuel the Knowledge Graph.

What do I mean? Well lets take the BBC News site, if you select and article and view the page source you will see there is some Semantic Markup (view the source search for json):
<script type="application/ld+json">
        "@context": ""
        ,"@type": "Article"
        ,"url": ""
        ,"publisher": {
            "@type": "Organization",
            "name": "BBC News",
            "logo": ""
        ,"headline": "EU referendum: Final day of UK deal talks ahead of summit"
        ,"mainEntityOfPage": ""
        ,"articleBody": "British officials enter the final phase of negotiations aimed at changing the UK's relationship with the EU, ahead of a European summit."
        ,"image": {
            "@list": [
        ,"datePublished": "2016-02-17T07:37:51+00:00"

This markup helps crawlers or other autonomous systems to understand the content on the page, in this instance we can see it's an Article, the publisher is the BBC, I have the headline, a list of images and the date published amongst other attributes.

So how is this helpful for Analytics? Well, some of these attributes are content attributes, for example the number of images on a page, the date the article was published and it's the fact its an article. With Universal Analytics I could create Custom Dimensions for these atrributes and then use them to look at engagement based on number of images, or look at the age of content consumed by comparing the ga:date to the published data.

This all sounds great but how do you go about getting this data into Google Analytics? Well the answer is quite simple by using a library called jQuery (which most sites use) and Google Tag Manager.

jQuery is a really powerful library - it allows you to manipulate and extract information from the DOM (Document Object Model), this includes code contained within the script tags. By using jQuery I can create code to extract code within the JSON-LD block and create a new object called myLD and use the JSON-LD attributes as illustrated below which will output the data published to the javascript console.


var myLD = JSON.parse(jQuery('script[type*=json]').html())



Again, all very well and good but how is this helpful? Google Tag Manager comes with the ability to create variables using Custom Javascript functions and this means I can place this code in a Javascript function and use the value within any of my tagging.

So let's put the pieces together for the Date Published.

1. I login to Google Analytics and create a Custom Dimension at Hit Scope called Date Published I create somewhere for this data to "sit".

2. I login to Google Tag Manager and create a Variable of type Custom Javascript and tell it to return my datePublished JSON-LD attribute.

3. I create or amend my Google Analytics Pageview tag to use this Variable and pass it to Google Analytics's in the Custom Dimension.

4. Publish! When I publish the above the published date is passed through to my Custom Dimension in Google Analytics.

What else can I do? This is all dependant upon what is in the JSON-LD. In the case above you could record the number of images on the page, or the article type.

I'm going to extend this further in the coming weeks to see if I can also do real time lookups against the Google Knowledge Graph API so watch this space! Any questions / problems just shout!

Tuesday, 4 November 2014

Eats, Shoots and Leaves - A Panda too far!

I am an avid Googler in terms of I love the Google stack and clients and peers often think I am a Google sales rep selling either Google Analytics (which is amazing) or Tag Manager (which is probably equally amazing) however I have become increasingly disheartened by their search product.

For me, Googles ethos was all about finding what you wanted and I remember when it first came onto the scene and I was studying Computer Science at the University of Hertfordshire that I was amazed at how quickly and easily I could find what I want. Since then, there have been numerous revisions to its algorithm, the inclusion and exclusion of features, ratings, snippets, however my biggest bug bear are the way content is demoted or excluded.

I understand that since Google came on the scene SEO consultants, agencies, or even "hackers" have been trying to manipulate the algorithm in order to leverage rankings but to the majority of those not involved in digital they see Google as a black box. They don't care whether site one brought content on a high authority site, nor that it has a bad back link profile. They do care that when they search for something they could find last month they can no longer find it as the goal posts have moved.

It also saddens me that the morals and ethics Google promote for organic search seem somewhat ignored on the Google Display Network, and even Adwords where sites which help spread Malware can rank for terms such as "Download Firefox" and you can have a great big "Download" button as creative to show on sites like which don't relate to what the user is trying to download.

What I want to be able to choose what decision Google has made for me. Let me change how much weighting you are going to give to my social connections. Let me decide if I want you to hide sites that have purchased links, or tell me if there is content that's recently been removed from a search. Let me decide if I want you to add maps or take up real esate with Knowledge Graph entries.

For this reason I've decided I'm fed up of continually refining search queries to try and find the site which used to rank and I am using Bing. Whilst Bing is nowhere near as complex as Google at least its simple, in my view cleaner and prettier and seems to get me what I want more quickly.

I'm trying to encourage others to switch and to join me in blogging about what they miss, or how they have found their transition - so join me and #BingItOn now!

Thursday, 23 January 2014

What does Google know about me?

Whenever we use the internet we leave behind us a digital footprint. Even if we are using incongnito mode or clearing our browsing history, our ISP will know the sites that have been accessed and be able to tie that back to a person.

I'm interested though in the data which perhaps is a bit more obvious and that's what is collected and shared by companies such as Google, Facebook and Twitter.

If you're a Google Analytics user, you will probably be aware of the change which allows you to include demographic information collected from the Google Display Network about your visitors. This lets us see breakdowns such as sex, age and interest. You are able to see what Google thinks you are by visiting their Ad Preferences page, you can see from the below Google thinks I am a 25-34 year old male which is bang on:

This made me think if this is characterised by the websites I visit or my GMail profile what else have Google collected about me and can I see it? The preferences above are a good start but don't tell me or allow me to work out how Google has put me into this group.
Yesterday, I decided to write to Google with a Subject Access Request. Under the Data Protection Act you are allowed to ask companies to disclose information they hold about you. Usually there is a £10 fee associated with this and they have 40 days to respond. The letter I sent is below:

You of course can send a similar letter to Google requesting they disclose this information to you and I'd love to know from you if you've done this or intend to do so. Equally I am going to share the response to my letter on my blog so if it interests you check back soon (well in up to 40 days!!).

Thursday, 22 August 2013

Measuring Twitter with Universal Analytics


Like most people working within the Web Analytics space the announcement of Universal Analytics excited me, in fact so much that I still have the Raspberry Pi, RFID readers and tokens ready to start doing some real world tracking after seeing the Loves Data inspirational video what seems like an age ago.

For those of you who don't know about Universal Analytics - it is really the next evolution of Google Analytics providing a means of being able to make your analytics more User centric rather than Visit centric. This of course means being able to look at journeys that not only cross devices but potentially bridge off and online.

Universal Analytics is built on something called the Measurement Protocol. This protocol was developed to allow a third party to be able to send data to Google / Universal Analytics. Obviously, the Universal Analytics JavaScript library wrappers this nicely for you, however, I would still advise people to read and become familiar with this.

Loves Data used the Measurement Protocol to send data to Universal Analytics using Arduino boards and RFID tags. This should excite any retailer with a loyalty card and I'm going to say Smart EPOS as technically it means you can send offline sale data back into Analytics. Of course, you don't need to have a loyalty card, Raspberry Pi or Ardino board to send data back. In fact, there are many other events that happen off your website that you may wish to record in Analytics and the Measurement Protocol makes this possible.

In September, a number of my colleagues are doing a Digital Hike (4PsHike) where over 5 days they will be walking the length of Hadrians Wall stopping off for some Google Hangouts discussing various developments within the Digital Marketing space. Last week one of my colleagues Charlotte, approached me to ask about Twitter Monitoring in particular looking at the Scottish Independent debate, primarily to try and see the number of tweets, active users and hashtags. We use a tool called Brandwatch but I thought could I quickly and easily do this using Universal Analytics and the answer of course is Yes.

There are some things I need to improve, like making hashtags lowercase so they are de-duped and also investigating the limits (of 500 hits per session) but as you'll see below it is possible and relatively straight forwards. I'd also like to look at using the uid= rather than cid= for identifying a visitor as we then could emulate Visits in Google Analytics a little more closely.


Step 1 - Create a new account
So firstly, we need to create a new account, very easy although the new look and feel of Analytics remember this is under Admin and then in the Account drop down. I made a new Universal Analytics account for my particular experiment - you then need to note the UA number.

Step 2 - Install PHP / MySQL
I downloaded a WAMP stack called XAMPP as I wanted to use PHP as my Twitter monitoring library. XAMPP includes Apache, PHP and MySQL. You can use any tool of your choose provided you are able to edit the code and add the necessary Measurement Protocol requests. The library I used is was from 140Dev you can download it here -

Step 3 - Create Twitter Application
In order to use the PHP monitoring library you need to have a Twitter Application. You can create this by signing in at Click My Applications:

Create your application and after you've done this you will need to note the Consumer Key, Consumer Secret, Access Token, Access Token Secret.

Step 4 - Start Monitoring
So, now we've got our Twitter application we can begin monitoring, in the 140dev package you need to modify a few files, firstly the db_config.php:
$db_host = 'MySQL Host Here';
$db_user = 'Put your MySQL username here';
$db_password = 'Put the MySQL password here';
$db_name = 'Put the database name here';

Then you need to edit the 140dev_config.php file:

// Server path for scripts within the framework to reference each other

// External URL for Javascript code in browsers to call the framework with Ajax
define('AJAX_URL', ''');

// OAuth settings for connecting to the Twitter streaming API
// Fill in the values for a valid Twitter app
define('TWITTER_CONSUMER_KEY','Your Consumer Key');
define('TWITTER_CONSUMER_SECRET','Your Consumer Secret');


After that, edit the get_tweets.php file to monitor what you need:
$stream->setTrack(array('Term to Track','Term to Track'));

This script should now run from the command line by running 'php get_tweets.php'. This will populate a cache file of tweets, there is a second part which extracts the data into MySQL and adds our Measurement Protocol request, this is the parse_tweets.php file. If you edit the file adding the bold rows where appropriate:

To track WHO is tweeting:
// Add the new tweet
// The streaming API sometimes sends duplicates,
// so test the tweet_id before inserting
if (! $oDB->in_table('tweets','tweet_id=' . $tweet_id )) {

// The entities JSON object is saved with the tweet
// so it can be parsed later when the tweet text needs to be linkified
$field_values = 'tweet_id = ' . $tweet_id . ', ' .
'tweet_text = "' . $tweet_text . '", ' .
'created_at = "' . $created_at . '", ' .
'geo_lat = ' . $geo_lat . ', ' .
'geo_long = ' . $geo_long . ', ' .
'user_id = ' . $user_id . ', ' .
'screen_name = "' . $screen_name . '", ' .
'name = "' . $name . '", ' .
'entities ="' . base64_encode(serialize($entities)) . '", ' .
'profile_image_url = "' . $profile_image_url . '"';


$strUA = "" . $user_id . "&t=pageview&dp=/users/" . $screen_name;
$strData = file_get_contents($strUA);

To track MENTIONS:
// The mentions, tags, and URLs from the entities object are also
// parsed into separate tables so they can be data mined later
foreach ($entities->user_mentions as $user_mention) {
$where = 'tweet_id=' . $tweet_id . ' ' .
'AND source_user_id=' . $user_id . ' ' .
'AND target_user_id=' . $user_mention->id;

if(! $oDB->in_table('tweet_mentions',$where)) {

$field_values = 'tweet_id=' . $tweet_id . ', ' .
'source_user_id=' . $user_id . ', ' .
'target_user_id=' . $user_mention->id;

$strUA = "" . $user_id . "&t=pageview&dp=/mentions/" . $user_mention->screen_name;
$strData = file_get_contents($strUA);

To track HASHTAGS:
foreach ($entities->hashtags as $hashtag)

$where = 'tweet_id=' . $tweet_id . ' ' .
'AND tag="' . $hashtag->text . '"';

if(! $oDB->in_table('tweet_tags',$where)) {
$field_values = 'tweet_id=' . $tweet_id . ', ' .
'tag="' . $hashtag->text . '"';

$strUA = "" . $user_id . "&t=pageview&dp=/hashtag/" . $hashtag->text;
$strData = file_get_contents($strUA);


To track URL mentions:
foreach ($entities->urls as $url) {
if (empty($url->expanded_url)) {
$url = $url->url;
} else {
$url = $url->expanded_url;

$where = 'tweet_id=' . $tweet_id . ' ' .
'AND url="' . $url . '"';

if(! $oDB->in_table('tweet_urls',$where)) {
$field_values = 'tweet_id=' . $tweet_id . ', ' .
'url="' . $url . '"';


$strUA = "" . $user_id . "&t=pageview&dp=/urls/" . $url;
$strData = file_get_contents($strUA);

After doing this again leave the parse_tweets.php file running on a command prompt by entering 'php parse_tweets.php.


The reporting interface of Google Analytics is actually very effective at monitoring Twitter as you are able to look in Real Time, use Dashboards, or custom reports.

The Real Time Analytics is fantastic at showing how active the things your are monitoring on Twitter is. If you just look at the Real Time overview as this screenshot shows:

You can use Dashboards to report on key areas of interest and apply whatever filtering you need, the dashboard below just shows the key hashtags, users, users mentioned and urls shared:

Custom Reporting also allows us to produce charts such as what times of the day users were active:

If you are interested in using the Measurement Protocol, Google Analytics or Universal Analytics or have any comments or feedback then I'd love to hear from you!

Wednesday, 8 May 2013

Dark Search - The quest for the missing referer

For those of you who haven't read my blog before, I am Chief Technology Office at Strategic Digital Marketers, 4Ps Marketing. My particular areas of interest are New Technology, Development and Analytics.

About two months ago, I was investigating the rise in direct traffic for a number of our clients. Whilst these were all well known brands, the rise in traffic and apparent drop in search traffic didn't feel right so I was curious to understand more.

Using the Google Data Query Explorer I broke the direct traffic down by Web Browser (as my initial feeling was that it was bookmarked or auto-suggest in Chrome causing this) to find that direct traffic in Safari had shot up. At the time, I wasn't 100% sure why as I couldn't find any release notes suggesting this type of change.

Graph showing visitors from direct over time by browser

Since then, I've read a number of blog posts including Dark Google Vexes Publishers which explains that this is due to the HTTP Referer header being dropped by iOS6, Android 4 and above, and some versions of Firefox.

For those of you who don't know, Analytics packages like Google Analytics, use the referer to know which website a visitor came from or if it is not present classify them as direct, i.e. a referral of (none).

What wasn't clear from the articles I read was the impact. Obviously, through my own investigations I had seen a clear change, however was this purely affecting Organic Search?

So, I decided to create a test page for myself - it's very simple, it reads using PHP the HTTP_USER_AGENT header and the HTTP_REFERER header and shows this to the user, if the referer is blank it asks where you came from and you can then submit it to the database. The name is Dr0pMyRef3r3r and the reason for this is I want it to rank organically so I needed something obscure enough to be able to find.

I've paid (allbeit with a small budget) for some PPC (bidding on [dark search] and [dark google search]) and will use Social media too to try and understand what browsers are dropping data and where.

I'd love you to take part and encourage your friends.

The link to the actual referer site is:

And as ever comments and suggestions are welcomed!!

Wednesday, 27 February 2013

Blogger in a subdirectory of my domain

As CTO I tend to be fielded a mixture of technical SEO questions along with general tech questions and the last one was something which I had often wondered myself.

I know that Blogger allows you to host it as a subdomain of your own domain (i.e. or using CNAMEs but what about if you wanted in a sub directory of an existing domain, that is is this possible? You can try it out by clicking -

Much research I read suggested that this wasn't possible, however, after a bit of "playing" in IIS I was able to do this and it wasn't quite as hard as I had thought so  I thought I'd share the solution, it basically uses PHP to act as a Proxy to Blogger.

Firstly, the solution below worked for me - it may need adaptation for you and indeed I used Microsoft IIS (Internet Information Services) 7.5 but I am sure the same can be achieved with Apache.

Step 1 : Ensure IIS has necessary Modules
The nice thing about newer versions of IIS is the ease in which additional plugins can be added. As my solution was going to use PHP (Pre Hypertext Processor) I installed this from -

As well as PHP, I needed to ensure that IIS could rewrite URLs in the same way as the modrewrite module in Apache, so I installed -

Step 2 : Create a Blog Folder
Obviously, you need to create a folder where the blog is going to sit. We are going to put some PHP files in this folder too. I chose to use the folder name blog.

Step 3 : Create the redirect.php file

The redirect.php file contains some global parameters such as the Blogger URL and the URL of where I want to host the blog.

We also have a PHP function which uses cURL to return the contents of a URL

// URL of the blogger (no http:// or trailing /)
// Where the blogger blog should go (no http:// or trailing /)
// *
// Function : get_data
// Arguments: Url - Request URL
// Returns: HTML from website
// Description
// Take a URL, connect using cURL and then return the data as a String
// *
function get_data($url)
  $ch = curl_init();
  $timeout = 5;
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
  $data = curl_exec($ch);
  return $data;

Step 4 : Create the index.php file

We need to create an index.php file in the blog directory. This is the file which will do the work. The file takes the page the users is requesting, and requests this from Blogger by stripping the name from the original HTTP Request.

When Blogger responds, we use a Regular Expression to replace any hyperlinks with the Blogger domain to be requests on our domain before sending the output to the user.

// If we are a re-written URL this Server Variable will be set
// Otherwise we default to the homepage
 $strURL = str_replace("/blog/","http://" . $_BLOGGER_URL . "/",$strURL);
 $strURL = "http://" . $_BLOGGER_URL . "/";
// Get the HTML and replace the links, we need to make sure we do this
// or traffic goes back to the Blogger site
$html = get_data($strURL);
$pattern = '/<a([^>]+)href=\'http:\/\/' . $_BLOGGER_URL . '\//i';
$replace = '<a$1href=\'http://' . $_REDIRECT_URL . '/';
$html = preg_replace($pattern,$replace, $html);
// There is a weirdness that this isn't picked up, but this fixes one link!
$html = str_replace("<a href=\"http://" . $_BLOGGER_URL . "/\">Show all posts</a>","<a href=\"http://" . $_REDIRECT_URL . "/\">Show all posts</a>",$html);
// Write out the HTML
echo $html;


Step 5 : Create a Search folder

I noticed that Blogger handled search requests slightly differently and so for ease of use I created a separate search folder so it could have its own index.php file listed below, this essentially does the same as the file above.

$strURL = "http://" . $_BLOGGER_URL . "/search?" . $_SERVER["QUERY_STRING"];
// Get the HTML and replace the links, we need to make sure we do this
// or traffic goes back to the Blogger site
$html = get_data($strURL);
$pattern = '/<a([^>]+)href=\'http:\/\/' . $_BLOGGER_URL . '\//i';
$replace = '<a$1href=\'http://' . $_REDIRECT_URL . '/';
// There is a weirdness that this isn't picked up, but this fixes one link!
$html = preg_replace($pattern,$replace, $html);

echo $html;


Step 6 : Setup rewrites

Now we've created the proxy code all that remains is for us to setup our rewrites. The IIS rewrite module is a very simple and easy to use tool.

We want to create the same rewrite for the blog and the blog/search folders as follows:

For the /blog directory:
For the /blog/search directory:

Step 6 : You're Done

Hopefully the above is useful and works for you - if you do get stuck or need some help or want to add to this solution please do let me know!