Skip to content
Feb 27 14

The Full Customer Journey – Managing User Identities with Google Universal, Mixpanel and KISSmetrics

by shay.sharon

In my last post I spoke about the two most basic requirements your website and analytics tool must support in order to track the full customer journey:

  1. User Identification – Your analytics tool must support user identification as a prerequisite for linking user activity across devices
  2. Signing In – Your users must sign in or identify themselves on each of their devices (PC, iPad, etc.)

However, although these prerequisites are necessary, I was surprised to find that they are not sufficient. Even if your website and analytics tool do support these 2 requirements, they may not always manage and interpret user activity correctly.

To explain, I will now elaborate on 5 scenarios (presented in my previous post) and how Google Universal, Mixpanel and KISSmetrics handle each of them. You can take this information into account when next choosing an analytics tool, and please let me know if you have any questions or input.

Scenario No.1 – What happens to session activity prior to registration?

A user browses your site for 10 minutes, searching for products and reading reviews, prior to registering. The user then registers.

Will the analytics tool track this activity prior to registration and link it to the registered user?

With Mixpanel or KISSmetrics, there is no loss of session-activity when using the Mixpanel or KISSmetrics alias method, as they automatically link your user ID (that you passed to the alias method) to the internal ID allocated by Mixpanel and KISSmetrics.

What happens to session activity prior to registration in KISSmetrics and Mixpanel

When a user first accesses a site from a specific device, Mixpanel and KISSmetrics generate a new client ID (distinct_id in Mixpanel and anonymous id in KISSmetrics) for that user (no. 1 in the above illustration) and begin using this ID to store all activity from now on. Once you call the alias method, these tools tie the current client ID to your user ID that you passed (no. 2 in the above illustration) – and all past activity will now be related to that user ID.

Google Universal: With Google Universal it is not that simple, as there is no alias method, and the user ID attribute that is much talked about doesn’t actually exist at this point. The good news is, however, that there is a work around solution. The secret is to use the client ID that Google Universal generates for the user, and tie it to the user information on your end. When the user registers, instead of replacing the client ID with your own user ID, keep the current client ID and store it on your database. From now on, this client ID will be used every time the user logs onto the site.

What happens to session activity prior to registration in Google Universal

Scenario No. 2 – What happens to session activity prior to signing in?

Mixpanel: Every time a user signs in, use the identify method to tell Mixpanel to switch to the relevant client ID. (This will automatically set the correct client ID based on the user ID that you passed to the identify method).

Google Universal: There is no identify method in Google Universal, but by manually changing the client ID (explained here), all future activity is related to that client ID, thereby achieving the exact same result.

So both Google Universal and Mixpanel support identification, one way or another.

But, what about the pages already viewed by the user (browsing and searching) prior to signing in? And what about the marketing channel used?

Let’s expand on this scenario. During her first visit, Anne registers on eBay from her desktop (visit I). A few days later, she uses her iPad to look for a digital camera on eBay, but doesn’t find anything (visit II). Later that same day, she uses her iPad to google for a specific camera. She finds what she is looking for, logs on, and makes a bid (visit III).

What your web analytics tool should show is: 1 unique visitor with 3 visits.

But what Mixpanel or Google Analytics actually show is: 2 unique visitors with 4 visits.
Let’s see why:

What happens to session activity prior to signing in

First Visit

  1. Visit I, pre-registration from desktop. Anne views one page but as the analytics tool does not recognize the visitor (first visit to the site from her desktop), it generates a new client ID (100). Anne then views one more page before registering (all tracked under the same client ID (100).
  2. Anne now registers. The site executes the alias method and ties our newly created user ID (1000) with client ID (100) and understands that the two pageviews prior to registration are related to this user.

Second Visit

  1. Visit II, from iPad. The analytics tool does not recognize Anne as this is her first visit to the site via her iPad. Therefore, the analytics tool generates a new client ID (200), and this entire visit (2 pageviews) is related to this client ID (200). (Note that Anne has not logged in during this visit).

Third Visit

  1. Visit III, from iPad. The analytics tool still recognizes the user as Client ID (200) because she has again accessed her site via her iPad. The next two pageviews are, therefore, automatically related to this user: Client ID (200).
  2. Once Anne logs in, however, the site executes the identify method and uses Anne’s user ID (1000). The analytics tool now changes her client ID from 200 to 100, and all future activity will be related to client ID (100).

Presenting 2 unique visitors (client ID #100 and #200) with 4 visits (third visit was split between client id #200 and client ID #100) instead of 1 unique visitor with 3 visits is unfortunately far from correct.

In Mixpanel, this behavior is by design as described here:
This [the identify method] will remap his phone activity to the original ID he used when signing up for your service, which is the most desirable outcome. This does mean that regrettably the events he fired before logging in will not be associated with him.

The results presented by Google Universal are far from correct and we currently only have the client ID to work with. As I have said before, their user ID attribute only exists in theory and they are far from being considered a user-centric platform.

The KISSmetrics solution: We need to be able to merge the two identities. You can read more about the KISSmetrics built-in support for merging identities with their alias and identify methods, but in a nutshell, by using just the identify method, KISSmetrics links the current anonymous ID with the identified user every time the identify method is called (as long as the current user is anonymous and has not already been identified). So eventually, the same user ID will be linked to all activities prior to signing in.
read more…

Feb 24 14

The Full Customer Journey – Part I
Managing User Identities with Google Universal, Mixpanel and KISSmetrics

by shay.sharon

A few months ago I wrote about the Google Universal User-Centric Approach, where I mentioned that in order for a web analytics tool to be considered a user-centric analytics tool, it must be able to identify and link a user across devices and platforms.

It was while writing that post – in which I also talked about the Mixpanel alias and identify methods, and how a similar solution can be achieved with Google Universal – that I first noticed that the Mixpanel alias and identify methods do not work exactly as I would have expected. An odd realization at this point, as I had, after all, been using Mixpanel for quite a while.

A few weeks ago I saw an answer on Quora written by Peter Reinhardt from segment.io, regarding a unique KISSmetrics feature that enables the merging of identities across public and authenticated sites. Up until then, I hadn’t been aware of this feature, although I had also been using KISSmetrics for quite a while.

It was then that I realized that handling and managing identities is more complex than I first believed. The goal of this post (part I) and the next one (part II) is therefore to provide an in-depth review of handling and managing identities using Google Universal, Mixpanel and KISSmetrics, including issues and solutions. I think you will be surprised by the results ;)
read more…

Dec 4 13

A Different Approach to Cross-Domain Tracking

by shay.sharon

Most analytic tools use cookies to identify visitors. However, data stored in cookies is only visible in the domain in which the cookies were set/defined – which is a serious problem if your site cuts across multiple domains. If, for example, your visitor is presently on domain2.com, then your analytic tool will not be able to read data stored on domain1.com, nor will it be able to link between the two visits – thereby your one single visitor will be counted as two separate visitors.

There are two traditional methods to overcome the multi-domain issue:

  • User Identification – Most analytic tools (including the new version of Google Analytics) allow you to link activity to a specific user using your own user identification. If you choose to go with this method, then once your user logs in to any of your domains, you can tell your analytic tool to credit all activity (on that domain) to the identified user from now on. This method bypasses the above mentioned cookie issue, as your analytic tool does not have to obtain the visitor ID from the cookies.

    Disadvantage: Your visitors need to identify themselves, in order to be tracked correctly.

  • Cross-Domain Linking – By enabling the cross domain linking, you can tell your analytic tool to pass the relevant data between your domains. In a nutshell, if your analytic tool identifies a visitor as visitor “123” and stores this information in cookies, then when that visitor clicks on a link to a different domain – the visitor ID (“123”) will be passed as a parameter to the other domain and the analytic tool will read that information from the URL parameters – rather than from the cookies.

    Disadvantage: The downside of this method is that it only covers visitors who switch between domains via links on your sites.

Unfortunately, both methods described above are not perfect as they only solve part of the problem – mainly because your visitors can always switch between domains using methods that are not in your control. For example, let’s say you have a multilingual site with an English version on domain.com and a French version on domain.fr. Your French visitors, however, especially from organic and social sources, could end up on your .com site and be counted as new visitors – even though they have in fact already visited your French domain before.

But fear not! A different approach altogether could provide a great solution:

The Centralized Domain Method

Using this method, all you have to do is have one domain that is responsible for (1) Assigning new IDs to new visitors, and (2) Identifying returning visitors. The idea is that a new visitor to a domain (a visitor without a user ID in their cookies) will be transferred to a centralized domain in order to be identified. Once identified, they will be redirected back to the domain originally visited together with their new allocated ID.

Let’s look at a the following scenario of visitor who visits a site with 2 domains (a.com and b.com).

  • A new visitor (User X) visits a.com
  • A week later, the same visitor visits b.com
  • A week after that, the same visitor visits a.com again

read more…

Aug 21 13

Single vs. Multiple Properties in Google Analytics

by shay.sharon

Should one single property be used for all sections of a website, or a number of separate properties the better choice?

This post aims at explaining why I believe that separated properties are sometimes better than a single property. I will, of course, try to provide you with all the necessary information for making an informed decision, and if you think of anything I may have missed, please leave a comment or contact me. Thanks!

Before we dive into the details, let’s make sure we are all referring to the same terminology…

  1. PropertyAccording to Google’s documentation, when using your Google Analytics account you can create a property for each domain, site, source or environment for which you would like to collect data. Each property has its own tracking code with its own unique ID for identifying data from that property.
  2. Views – Each property can contain one view or more. “A view is a defined perspective of the data from a property, and provides access to the reports for that property” (Google’s documentation). In other words, you can include a specific subset of data in a view by applying filters. For example, you can filter out all traffic that is not to a specific domain (should your property track more than one domain) or you can choose to only include organic traffic (if you want to provide a view for your SEO contractor).
  3. Google Analytics Account – Each Google Analytics account can include one property or more. Make sure not to confuse this with the term “Google Account”: One Google Account can contain up to 25 Google Analytics accounts (check out this post if you reach this limit but wish to have more than 25 accounts).

Please note that in some places Google uses the term profile instead of view, but as far as I can tell, both terms represent the same entity and I will use the term view throughout this post.

So now let’s begin with the main advantages of using multiple properties.

Sampled Data

The first reason why using multiple properties is my prefered choice is that in most cases it enables you to use complete data, not just sampled data.

Sampled data often occurs with relatively large sites when using a single property, and is of course not always optimal. If your query includes more than 500,000 visits, Google Analytics will only return sampled data. For example, if your site generates 1,000,000 visits per month, Google Analytics will use sampled data as the default on your reports.
You can overcome this by changing the date range to less than a couple of weeks, but this might not always be a long enough period for your analysis.

Although sampled data often suffices, it is difficult to work with if you want to dive deeply into specific cases, if you need precise numbers, or if you wish to tackle a certain issue with your tracking. This is especially true when receiving a sample rate of less than 70-80%. For more details check out this article.

You can solve the overall data sampling issue by creating subsets of your data using views – but that only helps with standard reports. Since session sampling is done on the property level, creating multiple views will not avoid data sampling when applying advanced segments or creating customized reports. You can find more information about data sampling here.

Google Analytics’ Limitations

The second reason why using multiple properties is my prefered choice, rather than using a single property of the free Google Analytics, are the limitations that occur when working with a large website:

  • Data Collection – Google Analytics has a limit of 10 million hits per month – or at least that’s what Google states. From my past experience, Google does process more than this amount, but I’m not sure if it processes all of the data, or just more than this stated limit.
  • Data Freshness – If you have more than 200,000 visits a day, Google processes the data once a day – which could lead to a delay of two days in your data refreshness. I have also noticed that with large numbers of visits, custom variable values in standard reports are not updated for up to 3 days, but I am not 100% sure that this is necessarily due to the large amount of data sent (the data is, however, available in the API, even if the GA reports have not yet been updated in the reports).

Clear Separation between Web Properties

Most websites today have more than one web property, such as the main site, a blog, support center, landing pages, etc.

When installing Google Analytics on a website, the first and most trivial thing to do is try to aggregate all of them under the same property. Why? Because it supposedly sounds right that a user who visits the main site and then moves on to the support site or blog will be only counted once by Google Analytics. The common misconception is that if you implement a different property with a different tracking code for each of the company’s websites (e.g., one for the main site, one for the blog, and one for the support site), then you will not be able to tell how many users in total visited your brand’s properties (as the user who visited the main site and then the blog will be counted twice instead of just once).

I believe that aggregating all your sites together, is in most cases the incorrect thing to do, but before expanding on why, I would like to give an example: Let’s say I have a Service-as-a-Software application, which includes two sections: (1) the main website (for potential clients), and (2) the authenticated web site where my clients can manage the services. It definitely makes more sense to divide the 2 into two separate sections.

Let’s take the KISSmetrics site as an example: Many people I know (myself included) read the KISSmetrics blog almost on a daily basis, but most of these people have never visited their main site and will never become their customers. That might be the reason why KISSmetrics divides their site into two separated properties in Google Analytics: one for the blog and one for the main site. Otherwise, think what their overall bounce rate would be, or their conversion rate from visits to registration, if their blog was defined as part of their main site property.
read more…

May 16 13

The User-Centric Approach – Google Universal Analytics vs. Mixpanel

by shay.sharon

About 6 months ago, Google released Universal Analytics, which – in a nutshell – offers 4 new key features:

  • Measurement Protocol – an API that can be used for multi-platform tracking
  • Improved Feature Configuration Management – through the GA admin screens, rather than the JavaScript agent
  • Custom Dimensions and Metrics – very similar to the GA Custom Variables feature, but with some admin control and improved support for the built-in GA features
  • Dimension Widening – which allows you to load dimensions via data upload, rather than having to do so one at a time via JavaScript (and exposing data you would prefer not to be exposed)

Since its release, I have often been asked about Universal Analytics. Now, after having had the chance to try it out for a while, my answer is as follows: From a market point of view, this release offers nothing new for the end customer. Google is now aligned with the rest of the industry, with their new features being old news for other existing tools.

What is unique about these new Google features, however, is that they are free, and definitely suffice for the needs of about 80% of website owners worldwide. Take for example the GA Real Time Reports that were released more than a year ago, which may be inferior to Mixpanel or ChartBeat, but are definitely good enough for most companies, and are free!

And what about Google Universal being more User-Centric?

At first, after reading the great reviews, I was sure that Google now recognizes the importance of placing the focus on the customer, rather than on the visit. I even believed, from what I had read, that this was the most significant change in this release. However, after working with Google Universal Analytics for the past 6 months, I can honestly say that GA has not changed its approach. (In Google’s defense, I don’t think they ever claimed to having made such a focus change and I don’t remember having seen an official document talking about such a change in approach.)

The funny thing is that looking back, despite there being such a buzz about this focus change (including how easy it is to link users across devices and use your own ID), no one ever explained the actual procedures. Even trying to figure out how their user identification actually works is almost impossible, without digging into the Measurement and Analytics Modules’ documentation.
read more…

May 8 13

Linking events to a specific user in Google Universal Analytics

by shay.sharon

In this post I will show you how to send events for a specific user from both the JavaScript agent (Analytics.js) and the Measurement Protocol. I could not find any existing information about how to do so. After some research, I did, however, find a great presentation by @techpad that talks about improving e-commerce tracking using Universal Analytics. In one of the last slides, the presentation talks about this issue exactly (slide #31).

The Measurement Protocol enables you to send events from a non web-based system, including a native mobile application or a batch process. In order to be able to link an event to the right user, you have to present the user identification as part of the call. Following is an example:


curl --data "v=1&tid=UA-XXX-X&cid=12345&t=pageview&dp=%2Fhome" http://www.google-analytics.com/collect

In this call I tell Google Analytics that user ID no. 12345 viewed a page called “home”.

curl is used here to execute a post call to Google. Even if you are not familiar with curl, or are not a technical person, the call is pretty trivial: you have to pass:

  • The property ID you want to use for tracking this event (tid)
  • The client ID (cid)
  • The information about the pageview (t and dp)

I recommend you check out the Measurement Protocol Developer Guide for detailed information about the parameters that you can pass, and examples of how to send pageviews, events, e-commerce tracking and more.

Google documentation states that the client ID (cid) should be an anonymous ID

v=1             // Version.
&tid=UA-XXXX-Y  // Tracking ID / Web property / Property ID.
&cid=12345        // Anonymous Client ID.
&t=             // Hit Type.

(I’m guessing they mean that the ID should be anonymous to Google – not to you, obviously…) Using an email address as the client’s ID will enable Google to identify this user, whereas passing a client ID that you allocated/invented, and which does not hold any personal information, can only be linked to the actual user by yourself.

So far so good. Now we know how to send specific user events using the Measurement Protocol, but what about sending events using Analytics.js? Luckily, there is a way to send pageviews and other events on behalf of users:

Tell Analytics.js to use your own client ID (clientId) when creating the tracker:

ga('create', 'UA-XXXX-Y', {
  'clientId', '12345'
});

However, when using this method there are a few things you should keep in mind:
read more…

Dec 19 12

My Google Tag Manager Wishlist

by shay.sharon

The Google Tag Manager (GTM) is indeed a great tool to have. Although several tag management services did exist before (such as tagman and ubertags), none of them were as simple and as easy as GTM – at least not for me.

I know the GTM has only been with us for few months, and I know that the guys at Google are constantly working to improve it by adding all sorts of additional tag templates (such as comscore and media6degrees). But following the implementations that I have performed over the past few months, there are several specific features that I would be very happy to see in the near future. This is my GTM Feature Wishlist:

  1. A Tag Template Marketplace (something like the marketplace at SiteApps)
    I don’t know how complicated it is to create a tag template but I am sure companies like Mixpanel and ClickTale would love to be able to develop tags and have them appear on the list of available tags.
  2. UI Improvements that will enable:
    • Organizing tags, rules and macros into folders.
    • Providing quick access for creating a new tag, rule and macro from the left navigation and not just from the overview screen.
    • Switching between containers from the account drop-down menu
  3. Tag Dependency Management
    For defining the order in which the tag manager loads the tags. At present, the tag manager loads tags asynchronously, in order to optimize the page load time. However, there are some cases in which a specific tag is required to be loaded directly following another tag. [I am currently using events to do so, whereby the main tag fires an event (using the dataLayer object) which is used to fire the dependent tag.]
  4. “Synchronous” Tag Support
    Although tags that affect the page layout, such as in AB and MVT services, do not necessarily have to be loaded synchronously, you want to make sure they run BEFORE the page is shown to the user, as they might affect the user experience. Hence the term “synchronous”. For example, you may want to redirect to a different page in the case of a split test, or change the layout of the page in the case of an AB test or MVT. Together with tag dependencies, you can provide a good solution for split test and make sure other tags are not fired should the split test tag need to redirect the user to a different page.

read more…

Dec 6 12

Using Google Tag Manager to Enable Visitors to Opt-Out of Being Tracked

by shay.sharon

Most websites use web analytics and marketing tools to track and optimize their visitors’ behavior. Most of those tools (Google Analytics, for example) use cookies for identifying these visitors. However, in some countries, website owners are required to receive specific authorization from each and every visitor for using cookies or storing personal information – or both. This type of authorization from visitors is usually called “Opting-In”. Moreover, site owners in some countries are allowed to use cookies but must first offer visitors the option to “Opt-Out”, which tells the site not to use cookies or store personal information for visitors who “opt-out”.

Regardless of legal requirements, presenting an opt-out option for your visitors can be considered a transparent, fair service, and will probably strengthen your visitors’ engagement and loyalty.

The idea is to have a default on your site, such as tracking all visitors unless otherwise specified, plus the option for your visitors to specifically “opt-out”, by performing a certain action such as clicking on a link or ticking a checkbox.

One possible way to implement the “opt-out” option is by using JavaScript to disable tracking (either by not sending events or by disabling the service using its built-in support, if available). However, when using JavaScript, even if you don’t fire an event or send a pageview, this doesn’t always mean that the 3rd party vendors won’t actually perform actions such as cookies settings.

The best way to implement the opt-out option is to not place tracking tags on the site. Google Tag Manager (GTM) is designed exactly for these kinds of tasks and will let you define when and where to include the tracking tags – and with almost no coding.

In this post I will show you how to implement opt-in and opt-out links on your site in order to allow your visitors to decide whether or not they agree to be tracked. I will use Google Analytics as an example but this method works with any service that requires placing a tag on your site for means of tracking.

You can see this in action here

In the above example, the site default is set to track all visitors, but also offer the option of the users opting-out (until they clear the cookies).

read more…

Nov 30 12

Google Tag Manager – Persistent Data Layer (per-session and per-visitor)

by shay.sharon

The Google Tag Manager (GTM) is awesome! It enables me to manage all my 3rd party tags in one place, and it seems like the Google team has made every effort to come up with a robust solution that could actually be an alternative to the way we work today.

Although there are additional features that I would have liked to see in the GTM, I realize that this is just the first version of this platform, and I am sure Google will continue to improve it.

One such missing feature relates to the data layer object, and so I have created my own extension in the meantime, and thought it will be a good idea to spread it around.

The Data Layer Object: In order to send information to the GTM, or to configure when a tag should be fired, the GTM is designed to easily reference information placed in an object called “dataLayer”.

For example, you may want a specific tag to only be fired when a user logs onto your site. In order to do so, you would:

  • Define a new variable such as “visitorType”
  • Set it to “loggedin” when the user logs onto your site
  • Configure the tag to only be fired if “visitorType” equals “loggedin”

So what’s missing?

The problem is that the data layer object only persists as long as the visitor remains on the current page. Variables that are relevant across pages (such as visitor type) must be declared in the data layer on each and every website page.
read more…

Aug 15 12

Analyzing Visitor Paths Using Google Analytics

by shay.sharon
Are you able to answer the following questions precisely:

  • What is the exact trail each and every visitor goes through on your site prior to completing a goal?
  • How many visits, pageviews or minutes are required until the average visitor converts?

A few months ago I published a post about Goal Analysis and Distance to Goal Completion (Goal Analysis (Part II) – Measuring Distances to Goal Completion), where I discussed the length of time, number of visits, pageviews, and traffic sources that an average visitor goes through before converting. I ended the post by writing that I would later publish a separate post about migrating the marketing suite goal analysis report to work with Google Analytics (via GA API) – so here goes.

Some of the important information that we require can be obtained using tools such as KISSmetrics, Performable or Mixpanel. My proposed Goal Analysis Report – on top of Google Analytics – will provide you with everything you need, and it’s completely free!

I will later publish a separate post with step-by-step instructions on how to implement the report, but the aim of this post is to explain why the report is important and what it can achieve. If you don’t want to wait, just send me an email and I’ll help you set up the report (or visit this page). I already sent emails to those who contacted me over the past few months regarding this report, so if you didn’t receive this email but wish to, please let me know.

Why do we need a new and improved Goal Analysis Report?

The report I’m suggesting can help us understand three important questions that cannot be fully addressed using the standard Google Analytic reports:

Now let’s take a look at how the report can provide this important information:


How many unique visitors completed a certain goal and what is the more precise conversion rate?


When you look at the GA Goal Completion Report you can view the number of goal completions during a specific date range. This metric, however, tells you during how many visits your visitors complete the goal, and not how many unique visitors completed the goal. The same goes for the Conversion Rate metric.

It is easy to interpret these metrics incorrectly if the difference between the two is not clear. You can look back at an earlier post (Constant Decrease in Conversion Rate Over Time) for more information. It’s a bit technical but it is important to understand how the standard goal completion metrics might give you the wrong impression about your conversion rate.

My Goal Analysis Report presents the amount of unique visitors who completed a certain goal within the specified date range, which together with the number of total unique visitors, will give you a more precise conversion rate.


The above screenshot from the report shows a total of 1,475 download completions (i.e., visits during which visitors performed the “Download” goal). The Goal Analysis Report reveals that these 1,475 visits were performed by 267 unique visitors, and the conversion rate was 2.91% (i.e., 267 unique visitors who completed the goal out of 9,159 unique visitors who visited the site).

To emphasize why this metric is so important, let’s take a look at the equivalent GA goal report:

Goal Analysis Report Google Analytics
Download completions 267 completions 1,475 completions
Conversion rate 2.91% 12.21%

read more…