The Full Customer Journey – Managing User Identities with Google Universal, Mixpanel and KISSmetrics

In my last post I spoke about the two most basic requirements your website and analytics tool must support in order to track the full customer journey:

  1. User Identification – Your analytics tool must support user identification as a prerequisite for linking user activity across devices
  2. Signing In – Your users must sign in or identify themselves on each of their devices (PC, iPad, etc.)

However, although these prerequisites are necessary, I was surprised to find that they are not sufficient. Even if your website and analytics tool do support these 2 requirements, they may not always manage and interpret user activity correctly.

 

To explain, I will now elaborate on 5 scenarios (presented in my previous post) and how Google Universal, Mixpanel and KISSmetrics handle each of them. You can take this information into account when next choosing an analytics tool, and please let me know if you have any questions or input.

 

Scenario No.1 – What happens to session activity prior to registration?

A user browses your site for 10 minutes, searching for products and reading reviews, prior to registering. The user then registers.

 

Will the analytics tool track this activity prior to registration and link it to the registered user?

 

With Mixpanel or KISSmetrics, there is no loss of session-activity when using the Mixpanel or KISSmetrics alias method, as they automatically link your user ID (that you passed to the alias method) to the internal ID allocated by Mixpanel and KISSmetrics.

 

What happens to session activity prior to registration in KISSmetrics and Mixpanel

 

When a user first accesses a site from a specific device, Mixpanel and KISSmetrics generate a new client ID (distinct_id in Mixpanel and anonymous id in KISSmetrics) for that user (no. 1 in the above illustration) and begin using this ID to store all activity from now on. Once you call the alias method, these tools tie the current client ID to your user ID that you passed (no. 2 in the above illustration) – and all past activity will now be related to that user ID.

 

Google Universal: With Google Universal it is not that simple, as there is no alias method, and the user ID attribute that is much talked about doesn’t actually exist at this point. The good news is, however, that there is a work around solution. The secret is to use the client ID that Google Universal generates for the user, and tie it to the user information on your end. When the user registers, instead of replacing the client ID with your own user ID, keep the current client ID and store it on your database. From now on, this client ID will be used every time the user logs onto the site.

 

What happens to session activity prior to registration in Google Universal

 

Scenario No. 2 – What happens to session activity prior to signing in?

Mixpanel: Every time a user signs in, use the identify method to tell Mixpanel to switch to the relevant client ID. (This will automatically set the correct client ID based on the user ID that you passed to the identify method).

 

Google Universal: There is no identify method in Google Universal, but by manually changing the client ID (explained here), all future activity is related to that client ID, thereby achieving the exact same result.

 

So both Google Universal and Mixpanel support identification, one way or another.

 

But, what about the pages already viewed by the user (browsing and searching) prior to signing in? And what about the marketing channel used?

 

Let’s expand on this scenario. During her first visit, Anne registers on eBay from her desktop (visit I). A few days later, she uses her iPad to look for a digital camera on eBay, but doesn’t find anything (visit II). Later that same day, she uses her iPad to google for a specific camera. She finds what she is looking for, logs on, and makes a bid (visit III).

 

What your web analytics tool should show is: 1 unique visitor with 3 visits.

 

But what Mixpanel or Google Analytics actually show is: 2 unique visitors with 4 visits.
Let’s see why:

 

What happens to session activity prior to signing in

 

First Visit

  1. Visit I, pre-registration from desktop. Anne views one page but as the analytics tool does not recognize the visitor (first visit to the site from her desktop), it generates a new client ID (100). Anne then views one more page before registering (all tracked under the same client ID (100).
  2. Anne now registers. The site executes the alias method and ties our newly created user ID (1000) with client ID (100) and understands that the two pageviews prior to registration are related to this user.

Second Visit

  1. Visit II, from iPad. The analytics tool does not recognize Anne as this is her first visit to the site via her iPad. Therefore, the analytics tool generates a new client ID (200), and this entire visit (2 pageviews) is related to this client ID (200). (Note that Anne has not logged in during this visit).

Third Visit

  1. Visit III, from iPad. The analytics tool still recognizes the user as Client ID (200) because she has again accessed her site via her iPad. The next two pageviews are, therefore, automatically related to this user: Client ID (200).
  2. Once Anne logs in, however, the site executes the identify method and uses Anne’s user ID (1000). The analytics tool now changes her client ID from 200 to 100, and all future activity will be related to client ID (100).

Presenting 2 unique visitors (client ID #100 and #200) with 4 visits (third visit was split between client id #200 and client ID #100) instead of 1 unique visitor with 3 visits is unfortunately far from correct.

 

In Mixpanel, this behavior is by design as described here:
This [the identify method] will remap his phone activity to the original ID he used when signing up for your service, which is the most desirable outcome. This does mean that regrettably the events he fired before logging in will not be associated with him.

 

The results presented by Google Universal are far from correct and we currently only have the client ID to work with. As I have said before, their user ID attribute only exists in theory and they are far from being considered a user-centric platform.

 

The KISSmetrics solution: We need to be able to merge the two identities. You can read more about the KISSmetrics built-in support for merging identities with their alias and identify methods, but in a nutshell, by using just the identify method, KISSmetrics links the current anonymous ID with the identified user every time the identify method is called (as long as the current user is anonymous and has not already been identified). So eventually, the same user ID will be linked to all activities prior to signing in.

Here is the live activity console of the above flow:

 

KISSmetrics' live activity console

 

As you can see, user1000 had two different anonymous IDs (one starts with “qz…” and the second starts with “rLJ…”). Some of the page view events were related to the first anonymous ID, others to the second anonymous ID, and one page view was related to user1000 (after the second alias). We know that all those page view events were performed by the same user. Will KISSmetrics figure this out?

 

Well, searching for people who have viewed at least one page achieves the following results:

 

And by looking at the activity performed by user1000 you can see that KISSmetrics successfully tied all page view events to user1000, including page views that were fired before signing in. Good job!

 

Activity performed by user1000

 

KISSmetrics definitely has a huge advantage over Mixpanel and Google Universal in this aspect. I am not that sure how relevant it is for most sites, but for those who need this level of accuracy, the choice is simple: go with KISSmetrics!

 

And now once we have dealt with merging identities, this leads me to our next issue.

 

Scenario No. 3 – When is past activity credited, and to whom?

How do we know if the user that has just signed in from a specific PC is the same user that signed it from that same PC earlier today?

 

First example: Greg is looking for tickets online for a show tonight and uses the hotel’s PC. He googles the name of the artist, clicks on a text ad, and accesses the ticket sale site. But sadly, the show is sold out. Greg closes the browser and leaves the computer. An hour later, Jack uses the exact same PC and goes directly to the ticket sale site. He registers or signs in, and purchases tickets to a different show.

 

KISSmetrics will, as you have probably already guessed, merge identities and present this data as one single user. Greg’s visit will be related to Jack’s visit, and moreover – Jack’s purchase will be credited to the text ad that Greg accessed… This is, of course, incorrect tracking data.

 

Mixpanel and Google Universal will, however, report the correct results in this case.

 

Second example: Sarah uses her PC to look for tickets for a show tonight. She googles the name of the artist and clicks on the text ad. Sadly, this show is sold out. She closes the browser and leaves the computer.
An hour later, Sarah goes back to her PC and accesses the ticket sale site directly. This time, she also registers or signs in, and purchase tickets for a different show.

 

Just to confuse matters – this time KISSmetrics will be correct, and Mixpanel and Google Universal will track the information incorrectly.

 

It is important to note that there is no technical way to solve both scenarios.

 

The best way to prevent KISSmetrics from merging identities is to use the clearIdentity method, which creates a new anonymous ID. In the first example, using the clearIdentity method when Greg or Jack leave the site tells KISSmetrics to treat the two visits as if they were performed by two separate visitors.

 

However, although the clearIdentity method will fix the first example, it will obviously ruin the second one.

 

Therefore, what I suggest you do is use the clearIdentity method selectively, such as:

  • Access from mobile devices – You may choose to avoid using the clearIdentity method when your site is accessed by mobile devices, as tablets and mobile phones are often used by one person.
  • Online services – The clearIdentity method could be used for online services (such as gmail or dropbox), where users must log in to use the service.
  • Consider asking the users if they are who you think they are. This may sound like a joke, but is actually the most accurate way to know who are you dealing with.

 

Here too KISSmetrics has a huge advantage, as it is the only tool out of the 3 presented here that gives you technical freedom to decide whether or not to credit past activity to currently signed-in users.

 

Scenario No. 4 – Should the activity of 2 registered users using the same PC be linked?

Is the user that has just registered on my site the same user who registered two days ago from the same PC? Chances are these are 2 different users. However, Mixpanel and Google Universal – as their default – automatically link these two users and consider them as one…

 

Example: Greg is looking for tickets online for a show tonight and uses the hotel’s PC (same example as before). He googles the name of the artist, clicks on a text ad, and accesses the ticket sale site. This time the show is not sold out and he buys tickets. An hour later, Jack uses the exact same PC and registers.

 

KISSmetrics – By now we already know that KISSmetrics provides a way to “forget” the user, so by calling the clearIdentity method when each user logs off, the two users will be tracked correctly. Once again – the KISSmetrics team has done a great job and really thought this feature through.

 

Mixpanel: I highly recommend not calling the Mixpanel alias method more than once for the same anonymous ID, as it can hold only one pair of user ID and anonymous ID. In this case, all past activity will be related to the user ID that you have just passed to the Mixpanel alias method. In this above example, all Greg’s activity will be related to Jack, and the next time Greg signs in, Mixpanel will not recognize him and will treat him as a new, separate user 🙁

 

If you use the Mixpanel alias method when users register to your site – be aware that your data will be compromised every time a user registers via a public computer that has already been used in the past to access your site (unless all cookies have been erased). This could have a huge negative impact on your data. There is, however, a way to get around this: All you have to do is implement your own version of the clearIdentity method – clear the relevant cookies and force Mixpanel to generate a new client ID.

 

Google Analytics will also incorrectly track activity on your site in such a case. Your database will include two different users with the same client ID, and will unfortunately be treated as the same (one) user. However – this too can be fixed! As with Mixpanel, just implement your own version of the clearIdentity method, clear the relevant cookies and force GA to generate a new client ID.

 

Scenario No. 5 – What happens to PREVIOUS session activity prior to registration?

In many cases, the most critical user session from a marketing point-of-view is prior to registration. If a person clicks on an ad from their iPad but does not register or identify themselves on the landing page or during this first session, chances are you won’t be able to link this iPad session and the correct marketing channel (the ad on their iPad) to future registration from their desktop .

 

Example: While looking for a digital camera on my iPad, I click on a text ad via Google and arrived at the B&H site. I find a camera that suits me but decide to look for a better price. I email myself the link and the following day use my desktop to directly access the B&H site (via the link I emailed myself), locate the camera and buy it.

 

All three tools will treat this activity as 2 visits by 2 separate users and there is no technical solution to solve this. If you cannot convince the user to identify himself on his first session, most chances you will not be able to credit the marketing channel he used to get to your site, and later on to convert.

 

The best advice I can give you here is to find ways to push users to identify themselves before leaving the site (during their first visit).

 

Or… use KISSmetrics!

 

If you find a way to convince users to identify themselves, even if it is not during the first visit to the site – KISSmetrics will merge the visits, relate them both to the same user, and will even retroactively credit all events to that original marketing channel.

 

This amazing feature is only offered by KISSmetrics! Here is my test:

  1. I set the referrer to cnn.com and went to the site on my iPad
  2. I then performed a “checkout” event without identifying myself
  3. I entered the same site from my desktop (directly), registered (as user1002), and performed the “checkout” event again.
     

    At this point, KISSmetrics reported 2 users who performed the “checkout” event at least once: 

    Visitors who checked out - KISSmetrics

  4. Two days later I used my iPad again, but this time logged in as the same user who registered from my desktop, i.e., user1002.
     

    Couple of hours later, the same report only showed 1 user: 

    Visitors who checked out after signing in - KISSmetrics 

    Not only did KISSmetrics merge these two users into one, it also credited the initial referrer I used (cnn.com). This is truly impressive!

Here are some additional tips to help you achieve correct tracking:

  • Promotional codes: You can use promotional codes to convince people to close the loop when registering. With the above B&H example, arriving at the site from my iPad + receiving a promotional code for a 5% off my next purchase might convince me to register and then these 2 visits will be tied together.
  • Crediting marketing channels: When a visitors uses a marketing channel to access your site, you can save this information along with the date of the visit. Same when they convert. That way, if the user signs in again in the future and thereby closes the loop, you will be able to see if the marketing channel originally used was before or after the date/time of user conversion and credit the correct traffic source.
  • Ask the user: “Is this your first visit to this site?; How did you hear about us?; Do you regularly use more than one device?” You will be surprised how much information you can obtain by asking the right questions and offering an incentive so that the user replies.
  • Segmenting your marketing channels: When analyzing your marketing channels, make sure to segment the traffic according to the device, and look for drastic changes in performance (negative or positive). For example, a certain marketing channel may perform poorly for mobile devices but will increase direct traffic to tablet or desktop respectively.

Summary

By now you have probably reached the conclusion that you cannot obtain all the information you need, and that in many cases, you will only receive a partial (and even false) picture of what is really going on with your site.

 

To maximize the information achieved and keep it as accurate as possible, my suggestion is:

  1. Understand each problem
  2. See what is (and isn’t) possible with each of the tools
  3. Make the necessary amendments to your site analytics tool

 

As far as I have found, KISSmetrics is by far the most advanced analytics tool in this field. KISSmetrics has concentrated on data processing – and at the end of the day, I personally prefer my data to be accurate, even if it takes couple of hours to process. If however these 5 scenarios are not really relevant for you, then implementing your own version of the clearIdentity method in Mixpanel and Google Analytics could also provide a good enough solution.

 

If these 5 issues are applicable to your site, I recommend the following:

  • Use KISSmetrics and use the clearIdentity method when the session ends.
  • Every time a user accesses your site: (1) ask them if they are who you think they are (you can keep the last user in a cookie) and (2) push them to make a choice – are they who you think they are, or are they someone else? This method is used on many sites, including Amazon and Google.
  • Segment your traffic by marketing channels and devices, and add promotional codes for the specific channels that you wish to evaluate, and to tie the user when needed.
  • Ask the user whenever you’re not sure, and find creative ways to get the answer you are looking for. Maybe concentrate on the largest “unknown”, which in most cases will be new visitors from direct traffic that visited your site and registered for the first time.