Why not analyze experiments in standard Google reports

Now that the free A/B testing tool Google Optimize is available to everyone, hopefully there will be even more testing and learning. Much has already been written about the new tool, but most reviews and articles mainly focus on how it works compared to other A/B testing tools. One of the frequently mentioned advantages of Google Optimize is its flawless connection to Google Analytics. To test this, in this article, I will discuss A/B testing analysis.

There are two standard reports in which completed tests can be analyzed: the Google Optimize report and the Google Analytics Experiment report. An advantage of the Google Optimize report is that it allows the Bayesian probability indicates a winner. The Analytics Experiments report, on the other hand, allows you to select one segment so you can dive deeper into the results. But both reports are very brief and contain only the bare essentials: total sessions and the amount of goals achieved per variant. For example, they don't look at the impact of a test on average time or exit rate on a page. But the two main problems are the differences between Google Optimize and Analytics' experiment sessions and the way experiment sessions are used.

The difference between Google Optimize and Analytics

Before I dive deeper into the difference between Google Optimize and Analytics, I am interested in the definition of an ‘experiment session’ as formulated by Google Optimize and Google Analytics. These seem the same on paper, but in practice there is a difference: the numbers are on average 1% to 2% higher in Google Analytics. For tests that have been running for less than a week, this difference seems much larger and you can even see up to 6%(!) additional visitors and transactions* in Google Analytics. This can have big implications. For example, you can reject a winning variation by using the wrong analytics method.

*For the above data, I looked at the results of 27 diverse tests in Google Optimize and Google Analytics.

Google gives yourself some reasons for the difference between the two tools:

  • In Analytics, actual conversion rates are reported, while Optimize reports use modeled conversion rates.
  • Reporting delays between the two products differ (Analytics is faster than Optimize).
  • Sessions collected just before the end of an experiment are not forwarded to Optimize, but are displayed in Analytics.

The latter reason also explains the larger difference in tests that have only been running for a few days. All this seems to me enough evidence to analyze your tests not in Google Optimize but in Google Analytics. Unfortunately, you do then miss the Bayesian probability of a winner incorporated into the tool, but fortunately there are plenty of Bayesian calculators available (e.g. AB test guide).

Using experiment sessions

In addition to not matching the Google Optimize and Analytics data, I almost always analyze users instead of sessions/experiment sessions. Session-based analysis focuses on the short term, but many products have a buying process of multiple sessions that you miss if you analyze based on sessions. Also, a ‘session-based analysis’ brings statistical problems because the conversion rate will be a lot lower, these blog by Hubert Wassner elaborates on this. In addition, Bayesian statistics is based on two values: 0 and 1 or, bought or not bought. If you measure sessions, this means that a visitor can buy something multiple times. But you only want to see if a visitor converts, regardless of how many times something is purchased.

The right way to analyze Google Optimize tests

Based on the above findings, I have briefly summarized the best way to use Google Optimize to analyze your experiments:

  1. First, I recommend analyzing all your tests in Google Analytics because it reports more data faster.
  2. In addition, it is best to work with a custom report with as dimension: experiment name and variant, as metric users and other interesting metrics such as transactions, time on page or exit percentage. A custom report also allows you to segment easily by adding a secondary dimension and segments. Below is an example of the custom Google Analytics report that I use when analyzing the Google Optimize tests (where 0 is the control and 1 is the variant).
  3. Analyze users who did something. If you use a frequentist or Bayesian calculation to determine a winner, you have to analyze users who did or did not do something. So then you can't use transactions (or any other metric) as a conversion goal, but you have to analyze users with a transaction (or some other goal). To conjure up this data, you need to create a series of segments where the visitor performs an action.

The purpose of the sample test is to increase the number of transactions from the test page. The segment then looks like this:

If you enter the users without a transaction and the users with a transaction per variant into a Bayesian calculator, the B variant from the test page has a probability of 80.5% of a higher conversion rate than the original. Depending on how much risk you want to take, you can then choose to implement the variant. In addition, we see that the test probably had little impact on visitor behavior because the average time on the page and the bounce rate remained about the same.

 

This article was published on May 22 at Webanalists.com