June 1, 2017
Now that the free A/B testing tool Google Optimize is available to everyone, hopefully there will be even more testing and learning. Much has already been written about the new tool, but most reviews and articles mainly focus on how it works compared to other A/B testing tools. One of the frequently mentioned advantages of Google Optimize is its flawless connection to Google Analytics. To test this, in this article, I will discuss A/B testing analysis.

There are two standard reports in which completed tests can be analyzed: the Google Optimize report and the Google Analytics Experiment report. An advantage of the Google Optimize report is that it allows the Bayesian probability indicates a winner. The Analytics Experiments report, on the other hand, allows you to select one segment so you can dive deeper into the results. But both reports are very brief and contain only the bare essentials: total sessions and the amount of goals achieved per variant. For example, they don't look at the impact of a test on average time or exit rate on a page. But the two main problems are the differences between Google Optimize and Analytics' experiment sessions and the way experiment sessions are used.
Before I dive deeper into the difference between Google Optimize and Analytics, I am interested in the definition of an ‘experiment session’ as formulated by Google Optimize and Google Analytics. These seem the same on paper, but in practice there is a difference: the numbers are on average 1% to 2% higher in Google Analytics. For tests that have been running for less than a week, this difference seems much larger and you can even see up to 6%(!) additional visitors and transactions* in Google Analytics. This can have big implications. For example, you can reject a winning variation by using the wrong analytics method.
*For the above data, I looked at the results of 27 diverse tests in Google Optimize and Google Analytics.
Google gives yourself some reasons for the difference between the two tools:
The latter reason also explains the larger difference in tests that have only been running for a few days. All this seems to me enough evidence to analyze your tests not in Google Optimize but in Google Analytics. Unfortunately, you do then miss the Bayesian probability of a winner incorporated into the tool, but fortunately there are plenty of Bayesian calculators available (e.g. AB test guide).
In addition to not matching the Google Optimize and Analytics data, I almost always analyze users instead of sessions/experiment sessions. Session-based analysis focuses on the short term, but many products have a buying process of multiple sessions that you miss if you analyze based on sessions. Also, a ‘session-based analysis’ brings statistical problems because the conversion rate will be a lot lower, these blog by Hubert Wassner elaborates on this. In addition, Bayesian statistics is based on two values: 0 and 1 or, bought or not bought. If you measure sessions, this means that a visitor can buy something multiple times. But you only want to see if a visitor converts, regardless of how many times something is purchased.
Based on the above findings, I have briefly summarized the best way to use Google Optimize to analyze your experiments:

The purpose of the sample test is to increase the number of transactions from the test page. The segment then looks like this:

If you enter the users without a transaction and the users with a transaction per variant into a Bayesian calculator, the B variant from the test page has a probability of 80.5% of a higher conversion rate than the original. Depending on how much risk you want to take, you can then choose to implement the variant. In addition, we see that the test probably had little impact on visitor behavior because the average time on the page and the bounce rate remained about the same.

This article was published on May 22 at Webanalists.com