We are looking for new colleagues! Check out our vacancies.

The A/B effect: bad news for lovers of evidence-based decisions

Bad news for lovers of evidence-based decision-making: new research from Michelle Meyer and colleagues (2019) shows that conducting experiments to base decisions on evokes resistance. Even if people would not mind if randomly one of the options were implemented.

If it is still unknown which approach will works best, conducting a randomized experiment (popularly popularly called A/B testing) is considered the gold standard of research considered. This is because an experiment can be used to prove which approach will be will be most successful. Yet according to the research of Meyers, many people find conducting an experiment more inappropriate than randomly choosing an approach with no evidence that it will be effective.

Online experiment among students leads to outcry

Meyer opens her paper with anecdotal evidence that conducting experiments can lead to fuss, even when the experiment serves a noble purpose. The U.S. provider of online studies Pearson Education sought to motivate its students to complete study assignments more often complete more often by adding encouraging messages to its products. Based of psychological literature, two approaches were chosen that could be effective could be: encouraging a ‘growth mindset’ (e.g. “No one is born a great programmer. Success takes hours and hours of practice.”) and ‘anchoring of effect’ (e.g.. “Some students tried this question 26 times! Don't worry if it takes you a few tries to get it right.”).

In an online field experiment among thousands of students, both approaches were tested against a control group (the existing version of the online course without encouraging messages). When Pearson made the results of this study public, it prompted a torrent of negative reactions. Take, for example this article in The Washington Post in which Pearson is accused of crossing ethical boundaries by using their students - without their consent - as ‘guinea pigs’ for their their own research goals. One cannot escape the impression that this resistance would not have arisen if Pearson had chosen to arbitrarily implement either both approaches through. Is it then ethically ‘better’ to make decisions based on intuition than on experimental evidence?

The A/B effect: field experimentation Feels more inappropriate than random choice

In a series of 16 experiments among 5873 participants, people appear to find it inappropriate when a field experiment is is conducted to show which of two approaches is most effective, while they do find it appropriate if randomly one of these approaches is conducted. This result emerges in a variety of domains (from health care to the design of self-driving cars) and is independent of education level or knowledge about the domain under study. Meyer calls this the A/B effect.

To illustrate, if it is unknown which of two drugs works best, most participants would prefer that a physician administer either drug A or drug B to everyone, rather than to give one give half the patients one drug and the other half the other in order to figure out which one is most effective.

Deciding on intuition is very risky

Apparently, conducting field experiments is intuitively less acceptable to people than randomly choosing one of several options. As far as I am concerned, this is worrisome. The leads decision-makers to be more likely to choose to make a decision make a decision without gathering evidence about what the effect will be. Either because they themselves don't feel good about field experiments, or because they fear that others - customers, voters, (social) media, other stakeholders - will react negatively to it.

This while it is known that making decisions based on intuition is very risky. As a behavioral scientist at Online Dialogue, I conduct online experiments for many different companies and I see back that roughly a quarter of the ideas deliver the intended result. About half of the ideas have no effect at all (and therefore not worth the investment required for implementation). A quarter of the ideas, when implemented, actually cause a deterioration of the situation.

Going back to the example of Pearson Education. In the end, both encouraging messages in the experiment caused students to less complete study assignments. Thus, both approaches had a negative effect! If Pearson had chosen not to conduct an experiment but to choose one of both approaches, no one would have fallen over it. But the result would then, despite all good intentions, have been bad for the students.

It's up to us to make the A/B effect disappear

The bad news is that you are likely to face more resistance if you base your decisions on the results of a field experiment than if you make your decisions based on your intuition about what will work better. Don't let that discourage you, because we also know that experiments have tremendous potential to inform decision-making in both the public and private sectors. And practice shows that people are only too happy to change their opinions about experiments when they discover how much it delivers.

It is getting easier and easier to conduct experiments and we are getting better at it. Yet the ability to make evidence-based decisions on a large scale is still relatively new. Seen this way, it is not surprising that people still have some time to get used to it. It is up to us, as lovers of evidence-based decisions, to prove the value of experimentation, to enthuse those around us and make the A/B effect go away.

Do you have your own example of an experiment that generated resistance when you weren't expecting it? Or do you have an explanation as to why people prefer performing one of two options over conducting an experiment? If so, I'd love to hear about it! Let let me know via joost@onlinedialogue.com

Source:
Meyer, M. N., Heck, P. R., Holtzman, G. S., Anderson, S. M., Cai, W., Watts, D. J., & Chabris, C. F. (2019). Objecting to experiments that compare two unobjectionable policies or treatments. Proceedings of the National Academy of Sciences, 116(22), 10723-10728.