The Simpson’s Paradox And Mystery Shopping

Recently I came across something very interesting. A Phenomenon known as the Simpson’s Paradox. This will not be anything new to those familiar with Statistics. However, it is important to note for anyone trying to understand data.

The phenomenon is a particular effect that occurs when there is a difference in outcome from the same dataset:

Simpson’s Paradox occurs when trends that appear when a dataset is separated into groups reverse when the data are aggregated.” – Source

This phenomenon definitely applies to Market Research. We are constantly looking at data and separating them into controlled variables to analyze. A good example of this is the Net Promoter Scores. Before that, let’s discuss a little more on the Simpson’s Paradox.

Below is a great example of how the Simpsons Paradox could happen:

The example above was taken from and it shows an example of 2 friends solving problems over the course of the weekend. “Your Friend” has a higher accuracy of solving problems on both individual days. However, the challenge was set to the totality of accurately solved problems over the weekend. Not individual days. In the example above we can see how “You” won the challenge in totality.

We can discuss further on the topic. However, I would suggest visiting the two sites below should you want to know more:


In context of Market Research

At AQ we specialize in a form of market research known as “Mystery Shopping.” We collect data from “shoppers” who are sent in to do a visit based on a pre-defined scenario with objectives. This means that, most of the time the objective of the project is already defined before we set the projects up.

For example, our clients may want to track the service level in stores or compliance levels. There are multiple sections to each survey. The data may also include demographic and geographic information, comments, etc.

Will that much data, to make things easier we have to filter through them. This leaves a possibility of the Simpson’s Paradox to manifest. For example, if we are calculating the scores for Section A – then separating the data into Scenario A and Scenario B, and further comparing Male and Female scores. There is a high possibility of the paradox taking effect.

Now, the result is the result and the data does not lie. What we essentially need to do is to be able to interpret the data and to be able to explain what happen.

In the example above we can see that the sample size was different for each friend on each day. This resulted in differing averages.