Here are 3 things to know.
- Concept tests are ideally structured in a monadic experimental design format. With this format, each survey respondent is exposed to a single variation of the product stimulus and asked a series of questions—including critical metrics like stated purchase intent—about their reaction to that stimulus. Ideally, each alternative “cell” is identical except for the main attribute being evaluated. This approach allows the research to focus on the variable attributes of the design, while holding all of the other attributes constant. For example, everything is held constant on this delicious, nutritious cereal box except a single attribute:
What about testing price points? Monadic design like this is arguably the most reliable form of price testing methodology. There are other methods, but each has significant flaws. Van Westendorp Price Elasticity Modeling tends to yield lower-then-expected optimum prices, since respondents very quickly learn to “lowball” the estimates; in other words, it is too open to negotiation even in situations where there generally is no negotiation (like cereal). Van West is good for ballpark estimates early in the process. Conjoint analysis is also often used for pricing research, but the price estimates obtained are inextricably tied to significant variations of product configurations. There is no ambiguity when testing prices with a monadic experimental design:
Are there any downsides to this approach? Depending on your point of view, yes. Because each respondent in the survey is asked to evaluate a single iteration, the method often requires large samples. Large samples for the research generally equals more cost for the research. Some marketers want huge samples, so that minute differences in performance can be tested and found “statistically significant.” But in our experience, this is not always a great outcome. Do you really want to go to market with a version of a product, Concept A, that has a statistically significant 0.5% more purchase intent than Concept B? Or would you be more comfortable if the difference was a statistically significant 10%? Or was the sample so inadequate in size that the difference was a robust 18%, but was not statistically significant? The answer usually lies somewhere in the middle.