Here’s our approach to product testing.
Until (relatively) recently, electric utilities pretty much had a single mission: the generation, transmission, and monetization of electrons. It was a simple business model. The customer flips a switch, or claps their hands, or yells across the room to Alexa to turn on the lights, and the fulfillment process happens, literally at the speed of light. For many years, that was enough. But what happened before is not what happens today. Instead, we live in an economic marketplace where cable service providers provide home security, phone companies provide video entertainment options, drivers with some spare time provide rides from the airport, and Amazon provides, well, everything. Utility companies learned that within a fairly narrow space, they too can compete for share of consumers’ share of wallets for products that most consumers would have traditionally sought elsewhere. The “narrow space,” at least today, tends to center on either 1) products and services directly related to the home, such as things like home warranties, appliance repair and maintenance, home management and efficiency systems, and yes, home security; or 2) optional, voluntary programs like community solar. Configuring and ultimately pricing these types of products is a well-established, tried and true exercise in product development research. Got ballpark pricing questions? Use Van Westendorp Price Elasticity research. Want to see what consumers will trade off for lower prices? Use conjoint analysis. Want to offer complex pricing models based on tons of optional add-ons and choices? Use adaptive conjoint. Want to pick a winner among some product finalists? Use a monadic experimental design. The limitation with all these methods is that they generally yield relative consumer adoption estimates, not absolute ones. Relative metrics reliably show which product will outperform the others but doesn’t help much in populating that top line in the product’s pro forma. And obviously, it helps to know how many people will sign up for a community solar initiative—and at what price—before a utility can reasonably estimate their investments in hardware, marketing, insurance, rolling trucks, regulatory compliance, and, importantly, impact on their brand. In research for consumer package goods, the transformation from relative adoption estimates to hard forecasts of adoption and dollars has relied on mountains of historical data that model specific products in specific categories (e.g. a new flavor of potato chips in the salty snacks category). For new concepts like community solar initiatives, however, this history doesn’t broadly exist. Fortunately, many utilities do have historical adoption rates for many analogous, if antiquated, products. In the final, monadic stage of product development, a key question is about purchase intent. Generally speaking, the concept with the highest PI wins. Producing a forecast from these outputs, while far from an elementary exercise, is founded on best-practice modeling and marketing science. Adoption of some original products can be treated as dependent variables is a logistic model, with things like household demographics, marketing outreach, seasonal factors, and other characteristics as predictors to that outcome. A similar process can then be applied to the “new product,” using an outcome of “definitely would purchase” as the binomial target result and the current values (or, in the case of marketing effort, estimated values) along with their previous coefficients as the predictors. The final data science application involves scaling the model’s results to the adoption levels of previous products. In a nutshell, the modeling application for these utility applications is similar to the large-scale efforts applied in CPG and other sectors, except that the process tends to be unique to each utility, based on their history, customers, and many other factors. This fact doesn’t allow for an easy, out-of-the-box solution, but it does allow for a scientific, empirical, and defensible approach to forecasting new product sales.
Life insurance companies that utilize direct mail marketing have relied on traditional response models to more efficiently target consumers for whom their products are relevant, attractive, appropriate and affordable. The actuaries at life insurance companies have applied sophisticated data science for years to establish risk, premiums, marketing allowances and more. The actuaries at these companies learned a fundamental truth about consumers: people are different demographically, behaviorally and socio-economically. Women generally live longer than men. Smokers represent an extremely high risk. There are high risk activities that some consumers enjoy more than others. In terms of health outcomes heredity has a role, body mass index has a role, and overall lifestyle has a role. Of course, these factors can be unique to each consumer. Paradoxically, the same principles that insurers use to accurately underwrite or deny life insurance policies also apply to the marketing of those policies. Why? Both analytic processes are attempting to predict an outcome; in the case of direct mail marketing, the outcome is a positive response to a marketing campaign and ultimately a sale. Consumers’ responses to marketing are based on different factors. These factors reflect who they are, where they are, how they behave, their attitude toward the world around them, their family situation, and what comprises their overall lifestyles. This is why “one size fits all” or “out of the box” predictive analysis does not work well when it comes to predicting response behavior. There are methods that utilize multiple segment-based models to determine which models—or combination of models—yield the highest response for a given audience. This methodology is alternatively referred to as ensemble modeling, model stacking, or nested modeling. Each of these specific approaches has subtle differences that are meaningful to nerdy data scientists, but their outcomes are the same: build and combine models that yield the greatest lift across diverse, heterogeneous population by determining market segments or geographic segments and refine the modeling parameters within each of those. Today, we’re going to focus on a nested modeling methodology. Nested models are founded on the premise that different consumer characteristics—whether they be geographic, demographic, lifestyle, attitudinal or behavioral—result in different reasons why people respond or do not respond. The problem with this approach is that the more ingredients that get added to the model soup recipe, the more diluted the outcome of that recipe becomes. It’s like the old data science joke analogy about the fallacy of marketing to the average consumer: you’re standing in a bucket of ice water with your head stuck in a hot oven. On average, you feel just fine! Nested models use the same group of predictors, but instead reserve the most differentiating variables to effectively distribute the target audience into more homogeneous population cohorts. Each group is modeled independently, and then reassembled into a single target audience with optimum response propensity. As we said, variables with significant value differences across different groups make good candidates for nesting within the overall ensemble. In the example below, we see that consumers’ state of residence is a meaningful separating factor because of very different population attributes. In the example below, we are focusing on the 64+ population. Nesting is not necessarily limited to a single variable split; indeed, the modeler can further refine and tune the final model through multiple levels of nesting. This has become a far more feasible approach than in the recent past due to fast (usually cloud-based) computing resources and statistical software applications that facilitate builds like this. In our example, we find that we can leverage many of these demographic variables to optimize response for our life insurance direct marketing campaign. How does this approach work in reality? The effectiveness of response models is measured by lift, a statistic that demonstrates how much mailing efficiency the final model yields with actual response rates. In the chart below, “global model” is the application of a single, traditional response model. “Individual state models” refers to models constructed like the diagram above demonstrates, and “best model for each case” selects the higher of the two resulting propensity scores for each prospective consumer. The bottom line is that with a smart, thoughtful nested modeling approach, an insurance company would realize enormous savings in its direct marketing costs. Reach us to see how our approach can benefit your business.
Goodness good•ness [goo d-nis] the state or quality of being good, excellence of quality. (dictionary.com). A good predictive model is only as good as its "goodness." And, fortunately, there is a well-established process for measuring the goodness of a logistic model in a way that a non-statistician—read: the senior manager end-users of your model—can understand. There is, of course, a precise statistical meaning behind words we propeller heads throw around such as "goodness of fit." In the case of logistic models, we are looking for a favorable log likelihood result relative to that of the null, or empty, model. This description isn’t likely to mean much to an end-user audience (it barely means anything to many propeller heads).
If you have been a silent advocate for all the benefits that market research (MR) can bring to your company, then we’re here to help you find your voice. Our Ultimate Guide to Proving Your Market Research ROI brings research to the forefront as a key player in achieving business objectives, optimizing strategies, and—most importantly—improving return on investment (ROI).
What it is An Exploratory Data Analysis, or EDA, is an exhaustive look at existing data from current and historical surveys conducted by a company.
Want to know more about grocery trends and digitally active grocery shoppers? Check out our Grocery Voice Panel.
We often use SPSS or R tables to conduct hypothesis testing—also known as testing for statistical significance—between means. These outputs generally provide t-test results, and in SPSS at least, produce a little alphabetical footnote when a difference is statistically significant. It’s quick and easy, but the problem is that it’s wrong. Even using the Bonferroni adjustment (which compensates for the fact that there are more than 2 groups being compared), it will commit Type II errors (it says something is not significant when it is) as well as the more common Type I errors (saying something is significant when it’s not). So what do we do?
There are many ways to predict which of several concepts will “win” in a retail environment. In another blog, I describe some of the methods we employ to conduct a best-practice experimental design for testing concepts, packaging options, advertising copy, or anything else where several discrete choices exist.
In the first 2 installments of this 3-part series, we made the case that market research return on invest (MR ROI) can be boosted by asking 5 key questions at the outset of a project and using action standards and benchmarks.