Recently, I read a promotional email from one of my favorite market research vendors. The email’s...
Predictive Modeling at the Speed of Business
Traditionally, data scientists build a single one-size-fits-all model to solve their business problem, or maybe several of them, one at a time. They never had the tools nor the computing power to let AI software automatically and intelligently build and evaluate models on any segment within the modeling database.
For example, a skilled data scientist who is seeing a new modeling dataset for the first time may build 100 models on 2,000 candidate predictors, with a 3 million record input file. She will have invested more than a month’s work using traditional tools like SAS or R.
QeSTM, datadecisions Group’s (DDG) proprietary multi-level modeling engine, builds hundreds of response models, using hundreds or thousands of candidate predictors, on millions of cases. QeSTM automates segment modeling “inside” a global model, it allows us to build up to 100 segment-level models in a single pass. These sub-models are based on some segment selection criteria, like state, county, income bucket, home ownership, or many more. We call these “nested” models. The better lift of the two approaches is the modeling schema we will recommend.
What’s the benefit of this approach with QeSTM?
- QeSTM applies the best predictors to each case, whether from the global or nested model.
- QeSTM identifies the best candidate predictors from the multitude available, transforming them along the way if doing so improves the model’s fit.
- It weeds out variables that tend to result in overfitting.
- QeSTM builds hundreds of models and sorts through them for the best lift.
- QeSTM applies the best model solution, whether it be the global model or nested model, to each case to yield the best overall lift.
- QeSTM does this in a few hours or days, not months.
The QeSTM Process
QeSTM builds numerous models in memory to identify those data elements from the analytical dataset that are most predictive and stable across hundreds, if not thousands, of model scenarios. Markov Chain Monte Carlo (MCMC) produces an aggregated result of all its iterations, which is used in the final step to select the winning model.
The strongest MCMC-identified predictor set (for the Global model) is initially used for the global model, and then the entire MCMC process is repeated for each nested segment, for up to 100 segments. Once the global and each segment predictor sets are chosen, they are used in an automated model-build function to build 60 different random models on 60 unique random file splits segment. The final “winning” model for each segment is the one that most closely resembles the MCMC aggregate result for each segment
Picking a Winner
We employ three key measurements in combination to determine the ‘winning’ model (global vs. segment-value) to use for predictive purposes in the final output solution.
We define consistency by the smoothness of the model curve. The smoother the model curve is on the validation’s holdout sample, the greater the statistical probability is that a model indeed predicts ‘Y’ from high to low scores—without showing significant percentile ‘kinks’—when applied to a new, future sample. This definition of consistency is in addition to consistency in variable performance of each predictor between the model build dataset’s results and the validation dataset results – variable consistency is built-into the basic QeSTM processing engine. If you calculate the probability of producing a gains table (on an out of model sample) with ten deciles in which decile 1 outperforms decile 2 which outperforms decile 3, etc. to decile 10 being the lowest, which is outperformed by decile 9, the chance of this sequence pattern occurring randomly is very small. Consequently, consistency is an excellent measure of discovering a real pattern that actually predicts high to low response rate.
- Model Efficiency
This metric evaluates the rate of picking up sales vs. the random rate through a receiver operating characteristic (ROC) curve analysis.
This factor considers the sample size of the validation set being scored. More error can occur in the predictive equations when the ‘Y’ sample size is too low.
The Bottom Line
QeSTM produces higher sales, better response rates, and lower cost. Fast.