<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=204513679968251&amp;ev=PageView&amp;noscript=1">

Do You Even Data

A data-driven marketing blog

Want to learn how you can translate incredible data list information into killer marketing campaigns? Want to better understand how data research and models can enhance the data you already have?
All Posts

Man Versus Machine, Part 2: Respondent Fraud Identification


In my previous blog post in this series I talked about the impact machine coding has on the quality of your data. Having a human being actually looking at each open-ended response in your survey may be time-consuming, but the results are hard to argue with - a code scheme that accurately represents the intention of the respondents' comments and a true reflection of the study results.

Today I want to focus on a different data quality issue that can be uncovered by physical, human inspection. In my nineteen years of coding open ended responses here at DDG, I've seen a lot of interesting responses to open-ended questions.  People tend to get creative when they're given a nice blank box to put their opinions in instead of being forced to select an option from a list or a number on a scale. Once I got enough like responses to a question that asked about life goals that I had to add "a date with Elizabeth Taylor" to my code scheme!  That was an interesting conversation to have with the project manager.  I love reading all the things people have to say. But sometimes I find something in those open ends that sets alarm bells off in my head.

DDG started conducting online research soon after its general adoption in the industry.  By the early 2000's we were conducting surveys online as a matter of course.  After years of working with mail and telephone surveys, we began to notice some differences between the kinds of responses we got on the web versus those we got when the respondent was taking the time to fill out a mail survey, or being coached through an open-ended response on the phone. Sometimes the responses were just a little shorter.  Sometimes, if the topic was sensitive, they were actually a little longer, as if the respondent felt more comfortable sharing private information when they weren't speaking with a live human. And sometimes we saw what we call "junk data", where the respondent simply keys a random assortment of characters into the open end field and moves on.  Frequently this behavior correlates with other quality issues, and it's one of the things we look for while we're coding. 

The coding team helps identify potential data issues is by looking for patterns that would be indicative of fraudulent behavior.  We have technology in place to validate the identities of respondents and prevent people from sending programs (called 'bots" through the survey. But data issues caused by 'bots were a major problem before the advent of captcha techology. When I review your open-ended data as part of the coding process, I am also looking for junk responses, responses that don't make sense, repeated response patterns - anything that might indicate that the respondent is not giving your survey the attention it deserves.  No matter how many traps we lay to preserve the integrity of your data, there's no substitute for a sentinel standing guard.  That's me.

Anne Saulter
Anne Saulter
Anne joined the Data Decisions Group team in the call center in 1996. Currently she runs our Coding Department, but she has worked in almost every department in the company during her tenure of 20+ years, including call center interviewing, new hire training, project coordination, and quality control. Anne graduated from Niagara University with a B.A. in Chemistry and a minor in Business.