<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=204513679968251&amp;ev=PageView&amp;noscript=1">

Do You Even Data

A data-driven marketing blog

Want to learn how you can translate incredible data list information into killer marketing campaigns? Want to better understand how data research and models can enhance the data you already have?
All Posts

Man Vs. Machine - Open-ended Coding in the Modern Age

You've written the perfect list, encompassing every possible response to your survey question.  Or so you think. No matter how comprehensive your list is, the range of human experience exceeds it, and that means that sometimes, someone is going to click that little radio button at the bottom, and then reach for their keyboard to type a response in the little box labeled "Other, please specify".

So how do you make sense of those "other" responses, the ones that don't fit into the neat little categories you had already defined?  The answer to that, for decades, has resided in your friendly neighborhood coding department. Nowadays people think of something very specific when they hear the word coding (website and other programming, for example), but in the market research industry, coding has always meant something a little different. 

Survey response coding starts with the careful review of the actual text typed (or written) by the respondent. The core meaning or sentiment being expressed is determined by the coder, who then determines if this response really should have belonged in one of the pre-existing categories on the list. If not, the response is tracked, and tallied along with similar responses until enough responses that fall into the new category have been documented to merit creating a code to represent them.  An example: you have five respondents who tell you when asked about their race or ethnic background that they are in fact of mixed racial heritage. You may want to consider creating a new code to reflect this.  When you are done, you have a list for that question that includes not just the response categories you began with, but those that were volunteered by the largest number of people.  This sorting and fine-tuning of the data makes it much easier to analyze, especially on questions where there are a lot of other-specify responses.


It's obvious that having data that is neatly categorized and ordered makes it easier to analyze. So, the question becomes, who...or what should be doing the ordering?  In recent years we've seen a significant uptick in tools designed to allow market researchers to automate coding. These learning programs search for key words and use those to find commonality between responses and distill them down, much like a human coder would do. These tools are typically quick and inexpensive, making them an appealing alternative to the somewhat labor-intensive process of manual coding. And, in some cases, they may indeed be accurate enough that the cost and time savings justify their use. But, at DDG, we've found that in most cases, there's no replacement for the human touch. See our comparison of manual versus machine coding below:

  1. Detection of sentiment.  It's nearly impossible for an automated tool to "hear" sarcasm. So, if a respondent types "I just LOVE Company X" immediately after blasting them for poor customer service, you run the risk of that statement being interpreted as positive.  Winner - Manual Coding
  2. People don't speak in key words. The world is a wide and varied place, and the way people express themselves is literally "all over the map".  Things like educational background, gender, ethnicity, language skills and more can mean that two respondents who are saying the same thing choose very different words to say it. A skilled coded teases out this underlying meaning and makes sure it gets captured. Winner - Manual Coding
  3. A human can raise the alarm when needed. Our coders are often the first to spot a systemic problem with a respondent's data. Radio buttons and check boxes can only tell us so much, but a respondent who is typing nonsense into every other specify box may need to be flagged for further investigation for potentially fraudulent response behavior.  A coding program will simply see that there are no key words for it to find. Winner - Manual Coding
  4. Expense. Human coding costs more than automated coding, once the front-end costs for the software are satisfied. Winner - Machine Coding
  5. Speed. Automated coding can be done in a fraction of the time it takes to code responses by hand. Winner - Machine Coding

Expense and speed are always factors in market research, as they are in any industry, but for DDG, the benefits of human intervention in the coding process outweigh the advantages of automated programs. 



Anne Saulter
Anne Saulter
Anne joined the Data Decisions Group team in the call center in 1996. Currently she runs our Coding Department, but she has worked in almost every department in the company during her tenure of 20+ years, including call center interviewing, new hire training, project coordination, and quality control. Anne graduated from Niagara University with a B.A. in Chemistry and a minor in Business.