How to Ensure Data Quality In Surveys

  • In-Survey quality control questions are an effective way to improve data quality 
  • A variety of questions can be used. They are not equally effective, and often do not measure the same thing 
  • Dynata recommends using at least two of five recommended questions and only rejecting people who fail at least two questions. 

There are two types of data quality issues we seek to identify via in-survey quality control questions. First, there are a very small number of people who begin a survey with the intention of not providing honest answers. A larger group intend to be honest but become fatigued or bored and do not expend enough attention and effort when completing the survey. This group is known as “satisficers”.  

Dynata employs a suite of tools to identify and remove frauds and educate satisficers on the importance of paying attention and answering carefully and thoughtfully. However, it is impossible to prevent every fraudulent person from entering a survey, or to predict who will satisfice on a particular day. Therefore, the use of in-survey quality control questions is an important quality measure.  

Effective quality control questions should be reasonably disguised from the participant. They should not attract attention nor be so obvious that they start to annoy people.  

A researcher’s desire should be to remove all the culprits from the data set, while trying to remove as few of the falsely-accused as possible. To determine the optimal number and type of quality control, Dynata tested 15 different quality control measurements: 12 quality-control questions, a speeding check, a straightlining check and an open-end assessment. These were tested among 2,100 online participants in the U.S. The survey containing the quality controls covered a mix of topics including entertainment, social issues, lifestyle and general behavior questions.  

The survey used 12 offline benchmarks as measurements of quality to compare the data against. The median survey time was 12.5 minutes; short enough not to encourage fatigue. 

None of the quality-control measurements removed all culprits but almost all the quality-control measurements removed some of the falsely accused. 

Dynata found the following quality check types most effective in removing poor quality while minimizing false positives: 

1.Low-incidence items done in the past week, check “travel to remote location” 

2. Open-end quality check 

3. Conflicting answers given in a short grid 

4. Speeder check 

5. Grid check “Check 6” instruction within a short grid 

Flagging every participant who failed one of these questions, resulted in 195 falsely-accused and captured all 65 culprits. But flagging the group who fail two of the five identifies 35 falsely accused and captures 60 of the 65 culprits. 

In summary 

  • Quality-control questions which exclude too many good people hurt feasibility and create a poor participant experience. Misdirects or true traps (e.g. asking about awareness of fake brands) fall into this category.  
  • Quality control measures that throw out too many participants do so at random and do not improve data quality.  
  • Any survey participant can become disengaged in the moment and fail a single quality-control question. Removing these participants does not improve data quality.  

Quality control questions are useful tools to prevent fraudulent and inattentive participants from spoiling data, but they need to be used correctly. Anyone’s attention can wander during a long survey, so it is important that a minimum of two quality control questions are used and that participants are only flagged as failing if both are answered incorrectly. If multiple people are satisficing by the end of the survey, it is strongly suggested that the questionnaire design and length should be evaluated.

For more information on the insights industry’s quality challenge – and how to overcome it, download our webinar.