The Double-Edged Sword of Generative AI:
Mitigating the Risks of Respondent Fraud

Tammy Rosner
Director, Data Quality

The explosion of mainstream access to generative AI tools has led to a lot of discussion as to how AI can be used to assist in market research, such as summarizing themes from large amounts of open-ended responses or identifying high-level trends in survey data (e.g., Kieser, 2023). As these technologies mature, Generative AI (GenAI) tools such as ChatGPT, Llama, and Bard are likely to lead to faster and more efficient insights from market research data. 

Of course, like any new technology, GenAI can be a double-edged sword. Along with the innovations that generative AI tools are driving comes the pressing concern that it can also be used against us. What do the advances in GenAI mean for how we catch fraudsters in this new era of democratized access to GenAI? There is understandable concern that GenAI may allow less sophisticated fraudsters to produce responses that look reasonable at scale, making fraud harder to detect. 

Fraudsters, for example, can give AI an open-ended survey question and ask it to generate multiple responses of a given word length from the perspective of individuals with particular demographics (e.g., see Unicorny podcast episode “AI. Everything, Everywhere, All at Once”). The AI will then produce a set of unique and reasonable responses that can be used to answer those open-ended questions. While GenAI answers are still fairly rudimentary at this time, as the technologies get better it may be difficult to discern these responses from honest answers given by real people (e.g., Crothers, Japkowicz, & Viktor, 2023). Therefore, simply reviewing open-ends as one of just a few quality checks has already become less and less reliable when it comes to fraud detection. This presents a challenge for fraud detection in market research as open-ends have historically been one of the primary and most useful vehicles for how the industry catches fraud.  

While this concern over GenAI fraud detection is warranted, even prior to easily-accessible GenAI, cleaning data to ensure high quality was more complicated than simply reading through open-end responses. Poor open-ends, while they could be indicative of fraud, are often due to the behavior of real, imperfect, but honest panel members. There are people who hate open-end questions but are happy to provide closed-end responses. There are badly-written surveys that actually cause disengagement with the survey itself, resulting in bad open-ends. There are people who have the time to thoughtfully answer one or two open-end questions, but not five of them. Thus, it is very possible for real, engaged survey-takers to provide poor open-ends, and discarding their data wouldn’t be ideal for data quality (e.g., McCarthy, 2022). Humans are messy and are never going to be 100% engaged in the survey every time. This is why we have long recommended using multiple types of quality control questions to assist in identifying low-quality data, and discarding respondents only if they fail a series of checks (e.g., Phillips, 2013). In short: data quality should never be dependent on open-ends alone, which was the case prior to GenAI being so easily accessible. 

In addition to multiple types of checks, there are many tools that can and should be leveraged outside of checking open-ends to ensure that survey takers are not fraudulent or so poorly engaged as to not be viable. One example of this is Dynata’s QualityScore™, which is employed at the survey level to detect and eliminate fraud and disengagement from the data set. QualityScore™ is a 175+ point machine learning (ML) model that looks at many different types of survey-taking behavior, such as survey responses, passive behaviors, and device information to identify low-quality respondents. QualityScore™ can also be helpful in detecting the potential use of GenAI based on the various ways it evaluates open-end responses. For example, QualityScore™ evaluates passive behaviors that may be associated with the use of GenAI tools, such as mouse movement and copy/pasting answers. Given their motivation to quickly or automatically move through the survey, fraudsters are also unlikely to use GenAI in isolation, so other suspicious behaviors such as survey-taking speed, response patterns, and page translation become even more important to look out for when it comes to fraud detection. In anticipation of the mainstream release of ChatGPT, in late 2022 the QualityScore™ algorithm was revised to put even more emphasis on these sorts of passive behaviors. Once fraudulent and completely disengaged respondents are identified, they are removed from the data in real-time, making the final data cleaning step easier and more efficient.   

Of course, QualityScore™ is also not the only tool we have to help with fraud detection. For example, if someone is flagged as suspicious in Dynata’s panel systems, they will be prompted to verify their identity with a valid government-issued photo ID and will be unable to take more surveys until they do so. In addition, it’s critical to take a holistic and full history approach, using every data point we have on a respondent. We evaluate respondents not only on their behavior within the current survey, but also historically (have they been a good panelist in the past, or have they been flagged in other projects?), holistically (how are other survey-takers behaving, and what is typical within the context of this survey?), and systematically (how do they behave in our survey router, are there anomalies in traffic patterns?).  

As with many new technologies, GenAI will likely be a tool that fraudsters try to employ in pursuit of conducting survey fraud at scale. To be sure, open-end checks should never be the only tool in your toolkit when evaluating the quality of your data. The rigorous, focused use of multiple tools (including AI!) and checks within and across different stages of a respondent’s experience on a panel are key to ensure that your data are of the highest quality. This is the approach we take at Dynata, and it has paid dividends: our study rejection rate post-automatic quality cleaning are the lowest in the industry, just 3-4% on average compared to many that remove 15% or more. Taken together, it’s clear that although GenAI certainly poses risks to market research, those risks can be mitigated with new methods including AI/ML, and continuous improvement in our techniques. In this context, the opportunities that come with GenAI are a much more fruitful avenue to explore!