A new paper Researchers from the Swiss university EPFL suggest that between 33% and 46% of distributed crowd workers on Amazon’s Mechanical Turk service appear to have “cheated” when performing a particular task assigned to them, because they used tools like ChatGPT to perform some of the tasks. work. If this practice is prevalent, it can turn out to be quite a serious problem.
Amazon’s Mechanical Turk has long been a haven for frustrated developers who want the work done by humans. In a nutshell, it’s an application programming interface (API) that provides tasks for humans to perform and then return the results. These tasks are usually the kind you wish computers were better at. According to Amazon, an example of such tasks would be: “Drawing bounding boxes to create high-quality datasets for computer vision models, where the task might be too ambiguous for a purely mechanical solution and too large, even for a great team of human experts.”
Data scientists treat datasets differently depending on their origin – whether they are generated by people or a large language model (LLM). However, the problem here with Mechanical Turk is worse than it looks: AI is now available at a low enough price that product managers who choose to use Mechanical Turk rather than a solution generated by a machine rely on the fact that humans are better than robots. Poisoning this data mine could have serious repercussions.
“Distinguishing LLMs from human-generated text is difficult for both machine learning models and humans,” the researchers said. The researchers therefore created a methodology to determine whether textual content was created by a human or a machine.
The test involved asking collaborative workers to condense research summaries from the New England Journal of Medicine into 100-word summaries. It should be noted that it is precisely the kind of task generative AI technologies like ChatGPT are good for.