A study on the influence of the number of MTurkers on the quality of the aggregate output
Recent years have seen an increased interest in crowdsourcing as a way of obtaining information from a large group of workers at a reduced cost. In general, there are arguments for and against using multiple workers to perform a task. On the positive side, multiple workers bring different perspectives to the process, which may result in a more accurate aggregate output since biases of individual judgments might offset each other. On the other hand, a larger population of workers is more likely to have a higher concentration of poor workers, which might bring down the quality of the aggregate output. In this paper, we empirically investigate how the number of workers on the crowdsourcing platform Amazon Mechanical Turk influences the quality of the aggregate output in a content-analysis task. We find that both the expected error in the aggregate output as well as the risk of a poor combination of workers decrease as the number of workers increases. Moreover, our results show that restricting the population of workers to up to the overall top 40% workers is likely to produce more accurate aggregate outputs, whereas removing up to the overall worst 40% workers can actually make the aggregate output less accurate.We find that this result holds due to top-performing workers being consistent across multiple tasks, whereas worst-performing workers tend to be inconsistent. Our results thus contribute to a better understanding of, and provide valuable insights into, how to design more effective crowdsourcing processes.
|Rotterdam School of Management (RSM), Erasmus University
Carvalho, A., Dimitrov, S., & Larson, K. (2015). A study on the influence of the number of MTurkers on the quality of the aggregate output. doi:10.1007/978-3-319-17130-2_19