Population counts and longitude and latitude coordinates were estimated for the 50 largest cities in the United States by computational linguistic techniques and by human participants. The mathematical technique Latent Semantic Analysis applied to newspaper texts produced similarity ratings between the 50 cities that allowed for a multidimensional scaling (MDS) of these cities. MDS coordinates correlated with the actual longitude and latitude of these cities, showing that cities that are located together share similar semantic contexts. This finding was replicated using a first-order co-occurrence algorithm. The computational estimates of geographical location as well as population were akin to human estimates. These findings show that language encodes geographical information that language users in turn may use in their understanding of language and the world. Copyright

, , , , , , , ,
doi.org/10.1111/j.1551-6709.2008.01003.x, hdl.handle.net/1765/74700
Cognitive Science: a multidisciplinary journal
Department of Psychology

Louwerse, I., & Zwaan, R. (2009). Language encodes geographical information. Cognitive Science: a multidisciplinary journal, 33(1), 51–73. doi:10.1111/j.1551-6709.2008.01003.x