This study aims to assess the reliability and the validity of exemplar similarity derived from category fluency tasks. A homogeneous sample of 21 healthy participants completed a category fluency task twice with an interval of one week. They also rated pairs comprised of the most frequently generated exemplars in terms of similarity. Similarities were derived from the fluency data by determining the average distance between generated exemplars and correcting it for repetitions and response sequence length. We calculated the correlation between the similarities derived from the two sessions of the fluency task and between the derived similarities and the directly rated similarities. Spatial representations of the similarities were constructed using multidimensional scaling to visualize the differences between both sessions of the fluency task and the pairwise rating task. We find that the derived similarities are not stable in time and show little correspondence with directly rated similarities. The differences between similarities derived from category fluency tasks in healthy participants, indicate that similar differences between healthy controls and patients with mental disorders, do not necessarily point to a semantic impairment of the latter, but rather reflect the unreliability of the data.

