Tags are often used to describe user-generated content on the Web. However, the available Web applications are not incrementally dealing with new tag information, which negatively influences their scalability. Since the cosine similarity between tags represented as co-occurrence vectors is an important aspect of these frameworks, we propose two approaches for an incremental computation of cosine similarities. The first approach recalculates the cosine similarity for new tag pairs and existing tag pairs of which the co-occurrences has changed. The second approach computes the cosine similarity between two tags by reusing, if available, the previous cosine similarity between these tags. Both approaches compute the same cosine values that would have been obtained when a complete recalculation of the cosine similarities is performed. The performed experiments show that our proposed approaches are between 1.2 and 23 times faster than a complete recalculation, depending on the number of co-occurrence changes and new tags.

Additional Metadata
Persistent URL dx.doi.org/10.1007/978-3-642-32597-7_14, hdl.handle.net/1765/53305
Vermaas, R, Vandic, D, & Frasincar, F. (2012). Incremental cosine computations for search and exploration of tag spaces. doi:10.1007/978-3-642-32597-7_14