A dataset of pairs of an image and tags for cataloging image-based archives

Research output: Contribution to journalArticlepeer-review


The dataset described in this paper contains pairs of images collected from the Web and their tags of keywords, which are linked to appropriate entity pages of Wikipedia, and programs to reproduce experiments. It is assumed for evaluating the disambiguation task, in which given an image and its tags to be disambiguated, an appropriate Wikipedia page is selected for each of the given tag. We collected images tagged keywords of animal names for that ambiguity and their tags since animal names may refer to not only names of animal but names of other types of objects, e.g., nicknames of sports teams from the photo sharing site Flickr. The tags are linked to the correspondence Wikipedia page judged by annotators. The dataset includes 420 images and 2,464 tags. It is useful for developing a system to link a keyword of an image to an entry of a knowledgebase as well as an image classification system, which include fine-grained classes, e.g. proper nouns of objects, as their classification targets.

Original languageEnglish
Article number108722
JournalData in Brief
Publication statusPublished - Dec 2022

All Science Journal Classification (ASJC) codes

  • General


Dive into the research topics of 'A dataset of pairs of an image and tags for cataloging image-based archives'. Together they form a unique fingerprint.

Cite this