Published 2010 | Version v1
Journal article Open

Who's Who in Your Digital Collection: Developing a Tool for Name Disambiguation and Identity Resolution

  • 1. University of Illinois at Urbana-Champaign
  • 2. Penn State University
  • 3. University of Maryland

Description

In the past twenty years, the problem space of automatically recognizing, extracting, classifying, and disambiguating named entities (e.g., the names of people, places, and organizations) from digitized text has received considerable attention in research produced by the library, computer science, and the computational linguistics communities. However, linking the output of these advances with the library community continues to be a challenge. This paper describes work being done by the University of Illinois, the Online Computer Library Center (OCLC), and the University of Maryland to develop, evaluate and link Named Entity Recognition (NER) and Entity Resolution with tools used for search and access. Name identification and extraction tools, particularly when integrated with a resolution into an authority file (e.g., WorldCat Identities, Wikipedia, etc.), can enhance reliable subject access for a document collection, improving document discoverability by end-users.

Files

58-278-1-PB.pdf

Files (621.4 kB)

Name Size Download all
md5:1462780b8bdc4069f4c080bea9979c99
619.5 kB Preview Download
md5:81a7c3520a5e3d420435a96e35a8e47d
1.9 kB Preview Download

Additional details

Identifiers

Other
oai:knowledge.uchicago.edu:135

UChicago Information

Department(s)
2010 Journal of the Chicago Colloquium on Digital Humanities and Computer Science Vol. 1, No. 2