nlp4arc 2017

Event Information
3 February 2017 9:00am – 5:00pm
Student Union rooms 3206A and 3206B, University of North Carolina, Chapel Hill, North Carolina

Suggested hashtag: #nlp4arc

About the Symposium
The symposium consisted of a number of short talks and unconference style break-out sessions on the application of natural language processing (NLP) to support use, access, and analysis of digital primary source materials.

A rapidly growing body of materials with significant cultural value are “born digital.” Information professionals must be prepared to extract digital materials from their original environments and media in ways that reflect the rich metadata and ensure the integrity of the materials. They must also support new forms of access: allowing users to make sense of materials and understand their context.

There are many types of contextual information that can be vital to making sense and meaningful use of digital objects. These can include objects, agents, occurrences, purposes, times, places, form of expressions, concepts/abstractions and relationships.

There are many existing open-source tools that libraries, archives and museums (LAMs) can use to identify, extract and expose such contextual entities from the wide diversity of born-digital materials that LAMs already hold and continue to receive. NLP tools and methods can help to both (1) facilitate curatorial decision making and description, and (2) generate access points to be presented to end users.


9:00 – 9:15 Welcome and introduction – Cal Lee
9:15-10:45 Challenges and Opportunities in Applying NLP to Digital Collections

10:45-11:00 Break
11:00-12:30 From Projects to Programs

12:30-1:30 Lunch
1:30-2:00 Kam Woods, University of North Carolina at Chapel Hill – BitCurator NLP Development and Plans
2:00-2:30 Generation of Breakout Topics
2:30-2:45 Break
2:45-3:30 Breakout Sessions
3:30-4:00 Reporting Back from Breakout Sessions
4:00-5:00 Wrap Up and Next Steps