Registration for nlp4arc 2018 is now open

We are pleased to announce that registration for the BitCurator NLP Forum 2018 (nlp4arc) is now open. The event will focus on the application of natural language processing (NLP) to support use, access, and analysis of digital primary source materials. Click here to register.

A rapidly growing body of materials with significant cultural value are “born digital.” Information professionals must be prepared to extract digital materials from their original environments and media in ways that reflect the rich metadata and ensure the integrity of the materials. They must also support new forms of access: allowing users to make sense of materials and understand their context.

There are many types of contextual information that can be vital to making sense and meaningful use of digital objects. These can include objects, agents, occurrences, purposes, times, places, form of expressions, concepts/abstractions and relationships.

There are many existing open-source tools that libraries, archives and museums (LAMs) can use to identify, extract and expose such contextual entities from the wide diversity of born-digital materials that LAMs already hold and continue to receive. NLP tools and methods can help to both (1) facilitate curatorial decision making and description, and (2) generate access points to be presented to end users.

nlp4arc 2018
February 2, 2018 – 9:00am – 5:00pm Dey Hall, Toy Lounge
University of North Carolina
Chapel Hill, North Carolina
Suggested hashtag: #nlp4arc


9:00 – 9:15 Welcome and introduction – Cal Lee
9:15-10:30 Foundations and Strategies

  • Michael Piotrowski, University of Lausanne – Historical Texts, NLP, and Formal Models
  • Daniel Pitti, University of Virginia – Name Entities, Named Entities, Facts in Contexts
  • Carl Wilson, Open Preservation Foundation – Not Just Building Tools: Strategies for Sustaining Software and Associated Communities
  • Mark Matienzo, Stanford University – Practical and Ethical Considerations of NLP Applied to Humanitarian Digital Libraries
10:30-10:45 Break
10:45-12:00 Implementation and Projects

  • Mary Elings, University of California, Berkeley – Using NLP to Support Dynamic Arrangement, Description, and Discovery of Born Digital Collections
  • Jeremy Gibson and Nitin Arora, North Carolina Department of Natural and Cultural Resources – “Honey, I Tagged the Email! Now What?”: NLP and the TOMES Project
  • Ryan Shaw, University of North Carolina at Chapel Hill – Gathering Specimens to Augment Authority Files
  • Stéfan Sinclair, McGill University – Spyral Notebooks: Some Reasons Why the World Needs Yet Another Jupyter
12:00-12:30 Panel on NLP Lessons Learned

  • Jaime Arguello, University of North Carolina at Chapel Hill
  • Stephanie Haas, University of North Carolina at Chapel Hill
12:30-1:30 Lunch
1:30-2:15 Enabling Technologies

  • Laney McGlohon, ArchiveSpace – Finding the Data: The Use of a Data
    Dictionary in Retrieving Descriptive Metadata from ArchivesSpace
  • Kam Woods and Cal Lee, University of North Carolina at Chapel Hill – BitCurator NLP Development and Plans
2:15-2:45 Generation of Breakout Topics
2:45-3:00 Break
3:00-3:45 Breakout Sessions
3:45-4:15 Reporting Back from Breakout Sessions
4:15-5:00 Wrap Up and Next Steps

General Registration – $30
Student & BitCurator Consortium Members Registration – $15
Register here.

Please see the list of nearby hotels below.

The Carolina Inn 211 Pittsboro Street
Chapel Hill, NC 27516
Tel 800.962.8519
(This is the closest option. It is on the UNC Campus, just a couple of blocks from the Student Union.)

Hampton Inn & Suites Chapel Hill Carrboro/Downtown
370 East Main Street, Unit 100
Carrboro, North Carolina 27510
Tel 919.969.6988
(Walkable distance)

Holiday Inn Express Chapel Hill
6119 Farrington Road
Chapel Hill, NC 27517
Tel 919.489.7555

Aloft Chapel Hill
1001 South Hamilton Road
Chapel Hill, NC 27517
Tel 866.716.8143
(Shuttle buses available)