BitCurator Visits SAA 2013

Last week, members of the BitCurator team visited New Orleans for the 2013 Society of American Archivists (SAA) Joint Annual Meeting. On Tuesday, August 13, we presented a poster at the 7th Annual SAA Research Forum on how the BitCurator environment can support archivists’ preservation goals in institutions.

In our poster, we described four preservation scenarios during the creation and ingest of a disk image into an archival repository. We then showed how the output generated by BitCurator tools during each scenario can be captured and stored as PREMIS-encoded preservation events.

Event 1: Image Capture
Definition: A forensic disk image is extracted from the original media source and created.
Tool: Guymager
Metadata: Acquisition time; duration of capture; manufacturing device & serial number; user who performed acquisition; cryptographic hash values

Event 2: File System Analysis
Definition: A set of file-objects corresponding to all of the files and directories identified on a disk image is analyzed and reported.
Tool: fiwalk
Metadata: Time of analysis; duration of analysis; user who performed file system analysis; file system partitions; file system volumes

Event 3: Feature Analysis
Definition: Describes forensic analysis of the raw bitstream, producing reports on specific features of interest (such as personally identifying or other sensitive information).
Tool: bulk extractor
Metadata: Time of analysis; execution environment; number of reports produced;

Event 4: Redaction
Definition: Used to overwrite specific patterns within the disk image according to a user-supplied rule-set.
Tool: iRedact.py
Metadata: Time of redaction; environment details; user performing redaction; name of new redacted image

OPF Hackathon

Do you have born-digital collections that you’re grappling with?  Old media with file-systems that you can’t read?  Concerns about disseminating your collections because they might contain sensitive information?  Have you created disk images but don’t yet have the ability to process that data on the images?

Or are you a developer who would like to tackle some real-world challenges that will directly benefit collecting institutions in caring for born-digital materials?  Are you adept at developing and applying open-source software and would like to learn through hands-on experience how to apply various open-source digital forensics tools to collections?

If you answered “yes” to any of the above, then we have an event for you!

The Open Planets Foundation and the University of Chapel Hill’s School of Library and Information Science proudly presents:

The OPF Hackathon at UNC Chapel Hill:
Tackling Real-World Collection Challenges with Digital Forensics Tools and Methods

Monday, June 3rd-Wednesday, June 5th
Registration cost: $150 (REDUCED!)
*includes 3 lunches and 2 dinners

Come join world renowned digital preservation experts, collection managers, and coders as we collectively hack through digital preservation problems using a variety of digital forensics methods and approaches.

Our expert facilitators will be available to provide hands on guidance.    We will also be presenting awards for best collection challenge and best technical solution!

This is the first OPF Hackathon taking place in the United States and we are thrilled to host you as we hack out solutions to common problems in born-digital collections.

Event information: http://bit.ly/Z09fls
Registration page: http://bit.ly/XK4zel

DFXML Tag Library

We’re excited to announce the newest release of the DFXML TAG library,  which contains tags for all fiwalk-generated output.  The next release will contain tags for bulk extractor technical metadata and some additional file system metadata.  Special thanks to Marty Gengenbach, now the Electronic Records Archivist at the Kansas Historical Society, who began compiling the TAG Library as a master’s student at UNC SILS.

 

 

An Early Look at the BitCurator Environment

The following blogpost was written by Porter Olsen, BitCurator’s Research Assistant at the Maryland Institute for Technology in the Humanities (MITH)

Roughly one year ago members of the BitCurator Professional Experts Panel (PEP) met at MITH to help further refine the scope and priorities of the BitCurator project, and ensure that our efforts would have “real world” usefulness for archivists and librarians who are responsible for born-digital materials. The PEP meeting, along with a similar meeting in January of the Development Advisory Group, produced two significant results: first, revisions to a product requirements document that outlined the work to be done on the BitCurator project, including an architecture overview and feature descriptions; and second, a collection of detailed workflows that have helped us to identify where BitCurator can best fit into and enhance curatorial practices. The upcoming one-year anniversary of these initial meetings makes this a good time to take a look at how the BitCurator project has progressed and where we’re headed in the near future.

The BitCurator development team is making available at test release of the BitCurator Environment, which can now be downloaded from the BitCurator wiki. The BitCurator Environment is a fully functioning Linux system built on Ubuntu 12.04 that has been customized to meet the needs of archivists and librarians, and it can be run either as a stand-alone operating system or as a virtual machine. Once installed, the BitCurator Environment includes a number of digital forensics tools that can be integrated into digital curation workflows. A sampling of those tools includes:

  • Guymager: a tool for creating disk images in one of three commonly used disk image formats (dd, E01, and AFF).
  • custom Nautilus scripts: A collection of enhancements to Ubuntu’s default file browser that allow users to quickly generate checksums, identify file types, safely mount drives, and more.
  • bulk_extractor: a tool that locates personal identifiable information (PII) and then generates reports on that information in both human and machine readable formats.
  • Ghex: an open source hex editor that allows users to view a file in hexadecimal format.

The BitCurator environment will make additional available tools available in later releases.

The BitCurator team has also been developing various forms of documentation to complement the product development. On the BitCurator wiki you can find documentation that introduces virtual machines, instructs users on how to install the BitCurator environment, and gives detailed configuration instructions on sharing devices and files between host and virtual machines. We are also currently working on developing documentation that outlines use-case scenarios for digital archivists using the tools mentioned above.

It is not enough, of course, to simply build tools and a wiki page and hope users will come find our software. The BitCurator team has also been actively promoting the BitCurator Environment through lectures, panel discussions, conference talks, posters, and publications. Recent examples include presentations from Kam Woods (BitCurator technical lead), Cal Lee and Matthew Kirschenbaum (BitCurator Co-PIs) on BitCurator and digital forensics at this year’s Society of American Archivists conference in San Diego; and presentations by Cal at Archiving 2012 in Copenhagen, Denmark, Memory of the World in the Digital Age in Vancouver, Canada, the International Congress on Archives in Brisbane, Australia, and to the staff of the National Library of Australia in Canberra. In addition, team members Alex Chassanoff and Porter Olsen will present a poster on integrating digital forensics into born-digital workflows at the upcoming ASIS&T conference. We have also recently published an article in D-Lib Magazine titled “BitCurator: Tools and Techniques for Digital Forensics in Collecting Institutions.”

We have also been incorporating BitCurator elements into professional education offerings.  Cal has developed a one-day continuing education course called “Digital Forensics for Archivists” as part of the Digital Archives Specialist (DAS) curriculum of the Society of American Archivists (SAA).  Matt Kirschenbaum and Naomi Nelson (BitCurator PEP member) have been offering a course called “Born-Digital Materials: Theory & Practice” as part of the Rare Book School (RBS) at the University of Virginia.  Both the SAA and RBS courses serve as excellent mechanisms for raising awareness about BitCurator’s offerings and eliciting needs and perceptions from working professionals.

These are just a few examples of the work done by the BitCurator team to get the word out about BitCurator and our work on bringing digital forensics tools and techniques the digital curation community. For a full list of BitCurator related publications and presentations, please visit our project website at www.bitcurator.net.

As we look forward into the next few months, the BitCurator team has a number of goals and benchmarks that we will be working towards, chief among them being the release of the BitCurator beta later this fall. We are also organizing the second annual meeting of our Development Advisory Group for January 2013, where we will elicit feedback from DAG members on our releases to date. The day before the DAG meeting will be CurateGear 2013 on January 9 in Chapel Hill, where members of the DAG and many other experts will give presentations and run demos of software to support digital curation  And finally, we are currently in the process of applying for funding for phase two of the BitCurator project to support additional product development and further efforts to engage with working professionals who could benefit from implementation of the BitCurator tools. We invite those who are interested, especially those in collecting institutions working with born-digital materials, to follow our progress at www.bitcurator.net, or follow us on Twitter.  For those who would like to jump right in and start working with the BitCurator Environment, you can do so at wiki.bitcurator.net and join theBitCurator Users List.

If you have questions about the BitCurator project or the role of digital forensics methods in born-digital curation, please feel free to ask them in the comments section below.

ASIST 2012 Poster

Here’s a sneak preview of the BitCurator poster we’ll be presenting at ASIST 2012 in Baltimore this coming weekend/next week.   The poster reports on results obtained in the first year of the project which include: (1) detailed workflows documenting the handling of born-digital content in several collecting institutions; (2) specifications on how BitCurator can support the implementation of digital forensics tools and methods in curatorial workflow.

Poster: Integrating Digital Forensics into Born-Digital Workflows: The BitCurator Project