BitCurator Access Software Developer Needed!

Would you like to help build software to automatically redact private and sensitive information from large data objects such as disk images?
The BitCurator Access team is looking for a software developer to join us during the summer.

The Software Developer will write, test and document software to redact – at both the block level and the file level – patterns identified within born-digital materials. S/he will report to Kam Woods, the Technical Lead of the BitCurator Access project.

– At least one year of Python development experience
– Working knowledge of file system structures and modern file system metadata
– Working knowledge of XML schema design and Python libraries/wrappers to manipulate XML structures (e.g. Expat, ElementTree)
– Must be comfortable using revision control with git and GitHub
– Working knowledge of Linux development environments

– Prior experience with open source digital forensics libraries and tools (e.g. The Sleuth Kit, libewf, Digital Forensics XML)
– Familiarity with Python scientific and data analysis libraries (SciPy and Pandas in particular)

Up to $25 per hour based on experience. Work to be performed over the summer.

About BitCurator Access:
BitCurator Access is an Andrew W. Mellon grant-funded project housed in the School of Library and Information Science at UNC. Our team is working on bringing tools and techniques from the world of digital forensics to libraries, archives, and museums, enabling long-term access to complex legacy digital materials.

If you have any questions about this position, please contact us at

The University of North Carolina at Chapel Hill is an equal opportunity, affirmative action employer and welcomes all to apply regardless of race, color, gender, national origin, age, religion, genetic information, sexual orientation, gender identity or gender expression.
We also encourage protected veterans and individuals with disabilities to apply.

To apply, please go to:

BitCurator Consortium Internship

The BitCurator Consortium (BCC) and The Educopia Institute are looking to hire an intern. To learn more about this opportunity, please read below!

Responsible to: Educopia Executive Director

The BitCurator Consortium intern will be responsible for the management of programmatic activities for the BitCurator Consortium (BCC). The intern will be a self-starter with a passionate belief in the importance of digital preservation activities in archives, libraries, and museums.

This position will bear responsibility for a variety of administrative and research-oriented tasks, including: proofreading documents, managing scheduling, taking minutes at meetings, assisting with event hosting, assisting with outreach activities, drafting and updating documentation, and researching best practices for model policies.

The intern will be expected to work 10-15 hours per week, and will be paid $12 per hour.

Responsibilities include:
1. Maintaining content on the BCC website and affiliated resources (e.g., training videos and presentations, bibliography)
2. Making arrangements and sending reminders for meetings with members
3. Recording and disseminating detailed meeting minutes
4. Preparing BCC outreach communications and instruments for conferences and other hosted events
5. Assisting with the creation and population of a library of model documents (policies, workflows, contracts, etc)
6. Assisting with the update and revision of training materials and “how to” resources
7. Assisting with the creation of training and professional development activities/proposals

– Exceptional written and oral communication skills
– Demonstrated interest and involvement in the digital library and digital preservation communities
– Ability to work under pressure, to adjust to change, to handle multiple tasks, and to coordinate the work of extended groups of member representatives

– Experience in managing collaborative projects
– Proven ability to produce/execute reports and project plans

Instructions to applicants
Please submit a resume, cover letter, and three professional references to Katherine Skinner (Executive Director, Educopia): Application deadline is May 29, 2015. Candidates will be considered until the position is filled.

The Educopia Institute is an Equal Opportunity/Affirmative Action Institution committed to diversity in its employment and educational programs, thereby creating a welcoming environment for everyone.

Save the Date! Announcing the 2nd Annual BitCurator Social Mixer at SAA

We are excited to announce that the 2nd Annual BitCurator Social Mixer will be held at the 2015 Society of American Archivists (SAA) Annual Meeting. The mixer will take place at Porcelli’s Bistro on Wednesday, August 19th at 730pm. Hors d’oeuvres will be provided. Please join us for an opportunity to learn how other colleagues are using BitCurator and to talk through strategies and implementation options.

We’d like to get a headcount of possible interested parties, so please RSVP below.

Registration now open for OPF Digital Forensics Workshop in Vienna!

Registration is now open for our next event “From the Toolbox: BitCurator Digital Forensics workshop” which takes place on May 29, 2015 at the AIT Austrian Institute of Technology in Vienna.


This one-day OPF workshop offers the opportunity to learn how digital forensics and the use of disk images can support your digital preservation workflows. Supported by expert facilitators Cal Lee and Kam Woods from the University of North Carolina, participants will get hands-on experience using the BitCurator tools including the latest developments with BitCurator Access. The BitCurator Environment is a suite of open source digital forensics and data analysis tools to help collecting institutions (libraries, archives, and museums) process and provide access to born-digital materials.

Who should attend?

This workshop is intended for archivists, manuscript curators, librarians or others who are responsible for acquiring, transferring and/or providing access collections of born-digital materials, particularly those that are received on removable media. We will assume that participants are familiar with basic digital curation issues and practices.

Though it is not mandatory, participants will ideally know how to create disk images; generate and verify cryptographic hashes of files; and examine the contents of a file in a hex editor. It will also be helpful to understand the role and purpose of filesystems, file headers, and file signatures. Knowledge of Linux command line operations will also be beneficial, but is not a necessary prerequisite to participation. We’ll be on hand to help with tasks, and many of the tools have graphic user interfaces.

Why attend?

Participants will learn about and get experience using BitCurator environment tools that can assist with various aspects of digital curation, including pre-imaging data triage; forensic disk imaging; file system analysis and reporting; identification of private and individually identifying information; and export of technical and other metadata. They will also learn about tools that are currently available but undergoing significant further development for providing access to data from disk images and redacting sensitive content. Participants should leave with a practical understanding of how to apply these tools in their own institutions and with contacts in peer institutions who are undertaking similar work.

More information, including registration and agenda.

OPF members are invited to attend free of charge. The price for non-members is 75 Euros.

Guest Post: Walker Sampson on Disk Imaging Workflow

Below is a guest post by Walker Sampson, the Digital Archivist at the University of Colorado Boulder. Walker describes the disk imaging workflow he presented at the first ever BitCurator User Forum held January 9th, 2015.

It was a real pleasure discussing workflows with fellow practitioners at the BitCurator Users Forum this year. Many thanks to Matt Farrell at Duke and Kari Smith at MIT for presenting with me, and to Farrell again for putting together and directing our panel.

I have pictured above a synopsis of my disk imaging workflow, which relies strongly on student help. Students take floppy disks from initial photography to imaging, mount testing, documentation of the results, and rehousing.

The disk is photographed to record the labeling information, which can be quite extensive (one collection of media art contains many disks with a printout of the disk’s file listing, along with the official label of the artists’ studio). A photograph also provides a visual reference for the future should we try to relocate the original media.

Students are trained to connect the KryoFlux floppy disk controller and a vintage disk drive to the host machine, which runs a copy of BitCurator. They are also trained on the KryoFlux GUI, and use this software for most of the disk imaging. While we have the FC5025 controller and software, KryoFlux is the device of choice given its greater versatility and ability to record flux timings for each track of the floppy disk. Within the KryoFlux GUI, students set the encoding formats to MFM (modified frequency modulation, a common coding for many IBM PC formatted disks) and the preservation stream by default, as this produces an accurate image for most disks. In the case that it does not, students are trained to run other disk encodings against the preservation stream files to attempt a positive disk image that can be mounted. In most cases, the disk turns out to be in a less common Apple or Commodore format.

The resulting log file, preservation tracks, image file, and disk photograph are saved in a folder. The student then runs a mount test through BitCurator’s mounting script. The results of the student’s disk imaging run are recorded in a row in a collection spreadsheet, which denotes the disk image name, date, their name, the disk drive used (we have different drives in use), the disk imaging device used, any bad sectors found, whether the disk mounts, and if there is a photograph accompanying the disk. The student than rehouses the disk in a Hollinger box. Disk collections which contain office documents and correspondence are candidates for floppy disk deaccessioning; collections for which the floppies are integral to the content and process of the donator, such as the media art collection mentioned above, are not. 

I check the students’ work at the end of the week. If disks have bad sectors or do not mount I investigate those disk images and attempt new reads if necessary. Besides the local backup, I run BagIt on the cumulative work of the student every week and upload that bag to our servers, which run their own backup routine. When the collection is complete, I do the same for the entire collection and remove the in-progress bags.

This workflow emphasizes the capture of disk images foremost. Analysis and description of the disk content is provided intermittently by myself, and batch runs of bulk_extractor and fiwalk on the disk images will be performed at a later time. Import into a formal digital archive software will occur down the road as well – a work in progress for us presently. I hope this workflow can help others who may not have a repository in place, but nevertheless need to rescue content from legacy media in a manner that will allow more refined processing in the future.