Alternatives for Imaging a Mac Laptop

Last week, I wrote about how to forensically image the internal hard drive on a Mac laptop without needing to physically remove the drive. If your workspace doesn’t have the necessary tools to follow that tutorial (a firewire cable, a firewire port on the Mac you’re imaging, and a firewire port on a PC partitioned with BitCurator), we offer an alternative in this post.

The Mac laptop we wanted to forensically image.
The Mac laptop we wanted to forensically image.

Other Options for Imaging Mac Laptops

We recognize that you might not have the correct devices on hand to follow the instructions in the previous post. In that case, you may want to open the laptop to temporarily remove the hard drive for forensic imaging within the BitCurator environment, which means you’ll need a cable that connects a hard drive to your imaging computer (probably a SATA cable). You can also opt to make a forensics image outside BitCurator and then import the image into BitCurator for exploration.

Note that the issue complicating this imaging process is specific to Mac laptops; Linux and Windows laptops wouldn’t require target disk mode and the trouble that causes. Target disk mode works with other Macs (perhaps obviously) and Linux machines; I wasn’t able to get a Windows machine to recognize the Mac laptop in target disk mode. I’ve read that commercial software called MacDrive (currently about $50 for use on one PC) will let you connect the a Mac in target disk mode to a PC, but this would not make the Mac drive also available in the Windows computer’s BitCurator VM; unfortunately, VirtualBox is unable to take firewire input. It’s possible you could get around this issue by using other virtualization software, but VirtualBox is the best free/open-source option.

That leaves us with using either a Mac or Linux machine to create our backup of the Mac laptop; in our example, I used a Mac to create the backups. We’ll walk you through how to first lessen the risk of tampering with a laptop’s insides by securing a forensic image outside of BitCurator.

Why Backup?

Opening up the laptop, removing the drive, and later trying to put everything back risks the laptop refusing to start or otherwise being damaged: maybe you break something, or can’t get things to fit back together. If you don’t have another way to gather a forensics disk image packaged with metadata about the imaging, though, opening the laptop up can be an acceptable risk. All computers fail eventually, and we’d rather have a good forensics disk image of the laptop now, than more years with the laptop working but no forensics image preserved. We thus recommend you forensically image the laptop’s hard drive before opening it, or choose to create a forensics image with one of the non-BitCurator options discussed below and import the image into BitCurator. Opening up the computer is only necessary if none of these forensics imaging programs are right for you, your Mac laptop doesn’t have a firewire port, or if you prefer to do all your forensic work inside the BitCurator environment. For either method, you’ll need a firewire and another Mac (with a firewire port) on which to image the laptop.

Write Blocking

First, we need to protect the laptop from having the connected machine write back to it during the imaging process. This wasn’t a major concern in our example as Larsen’s laptop has already been explored by researchers at MITH—but it’s good practice nonetheless, especially if you use a command-line imaging method, where a simple is-typing could accidentally erase your device. Our WiebeTech Forensic ComboDock works well for most write-blocking purposes, but it doesn’t have the firewire input and output needed to work with a Mac in target disk mode. Tableau T9 Firewire Forensic Bridge is a hardware option that does accept both firewire input and output, but we didn’t have one on hand. We thus used software write-blocking instead, installing Aaron Burghardt’s Disk Arbitrator to protect the laptop.

Connecting the target old Mac laptop with the host new Mac laptop to image the old on the new.
Imaging time! Leaving the computers alone during imaging.

A Forensic Disk Image

Begin by putting the Mac laptop you want to image into target disk mode:

  1. The laptop to be imaged (e.g. our Larsen laptop) should be turned off.
  2. Hold down the t key and turn the laptop to be imaged on.
  3. Continue to hold down the t key until the target disk mode image appears on the screen (see photo below).
Old white Mac laptop opened to show screen in Target Display Mode
Old white Mac laptop opened to show screen in Target Disk Mode

You can now connect your firewire cable to both the laptop to be imaged and the Mac (or Linux computer) doing the imaging.

To create a forensics disk image, there are a variety of free and commercial programs that provide graphical interfaces for Mac and Linux, including MacOSXForensics Imager (Mac) and Guymager (Linux; note that Guymager is the imaging software BitCurator incorporates). Commercial options such as FTK Imager also exist. Almost any program that creates the image in an Encase (E01) or AFF forensic disk image format works, as these formats take a raw disk image and wrap metadata about the imaging around it. We haven’t formally evaluated the effectiveness of any programs outside the BitCurator suite, though, so you’ll want to check potential Mac forensic imaging software out yourself and explore the images they create within the BitCurator environment to make certain they captured your device correctly.

Alternatively, you can choose one of the following command line methods—but it’s of utmost importance that you use a write-blocker with these, as mis-typing could erase your device:

  1. If you’re very knowledgeable about using the command line, you may already know how to use dd or dcfldd.
  2. The ForensicsWiki has a detailed tutorial on “Acquiring a Mac OS System with Target Disk Mode” that uses dd and other commands to create a .dmg image, plus instructions on converting the .dmg to an Encase format.
  3. Macintosh Forensics: A Guide for the Forensically Sound Examination of a Macintosh Computer” by Ryan Kubasiak offers alternative instructions for using dd (use the hyperlinked table of contents to jump to the “Imaging a Target Macintosh” section starting on page 25).

Opening the Laptop to Remove the Hard Drive

After following these steps to make a forensic image of your laptop, you can either opt to import the forensic image into BitCurator and explore the image there, or choose to temporarily remove the hard drive in order to image it directly through BitCurator. If you choose the latter path, you’ll need to search for instructions like these that show how to open your particular model of Mac. If possible, use a guide with many photos to show you how to carefully open, remove, and replace the Mac laptop’s hard drive. I’ve found that sites dedicated to DIY fixing and making, such as iFixit and Instructables, offer good community-moderated tutorials on opening up computers.

In a future post, I’ll discuss what I found while exploring the Larsen laptop disk image using BitCurator. Send us your suggestions for other difficult-to-image use cases, and we’ll cover them in future posts!

Amanda Visconti is a MITH graduate research assistant on the BitCurator project, where she creates user-friendly technical documentation, develops and designs for the web, and researches software usability. As a Literature Ph.D. candidate, she blogs about her digital humanities work regularly at LiteratureGeek.com.

BitCurator Visits SAA 2013

Last week, members of the BitCurator team visited New Orleans for the 2013 Society of American Archivists (SAA) Joint Annual Meeting. On Tuesday, August 13, we presented a poster at the 7th Annual SAA Research Forum on how the BitCurator environment can support archivists’ preservation goals in institutions.

In our poster, we described four preservation scenarios during the creation and ingest of a disk image into an archival repository. We then showed how the output generated by BitCurator tools during each scenario can be captured and stored as PREMIS-encoded preservation events.

Event 1: Image Capture
Definition: A forensic disk image is extracted from the original media source and created.
Tool: Guymager
Metadata: Acquisition time; duration of capture; manufacturing device & serial number; user who performed acquisition; cryptographic hash values

Event 2: File System Analysis
Definition: A set of file-objects corresponding to all of the files and directories identified on a disk image is analyzed and reported.
Tool: fiwalk
Metadata: Time of analysis; duration of analysis; user who performed file system analysis; file system partitions; file system volumes

Event 3: Feature Analysis
Definition: Describes forensic analysis of the raw bitstream, producing reports on specific features of interest (such as personally identifying or other sensitive information).
Tool: bulk extractor
Metadata: Time of analysis; execution environment; number of reports produced;

Event 4: Redaction
Definition: Used to overwrite specific patterns within the disk image according to a user-supplied rule-set.
Tool: iRedact.py
Metadata: Time of redaction; environment details; user performing redaction; name of new redacted image

Building a Digital Curation Workstation with BitCurator (update)

Last year I wrote a post for the MITH blog describing how to build a digital curation workstation using readily available hardware (at least for the present) and the BitCurator suite of digital forensics tools. Matt Kirschenbaum and I revisited that topic a few weeks ago for a poster we presented at the Digital Humanities 2013 conference. We got a number of requests asking for the digital version of the poster, so I’ve revised it slightly and uploaded the poser in blog form below.

The most significant update from my orginal post in 2012 is the case study involving the transcription work done by Niel Fraistat on Percy Shelley’s Prometheus Unbound manuscripts–completed in 1989 and saved on 5.25″ floppy disks. Otherwise much of the information remains the same, though there is significantly more detail regarding the BitCurator suite of tools. As always, feel free to leave questions or comments in the Comments section below.

Introduction

This post builds on the recent report from the Online Computer Library Center (OCLC) titled “You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media” (Erway, 2012).  The OCLC report identifies eleven specific steps archivist can follow to safely and effectively process born-digital content. This post considers the hardware needs of archivists and scholars as they look to implement the OCLC’s recommendations by offering a model born-digital curation workstation using readily available PC hardware and a suite of free and open source tools being developed and extended by the BitCurator project. We also demonstrate why such a workstation would be a valuable asset for a working digital humanities center through a case study involving the Shelley-Godwin Archive.

About BitCurator

BitCurator is a research and open source software development project designed to bring digital forensics tools and techniques to collecting institutions.

The BitCurator Environment is a Linux based suite of digital forensics tools that facilitate the creation of forensic disk images, file analysis, meta data extraction, and the identification of personally identifiable information (PII). BitCurator can be installed as either a stand-alone Linux operating system or as a virtual machine within a Windows or OS X host.

“Standard” Hardware

While there exist custom hardware solutions for preserving digital content, these options can be cost prohibitive, especially for a smaller DH center. For example, the Forensic Recovery of Evidence Device, or FRED, from Digital Intelligence can cost several thousand dollars. By contrast, the workstation we demonstrate here is designed to cost just over a thousand dollars, including the media access devices described in the next section.

The workstation will begin with a standard Windows or OS X based PC. While the exact technical specifications can vary, we offer the following recommendations as the minimum specifications for a digital curation workstation. More powerful hardware will, of course, result in less processing time, but a computer system with the specifications below should be available for roughly $500 from any number of retailers.

A multi-core CPU such as the Intel i5 series or AMD Fusion line of CPUs. The digital forensics tools in the BitCurator environment are mulit-threaded, so they will take advantage of multiple CPU cores and thus speed up processing times. Older multi-core CPUs such as the Intel Core2 series can also function well with BitCurator. If running BitCurator as a virtual machine, make sure your CPU has support for hardware virtualization: VT-x on Intel CPUs, and AMD-V on AMD.

Four to Eight Gigabytes of RAM. We recommend at least 4 gigabytes of RAM. BitCurator can run with less, but performance will suffer. 8 gigabytes of RAM is recommended if you plan to run BitCurator as a virtual machine within a host operating system.

Hard drive storage based on need. You will need at least 20 gigabytes of storage space to install BitCurator, beyond that you will want to plan for storage needs based on the media you will be accessioning.

Memory card reader and Blu-ray player. Choose a memory card reader with support for as many types of memory cards as possible. The Blu-ray player should be backwards compatible with burned DVDs and CD-ROMs.

Digital Curation Hardware

The hardware described below and shown in the image of our digital curation workstation allows the user to access a wide array of digital media. This is by no means an exhaustive list of media access devices, but it should be enough to handle most forms of legacy media and offer a starting point for those who may need more specialized hardware. (Pricing details are included to show the overall cost of the a digital curation workstation.)

USB 3.5” Floppy Disk Drive

Still available new from online retailers, look for a drive that can read both 1.44 MB(HD) and 800 KB (DD) 3.5” diskettes. Most drives support HD diskettes in both PC and Mac format, but only support PC formatted DD diskettes. Units are still available for around $20.

External USB 250MB Zip Drive

These units are available both new and used. We recommend the 250MB model as it is backwards compatible with the 100MB Zip disks. New units retail for around $200 and used units for around $50.

USB Zip and 3.5" Floppy Drive
USB 250 MB Zip Drive and 3.5″ Floppy Disk Drive

Device Side Data’s FC5025

The FC5025 is a controller card for 5.25” floppy disk drives that can be used as an internal or external—as seen here—interface. Device Side Data charges $55.25 per controller.

FC5025 5.25" Floppy Disk Drive Controller
FC5025 5.25″ Floppy Disk Drive Controller

5.25” Floppy Disk Drive

These units are no longer available new, but can still be purchased off of eBay for about $50. Drives can be mounted in a PC case or used as external devices as shown in the digital curation workstation example below.

Because 5.25″ floppy drives are no longer manufactured, we recommend that you purchase a number of back up drives based on your need. We also recommend a disk drive cleaning kit.

Backup your capacity to backup
Backup your capacity to backup

Wiebetech UltraDock Hardware Write Protector

This unit serves as both an interface with IDE and Serial ATA  type hard disk drives and as a write protector. Because it is common for the OS to overwrite metadata on a hard drive, write protection ensures that no interactions of the archivist or researcher affects the integrity of the original media. Wiebetech charges $250 for the UltraDock Hardware Write Protector.

WiebeTech Forensic ComboDock Writer Blocker
WiebeTech Forensic ComboDock Writer Blocker

How Things Come Together

Digital Curation Workstation
Digital Curation Workstation

BitCurator Digital Forensics Tools

The BitCurator Environment includes 3rd-party  and custom tools for disk imaging, data triage, PII discovery, filesystem analytics and reporting, and metadata export. It incorporates scripted actions that can be run in the GUI against live filesystems for file analysis prior to (or in lieu of) imaging. Additionally, it includes unique software for producing human-readable reports from forensic tool output.

BitCurator currently includes the following tools:

Guymager: A GUI-based disk imaging program for capturing disk images in raw, E01, and AFF formats.

Bulk Extractor and Bulk Extractor Viewer: A stream-based forensics tool for extracting features of interest from disk images (including but not limited to private and individually identifying information) and associated GUI front-end.

Fiwalk: A tool for generating Digital Forensics XML output describing filesystem hierarchies contained on disk images.

The Sleuth Kit: Basis Technology’s open source digital forensics framework.

BitCurator Reporting Tools: Custom tools developed by the BitCurator team for generating metadata reports useful for archivists and digital-curation practitioners.

Sdhash: A “fuzzy hashing” file similarity finding tool.

Digital Forensics plug-ins for Ubuntu’s GUI file browser (Nautilus) that can identify file information, generate check sums, read files in hexadecimal format, and perform other file analysis tasks.

Case Study: Recovering the Shelley-Godwin Archive

The half-life of bits: The evolution of Niel Fraistat's transcription of Prometheus Unbound.
The half-life of bits: The evolution of Niel Fraistat’s transcription of Shelley’s Prometheus Unbound manuscripts.

In 1989, Neil Fraistat worked with the Bodleian Library in Oxford to produce a transcript of Percy Shelley’s Prometheus Unbound manuscripts. These handwritten manuscripts included notations, line-throughs, revisions, line counts and other edits that made them challenging to read. Fraistat transcribed each page using WordPerfect 4.2 and then created a photo ready print of each transcribed page. The subsequent publication included a facsimile of the original page next to Fraistat’s transcribed version, as seen in the image above.  Completed on an IBM PC of the era, Fraistat’s work was saved on 5.25” floppy disks.

In 2012, MITH began working on the Shelley-Godwin archive, using Fraistat’s publication of Shelley’s manuscripts as a model for the digital implementation using Shared Canvas. However, the original transcription work–that had also been vetted by the Bodleian–now existed on essentially unreadable media, and in a format no longer supported by contemporary word processors. By using the FC5025 seen above and a used 5.25” floppy disk drive, we were able to recover Fraistat’s original transcriptions and preserve Fraistat’s work by creating disk images of the 5.25: floppy disks. We then used Open Office to convert the files to a format readable by a present-day word processor. The now-readable files were then passed to the editors for TEI encoding, saving hours of repeat labor.

This case study not only shows the utility of a digital curation workstation, but also asks us to rethink the history of digital humanities. We argue that Fraistat’s early work of thinking through representing analog, handwritten text in a digital medium represents an early form of DH scholarship. At stake in the archival work demonstrated here is not only the preservation of born-digital objects generally, but the ability to examine the history of our own disciplinary practices.

Conclusions

It is our argument that for a relatively small investment—roughly a thousand dollars—a digital humanities center can avail itself of the tools required to preserve, archive, and investigate a wide variety of born-digital materials. The hardware interfaces described in this post work in conjunction with the BitCurator Environment’s suite of digital forensics tools to achieve those ends and facilitate the recommendations of the OCLC report cited in the introduction.

The Shelley-Godwin Archive case study demonstrates the need for these tools—both hardware and software—in the DH community itself. If Fraistat’s original transcription of the Shelley manuscripts are an early form of DH, as we contend, then we see that what’s at stake here is not simply the concern of archivists and libraries, but the history and legacy of our own practices. The disciplinary sensibilities that inform the digital humanities make the DH center a logical location for these discovery and preservation tools. Further, the capacities afforded by the tools described in this poster will further demonstrate the value of a DH center to host institutions.

Acknowledgements

The BitCurator project is funded through the Andrew W. Mellon Foundation. The principle investigator is Christopher Lee and the co-principle investigator is Matthew Kirschenbaum.

Erway, Ricky. “You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media.” OCLC 2013.

Reside, Doug. “Digital Archeology: Recovering Your Digital History.” NYPL 2012.

CurateGear: Enabling the Curation of Digital Collections – January 6, 2012 in Chapel Hill

CurateGear: Enabling the Curation of Digital Collections

WHAT: The DigCCurr 2012 Public Symposium Presents CurateGear! a highly
interactive day-long event focused on digital curation tools and
methods. See demonstrations, hear about the latest developments, and
discuss application in professional contexts.

WHEN: Friday, January 6, 2012, 8:00 AM – 5:00 PM

WHERE: Friday Center, UNC, Chapel Hill, North Carolina

HOW TO REGISTER: Go to: http://tinyurl.com/3m8ajrm COST: $100 ($125 for
late registration beginning December 1st); STUDENT COST: $50.

Registration includes continental breakfast, morning and afternoon
breaks, lunch, and free parking.

8:00-8:15 Welcome and Introductions Helen Tibbo
8:15-8:30 Overview of the Day’s Topics Cal Lee
8:30-9:00 Curation Needs and Behaviors
  • Carolyn Hank – The Blog Archiving Landscape: Services and
    Approaches for Personal Blog Preservation
  • Matt Kirschenbaum – More than Words: Literary Authorship and
    Word Processing
  • Doug Reside – On Becoming a Digital Curator for the
    Performing Arts
9:00-9:20

Break

9:20-10:00 Repository Management Environments
  • Jon Crabtree – SAFE Archive
  • Mark Evans and Mike Thuman – Safety Deposit Box
  • Chien-Yi Hou and Richard Marciano – Policy-Driven Data
    Management
  • Peter Van Garderen – Integration of BitCurator knowledge and tools
    into Archivematica
10:00-11:00 Demo and Discussion Session – Repository Management Environments Presenters from the previous session will provide demonstrations
and discuss with CurateGear participants.
11:00-11:20

Break

11:20-12:00 Metadata and Documentation
  • Barbara Guttman and Doug White – The NSRL and Its Potential
    Role in Digital Curation
  • Mark Matienzo – Accessioning-based Metadata Extraction and
    Iterative Processing: Notes From the Field
  • David Pearson – NLA Digital Preservation Knowledgebases for
    Formats, Software and associated levels of support
  • Seamus Ross – DRAMBORA and the Data Audit Framework
12:00-1:00

Lunch

1:00-2:00 Demo and Discussion Session – Metadata and Documentation Presenters from the previous session will provide demonstrations
and discuss with CurateGear participants.
2:00-3:00 Data Transformation, Processing and Access
  • Greg Jansen – Curator’s Workbench
  • Trevor Owens – Viewshare.org: A free open platform for creating
    interfaces to cultural heritage collections
  • Seth Shaw – Accessioning Evolution @ Duke
  • Bill Underwood – Tools for File Format Identification, Validation
    and Characterization
  • Kam Woods – BitCurator
3:00-3:20 Break
3:20-4:20 Demo and Discussion Session – Data Transformation, Processing
and Access
Presenters from the previous session will provide demonstrations
and discuss with CurateGear participants.
4:20-5:00 Observations and Implications Nancy McGovern, Seamus Ross, Helen Tibbo, Bram van der Werf