LOCKSS, CLOCKSS, and Portico
Potential Digital Archive Solutions for Rutgers

The LOCKSS (Lots of Copies Keep Stuff Safe) initiative was developed at Stanford University. Portico emanated from the JSTOR project as its archiving program. Both efforts are now funded by member libraries in addition to significant grant support from the Mellon Foundation and other agencies.

The basic principle of LOCKSS is that content is continuously compared among servers at hundreds of member libraries and differences are corrected. The system regularly crawls target publication sites to verify content and add new material. Content is preserved in the format supplied by publishers. As formats change, LOCKSS will convert content through "transparent [to the user] format migration" at the point of access.

CLOCKSS, or Controlled LOCKSS, is an offshoot of the LOCKSS program. Content is archived at publisher sites and a group of selected libraries (Indiana University, NYPL, and others). The LOCKSS software verifies and updates content within this small network of participants.

The basic principle of Portico is to create an archive from publisher source files that have been converted to a standard format. The archive is migrated forward en masse as formats change.

All of these archives allow access to the archive when certain trigger events occur. At LOCKSS libraries, requests for materials are sent to the publisher site, and if the content is not retrieved for any reason, the LOCKSS copy is provided. The transaction is not apparent to the user with one exception: dynamic content, e.g., advertisements or graphics that change with each screen display, remains static. Publishers must agree to participate in the LOCKSS program. Currently, there are about fifty publishers who participate in LOCKSS, and most have agreed to allow their subscribers to use the service as a backup system.

Portico is a dark archive, allowing access only when content is no longer available because of a trigger event, such as a publisher ceasing operations or its delivery platform fails. The trigger event must result in a sustained loss of access. At present, Portico has about thirty participating publishers, and there is a growing list of publishers who are specifying Portico as their archive.

Costs: Portico fees for Rutgers would be between $14,000 and $15,000 annually. LOCKSS software is open-source and freely available, however, participants are expected to join the LOCKSS alliance. The annual fee for LOCKSS ranges from $1,080 - $10,800. Consortia discounts may be available for both Portico and LOCKSS.

Comparison: The list of participating publishers for each service provides the best point of comparison (as opposed to title-by-title). The LOCKSS publisher list is deeper than Portico's at this point. On the other hand, Portico's list includes the publishers of RUL's most expensive digital resources, such as Elsevier and IEEE.

LOCKSS requires more staff resources than Portico, and participating libraries must provide their own equipment (typically a standard desktop PC) to run the LOCKSS software. The service requires set-up and ongoing administration, and new content must be targeted. After LOCKSS is implemented, staff time is relatively minimal at most sites.

LOCKSS and Portico are different products, and both have advantages and disadvantages. LOCKSS provides access to stored content whenever publisher sites are unavailable, even for brief periods of downtime. LOCKSS is a real-time backup solution more than it is an archive. Portico is a true archive, preserving digital content in a standard format for the long term. CLOCKSS preserves content in the publisher's original format (not a standard archival format). Access to CLOCKSS content is similar to the Portico model, however. Trigger events must result in a sustained loss of access, and content is released only after participating publishers and libraries review the situation.

Recommendation: In an ideal world, Rutgers would participate in both LOCKSS and Portico: LOCKSS to provide real-time access when content is unavailable, and Portico to provide long-term archival preservation of expensive digital resources. The decision should be based on what we hope to achieve. If our goal is to provide real-time access when service is interrupted for any reason, LOCKSS is the solution. Portico should be our choice if the goal is long-term preservation. It must be recognized that neither service provides complete coverage of our digital resources. At best, perhaps fifty percent of our purchased content is available on one or the other at this point in time.

msp 1/30/2007

Posted: September 13, 2007
