2014-06-05 Meeting Notes AWG

Date

05 June 2014

Attendees

Agenda items

  • Discuss transfer and ingest experience in Archivematica:
    • microservices
    • ease of use; concerns about training others to use the tool
    • customizability
  • Share thoughts on integrating Archivematica into existing workflows

Discussion Items

ItemNotes
DIP destination
  • Ladd played with ICA-ATOM and is currently working on DSpace - the instance and the Archivematica pipes to the instance are in progress
  • Melanie tested DIP upload to CONTENTdm feature using the Project Client package export method, rather than pipe directly to Tarlton's live site. However, Archivematica config is similar for both CONTENTdm methods, and requires a defined CONTENTdm collection before relevant options appear in the microservice processing. As a safety precaution, Melanie set up a private collection on Tarlton's live site for testing. Archivematica's DIP package export destination in the testbed currently allows only superadmin access.
AIP destination
  • There is some new documentaiton out there for uploading AIPs to Islandora
  • Ladd is planning on speaking with a Solutions Architect from Discovery Garden this Friday in order to discuss the relationship/integration between Archivematica and Islandora
  • Ladd suggests that for digital objects that require special preservation needs (beyond multiple local copies), those AIPs could potentially be candidates for DPN
Custom microservices
  • Benn and Jessica have particular concerns about ingesting disk images
  • The discussion of custom microservices comes up - including the integration of exsiting pythin scripts for mounting images and extracting filelistings, that are already apart of the Briscoe workflow
  • One of Benn's failed transfers was an EO1 (EnCase disk image file format), and the verbose error message indicated that (1) it was not bagged and (2) our current instance of Archivematica does not incorporate digital forensics tools that would recognize this format like EWFinfo
  • Integration of BitCurator tools, including those that read and extract wrapper metadata from E01 files, is feature on the Archivematica roadmap (according to Porter during his visit to Austin on May 22nd)
Bagit versioning
  • Benn's other informative failure resulting from an attempt to transfer a bag created using Bagit 4.1, instead of Bagit 4.4.; Benn rebagged the objects using 4.4. and the transfer was successful
Failures, rollbacks and deletion
  • Reasons for failures are explicit in error messages (and automatic email notifications)
  • Melanie pointed out Archivematica's lack of a rollback/start over feature, and that this is a yet-to-be funded feature listed on the Archivematica roadmap. Until then, failed transfers must be deleted and started over.
  • Also, Melanie's topic raised additional discussion about what deletion meant in Archivematica:
    • deletion in Archivematica merely means deletion from the user dashboard, not the actual storage system
    • Archivematica storage system is administered seperately by a superadmin
    • the system keeps tracks of items that are deleted
How do we incorporate this? How would incorporation change our workflow? What issues do we foresee?
  • Benn's current workflow is in a different order but the change would not be too disruptive and he sees advantages to Archivematica's order
  • Benn also points out that FIDO (as a metadata extractor) might be too narrow, that FITS might be better and perhaps he can add a new rule to this effect
  • Archivematica does have size limitations, so large video files and image files would have to be tested
    • One idea for addressing huge files was too go ahead and run all of the services outside of Archivematica (do the work ahead of time)
    • Create a new type of transfer that indicates to the system upon selection that it can skip running many of the microservices (ONLY DO X, Y, Z)
  • Melanie is wondering about the feasibility of sending AIPs to OCLC's dark archive - would some additional step be required between Archivematica and OCLC to make them dark archive-ready?
  • Melanie also observed that introducing Archivematica would flip some workflows for the special collections staff
Tracking and searching
  • Ladd asks the question of whether or not Archivematica could be useful in tracking events in the life of a digital object (Ex. Someone calls and says, I know we digitized these items five years ago but we can't find them). Could you use Archivematica to go back and track down if those items are and where those items are?
  • Jennifer points out that Archivematica's tracking (logging of PREMIS events) may be useful immediately post-Archivematica, but any number of things can happen to DIPs and AIPs once they have left the system, so Archivematica tracking would not be useful in tracking the whole life history of the digital object - Archivematica logs are a snapshot of the object earlier in time
  • Ladd wonders how this decoupling of DIPs and AIPs from one another post-Archivematica fits into the broader conversation. What system are people currently using to track something analogous? Does this have any effect on the evaluation of the tool?
Rights metadata
  • There is an opportunity, for every microservice (or what gets logged as a PREMIS event) to record rights metadata. This was agreed to be a feature. Staff can choose to do it whenever it makes sense to do it.
Mounting a network share
  • FTP is better.

Action Items

  • Benn is going to create some simple customized microservices, including custom rules/tools and try breaking up a disk image file and ingesting it in chunks
  • Benn will come up with instructions for customizing bag (e.g. including file for skipping over certain microservices?)
  • Benn will review master indexes/logs of what has happened thus far
  • Vandy and Jessica will continue the investigation into ICA-Atom - try to actually create a pipe for the DIPs to Atom (

    atom login is ladd@austin.utexas.edu; pswd de34rf)

  • Melanie will review the DIP package contents and test manual upload to CONTENTdm via Project Client.