Descriptive metadata for aggregate objects
Traditional archival description operates at a level above the item level, because you describe 16 linear feet of documents on page at a time, I'll wait. What this boils down to is that each level has a description which cascades down, with refinements, to the level below. So every sub-series in a series inherits the series description, but expands or refines it; if there are further divisions below the sub-series level, they inherit that refined description, and may in turn expand it. Some collections, for instance, are described down to the folder. The end result, though, is uniform: one object description applies to many items. The description at the folder level (object) applies to all the pieces of paper (items) contained within it. The DPLA's whitepaper and guidelines for creating descriptive metadata at the aggregate level may prove useful for:
- Series
- Sub-series
- collection
- item group
- item sub group
DPLA model for aggregates- in our case, the aggregate level items are intellectual/conceptual rather than physical (ie, there are no digital files which correspond to the aggregate level, and yes, I am saying that digital files have a physical reality, because they do), and so things like "file name" do not apply. This cuts out file-level descriptive data. However, using the DPLA's model, we can describe the content rather than the form of these aggregates, using descriptive standards already familiar in the archive world (DACS, eg). The model looks a little like this:
In this fashion, we can describe the intellectual properties and content of the materials at a higher level- that of the series, subseries, collection, register, or whatever the most specific aggregate is (with some it may come down to a grouping of 5 pages; with others, like photographs, it may be the subseries or series level)- rather than any kind of item-level content description.
This does not supersede or preclude item level form description- individual file name and properties (some of which is wrapped up in the technical metadata), but it does make it so that no one has to look at every one of these items and describe their content. That can be carried at the aggregate level, using the enhanced finding aid, existing descriptive archival standards (leaning on DACS here because that is what it is for), and what we know or can find out about the materials with minimal intervention.
Appendix C: Use Cases. Worth looking at for dark archive/digital library purposes:
- 4
- 5
- 6
- 9
The DPLA, being user-focused, has different assumptions about how and why aggregates would be described, notes about UI implications etc, but that aside, the model is still useful for the purposes of our dark archive. It would translate back to the digital library side as well, but I'm looking at this from the archival perspective, which is: can we use this to describe context at various depths, and to surface aggregate archival entities (series, item groups, etc) which otherwise have no reality in our collection? Short answer: yes.
We have pages, and we are using the file system and the naming convention to organize and describe those pages in relationship to the Finding Aid in order to create archival order. Since we have no intention (currently, though that may change if, I don't know, we get brain parasites) of making aggregated objects (bundling individual TIFF files together in one big TIFF or PDF, eg), we need a way to provide contextual information despite not having a digital object to attach it to; we also need a way to provide subject/content description other than reading all the files and providing a summary, which sounds like the archives equivalent of an actual Herculean labor. This will do that, without requiring that we create aggregate objects. As a happy dividend, we can use this to describe the couple massive PDFs that I understand the digitizers went and created, which aren't archival objects (as we've previously established), but do exist and could be served to patrons (in some theoretical scenario where bandwidth is no object).
While it would be possible to attach the location on the machine of the directory which corresponds in the Finding Aid to the descriptive record, for the purposes of portability etc, it is not recommended. There are merits to this approach: however, in the interests of unshackling the intellectual order from the disk-level order (and not having to do link gardening), it is better to instead simply make sure that the hasPart and isPartOf relationships, and file listings attached, are as robust as possible. That way, regardless of seismic shifts in the server/architecture, or migration, or flattening of the hierarchy, the relationships remain.
We can use Appendix D: Metadata Properties as a guide. Stuff like "browse" and "more like this" aren't exactly inside the scope of the dark archive, so I emphasize guide here.
Corresponding standards: General International Standard Archival Description ISAD(G) guidelines.