Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Archival files are not generally available through the DAMS, instead the Production copy is used for ingest ingesting into the DAMS into the OBJ datastream. Archival file and Production copy can be identical in some cases. If there is a specific use case for storing Archival files in the DAMS, they can be added as an additional datastream (ARCHIVAL_FILE).

...

Digital copy of content after initial processing that typically does not result in significant loss of quality vs. the Archival file. In some cases, the creation of a Production file will be a necessary step to achieve a valid digital representation of content, e.g. by stitching image segments of an oversized physical original. Other examples of processes resulting in a Production file include down-sampling, color correction, cropping or mild compression (e.g., video material).

Edits are optional, though, depending on the requirements of the individual repository, Production files can be identical to the Archival files (in this case no separate copy is necessary).

...

Recommended file formats for images are uncompressed TIFF, TIFF or JPEG 2000 with lossless compression.

Usually, the Production file is used for ingest ingesting into the DAMS.

Derivative file

...

Any file that results from further editing, transformation, or content analysis and extraction. Examples are lower-resolution, compressed representations of content (JPEG images, MP3 audio files), OCR-extracted text (e.g. hOCR, ALTO files), XML documents describing the structure and content (e.g., METS, TEI), PDF documents.

Derivative files are vaulted to UTL's tape archive if the derivative process cannot be automatically or deterministically reproduced: lower-resolution derivative image files can be easily recreated from a production copy, and do not need to be vaulted to tape. Uncorrected OCR results can be recreated from images (at better quality) as time passes. Manually corrected OCR text is the result of intellectual labor, and should be preserved, similarly METS and TEI representations of documents typically involve manual intervention , and should be kept. PDF documents can be regenerated, if they are created through an automatic process from images, METS XML and OCR results.

Some of the Derivative files are generated automatically upon ingest into the DAMS, e.g., lower-resolution JPEG images, thumbnail images and MP3 audio files. Other derivatives can be created outside of the DAMS and added to the digital representation of an asset as an additional datastream.

...