This page describes similarities and key differences between DAMS1 and DAMS2

[ 1 Reminder: DAMS Policy ] [ 2 DAMS2 general concepts ] [ 3 User accounts ] [ 4 Content models ] [ 5 Metadata ] [ 6 Staging and transferring content and metadata for ingest ] [ 7 Ingest process ] [ 8 Updating assets ]

Reminder: DAMS Policy

https://cloud.wikis.utexas.edu/wiki/x/WsRPAg

The DAMS is part of a larger digital stewardship ecosystem: The DAMS can be used to streamline access to curated content, predominantly for public access through the Collections Portal, the HRDI and Primeros Libros Portals, or Spotlight.
- The DAMS is not for temporary storage of digital items.
- The DAMS is only a secondary storage location for asset files that UTL curates. All content that is added to the DAMS must be preserved in UTL’s digital archival infrastructure (exception: Primeros Libros content provided by partner institutions).
- Reformatted content that is not digitized by UTL’s Digitization unit must conform to the UTL Digitization specifications: https://cloud.wikis.utexas.edu/wiki/x/wQGRAg
- For digitized/reformatted content, the ‘main’ asset Media entity typically contains the Production Master. Avoid storing Archival Master files in the DAMS, unless you anticipate frequent access requests. Archival Master files can be restored from the digital preservation archive upon request (contact @Karla Roig Blay).
- Born-digital content must adhere to UTL standards for acquisition and stewardship of born-digital collections. Consult with the Digital Processing Archivist (@Jeremy Thompson) or other Digital Stewardship staff BEFORE you agree to acquire born-digital content.
- Archival processing of born-digital files, in particular redacting of content and file format normalization must be completed before the content is ingested into the DAMS. Unredacted/unnormalized files MUST NOT be stored in the DAMS.
The DAMS is the last step before publishing and not a parking lot: Ideally, content added to the DAMS will have sufficient metadata to allow for management and timely publishing of content.
- All content must be inventoried outside of the DAMS prior to ingest.
- Archival content: Content that by nature of its formal characteristics and organization is best described in an archival finding aid should have a sufficiently detailed finding aid, if necessary accompanied by an inventory that provides item/object-specific metadata.
- Library materials: Content that by nature of its formal characteristics is best described in a catalog record must have a catalog record. All materials that can be cataloged in the OCLC database must have an OCLC number.

DAMS2 general concepts

Every asset is a Node: The new version of the DAMS is based on Drupal, a web content management system. Content in Drupal is referred to as a “Node”. Every asset in the DAMS, including collections, serials, books/issues, pages, is represented by a Node, which has a unique ID. The Node stores the metadata which describes an asset.
- Assets in DAMS2 technically have two IDs: a running-number node ID, which is unique only within the context of the current DAMS2 instance, and a UUID, which is generally unique. Users will typically interact with assets through the DAMS2 GUI using the Node ID.
- DAMS1 PIDs were retained during the migration, in order for Collections Portal URLs to remain stable. New content that is published to the Collections Portal will also receive a UUID-based URL.
- Paged Content is modeled similar to DAMS1 by creating a ‘sparse’ Node for each page and associating the page Nodes with a book/issue-level asset Node.
Every file is represented by a Media entity: Media entities in Drupal allow to associate files with (technical) metadata about a file. When adding files to the DAMS, they are associated with a Node that represents an asset.
- Media entities are functionally similar to datastreams in DAMS1.
- Similar to the different types of datastreams in DAMS1, DAMS2 distinguishes different types of Media entity (representing for instance the main asset file, derivatives/service files, thumbnails, transcripts, etc.).
Assets are bundles of files: Similar to DAMS1, which bundled different types of datastreams under a Fedora object/PID, asset Nodes in DAMS2 are typically associated with different Media entities: the ‘main’ asset file, derivatives, OCR results, transcripts or captions (if applicable). It depends on the Content Model of a particular asset, which types of Media will be available.
- Upon ingest, the DAMS software automatically creates certain types of derivatives.
- Media entities other than the ‘main’ file entity will be automatically named.

User accounts

User accounts will be created upon request by a DAMS manager.

You will find your credentials in Stache: Stache

If you change your password in the DAMS interface, the Stache user credentials will not be updated automatically. Please contact a DAMS manager if you need to reset your password.

Content models

Audio

Allowed file extensions: mp3, wav, aac, m4a (=mp4 container that contains only audio datastreams)

A transcript is REQUIRED for publishing audio assets with linguistic content. Transcripts must be provided as plain text. A formatted PDF document can be provided additionally as a transcript.

Ingesting an audio file will trigger creation of an MP3 derivative

It is possible to add a custom thumbnail image for audio assets, for instance showing a photo of the media carrier. Custom thumbnails must be added in a separate step after the initial ingest. The thumbnail files must be named using the following pattern: [node ID]-additional.jpg (the node ID won’t be available until the asset was ingested first).

Audio content published to the Collections Portal or the HRDI Portal will be streamed via the Wowza media server, to prevent simple download/duplication of the content. This is not a strong copy-protection/ digitial rights management mechanism, however.

Collection

Used to organize content into hierarchical groups.

Can be nested into subcollections. Avoid too granular subcollection structures/ deep nesting.

Image

Single-item/single-page image content. For multi-page items or recto/verso scans use “Paged Content” (entire object) in combination with “Page” (individual page images).

Allowed file extensions: tiff, tif, jp2, jpf (“File“ Media type) - png, gif, jpg, jpeg (“Image” Media type) are technically allowed too but should be ingested only after consultation with Digital Stewardship staff.

Ingesting an image will trigger creation of a JPEG 2000 service file and a JPEG thumbnail image.

Newspaper

Parent content type for Publication Issue/serial content, represents the entire series

Functionally similar to a subcollection

Page

Individual page in a Paged Content item or a Publication Issue.

Metadata entered at this level is not usable for publishing and might get dropped during future migrations.

See Image content model for allowed file extensions.

Paged Content

Asset that aggregates page images, e.g. books/bound volumes, archival folders or element collections that form an intellectual entity that are not going to be described in more page-specific detail.

Stores metadata about an intellectual entity.

Published to the Collections Portal as a ‘flip-through’ resource.

Publication Issue

Individual issue/element of a periodical or series (typically unbounded)

Functionally similar to Paged Content (book)

Video

Allowed file extensions: mp4, mov

Captions are REQUIRED for publishing audio assets with linguistic content, an optional transcript can be provided as well. Captions must be provided in WebVTT format. If a transcript is provided, it should be a plain text file; a formatted PDF document can be provided in addition to that.

Ingesting a video file will trigger creation of an MP4 derivative

Audio content published to the Collections Portal or the HRDI Portal will be streamed via the Wowza media server, to prevent simple download/duplication of the content. This is not a strong copy-protection/ digitial rights management mechanism, however.

Not in use

The following content models are currently not available:

Binary
Compound Object
Digital Document

Metadata

Metadata is prepared in spreadsheet form, using a common template. A transformation to XML will not be necessary anymore. All metadata must be pre-processed by DAMS managers before it can be ingested.

DAMS2 contains managed vocabularies ('taxonomies') for some pieces of metadata, to improve the uniformity of metadata across assets. Some of these are open ended, meaning they can be extended as needed during ingest.

When creating metadata, look up terms in the existing taxonomies and reuse terms as appropriate, making sure that the spelling matches exactly. New taxonomy terms must be created before the Node metadata that reference a term. For the time being this will be managed as part of the pre-processing of metadata spreadsheets.

Avoid typographic quotation marks

The DAMS is generally capable of handling Unicode characters. However, you should avoid typographic quotation marks (curly/smart quotes), as they will result in an error during ingest. Some spreadsheet and word processor software applications will automatically convert regular quotation marks to typographic ones (“”), consider turning this replacement off when you create DAMS metadata.

Metadata field overview and crosswalk from DAMS1 metadata template CSV: DAMS2 metadata field crosswalk.xlsx

Metadata spreadsheet template: DAMS2 metadata template v.0.2.xlsx
(changed in v.0.2: renamed column AdditionalFile_0 to PDF_0, see crosswalk)

Taxonomy term lists (for easier browsing/searching): DAMS2 taxonomies

DAMS2 metadata error checking script: https://github.austin.utexas.edu/mmh4428/dams2_validate

Staging and transferring content and metadata for ingest

If the content was created in-house by Digitization Services within the last 6 months, we can check if the media files are still available in our production environment. If they are, you do not need to transfer a copy of the media files back to us.

If you want to ingest content that was created outside of Digitization Services, please reach out to us to discuss preservation options.

Group content in one folder per batch. Give the folder a unique name that identifies project/group/collection and batch.
Verify that all assets and spreadsheets are present within the folder. Ensure that the metadata validation script (linked above) does not return any errors.
Use one of the following transfer options:
- Box
  - Do not share Box folder(s) with DAMS staff unless you are planning to transfer ownership of the Box folder.
  - Instead, please reach out to the DAMS team to request access to the _DSTransferIn folder and create a copy of your batch folder there.
  - Creating a copy of a Box folder ‘within’ Box is usually accomplished fastest via the Box web UI. Box Drive on your computer might have to download and re-upload files.
  - The copy of the batch folder you share with us is deleted after ingest. For post-custodial content, please reach out the DAMS team to discuss preservation options.
- Network share: consult with DAMS staff first
Notify DAMS staff via email or Teams. Identify if the assets are being ingested for the first time or are being updated. If the assets are being updated, specify what is being updated.

Ingest process

The ingest process will be managed by DAMS staff for the time being. You need to provide a metadata spreadsheet and the files that will be ingested.

Updating assets

If you want to update asset metadata, you need to specify the ID of the node you want to update. Contact the DAMS administrators for support with obtaining node IDs.
For updates to taxonomy terms as part of a metadata update the updated term must already exist in the DAMS. If you want to introduce new terms as part of a metadata update, please create a separate list of those.
Updates to media files (e.g. replacing images or AV content, or adding files after a node was created) must happen separate from updates to metadata. Please reach out to DAMS staff to inquire about how to provide spreadsheets for different kinds of updates.

DAMS1 > DAMS2