Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

See the documentation page on content models for details on each model and the file types supported.

Datastreams

User-provided

OBJ

Status
colourRed
titlerequired

Primary media file (e.g. image, audio or video). The media file stored in the OBJ datastream is conventionally a production or publication file. The OBJ datastream is the source of any derivative made available for publication via the Collections portal.

Info

The OBJ datastream for an asset should contain a publication/production file. Curators can decide which processing steps should be applied to an archival file to create a publication copy (for instance cropping, stitching or certain color corrections of images). Curators can also decide to use a publication copy that is virtually identical to the archival file. In any case, curators should consider that the OBJ datastream is currently the only source for media content to be published to the Collections portal.

Note

Paged content/complex assets, publication type assets or collections do not have an OBJ datastream.

MODS

Status
colourRed
titlerequired

Descriptive metadata about the asset, organized according to the Metadata Object Description Standard (MODS).

ARCHIVAL_FILE

Status
colourBlue
titleoptional

Note
Suggested datastream label for Archival files ingested along with production/publication files in a tiered ingest.

OCR_CUSTOM, FULL_TEXT_CUSTOM

Status
colourBlue
titleoptional

User-generated full-text, e.g. as the result of optical character recognition (OCR).

PDF

Note

The PDF datastream is system-generated for page assets that are part of a book or serial issue. If you provide a PDF datastream for a page-level asset, it will be overwritten when the creation of a PDF for the book or serial issue is triggered through the DAMS GUI.

Derivative of the content represented by the asset in PDF format.

Typically provided by Digitization Services for assets which comprise multiple pages (paged content).

TRANSCRIPT

Status
colourRed
titleRequired
for Audio content

Textual representation of linguistic content in audio and video assets. Required for audio assets to be publishable. Optional for video assets.

Transcripts MUST be in plain text.

CAPTIONS

Status
colourRed
titlerequired
for Video content

Timed textual representation of linguistic content in audio and video assets. Required for video assets to be publishable.

Captions MUST be provided in WebVTT format.

PROXY_MP4

Audio content can be provided as streaming media, which adds a limited technical hurdle against a simple download of a complete MP3 audio file. If you prefer to deliver audio content as streaming media, you need to externally create an MP4 derivative and ingest it into a datastream labeled PROXY_MP4. Please submit a DAMS support ticket for details on this step.

System-generated

Depending on the ingest method (manual or batch) and the type of content that is ingested to the DAMS (see Content models), the DAMS will automatically create some datastreams.

Warning

The system-generated datastreams serve pre-defined functions in the DAMS and are managed by the system. If you want to add a custom datastream, do not use one of these IDs, otherwise the system might overwrite custom data.

RELS-EXT

DAMS-specific metadata about the sub-collection an asset belongs to and about access permissions.

TECHMD

Technical metadata about the content of an asset's OBJ datastream, e.g. information about file format, compression algorithms, creation and modification dates.

POLICY

Metadata about the role-based access permissions to an asset inside the DAMS.

DC

Descriptive metadata about the asset in Dublin Core format, automatically derived from the MODS metadata provided by the user during ingest.

TN

Derivative image file: Thumbnail image.

JPG

Derivative image file: low-resolution dissemination copy.

JP2

Derivative image file: JPEG 2000 copy with lossless compression, for use in the Collections portal.

PROXY_MP3

Note

The PROXY_MP3 datastream is automatically generated upon ingest from an OBJ datastream that contains WAV (waveform audio) data. If you publish an asset with a PROXY_MP3 datastream to the collections portal, the file will be made available to the embedded player as a progressive download. Users can download the complete MP3 file with relative ease. Audio content can be provided as streaming media, which adds a limited technical hurdle against a simple download of the complete MP3 audio file. If you prefer to deliver audio content as streaming media, you need to externally create an MP4 derivative and ingest it into a datastream labeled PROXY_MP4. Please submit a DAMS support ticket for details on this step.

Derivative audio file with MPEG Audio Layer III encoding. Automatically generated upon ingest by the DAMS using lame, for use in the Collections portal.

lame is invoked with the following parameters: -V5 -vbr-new

MP4

Derivative video file: MPEG-4 media container file, generated upon ingest by ffmpeg with H.264 video encoding and AAC audio encoding. Used in the Collections portal.

ffmpeg is invoked with the following parameters: -vcodec libx264 -preset medium -acodec aac -strict -2 -ab 128k -ac 2 -async 1 -movflags faststart

OCR, FULL_TEXT, HOCR

Full text generated by the DAMS during ingest or when OCR process is triggered in the DAMS GUI.

PDF

Note

The PDF datastream is system-generated for Page assets that are part of a Book or Serial Issue. If you provide a PDF datastream for a Page-level asset, it will be overwritten when the creation of a PDF for the Book or Serial Issue is triggered through the DAMS GUI.

System-generated for Page assets when PDF creation is triggered through the DAMS GUI.

UTLDAMS_PDF

Derivative PDF container for paged content, generated through the DAMS GUI or upon ingest.

Note

The system-generated PDF container for paged content is in almost all cases of lower quality than the PDF files provided by Digitization Services, which is typically ingested as the PDF datastream for book/issue-level assets.

Paged content/ complex assets

...