Tiered Ingest allows you to group all of the files corresponding to a simple asset's datastreams (including archival files, publication files, other derivatives created outside of Islandora, with the exception of RELS-EXT) into a sub-directory.
This tiered batch ingest method is NOT suitable for paged content (complex/compound assets with children). See Batch ingest complex assets (paged content) for instructions on how to ingest assets comprised of multiple pages.
The tiered ingest allows you to store additional files with a digital asset, and you can use this method to ingest externally created derivative datastreams (e.g. for streaming audio). See Content models for a breakdown of the expected datastreams per content model, and for information which datastreams can be published to e.g. the Collections Portal.
The batch ingest process runs continuously, looking for newly queued batch jobs approximately every 5 minutes. You can add batch ingest jobs to the queue at any time. Batch jobs are subject to the following batch job size and file size limitations: Organise files in a batch job folder, using subfolders if appropriate. Refer to the instructions/options listed below for preparing batch jobs.General information for batch ingest
Step 1: Stage files for batch ingest job
Excerpt lookup error: excerpt "datastreams generator script" was not found on page "DAMS datastreams.txt generator" (with ID 38786949) in space "UTLDAMS".
If you're experiencing issues please see our Troubleshooting Guide.
All of the files you are ingesting as part of one asset will be staged in one directory per asset, as a sub-directory of the batch job folder.
path you identify in the queue form. Each sub-directory corresponds to one asset and must have at least a file for the "key datastreams" (datastreams.txt). This file will list the datastream ID and corresponding filename, for instance the MODS datastream (MODS.xml), OBJ datastream (ex: filename.tif for large image), or other datastreams with derivatives.
In order for the script to know what the datastreams to be ingested are we need a "manifest" to be included with the queued batch.
OBJ==primaryfile.ext MODS==metadata.xml # optional, if no MODS file is included, minimal metadata is automatically generated during ingest PDF==custom.pdf # optional ARCHIVAL_FILE==originalversionof_primaryfile.ext # optional, use for archival file (e.g. uncropped scan) COMPONENT1==componentfile1.ext # optional, can for instance be used in cases where a primary image is stitched from multiple component images; increment for additional files in same directory # DO NOT use for complex objects that can be modeled as paged content or Islandora component assets! MEDIAPHOTOGRAPH1==anymediaphotographfile.ext # optional, can be used for images documenting physical media, cases, covers, etc.; increment for additional files in same directory DERIVATIVE1==anyarbitraryderivativefile.ext # optional, use for derivative files with direct descendant relationship from file designated OBJ; increment for additional in same directory # CAUTION, do not duplicate derivative files that are automatically generated by the DAMS
OBJ==primaryfile.ext [designation of primary file is at digital stewardship staff discretion, in consultation with requesting content holder]
DERIVATIVE1==anyarbitraryderivativefile.ext [use for derivative file with direct descendant relationship from file designated OBJ; increment for additional in same directory]
COMPONENT1==anyarbitrarycomponentfile.ext [use for cases such as a file comprising one piece of a stitched OBJ or one page image in a pdf OBJ; increment for additional in same directory]
MEDIAPHOTOGRAPH1==anymediaphotographfile.ext [use for images documenting physical media, cases, covers, etc.; increment for additional in same directory]
MODS==metadata.xml [use for optional included metadata file, if not included then very minimal mods will be added]
Notes:
- [text] should not be included in datastreams.txt file, used above for explanatory purposes only.
- Additions beyond the standard datastream IDs shown above are allowed. Consult with DAMS Management Team for recommendations.
Example Ingest:
User1 in Architecture has a collection and needs to ingest their media with extra datastreams
they use ftp to upload their files to the server in a directory called batch1
fill out form as follows:
>>> Architectural Collections
Sample folder structure
eid1234_example-batch-submission/ (batch job folder) ├── asset1/ │ ├── datastreams.txt │ ├── modsfile.xml │ ├── primaryfile.tif │ ├── anyarbitraryderivativefile.ext │ ├── anyarbitrarycomponentfile.ext │ └── anymediaphotographfile.ext ├── asset2_audio_example/ │ ├── datastreams.txt │ ├── modsfile.xml │ ├── audiofile.wav │ ├── derivative_audiofile_for_streaming.mp4 (e.g. for creating PROXY_MP4 datastream, which is required for streaming audio) │ └── audio_transcript.txt └── asset3_video_example/ ├── datastreams.txt ├── modsfile.xml ├── videofile.mp4 ├── video_captions.vtt └── video_transcript.txt └── page02_custom_ocr.txt
Notes:
- Folders for asset1, asset2, asset3 as shown above are nested under the batch directory. Each subfolder represents an individual asset with its datastreams.
- The batch job folder can contain just one asset folder, but would still need the extra nesting