UT Libraries bag-info specification

The UTL bag-info specification defines which fields are required and/or allowed in bag-info.txt files accompanying collections materials submitted for preservation in the LTO tape archive. Some of these fields must be provided by the bag creator, while others are generated automatically during bagging. See the Bagging manual for instructions on providing values for these fields and bagging using various software tools.

There is no required order for listing these fields. Python bagging methods will always rearrange the bag-info fields alphabetically and collapse any multi-line values into a single line.

Fields provided prior to bagging

The following fields must be provided by the bag creator at the outset of the bagging process.

Source-Organization

The organization responsible for writing the bag to tape. Unless the content is being written to tape on behalf of an organization outside of UTL, and UTL does not have administrative control over the digital collection, the value should always be as follows:

University of Texas Libraries

Organization-Address:

The full physical address of the Source-Organization. For UTL, use the following value:

University of Texas Libraries

The University of Texas at Austin

Post Office Box P Austin, TX 78713-8916

Multi-line values

Python bagging methods will collapse all bag-info values into a single line. When bagging using Python, it is a good idea to separate the Organization-Address value lines with commas to make the single-line value more legible

Contact-Name

The name of the person responsible for the bag. This can be the collection manager or the person creating the bag.

Contact-Phone

International format telephone number (beginning with "+1-" for US numbers) of the person named in Contact-Name.

Contact-Email

Fully qualified email address of the person named in Contact-Name.

External-Description

A description of the intellectual contents of the bag. Include creator information, collection information, and project history. For digitized archival collections, include the name of the corresponding physical collection and any series, folder, or item information.

External-Identifier

Any unique identifiers assigned by organizations other than the creator or owner of the materials that allow the contents of the bag to be identified. A UUID is required for all bags, and is used for tracking bags in the Digital Stewardship SIPs records. OCLC numbers or other external identifiers can also be provided, and should be accompanied by a clear label.

Generating UUIDs

Some internally-developed bagging scripts (see Bagging) will generate a UUID External-Identifier during the bag creation process. When using these scripts, there is no need to generate or provide a UUID manually. For other bagging methods, generate a UUID using an online UUID generator or the ToolBucket plugin for Notepad++.

Internal-Sender-Description

A description of the equipment and software used to generate the files that may be used to render the files in the bag later. Describe the file tiers included in the bag, such as "archival masters" and "derivatives". Include file format information for each file tier, as well as relevant technical specifications such as image resolution, bit depth, or compression methods.

Internal-Sender-Identifier

Identifiers assigned by the creator or owner of the materials that are used to identify the contents of the bag in their original environment. Common values for collections held by UTL are accession numbers, collection identifiers, or catalog call numbers.

Rights-Statement

A note describing any conditions or restrictions associated with the Content Information pertaining to both preservation and access.

Bag-Group-Identifier

Some convention or identifier that associates a related group of bags. Contact Digital Stewardship to check if a Bag-Group-Identifier has already been assigned for a given collection.

Bag-Count

Count representing the bag, in order, out of all of the bags referred to by the Bag-Group-Identifier in the current batch. Use format "# of #".

Fields generated during bagging

These fields are generated automatically by the bagging process. Note that in the case of Python bagging methods, the UUID External-Identifiers are generated during bagging, but the field is not listed here, since bag creators can also provide other External-Identifiers prior to bag creation.

Bag-Size

The size of the bag in bytes, KB, MB, GB, or TB. Calculated automatically during bagging. bagit-python does not generate this value itself, but the Python bagging scripts developed at UTL calculate bag size and populate the field.

Bag-SOftware-Agent

For bags created using Python, the installed version of bagit.py, plus the URL of the bagit-python GitHub page.

Bagging-Date

The date the bag was created, formatted as "YYYY-MM-DD".

Payload-Oxum

The total size of the bag in bytes, and the number of files in the bag. The two numbers are separated by a period.

Sample templates and bag-info files

blac_projects bag-info template: blac_projects.txt

aaa_projects bag-info template: aaa_projects.txt

theses_dissertations bag-info template: theses_dissertations.txt

2021_0264 (pcn_buenaventura_documentos) bag-info example: bag-info.txt