Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

What are data models, and why are they needed?

Wikipedia offers the following definition of a Data Model:

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world. For instance, a data model may specify that a data element representing a car comprises of a number of other elements which in turn represent the color, size, and owner of the car.

In addition, the Wikipedia entry goes on to say the following:

A data model explicitly determines the structure of data.

Taking the analogy of computer programming languages, data models can be compared with data structures.This section includes metadata descriptions for the dark archive and the public-facing digital library. Both data models are developed based on broadly adopted, well-documented, community supported standards. The standards have been extended where necessary. We expect that no single data model is sufficient forĀ 

Dark archive

The dark archive will be available only to project personnel and to the staff authorized by participating organizations, such as Central State Hospital and the State Library of Virginia.

The unit of archiving is a single scan. Metadata for the archive is designed for this level of granularity. Part of the rationale for this decision is simplicity and part operational facility.

Simplicity: The relationships between various scans is difficult to ascertain without studying the scans. Scans are numbered sequentially and in some cases, some scans are more closely related to each other than others within the sequential order.

Operational facility: Scan file sizes are between 20MB and 250MB. Bundling scans into a larger grouping for archiving will result in a requirement of downloading several gigabytes at a time prior to accessing any information from the archive. Supporting granular access will enable future patrons or software to retrieve data quickly and then bundle it as needed for particular goals.

The metadata will use a hybrid, standards-based data model, with classes that cluster the metadata developed for different purposes: descriptive, administrative, technical, rights, and preservation.

Digital library

The data model for the CSH digital library will be based on standards such as the Portland Common Data Model, or PCDM. Documentation for PCDM can be found at the following URIs:

DescriptionURI
Duraspace PCDM wikihttps://github.com/duraspace/pcdm/wiki

Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects

http://www.slideshare.net/kestlund/portland-common-data-model-pcdm-creating-and-sharing-complex-digital-objects