Administrative
Requirements
- The script should be executable from the command-line, and should accept inputs from the command line.
Input(s)
The script should be able to take the following input:
- CSV file containing a paired columns of labels and labelnames in the archive organization (for example, column order in a file could be: series, seriesname, subseries, subseriesname, etc.)
- the odd-numbered columns of the CSV must contain label (for example: 1)
- the even-numbered column of the CSV must contain labelname (for example: Organizational correspondence)
- prefix the name of label column with "arrange:<label>" and the name of labelname column with "arrange:<labelname>"(case-sensitive, exclude the quotes)
- the text string following "arrange:<label>" will be used as a property name to query the database and text string following "arrange:<labelname>" will be used as a property name to add the labelname in the admin metadata profile
Interface
The script will be a command-line utility that will serve the requirements as outlined above.
Invocations of the script would look like this:
python3 administrative.py [OPTION]... -f CSV
OPTION represents a the following set of switches that can be applied to modify the default behaviour of the script:
Switch | Argument | Description |
|---|---|---|
| -f | Path to CSV file | Path to CSV file for batch processing. |
| -q | (none) | Quiet mode. Disables all informational prints. All exception and error related prints will still be output. |
| -h | (none) | The script displays a help document on the screen and exits. |
Behavior and Implementation
The script (adminsitrative.py) performs the following high-level operations:
- Parse command line arguments
- set variables in accordance with the arguments
- inform user about errors in the arguments, print help, and exit
- Read CSV file
- validate header structure
- parse 'arrange' information from header
- error if the label and labelname are not in pairs(even number of columns in csv) and print the message in error csv.
- Read metadata property names from labels.json
- store all labels within a Python object
- Read controlled vocabulary from vocab.json
- store the vocabulary as a Python object
- Create a connection to the database
- For each row in the CSV:
- extract the 'arrange' info (for the admin metadata profile)
- query the database for the matching values of all labels in this row
- if no documents in the database match this query, print message in error csv
- for each returned document record,
- error if the label and labelname are not in pairs(even number of columns in csv) and print the message in error csv.
- if document already has labelnames for every label in the query, print message in error csv.
- update the document with labelnames for all labels on the current row of the CSV file
- Add a metadata enrichment event to the PREMIS profile for this document.
Output(s)
The script should, of course, be able to carry out the mapping of labelname to the corresponding label as specified in the csv, but also print helpful information in case errors were encountered. Errors to be reported include errors in command usage, as well as any errors encountered while mapping the csv entry to the document. Error csv name: "admin_profile_errors_<timestamp>.csv"
Test cases / Validation
- No header in the csv file should print an error.
- One entry of data in the csv file with at least one label-labelname pair. [[does this mean one data row in the csv? how many columns]]
- Multiple entries of data in the csv file with at least one label-labelname pair. [[does this mean multiple data rows in the csv? how many columns?]]
- Document not available in the database should print an error.
- Admin Profile already present for the document in the database with missing labelnames.
- Admin Profile already present for labels with labelnames. [[what should the script do here?]]