Administrative
Requirements
The script should be executable from the command-line, and should accept inputs from the command line.
Input(s)
The script should be able to take the following input:
CSV file containing a paired columns of labels and labelnames in the archive organization (for example, column order in a file could be: series, seriesname, subseries, subseriesname, etc.)
the odd-numbered columns of the CSV must contain label (for example: 1)
the even-numbered column of the CSV must contain labelname (for example: Organizational correspondence)
prefix the name of label column with "arrange:<label>" and the name of labelname column with "arrange:<labelname>"(case-sensitive, exclude the quotes)
the text string following "arrange:<label>" will be used as a property name to query the database and text string following "arrange:<labelname>" will be used as a property name to add the labelname in the admin metadata profile
Interface
The script will be a command-line utility that will serve the requirements as outlined above.
Invocations of the script would look like this:
python3 administrative.py [OPTION]... -f CSV
OPTION represents a the following set of switches that can be applied to modify the default behaviour of the script:
Switch | Argument | Description |
|---|---|---|
-f | Path to CSV file | Path to CSV file for batch processing. |
-q | (none) | Quiet mode. Disables all informational prints. All exception and error related prints will still be output. |
-h | (none) | The script displays a help document on the screen and exits. |
Behavior and Implementation
The script (adminsitrative.py) performs the following high-level operations:
Parse command line arguments
set variables in accordance with the arguments
inform user about errors in the arguments, print help, and exit
Read CSV file
validate header structure
parse 'arrange' information from header
error if the label and labelname are not in pairs(even number of columns in csv) and print the message in error csv.
Read metadata property names from labels.json
store all labels within a Python object
Read controlled vocabulary from vocab.json
store the vocabulary as a Python object
Create a connection to the database
For each row in the CSV:
extract the 'arrange' info (for the admin metadata profile)
query the database for the matching values of all labels in this row
if no documents in the database match this query, print message in error csv
for each returned document record,
error if the label and labelname are not in pairs(even number of columns in csv) and print the message in error csv.
if document already has labelnames for every label in the query, print message in error csv.
update the document with labelnames for all labels on the current row of the CSV file
Add a metadata enrichment event to the PREMIS profile for this document.
Output(s)
The script should, of course, be able to carry out the mapping of labelname to the corresponding label as specified in the csv, but also print helpful information in case errors were encountered. Errors to be reported include errors in command usage, as well as any errors encountered while mapping the csv entry to the document. Error csv name: "admin_profile_errors_<timestamp>.csv"
Test cases / Validation
No header in the csv file should print an error.
One entry of data in the csv file with at least one label-labelname pair. [[does this mean one data row in the csv? how many columns]]
Multiple entries of data in the csv file with at least one label-labelname pair. [[does this mean multiple data rows in the csv? how many columns?]]
Document not available in the database should print an error.
Admin Profile already present for the document in the database with missing labelnames.
Admin Profile already present for labels with labelnames. [[what should the script do here?]]