Alexis's interview with TARO's WebTex subcommittee:
Please describe all the different tools and technologies that make up your current descriptive environment? Why were these particular tools chosen? What other options did the Princeton team evaluate?
What was your process of evaluation?
Please describe the adoption/migration to your current archival collection management system? Did you encounter any challenges in the adoption/ migration process? Can you discuss your data clean-up workflows?
How the site works/technologies that contribute to collection management and public access:
they are currently in the process of evaluating the current site to still if it still serves their needs
they are already looking at the next iteration of the finding aid site (follow up: what are they currently considering)
the site you see is a custom built site by their systems department; they evaluated existing publishing platforms and dlxs but decided no
they do not have an integrated collections management and front end system; the site currently just publishes the collection description
the backend of the site involves several technologies and steps because they do alot of extra things/enhancement to their ead to provide the functionality people se on the site (sorting by date and title or linking digitized material to the finding aid site)
mudd currently uses AT and they manage their accessions data there; they do not use it as a database of record for anything else that they do (no locations, no finding aids); the really only use it for accessions
they do share this stack of technologies with other libraries on campus but they may have a different process or workflow manuscripts division)
MUDD Process:
Everything was already in EAD
when a collection comes in, they try and get as much descriptive info up front with date to make the process faster
accession record created in AT (an authoring tool for collection level information - but may change the EAD later, not storing master EAD)
Export EAD from at; if a collection came with description like a box list, spreadsheet) they have an xslt that they use to take the transcript/list data to transform the list into a EAD inventory
Once the finding aid is created, they run it through a “normalizer script” that (does a bunch of things to clean up the data):
normalizes dates
strips label attributes and <p> elements
adds unique component ids (if you go to the URL for a given level of description within a finding aid, you’ll see the various components have unique ids)
this serves to create an id that never changes even if physical items are moved in a box - the unique id is also the basis for relating digital objects to specific components - this component id becomes p
They validate the record - they have a looser and a stricter schema
Once errors are cleaned up, they commit the master EAD to SVN version control system and SVN is used as a the database for the finding aid site - additions are updated and added nightly; this is the master record
They also have a test site to upload EAD and preview it before they publish the EAD
For the test site, they upload to an eXist database - so it’s in there somewhere
Systems Team - John Ellis, Sean Stroup (decision and versions of technologies, implementation) - they would be able to tell us exactly how these different tools are talking to one another
they do have a princeton archival collection working group to discuss workflow standardization but
mudd uses less ACms functions because they get much more info when it comes in the door; they have donations
manuscripts division uses AT/ACMS differently because they purchase collections and create more from scratch
collection description is done independently among repositories on campus
Looking at the second round improvement:
ADWG subcommittees (because this group is big and each can carve out their tasks)
UX
Digital Objects/Born digital - they are a hydra partner now so this subcommittee will be considering that in the context of collection description
Authority control/SNAC/EAC-CPF
Data modeling - moving away from EAD to their own data model; looking at the article on the EU Sendari (sp?) Project, LOD
UI
ArchivesSpace task force
Discussion items
Item
Who
Notes
Discussion of Princeton's Finding aid system
Amy, Beth, Esther, Paloma
Liked the integration of digital content and ability to sort results
How are comments intended to be used? Moderated? No parameters present. More work for staff?
Discussion of Briscoe Center public service moving to the Benson
Stephanie Malmros, Carla, Amy, Esther, Beth
Through next December
Items paged every half hour
Appointment based
Smaller space: fewer reference materials
Beth: other projects on hold? Stephanie: exhibits will be conducted at LBJ, must functions can continue
Beth: biggest improvement? Stephanie: Reading room up to date, dedicated classroom space, flexible spaces, exhibit space
Front desk staff have been getting some complaints- researchers don't care about exhibit space, feel it is unnecessary
More finding aids are now available online for researchers
Skype call with Alexis Antracoli, Archivist, Mudd Library, Princeton University
Alexis, Paloma
Introductory info:
Works at Mudd manuscript library, which is part of rare books and special collectins
Took job in July 2015
Oversees technical services, digitization and electronic records
Paloma question: Background of project? Alexis answer:
Mudd converted to EAD in around 2006
New site was launched in 2012
Old site had less functionality, displayed EAD
New site: better functionality, allows access in a variety of ways, can look at finding aids whole or use search to find stuff at each level of description.
Can easily attach digital items to finding aids
Can do on-demand digitization and then attached items to finding aid
Can sort on date, title, box #'s, etc.
Simple and advanced search
Topics page
Use eac-cpf records (data is pulled from finding aids automatically using a script written at Firestone)
Involved in SNAC and want to do more with that, but no specific plans yet.
Paloma question: Is SNAC data pushed or Pulled? Alexis answer:
Not sure. Are biog/hist notes even worthwhile?
Ex. Woodrow Wilson. He was president of Princeton, but most biog/hist notes about him won't focus on this.
Paloma question: How was finding aid project coordinated? Alexis answer:
Two main repositories, Firestone and Mudd, with separate technical services units. Two teams collaborated with systems. Total of 4-5 people plus systems worked on project.
Created a proposal and then met 2/month to develop the site
A lot of data cleanup was involved
EAD goes through normalization and cleanup
Processing:
Staff use Archivist Toolkit for processing- creates a skeleton description at collection and series level
Try to get inventories up front from donors
Students enter data into excel and use stylesheet to transform excel into EAD
Don't enter a lot of data directly into Archivists Toolkit, which cuts down on processing time
Once created, finding aid is normalized (esp. dates), then validated against in-house schemas
Two schemas: looser one for older descriptions, stricter one for new descriptions
Then finding aid is uploaded and commited to site
Each component of the finding aid has its own url- normalizer adds this as a data attribute to each component
Paloma question: Best practices of authorities and taxonomies? Alexis answer:
Authorities applied in archivists toolkit (subjects and names module)
Currently working on cleaning up names using open refine
Data from Archivists Toolkit will be moving into Archives Space and want to cleanup before migration.
Processing archivists can apply subjects freely, use LCSH, AAT, genre/form - which is esp. useful for born-digital items
University has files on alumni, fac/staff etc. which is very siloed and not searchable through the finding aid system
trying to get indexes into EAD so they're findable on the finding aid site
Some data my not be fitted to EAD, and would work better as EAC-CPF ex. birth/death dates, associated departments etc.
University archives thinking about adopting EAC-CPF for this, but project is a few years out
Paloma question: How does commenting function? Policies? Alexis answer:
Idea was to get people to provide information about collections
wound up getting a lot of reference questions which are passed onto public service. Also a lot of spam.
Not used as originally conceived.
No guidance is provided on how to use system.
Maybe a newer iteration of site can add more guidance on how to use comments.
Once in awhile someone points out a mistake, but it's rare.
Paloma question: Future improvements to site? Alexis answer:
Not sure what would replace current system.
Possible improvements:
Hard to tell if a finding aid includes digital content
Easy to overlook pdf version on page
Like California Digital Library system- makes it easy to search for digital content
Left tree view is a little clunky- needs better usability
Looking at some UX improvements, incl. user testing and google analytics- need to figure out priorities
There is also a committee on born digital, authorities and data modeling: considering moving away from EAD to a different data model altogether.
There is also an ArchivesSpace subcommittee
Paloma question: Are you members of ArchivesSpace? Alexis answer:
Yes, haven't migrated yet, will start with accessions and born digital content.
Post skype discussion
Paloma, Jennifer, Beth
Surprised to hear they are considering moving away from EAD
Benefit of ArchivesSpace is ability to export EAD/EAD3
Some interest in continuing discussion of data cleanup projects/methods/tools - next meeting?
Current project to do data cleanup on TARO - 40+ repositories, various tools and processes involved.
Beth: Architecture library also working on data cleanup: figuring out what metadata they want, then will begin actual cleanup work.
Looking at access and restriction areas- can we all use similar or standard terminology?
Privacy and confidentiality data is siloed
Jennifer: Possible to compile US-level public safety/gun laws document, list any laws that affect public service?
Steve: HRC is looking at museum collection management systems (Past perfect etc.) but ArchivesSpace would make sense for archives
Beth: Architecture is interested in ArchivesSpace
Jennifer: Fine arts finding aids are in pdf format, not in EAD. Would like to get them into TARO