Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document is intended to provide a structure for normalizing data. It is by no means an exhaustive treatise on the topic, nor is it an authority. It is simply this- if you want to get your data into a standard order, here's what NPL did, how we did it, and what I wish we had done. While it may not be the 'best' way, consider that it is at the very least a way to get everything standard so that tweaking procedures becomes easier. 

I am basing the categories off of the Darwin Core Terms  found at rs.tdwg.org/dwc/terms/index.htm, in the order they are listed on that webpage. I'm not covering all of them, only the ones I normally would use.One of the strongest tools in the data-nitpicking tool kit is Open Refine. This application is a way of taking a spreadsheet and massaging it into a database ready format.

 

See the 'Refine Help' page for instructions on how to apply the snippets of code.