Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This project addresses a long-needed effort to bring together in one database the world-wide museum holdings on the fishes of Texas. Before this project, museum data Modern research and conservation efforts require high quality data and ideally lots of it.  There is thus a critical need to bring occurrence data together and normalize, georeference (so they can be mapped), correct errors, and provide them to the world. Before this project (<2006), museum data for Texas' fish were only available from many disparate and often hard to find sources, located in several countries and managed in various incompatible databases. Some of these museums have no lacked a digital record of their collections and have , having paper ledgers only. Many are small museums that do did not offer their data online (although this is now changing quickly). Some have had no catalog at all, except what is recorded on jar labels. Extensive efforts were made to find, format and compile data from these museums into one database.Finding, reformatting and merging data from these sources was a critical task. But we have made those merged data even more useful by doing considerable editing and clean-up. Museums vary considerably in how data are managed. Many rarely update  Museums that did provide data varied considerably in how the managed data. Many rarely updated their databases as taxonomy changes changed or examine specimens as new information is learned. Spelling mistakes and other typographical errors are common among all data fields in most museums. These problems make useful queries difficult to impossible. Without addressing these issues, as we have now done, these data would only be moderately useful.

We have attempted to georeference all records from the state, although some could not be georeferenced due to locality descriptions that were vague or had internal conflicts. We synonymized the taxonomy and in many cases, we have examined specimens to verify identifications. We have extended the range of some species from what was once thought, based on this very basic work. In addition, we have formatted and edited dates and collector names. These additions make the database more useful since data can now be queried on many fields including geography and taxonomy.

This project is timely since more and more large complementary data sets are becoming available online and new tools for complex data analyses are becoming available. To date, much of what is thought to be known about Texas fish distributions is based on anecdote or publications with identifications that cannot be verified. We believe it is difficult to overstate the utility of this high-quality database. Before this project, researchers would not have been able to find the records that we now provide in an easily searchable database. Even if researchers did find records, all of the error checking and verification steps that we have done would still be needed. Those steps benefited greatly by the sheer number of records that we have since some of our data editing steps relied on content of other records in the database. We now provide this database to researchers and the public so they can peruse it and use it for their own research interests.

Thus the highest quality data about where and when fish occur in Texas were largely inaccessible and not often very useful when they were. Anyone who did access them (what they could find) for a specific research problem, perhaps for a specific species, had to clean them up themselves - a process that has been done many times over the years with various levels of completeness. Those efforts have been sporadic and not usually done in ways that correct data at the source so future users can benefit. At the beginning of this project, to our knowledge, no one had tried to bring the data together into a single normalized database where all of the data could be queried together the way the Fishes of Texas project has now done. 

These are some of the things we've done to fill this need:

  • find data
  • data entry (when needed)
  • re-formatting (normalizing)
  • compile data
  • georeference locations (apply spatial coordinates)
  • synonymizing taxa, collector names
  • detect errors (usually via visualization on a map)
  • verify/correct determinations
  • verify data against ledgers, labels, and fieldnotes
  • research manuscripts and other documents that can improve data quality
  • photograph specimens
  • photograph field notes and jar labels
  • preserve original data
  • publish data (including useful summaries)
  • publish research products (models, conservation areas)