Skip to content

General toolkit for working with VertNet data. We call these data "migrators." Once customized to an original data source, it converts the original data into Darwin Core ready for upload to an Integrated Publishing Toolkit (IPT) resource.

Notifications You must be signed in to change notification settings

VertNet/toolkit

Folders and files

NameName
Last commit message
Last commit date
Nov 12, 2021
Dec 18, 2013
Aug 17, 2017
Nov 1, 2017
Oct 24, 2021
Nov 12, 2021
Oct 24, 2021
Dec 10, 2014
Sep 18, 2017
Nov 4, 2018
Jul 17, 2013
Nov 12, 2021
Dec 1, 2017
Jul 17, 2013
Dec 3, 2017
Nov 2, 2017
Nov 4, 2018
Dec 10, 2014
Jul 17, 2013
Aug 29, 2017
Nov 23, 2020
Nov 23, 2020
Dec 18, 2013
Aug 29, 2017
Oct 9, 2015
Jul 17, 2013

Repository files navigation


VertNet Darwin Core Data Migrator Toolkit

Scripts and databases to migrate source data to Darwin Core ready for publication via IPT.

A description of the steps required to be modified to create a migrator customized for a given data set is given in the file README - MigratorPrepSteps.txt.

The migrator uses Microsoft Access, and requires that the system on which it runs has unix shell command capability enabled in the environment on which the migrator DOS .BAT scripts are invoked.

  • BlankLineIssues.awk - Script to report unexpected blank lines in a CSV file.
  • NewLineIssues.awk - Script to report records having a new line in the field content in a CSV file.
  • EncodingIssues.awk - Script to report records having UTF8 encoding issues.
  • PurgeNonprintingCharacters.sh - Script to substitute '.' for non-printing characters in data content.
  • PurgeNuls.sh - Script to remove the NUL characters in a file encoded as utf16 to render utf8.
  • PurgeVerticalTabs.sh - Script to remove all vertical tab characters from a file.
  • RemoveLastLine.sh - Script to remove the final line in a file.
  • utf8er.awk - Script to prepend Byte Order Marker (0xEF 0xBB 0xBF) to CSV file known to be utf8-encoded.

About

General toolkit for working with VertNet data. We call these data "migrators." Once customized to an original data source, it converts the original data into Darwin Core ready for upload to an Integrated Publishing Toolkit (IPT) resource.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published