EOL Dynamic Hierarchy Data Sets
The Encyclopedia of Life (EOL, eol.org) aggregates biodiversity information from more than 400 sources and provides access to the data through taxon pages, visual query and application programming interfaces. Scientific names are essential elements of the data integration infrastructure, but their shortcomings as key identifiers are well documented (Patterson et al., 2016). Complex automated workflows and continuous manual curation are required to address idiosyncrasies of source taxonomies, variation in data quality, and conflicting taxonomic opinions.
To achieve a harmonized taxonomic view of EOL content, names from data sources are mapped to a dynamic reference hierarchy (see current version here) using an algorithm that leverages canonical name strings, hierarchical information (ancestry, descendants), taxonomic ranks, synonym data, and author strings. Names that cannot be associated with a reference taxon are still accessible, but their unmapped status excludes them and any associated content from certain core EOL functions.
For more information about the EOL taxonomy, see EOL Dynamic Hierarchy