Jump to content

Event:WikiCon Australia 2024/Submissions/Using OpenRefine & IRMNG to improve Australian Biodiversity: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Line 20: Line 20:
# [https://w.wiki/ByXx For species with APNI ids (and no authority)]
# [https://w.wiki/ByXx For species with APNI ids (and no authority)]
# [https://w.wiki/ByXm For genera with AFD ids (and no authority)]
# [https://w.wiki/ByXm For genera with AFD ids (and no authority)]
## [https://w.wiki/Byax for AFD arachnid genera]
## [https://w.wiki/Byax for AFD arachnid genera] (limiting a query)
# [https://w.wiki/ByXp For species with AFD ids (and no authority)]
# [https://w.wiki/ByXp For species with AFD ids (and no authority)]



Revision as of 00:55, 15 November 2024


Using OpenRefine & IRMNG to improve Australian Biodiversity

Abstract/description

We

  1. demonstrate how to download a Darwin core csv file from IRMNG which may represent the taxa named by a particular taxonomist. The list will not be complete as IRMNG is very incomplete with respect to Australian Faunal Directory and World Register of Marine Species taxon databases.
  2. import this file into openRefine and create a project.

In openRefine, we learn to

  1. reconcile columns... with taxon names (Accept only perfect matches NOT synonyms)
  2. create new columns
    1. by splitting a column
    2. by copying a column
    3. by using GREL functions such as substring, replace, indexOf ...
  3. subset for further processing (and using flags and stars)

An alternative approach

Using the following queries for APNI and AFD taxa:

  1. For genera with APNI ids (and no authority)
  2. For species with APNI ids (and no authority)
  3. For genera with AFD ids (and no authority)
    1. for AFD arachnid genera (limiting a query)
  4. For species with AFD ids (and no authority)

Modify these queries

  1. to pick a family
  2. or an author

and download the query result as a CSV file

The tasks thereafter closely match those discussed above and include

  1. forming links to the APNI and AFD pages for the taxon
  2. grabbing the authority and the publication from these links

to create lists of authors, taxon year of publication, publication name and page, and again, creating a schema to upload the reconciled authors and publications to wikidata.

What I am hoping to achieve

At the end of the session, participants will have learned

  1. how to create a project in openRefine
  2. why & how to facet
  3. how to split a column (and how to undo an action)
  4. how to reconcile a column with its wikidata
  5. how to create a schema for uploading data to wikidata

Relationship to Wiki skills or to the theme

Learning how to use openRefine to import statements and items into Wikidata

Username/s

Session type & duration

4 x two hour online sessions