Wikidata:Requests for permissions/Bot/BiodiversityBot 3
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 19:12, 14 October 2023 (UTC)[reply]
BiodiversityBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Andrawaag (talk • contribs • logs)
Task/s: TreatmentBank (Q54857867) is a CC0 resource that contains data on taxonomic treatments, treatment citations, figures, tables, material citations and bibliographic reference. It contains valuables links mapping scholarly articles, taxa and taxonomic treatments. This bot taks involves linking research papers on taxa to taxa and related taxonomic treatments.
Code: on GitHub
Function details: The bot fetches a record from treatmentbank and aligns from these records the scholarly article, the taxon record and the related taxonomic treatments. The item on the article is linked to the related taxa by main subject (P921). The taxon is linked to the taxonomic treatments, through taxonomic treatment (P10594). I intend to operate the bot first manually, which means that I will select the records to be aligned with Wikidata. First, in small sets. In a second stage, I will add daily updates from treatmentbank. --Andrawaag (talk) 20:29, 15 January 2023 (UTC)[reply]
- what's the rationale for having both an exact match statement and the Plazi ID statement if they go to the same place? BrokenSegue (talk) 21:25, 16 January 2023 (UTC)[reply]
- The Plazi ID statement and the exact match statement are stored slightly differently.
- Plazi ID's are only stored using the string version of the ID (with out the PREFIX), while the exact match uses the the full format.
- wd:Q116211926 wdt:P1992 "55DC13A5-1D62-5AA0-8ABB-C1094124B7F1" ;
- and
- wd:Q116211926 wdt:P2888 <http://treatment.plazi.org/id/55DC13A51D625AA08ABBC1094124B7F1> ;
- Having the full URI format at SKOS:exactMatch can help in federated querying. We could argue that with BIND we can compose the full URI, but this is an expensive query operator. Having the full URI to linkout in federated queries leads to more performant querying.
- Having said that, I have also added the "formatter URI for RDF resource" property on Plazi ID which should resolve to the same URI and would indeed make the exact match property redundant. However, to work the ".rdf" extension has to be added since plazi does not seem to have content negotiation in place. External RDF resource can use the URI without the .rdf suffix, which could make some federated queries not work.
- With this in mind, I decided to add the skos:exactMatch property as P2888 to the items, so all (federated) query use cases can be catered. Andrawaag (talk) 22:29, 16 January 2023 (UTC)[reply]
- Here your bot deleted existing references and qualifieres of P225. I didn't checked other edits. --Succu (talk) 17:54, 17 January 2023 (UTC)[reply]
- @Andrawaag: Same here. --Succu (talk) 22:08, 18 January 2023 (UTC)[reply]
- Oppose: And here! Please check your 89 „contributions”. Thx. --Succu (talk) 22:13, 18 January 2023 (UTC)[reply]
- How do you plan to connect treatments as Hydnellum nemorosum A. M. Ainsw. & E. Larss. 2021, sp. nov. (Q111983493) to the original publication Four new species of Hydnellum (Thelephorales, Basidiomycota) with a note on Sarcodon illudens (Q111983492)? --Succu (talk) 22:17, 19 January 2023 (UTC)[reply]
- I don't understand your question. I am not planning to invent any new link between treatments and the original publication. The only thing this bot does is link taxon names and treatments if there is an explicit link in plazi. Wikidata should not be a primary source. Andrawaag (talk) 21:20, 7 February 2023 (UTC)[reply]
- The bot links three items (one of the test runs):
- 1. Oxalis lourteigiana (Q116211006) <- taxon name
- 2. Oxalis lourteigiana Nuernberg-Silva & Fiaschi 2021 (Q116211004) <- treatment
- 3. Taxonomic revision and morphological delimitation of Oxalis sect. Ripariae (Oxalidaceae) (Q110696619) <- publication.
- taxon name and treatment are linked using taxonomic treatment (P10594) and
- taxon name and publication are linked using stated in (P248) Andrawaag (talk) 21:45, 7 February 2023 (UTC)[reply]
- I don't understand your question. I am not planning to invent any new link between treatments and the original publication. The only thing this bot does is link taxon names and treatments if there is an explicit link in plazi. Wikidata should not be a primary source. Andrawaag (talk) 21:20, 7 February 2023 (UTC)[reply]
- Thats the way of how is User:Christian Ferrer using taxonomic treatment (P10594) created by UWashPrincipalCataloger after a controversional discussion. --Succu (talk) 21:40, 20 January 2023 (UTC)[reply]
- Succu I don't think I ever used once time taxonomic treatment (P10594). Christian Ferrer (talk) 07:21, 21 January 2023 (UTC)[reply]
- Another example Q116211011. --Succu (talk) 22:40, 20 January 2023 (UTC)[reply]
- @Andrawaag: About Oxalis (sect. Ripariae) Lourteig 2000 Plazi says Q93379839. --Succu (talk) 22:52, 20 January 2023 (UTC)[reply]
- @Andrawaag: Psiloderces (Q5515527) has three values for Plazi ID (P1992) (reffering to different articles). One treatmentment is linked to Fourteen new species of the spider genus Psiloderces Simon, 1892 from Southeast Asia (Araneae, Psilodercidae) (Q87009972) via taxon name (P225) How will yor bot handle the relationship according to taxonomic treatment (P10594)? --20:58, 23 January 2023 (UTC)
- In the case of Psiloderces (Q5515527) and the three Plazi IDs (P1992). Linking three plazi id's to the same taxonname, is semantically inaccurate. The bot links the taxon name, to the distint wikidata items of instance of (P31) taxonomic treatment (Q32945461) using taxonomic treatment (P10594) Andrawaag (talk) 21:35, 7 February 2023 (UTC)[reply]
- The property was created to exactly to do this: linking taxon name (P225) to Plazi-IDs. --Succu (talk) 21:50, 8 February 2023 (UTC)[reply]
- Please note this discussion too. --Succu (talk) 22:04, 8 February 2023 (UTC)[reply]
- In the case of Psiloderces (Q5515527) and the three Plazi IDs (P1992). Linking three plazi id's to the same taxonname, is semantically inaccurate. The bot links the taxon name, to the distint wikidata items of instance of (P31) taxonomic treatment (Q32945461) using taxonomic treatment (P10594) Andrawaag (talk) 21:35, 7 February 2023 (UTC)[reply]
- How do you plan to connect treatments as Hydnellum nemorosum A. M. Ainsw. & E. Larss. 2021, sp. nov. (Q111983493) to the original publication Four new species of Hydnellum (Thelephorales, Basidiomycota) with a note on Sarcodon illudens (Q111983492)? --Succu (talk) 22:17, 19 January 2023 (UTC)[reply]
Support Idea looks good. As we do not have clear standards for whether or not having both exact match statement and the ID, I am okay with either. However, I also agree that the bot should not delete the previous qualifiers, they are quite relevant.TiagoLubiana (talk) 19:50, 6 February 2023 (UTC)[reply]
- The bot has overwritten indeed a handful of references, but those edits have been reverted. Nice to have test edits for this. I have updated the bot and now the bot will no longer overwrite. Later this week I will run additional test run and report those here, so we can move forward with this bot request. Andrawaag (talk) 21:17, 7 February 2023 (UTC)[reply]
- Any answers to my questions? --Succu (talk) 21:22, 7 February 2023 (UTC)[reply]
- Yes see inline Andrawaag (talk) 21:46, 7 February 2023 (UTC)[reply]
- Any answers to my questions? --Succu (talk) 21:22, 7 February 2023 (UTC)[reply]
- Comment In the exemple with Oxalis lourteigiana (Q116211006), as far I see the taxonomic treatment and the publication are not linked one each other, though to know form where comes a taxonomic treatment is one of the most important thing in taxonomy. E.g. see Capillaster sentosus (Q1985550) or Phanogenia distincta (Q2187529), where in case of several treatment you can easily find from where comes each treatment. In that way you can find every treatments from a publication, e.g. see that query https://w.wiki/6FRP showing all the treatments available in H.L Clarck (1938). Another exemples with a publication (Mah (2021)) with Plazi treatments available [1]. The Plazi web site also groups treatments by publications, otherwise that would not make sense IMO. Christian Ferrer (talk) 05:57, 8 February 2023 (UTC)[reply]
- Note also that not all (the big majority of) taxonomic treatments don't exist in Plazi, so if in Wikidata we decide to separate the treatments in a separate item as Oxalis lourteigiana Nuernberg-Silva & Fiaschi 2021 (Q116211004), why not, but then it should be done in a way that we can modelize all other potential treatments even if they don't exist yet in Plazi. E.g. the current label of "Oxalis lourteigiana Nuernberg-Silva & Fiaschi 2021" is in fact the exact citation of the "name of the species + author citation", what will be the labels for the next potential treatments about this species, what will be the way ro modelize those treatments in separate items if they don't exist in Plazi? I'm really interested because I add a lot of treatments in Wikidata, and I like to have a possibility to retrieve externaly easily this kind of data, e.g. in Wikispecies. Christian Ferrer (talk) 06:37, 8 February 2023 (UTC)[reply]
Question: In Novakiella trituberculosa (Q1310467) we have taxon redescription (Q42696902) claiming to be a subclass of taxonomic treatment (Q32945461): Guess this should be merged with emendation (Q1335348). --Succu (talk) 21:00, 8 February 2023 (UTC)[reply]
- No, here emendation (Q1335348) means change of taxon name, not of the description. This concept is treated in the ICZN code [2]. Christian Ferrer (talk) 05:34, 9 February 2023 (UTC)[reply]
- There is even an article named "Emendation" in the ICZN code [3], further more I quote "Any demonstrably intentional change in the original spelling of a name other than a mandatory change is an "emendation" ". However the German Wikipedia article seems to talk about a different concept (the change of definition/scope of a taxon), it's true that sometimes we can see in scientific articles things such as "Genus xxxx (emended)", but it is obvioulsy not the same thing. Christian Ferrer (talk) 12:32, 9 February 2023 (UTC)[reply]
- OK, thx, but the description of taxon redescription (Q42696902) says "modification of an existing description". --Succu (talk) 17:03, 9 February 2023 (UTC)[reply]
- Nothing shocks me. To make a redescription you must have "an existing description" overwise it is not a redescription but a "first description", and if the redescription is identical to the "existing description" then it is not really a redescription thus the use of the word "modification". As well as its placement as a subclass of taxonomic treatment tend to be obvious for me. Christian Ferrer (talk) 17:45, 9 February 2023 (UTC)[reply]
- In Description, redescription and revision of sixteen putatively closely related species of Echinoderes (Kinorhyncha: Cyclorhagida), with the proposition of a new species group – the Echinoderes dujardinii group (Q106699033) the term Emended diagnosis is used (referenced e.g. by Echinoderes gerardi (Q2620277). emendation (Q1335348) as well as diagnosis (Q3025852) are a mixture of different topics, but both indicate different taxonomic treatment (Q32945461). An "Emendation" in sense of the ICZN code is a treatment of it's own. Not sure what to do. --Succu (talk) 19:13, 9 February 2023 (UTC)[reply]
- See this interesting article: [4]. Not sure we are still in the scope of this Requests for permissions for the BiodiversityBot 3. Christian Ferrer (talk) 19:56, 9 February 2023 (UTC)[reply]
- In Description, redescription and revision of sixteen putatively closely related species of Echinoderes (Kinorhyncha: Cyclorhagida), with the proposition of a new species group – the Echinoderes dujardinii group (Q106699033) the term Emended diagnosis is used (referenced e.g. by Echinoderes gerardi (Q2620277). emendation (Q1335348) as well as diagnosis (Q3025852) are a mixture of different topics, but both indicate different taxonomic treatment (Q32945461). An "Emendation" in sense of the ICZN code is a treatment of it's own. Not sure what to do. --Succu (talk) 19:13, 9 February 2023 (UTC)[reply]
- Nothing shocks me. To make a redescription you must have "an existing description" overwise it is not a redescription but a "first description", and if the redescription is identical to the "existing description" then it is not really a redescription thus the use of the word "modification". As well as its placement as a subclass of taxonomic treatment tend to be obvious for me. Christian Ferrer (talk) 17:45, 9 February 2023 (UTC)[reply]
- OK, thx, but the description of taxon redescription (Q42696902) says "modification of an existing description". --Succu (talk) 17:03, 9 February 2023 (UTC)[reply]
- There is even an article named "Emendation" in the ICZN code [3], further more I quote "Any demonstrably intentional change in the original spelling of a name other than a mandatory change is an "emendation" ". However the German Wikipedia article seems to talk about a different concept (the change of definition/scope of a taxon), it's true that sometimes we can see in scientific articles things such as "Genus xxxx (emended)", but it is obvioulsy not the same thing. Christian Ferrer (talk) 12:32, 9 February 2023 (UTC)[reply]
- Info I just remember the following querry, there is currently near 5000 Plazi treatments (probably mainly added by me) stored in the reference sections:
List of items with taxonomic treatment that includes Plazi ID within the reference section of taxon name (query) Christian Ferrer (talk) 09:09, 5 March 2023 (UTC)[reply]
The issue mentioned above about overwriting references, has been fixed. To kickstart this discussion towards approval I ran some additional test runs. Any outstanding issue I need to address to get this task approved? --Andrawaag (talk) 13:47, 6 October 2023 (UTC)[reply]