Wikidata:Property proposal/reason for normal rank

From Wikidata
Jump to navigation Jump to search

reason for normal rank

[edit]

Originally proposed at Wikidata:Property proposal/Generic

   Not done
Descriptionqualifier to allow the reason to be indicated why a particular statement should have normal rank (and not preferred nor deprecated rank). Avoid using it when there is no statement with another rank nor a qualifier that might induce people to set deprecated rank. Sample use: a statement with an "end date" (P582) would generally have normal rank and not deprecated rank, nor should it be deleted nor overwritten
Data typeItem
Domainstatement and claims with normal Help:Ranking#rank (not preferred or deprecated)
Allowed valuesstatement no longer current (Q103841141) or other
Example 1United States (Q30) member of (P463) UNESCO (Q7809)
qualified with → statement no longer current (Q103841141)
Example 2Brazil (Q155) member of (P463) Latin Union (Q123209)
qualified with → statement no longer current (Q103841141)
Example 3Ronan Huon (Q654520) date of birth (P569) 1922
qualified with → new item for "less precise value"
Example 4Dalia Mya Schmidt-Foß (Q66490714) Instagram username (P2003) californiadalia
qualified with → statement no longer current (Q103841141)
Example 5Worle Library and Children's Centre (Q55163610) coordinate location (P625) 51°21'39.474"N, 2°55'35.922"W
qualified with → statement no longer current (Q103841141)
See alsoHelp:Ranking

Motivation

[edit]

It seems to me that preferred rank is generally more easily understood than normal rank for statements that are no longer current. While we have reason for preferred rank (P7452), we lack a qualifier for this. Obviously, it shouldn't be added to any statement with normal rank. See the description for when to use it. Please help improve it. (Add your motivation for this property here.) --- Jura 10:49, 5 December 2020 (UTC)[reply]

Discussion

[edit]
  •  Support Fit nicely with our semantics to give reasons for the other ranks. ChristianKl11:25, 5 December 2020 (UTC)[reply]
  •  Oppose I see no need. United States (Q30) has more than 50 statements with member of (P463). Some of the currently valid statements have preferred rank, and some have normal rank. That is not good as it means it is harder to find the statements with normal rank. One of the most asked questions about SPARQL queries is why it doesn't find some statements. The answer usually is that the query find statements of "best rank", and the statement in question have normal rank, but other statements with preferred rank are the best rank.
I think it would be better to just give all statements here normal rank. It is too hard to maintain having dozens of statements with preferred rank for the same property. When new statements are added, many users will not set the rank from the default normal rank.
Do you have any other kind of examples where it would be relevant to give a reason for normal rank? --Dipsacus fullonum (talk) 11:45, 5 December 2020 (UTC)[reply]
Q30 just seems messy. Q155#P463 has it mostly right. That ranks can change isn't really a reason not to use them. We actually do have ranks because one should change the rank and not the statement itself. Maintaining them on countries is really trivial if not done manually. Statements with end dates are frequently misunderstood and either deleted, overwritten or deprecated. --- Jura 11:55, 5 December 2020 (UTC)[reply]
  •  Comment Maybe it would be better if this property was simply "reason for rank," to give it more flexibility. I'd also allow editors to explain why a statement is marked as preferred or obsolete. NMaia (talk) 09:24, 8 December 2020 (UTC)[reply]
    • Supposedly, we could do that and replace the other. We'd have to make sure the rank can be deduced from the item used as value, otherwise we loose a way to check if the rank matches the reason. --- Jura 14:20, 8 December 2020 (UTC)[reply]
    I posit that properties with a conditional meaning (i.e. their meaning is instance specific) are a bad data smell which encourages lower data veracity. In this case the meaning would change depending on the current rank, removing a method of data constraint. SilentSpike (talk) 18:49, 24 December 2021 (UTC)[reply]
  • @Moebeus, Ayack, SilentSpike, SixTwoEight, YULdigitalpreservation: @Tagishsimon, Nomen ad hoc, Tinker Bell, Ghouston: as you participated in preferred rank property proposal, what's your view on this? --- Jura 11:58, 9 December 2020 (UTC)[reply]
    Sorry, I can't see the need of that.  Neutral therefore. Nomen ad hoc (talk) 13:22, 9 December 2020 (UTC).[reply]
  •  Weak support While I'm not fully sold on the need, I certainly don't see any harm. Moebeus (talk) 12:12, 9 December 2020 (UTC)[reply]
  •  Neutral Hmm, I'm unsure. I do like the it allows intention to be captured in the data which makes it less prone to misunderstanding. However, what I don't like is that the application of this qualifier would be selective (i.e. there's always a reason a statement has a specific rank, be we wouldn't want to be adding this to every statement ever - I think the reasons will basically all boil down to "the data is valid"). In general, awareness of how ranks are intended to be used should be improved site-wide (i.e. editors need to stop deleting accurate data just because it has a temporal component to it). --SilentSpike (talk) 21:34, 9 December 2020 (UTC)[reply]
  •  Comment The motivation section sounds like this property would be more "reason this is not preferred rank" - if this is created, would that be a better label? I assume we have no need for a "reason this is not deprecated rank" property? ArthurPSmith (talk) 19:39, 10 December 2020 (UTC)[reply]
  •  Oppose This does not seem necessary. reason for deprecated rank (P2241) is not just about deprecation rank but as can be seen its original proposal it is a reason why it is not the top rank. If there are preferred statements, you can add reason for deprecated rank (P2241) to any statement that is not preferred to provide a reason why those statements are not preferred, regardless if they are of normal or deprecated rank. I personally think reason for preferred rank (P7452) ought to remove "rank" from the label so it can just provide a reason why a statement is preferred over others, regardless of the actual rank (if there are no preferred ranked statements the normal ones are clearly preferred over deprecated rank ones, etc.). In fact perhaps it would be best if we stepped away from using the same terminology as the ranks are named, e.g., maybe "reason for statement deprecation" and "reason for statement preference" would be better labels. —Uzume (talk) 06:21, 11 December 2020 (UTC)[reply]
    • I think in the meantime reason for deprecated rank (P2241) is only for deprecated rank (see its current description). I'm not convinced that the suggested relative use of the term (and preferred) would help users. --- Jura 07:31, 11 December 2020 (UTC)[reply]
    • @Uzume: If people start using reason for deprecated rank (P2241) for statements that aren't deprecated that likely leads to increased confusion about ranks and what it means to be deprecated and should be avoided. ChristianKl10:58, 11 December 2020 (UTC)[reply]
      • @ChristianKl: Perhaps (depending on ones point of view) but methinks it is more confusing to introduce another property to label deprecated statements at a normal rank as well as preferred statements at a normal rank. This proposed property could be confusingly interpreted either way. If you look at any dictionary "normal" is always a sticky abstract definition. A reason for normalcy seems even more confusing. Methinks what you are looking for is something more along the lines of "reason for historic statement" to keep people from removing/deleting statements that do not represent the current situation. I personally see this as exactly what reason for deprecated rank (P2241) was originally intended for. It is unfortunate it got boxed and conflated by the Wikibase statement ranking names. You say it might cause confusion and I won't argue that. There is plenty of room for confusion due to the naming of the Wikibase ranks. However, you will find it much harder to craft machine queries (Wikibase is all about machine readable data) for "outdated" statements since there is no single qualifier property to label them with. Methinks if anything you are creating a system where the data can be more confusingly misinterpreted. From my point of view the Wikibase statement ranks have done much damage. Instead of considering "best" statements by Wikibase rank, we should be considering "best" by qualifiers. —Uzume (talk) 16:18, 11 December 2020 (UTC)[reply]
        • @Uzume: Statements at a normal level are not deprecated. Just because Berlin has a population of 3,644,826 in 2019 doesn't mean that the statement that it had a population of 3,644,826 in 2018 is deprecated. It's still a true statement that Berlin had a population of 3,644,826 in 2018.
An outdated statement has either point in time (P585) or end time (P582) as qualifiers and you can easily query for both if you care about timing.
If you want to filter more directly you can filter for statement that have start time (P580) but no end time (P582). ChristianKl16:37, 11 December 2020 (UTC)[reply]
@ChristianKl: I totally agree. My point is that the Wikibase statement rankings have no defined and solid meaning and frankly they shouldn't as that is what qualifiers are for. I really do not think we should have qualifiers that are trying to lock in their meaning. Just add the meaning via qualifiers to begin with. This is why I am opposed to this property proposal as it targets a property for a specific Wikibase statement ranking attempting to give the rankings some more solid meaning when the meaning should be all in the qualifiers and thus we do not need to consider even trying to assign or define meaning to rankings (they are inflexible). We would be far better off having Wikibase ranks removed (and during the interim replaced with a rank qualifier until we can manually replace them with more appropriate qualifers).
If people start using reason for deprecated rank (P2241) for statements that aren't deprecated that likely leads to increased confusion about ranks and what it means to be deprecated and should be avoided.
There is your problem right there. What does "deprecation" even mean? Dictionaries will tell you it means disapproval for something, i.e., it is the opposite of "preference" or "best". But what do any of those really mean? Well they are context dependent. If you want to know the current populate of Berlin, the population from 2018 is deprecated. If you want to know what it was in 2018, then the 2018 statements are preferred, right? So I take objection with deprecated and preferred to begin with but this is only made worse by trying to shoehorn meaning into Wikibase statement rankings. Do you want meaning controlled by the Wikibase software or by Wikidata community and the statements they curate? As soon as you attempt to assign meaning to such a field you give it power. Let's not give power to a very small and inflexible Wikibase statement ranking system and instead embrace qualifiers that we can design and assign meaning to. —Uzume (talk) 17:37, 11 December 2020 (UTC)[reply]
If I want to know the current population of Berlin then I don't disapprove if someone tells me "the population from 2018 is X". It's not the answer to the question I asked but it's not a wrong answer. Deprecation is for statements that are actually wrong.
The ability to get single values to queries via wdt is quite useful and we essentially would get rid of that if we would get rid of ranks. ChristianKl18:22, 11 December 2020 (UTC)[reply]
@ChristianKl: If something is actually wrong it would be better to correct it or remove/delete it than mark it with some sort of deprecation. As for your other assertion about getting single values, as far as I know there is no interface that guarantees such a thing. There are things like mw.wikibase.getBestStatements that return a set of statements filtered by "best" rank. It would be easy enough to have different qualifier based filters instead. I believe something similar can be done in SPARQL queries. There is nothing wrong with having multiple statements with the same claim/property and the same rank—even if that rank is preferred or the highest rank. Different statements might be preferred for different reasons. In fact we have reason for preferred rank (P7452) for such but there again, I question its value as preference is context dependent and thus the meaning should be all in the qualifiers and how you choose to use them. Ascribing meaning to Wikibase ranks is asking for trouble since it will invariably mean one thing in one context/claim, etc. and a different thing elsewhere over and over again effectively conflating and convoluting things. —Uzume (talk) 00:54, 12 December 2020 (UTC)[reply]
@Uzume: Throwing away the research about how claims that serious source make are wrong is very wasteful. Let's say a date of birth (P569) claim is wrong in VIAF and we have a bot that regularly imports data from VIAF. If we just remove the data it will be readded the next time the bot runs. If we deprecate the value it doesn't get readded as normal rank and the people over at VIAF have a chance to notice that a human editor over at Wikidata found that the value is false, so that VIAF can fix their data.
wdt never returns more then one value. It just gives the truthy value. Seperately, the truthy values seem to be valuable enough that enough people download the dumb of the truthy values that WMDE regularly publishes that dumb separately from the full Wikidata data.
@ChristianKl: Yes, and that is another type of "deprecation". Just how many should be supported and will every user be schooled on them all? I agree what you are saying about tagging values from external sources that we know to be questionably incorrect is a good thing. But I think we can more flexibly do that with qualifiers rather than depending on Wikibase rankings and trying to ascribe any special meanings to such. I cannot say I have considerable experience with SPARQL query semantics but even there I question the value of assigning meaning to Wikibase statement ranks. —Uzume (talk) 01:39, 12 December 2020 (UTC)[reply]
It's not another type of deprecation. It's the standard deprecation that one does on Wikidata when one finds a wrong claim with a serious source. ChristianKl02:02, 12 December 2020 (UTC)[reply]
@ChristianKl: This would seem to say otherwise: Special:Search/haswbstatement:P31=Q27949697. Despite your claim, there seem to be a large number (141 by my count) of reasons for deprecation. There is also a slightly shorter list here list of Wikidata reasons for deprecation (Q52105174). Quite a number of those have little to nothing to do with inaccuracies from a source. —Uzume (talk) 06:46, 12 December 2020 (UTC)[reply]
The point of the rank is that a user doesn't have to know about all those possible qualifier values to use them. They would need to know more if we would get rid of ranks. ChristianKl12:43, 12 December 2020 (UTC)[reply]
  •  Comment Interesting discussion as we seem to come to different conclusions about the same question. --- Jura 08:05, 11 December 2020 (UTC)[reply]
  •  Comment Just for the scope of this discussion: I added links for "normal", "preferred", "deprecated" and "rank" to the proposal. Maybe we can discuss what "deprecation" means elsewhere --- Jura 07:49, 12 December 2020 (UTC)[reply]
  •  Comment looks like there is no consensus for this. So, I suppose one must add a "reason for preferred rank" to every preferred statement instead. --- Jura 16:40, 8 January 2021 (UTC)[reply]
  • Is there a reason we need separate properties for each rank? Couldn't we rename reason for deprecated rank (P2241) to "reason for current rank" (and then get rid of reason for preferred rank (P7452))? - Nikki (talk) 14:28, 29 January 2021 (UTC)[reply]
  •  Oppose So I've reviewed this proposal again (some significant time has passed) and the two opposition votes seem invalid since they misunderstand the use of the rank system. However, instead of creation I'm adding my own opposing vote, for the same point I gave above in the past: reason for normal rank seems to always boil down to "the data is valid" given that it is the default state. It seems to me this property would have a somewhat arbitrary use case since it presumably should not apply to every normal rank statement in Wikidata. Can someone give a specific use case or have I missed that? --SilentSpike (talk) 18:59, 24 December 2021 (UTC)[reply]
    Have a look at the countless statements at Q155#P463. I think it nicely matches the description provided with the proposal.
    I think the same oppose argument could have been made for the qualifier in use for preferred ranks and deprecated rank. They are not always used either. --- Jura 10:50, 28 December 2021 (UTC)[reply]
    My objection is not about whether they are always used, but whether it is appropriate to always use them. In the case you link, I think there's an argument for all of those to have "reason of preferred rank"="currently valid value" as it makes it obvious why they are preferred over the others. Setting preferred rank is not the default, thus providing an explanation is logical. Normal rank needs no explanation because it's always "the data is valid, but not preferred". SilentSpike (talk) 14:38, 28 December 2021 (UTC)[reply]
    I see your point of view. In general, it may work if there is just a single preferred rank.
    "Normal rank needs no explanation", I wish that too, but I don't think most people understand it that well. The ranks we had at Q30#P463 illustrate the contrary. --- Jura 14:53, 28 December 2021 (UTC)[reply]
  •  Strong oppose "Normal rank needs no explanation" I like that comment from Jura enough to end this discussion. -- ~Namita (talk) 20:19, 12 May 2022 (UTC)[reply]
Asking what needs an explanation is misguided. We have cases where we have a discussion. It's useful to be able to store the information of the discussion and the explanation so that it's visible in future. ChristianKl15:27, 19 July 2022 (UTC)[reply]
If a statement is no longer valid then you should qualify it with end time (P582)... Lectrician1 (talk) 06:14, 24 December 2022 (UTC)[reply]