qualifiers for stating to which part of a media file a statement applies to. These qualifiers (i.e. "start time in media" and "end time in media") should only be used in media files (i.e. audio and videos) of Wikimedia Commons.
I mainly plan to use it in conference presentations (e.g. WikidataCon 2017) since those are the media files that I mostly watch, but I'm sure people will be able to use in the structured data of media files of any kind.
In Wikimedia Commons, there are long videos (00:57:31, 00:52:50, 00:48:37, 00:36:36) and long audios (01:15:57, 00:45:31, 00:27:51, 00:18:51). The statements in SDC sometimes only apply to some parts of the media files. For example, an indigenous language can be spoken in a short part of a documentary or the main subject during the part of a lecture can be a specific topic or a person can be the speaker at multiple parts of a conference presentation. For this reason, we need qualifiers for stating to which segment of the media file a statement applies to. I'll list some more examples.
In File:WIKITONGUES- Pau speaking French, the languages used are French, Lithuanian, Italian, English, Spanish, and Catalan. A user that would find this video might find difficult to know the parts in which each language is spoken. This qualifier can solve this issue by using it as it follows.
Another example: Consider File:35c3 WikipakaWG - AI in Wikipedia (eng).webm. Its duration is 37 min 13 s. Because of the title, we can say the main subjects are artificial intelligence and Wikipedia, but, during the presentations other topics are discussed. This qualifier can help for specifying those topics and the segments in which they are discussed.
Comment Someone in the telegram group of Wikimedia Commons mentioned that these qualifiers could also be used in audio files. I've updated the proposal. The name of the qualifiers are now "start time in media" and "end time in media" instead of "start time in video" and "end time in video" so that they can also be used in audio files. --- Rdrg109 (talk) 15:16, 21 December 2021 (UTC)[reply]
@Jura1: The problem with the quantity data type is that it only allows a single unit. This means that if someone wants to specify a timestamp, they should use the unit of the smallest time unit. For example, if someone wants to store 01:00:00, they would use 1 hour. If someone wants to store 01:32:00, they would use 92 minutes. If someone wants to store 01:12:37, they would use 4357 seconds. If someone wants to store 01:13:18.322, they would use 4398322 millisecond. Having to convert the timestamp to different units makes it more difficult to the user to contribute which might cause the property not to be used at all.
Because the data type of time index (P4895) is "quantity", I think that the proposed properties are more suitable.
You can enter 18.322 seconds as 18.322 second. No need to input 18322 millisecond.
It's clear that it requires some thought about unit selection and value entry (but even date entry require date precision to be determined). If you need help with that, please ask on project chat.
It may be easier to add "01:32", but a user couldn't really know what it means. Is is 1 hour and 32 minutes, 1 minute and 32 seconds?
Quantity datatype does provide an automatic conversion to seconds on query service, so fairly easy to figure out the values, even without checking units. --- Jura14:40, 2 January 2022 (UTC)[reply]
@Jura1: We can avoid the confusion of taking "01" in "01:32" as hour or minute, by using the same logic that uses FFmpeg. You can find information about that in this link. Here's the notation: [-][<HH>:]<MM>:<SS>[.<m>...] (omit the hyphen at the beginning -). As you can see, the hours are only considered when there are three parts in the time. --- Rdrg109 (talk) 15:48, 12 January 2022 (UTC)[reply]
To me that is a clear argument for these properties. "Blade Runner depicts a unicorn at 72 minutes" is true as long as there is a unicorn visible at that time, whether it only just appeared or not. If you agree with that, then we're missing a way to enter the start time. If you disagree with it, then the property is ambiguous and we're missing a way to distinguish start time uses from point in time uses. - Nikki (talk) 09:52, 4 January 2022 (UTC)[reply]
Comment A more appropiate data type for these properties would be Duration, but because, apparently, it hasn't been fully integrated into Wikibase, I think that if these properties were to be created, they could be handled with String until the Duration data type is fully implemented. --- Rdrg109 (talk) 04:58, 2 January 2022 (UTC)[reply]
I wrote that ticket, but it doesn't really seem to be acted upon.
One of the suggestions in the ticket is to display quantity differently, so entering it as quantity already would allow to convert values easily later.
@Jura1: In regards to the ticket: Hopefully, it'll be implemented some day. If not, we can use String for the time being.
In regards to using duration (P2047): I think the same problem happens: More friction for the user. The user would need to compute the difference between point 1 and point 2 in the video and then use that value for duration (P2047). With this properties, they would just need to use the time that is shown in their media players which most of them, if not all, show the time in the format [<HH>:]<MM>:<SS>[.<m>...] which is the one that would be used in this property. --- Rdrg109 (talk) 16:15, 12 January 2022 (UTC)[reply]
I think the time format shown in one's player depends on the player's localization. I strongly disagree to the datatype string with a format used in English. While we wait for the duration datatype, quantity is much better as it includes unit so people don't have to guess to interpret the string, and it be usable for all independent of language, script system and culture. It will also be easy to make tools to convert to and from whatever format different people prefer. Wikidata is primary intended to be accessed by computers, and datatypes should be intended for easy computer handling. --Dipsacus fullonum (talk) 18:41, 12 January 2022 (UTC)[reply]
Comment Why not speak of "start validity in media" and "end validity in media"? It would be also possible to use it for a text. Ex : One text depict something from page 2 to page 5. --2le2im-bdc (talk) 20:14, 20 February 2022 (UTC)[reply]
We use page(s) (P304) for that. I don't think it would be a good idea to use this for both timestamps and page numbers, because they are not the same thing. Page numbers are also not strictly numeric or sequential (you can have both page v and page 5 in a book). - Nikki (talk) 17:24, 18 November 2022 (UTC)[reply]
Conditional support I support adding these but only if the datatype is quantity. I realise the interface is not ideal for that, but Wikidata is about storing structured data and the interface can theoretically be improved. If someone wants to enter it as plain text and try to enforce a consistent format for it, they can already do that in wikitext. It should be possible to make a script to both parse and display HH:MM:SS timestamps in Wikidata, but unfortunately I have no idea how to do that in Commons - most of the hooks that we use for scripts aren't currently used there. - Nikki (talk) 17:39, 18 November 2022 (UTC)[reply]
@Rdrg109: the examples are not in the proposed datatype. Please either edit the examples to the proposed datatype or edit the status of the property to "on hold" to wait for a suitable datatype. ChristianKl ❪✉❫ 18:03, 4 December 2022 (UTC)[reply]