Shortcut: WD:PP/L
Wikidata:Property proposal/Lexemes
Property proposal: | Generic | Authority control | Person | Organization |
Creative work | Place | Sports | Sister projects | |
Transportation | Natural science | Computing | Lexeme |
See also
[edit]- Wikidata:Property proposal/Pending – properties which have been approved but which are on hold waiting for the appropriate datatype to be made available
- Wikidata:Properties for deletion – proposals for the deletion of properties
- Wikidata:External identifiers – statements to add when creating properties for external IDs
- Wikidata:Lexicographical data – information and discussion about lexicographic data on Wikidata
This page is for the proposal of new properties.
Before proposing a property
- Search if the property already exists.
- Search if the property has already been proposed.
- Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
- Select the right datatype for the property.
- Read Wikidata:Creating a property proposal for guidelines you should follow when proposing new property.
- Start writing the documentation based on the preload form below by editing the two templates at the top of the page to add proposal details.
Creating the property
- Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
- Creation can be done 1 week after the creation of the proposal, by a property creator or an administrator.
- See property creation policy.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2025/01. |
Wikibase lexeme
[edit]Description | suggest the relationship between similar Javanese lexemes, between its various registers (social variants), mainly ngoko (Q12500634) register (plain Javanese), krama (Q12492493) register (high/polite Javanese), and madya (Q13091955) register (middle Javanese) |
---|---|
Data type | Lexeme |
Domain | lexeme senses, in particular forms with spelling alternatives |
Example 1 | kowé/kowe/ꦏꦺꦴꦮꦺ (L2328) "ngoko" register and sampéyan/sampeyan/ꦱꦩ꧀ꦥꦺꦪꦤ꧀ (L1322036) "krama" register both means "you", but have different social register, where the former is considered casual, and the latter more formal and polite. For reference, please see the online Javanese dictionary in https://www.sastra.org/leksikon (make sure to tick "kata utuh" checkbox when searching to exclude partial matches). For more information regarding this ngoko/krama, see the introduction in this Javanese-English dictionary: https://www.sastra.org/bahasa-dan-budaya/kamus-dan-leksikon/1703-javanese-english-dictionary-horne-1974-1968, especially section 4.1. Organization of the Entries, and 5. SOCIAL STYLES. See also: en.wp, https://jv.wiktionary.org/wiki/Wikisastra:Tabel_krama-ngoko jv.wikt |
Example 2 | (update 18 August) gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama) |
Example 3 | (update 18 August) endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil) |
Motivation
[edit]I'm planning to add more Javanese lexeme, but there are many words with different registers, and using synonym (P5973) is not correct, because although they have different meaning, but they have different usage, and also there are many synonyms within the same registers (for example, "you" have 4 or more synonyms in "ngoko", and 3 or more different words in "krama"). Using a dedicated property would enable to search and query the relationship between different registers. As you can ses from the links provided above, the relationship between these registers are not one-to-one, and while "ngoko" form is considered the default, not all "ngoko" have "krama" equivalent (only about 1000 without affixation, much more with affixation), much less "madya" and other register ("krama inggil", etc.) and some "krama" are equivalent to several "ngoko", because they are not true "synonym" equivalent, but rather substitutions words for different social context. Therefore this property should support multiple relationships. For example:
- "you"
- ngoko: kowe, (synyonym: ko'ên, kohên, kowên)
- madya: samang, andika, (synyonym: dika)
- krama: sampeyan, (synyonym: bênampeyan, bênangpeyan)
- krama inggil: panjênêngan, (synyonym: nandalêm, paduka)
- "to say, to tell"
- ngoko: kandha, (synyonym: clathu, ngomong, kêcap, wara, gotèk, cluluk, wuwus, etc.)
- krama: criyos, sanjang, (synyonym: sajang, wicantên, etc.)
- krama andhap: matur
- krama inggil: andika, ngêndika, (synyonym: unandika)
- kawi: angling
- (things related to hand / "tangan")
- ngoko: tangan, krama inggil: asta, simple noun, but the verbs get complicated:
- krama inggil: ngasta (ng- + asta) serve as substitutions for ngoko: 1 nyambut gawe (to work), 2 nggawa (to bring, take, carry), 3 nandang (to do), 4 nyekel (to hold, grasp, to handle), 5 mulang (to teach)
Bennylin (talk) 18:23, 9 August 2024 (UTC)
Update 18 August
[edit]Just to make it clearer, on behalf of Javanese speakers, we would like to request 5 new properties:
- ngoko variations (see ngoko (Q12500634)
- madya variations (see madya (Q13091955)
- krama variations (see krama (Q12492493)
- krama inggil variations (see krama inggil word (Q16893583)
- krama andhap variations (see krama andhap word (Q66724909)
The first and foremost reasoning is that most Javanese dictionaries (monolingual, bilingual jv-id, jv-en, jv-nl) separate Javanese lexemes into mainly these 5 registers and link to their counterparts seamlessly. Secondly, the current available property (synonym (P5973)) doesn't fit our need for specific-linking from one lexeme to another - besides, synonymy in Javanese is called dasanama (lit. ten names), instead of register (Jv: unggah-ungguh) - and in the future I believe using these 5 new properties would make it much easier to "transform" words, phrases, sentences from one register to another (e.g. via WikiFunctions or other tools).
I've given in the form above two new examples:
- mountain: gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
- L680638-S1, instead of having property "synonym: L45622-S1", should instead have property "krama variations: L45622-S1"
- Likewise L45622-S1, instead of having property "synonym: L680638-S1", should instead have property "ngoko variations: L680638-S1"
- both lexemes could have the following synonyms: ancala, indra, endra, ancala, ardi; ardya, arga, asalingga, awukir, aldaka, hyang parwata, imandri, himawan, himawat, nala, cala, dri, tambana, wanawasa, wukir, wukira, parsa of parswa, parasu, parswa = paraswa, praswa, parwaka, par(of pwar)wata, prawata, parja, pradesa, pra(of prê)bata, par(of pêr)bata, par(of pêr)bwata, par(of pêr)byata, padaka, jambangan, mahahimawan, mahendra, mèru, malaya, gana, gunungan, giri, gori, girindra, girinata, gorata, giriwara, gêgêr, basulingga, byata, ngasrama. These all means "mountain" in Javanese language
- head: endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)
- endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S1) and S3 (ngoko), should have "krama variations: L999025-S1" only, while
- endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2) (ngoko), should have "krama variations: L999025-S1", and "krama inggil variations: L413863-S1", while
- sirah/ꦱꦶꦫꦃ (L999025-S1) (krama), should have "ngoko variations: L413183-S1, S2, S3", and "krama inggil variations: L413863-S1", and
- sirah/ꦱꦶꦫꦃ (L999025-S2) (ngoko and krama), has no other variations
- mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1) (krama inggil) should have "ngoko variations: L413183-S2", and "krama variations: L999025-S1"
- mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S2) and S3 (ngoko and krama), has no other variations
- all three lexemes could have the following synonyms: utamăngga, hulu, cêngêl, rajawèni, katumangga, katumăngga, kapala, kumba, têndhas, swa, sidhira, pasuhunan, murda, mukyana. All of them means "head"
Discussion
[edit]- Support Thersetya2021 (talk) 14:39, 13 August 2024 (UTC)
- Support Empat Tilda (talk) 01:20, 14 August 2024 (UTC)
- Support Alfiyah Rizzy Afdiquni (talk) 04:41, 14 August 2024 (UTC)
- Comment What's wrong with using something like language style (P6191) or variety of lexeme, form or sense (P7481) for this purpose? (Korean suffixes currently mark the register in which they are used with the former of these properties.) Mahir256 (talk) 16:53, 14 August 2024 (UTC)
- I don't think you get what I mean, so I am going to give another example later. Meanwhile could you give the link for said Korean suffixes, and preferably lexemes? Bennylin (talk) 12:03, 18 August 2024 (UTC)
- @Bennylin: There are a number of registers used in Korean, such as hasoseo-che (Q115744995), hapsyo-che (Q115744896), haeyo-che (Q115744904), and hae-che (Q115744915), where each is named for the verb meaning 'to do' in that language with the appropriate suffix used for indicative sentences in that language. The interrogative suffixes 나이까 (L749506), ᆸ니까 (L749614), and ᆯ까 (L1346003), to give examples of specific lexemes, have the same meaning(s) but differ only in the register used. More generally, though, it is not clear from this proposal why register differences between vocabulary items (especially register differences within a single language) should be treated differently from other stylistic differences between words in other languages with the same meaning (and indeed, the property 'language style', usable with a lot of language styles broadly construed, has at least five aliases containing the word 'register' in it) when an application (such as Ninai/Udiron and its deployment as Elemwala) can filter for senses in a language with particular language styles without requiring specialized links for them. Mahir256 (talk) 21:48, 20 August 2024 (UTC)
- Give a simple query each for these questions:
- What is the krama (Q12492493) for endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2)?
- What is the krama inggil word (Q16893583) for sirah/ꦱꦶꦫꦃ (L999025-S1)?
- What is the ngoko (Q12500634) and krama (Q12492493) for mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1)?
- Bennylin (talk) 10:35, 22 August 2024 (UTC)
- Give a simple query each for these questions:
- @Bennylin: There are a number of registers used in Korean, such as hasoseo-che (Q115744995), hapsyo-che (Q115744896), haeyo-che (Q115744904), and hae-che (Q115744915), where each is named for the verb meaning 'to do' in that language with the appropriate suffix used for indicative sentences in that language. The interrogative suffixes 나이까 (L749506), ᆸ니까 (L749614), and ᆯ까 (L1346003), to give examples of specific lexemes, have the same meaning(s) but differ only in the register used. More generally, though, it is not clear from this proposal why register differences between vocabulary items (especially register differences within a single language) should be treated differently from other stylistic differences between words in other languages with the same meaning (and indeed, the property 'language style', usable with a lot of language styles broadly construed, has at least five aliases containing the word 'register' in it) when an application (such as Ninai/Udiron and its deployment as Elemwala) can filter for senses in a language with particular language styles without requiring specialized links for them. Mahir256 (talk) 21:48, 20 August 2024 (UTC)
- I don't think you get what I mean, so I am going to give another example later. Meanwhile could you give the link for said Korean suffixes, and preferably lexemes? Bennylin (talk) 12:03, 18 August 2024 (UTC)
- They're incorrect
- The krama variant for endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2) is only one: sirah/ꦱꦶꦫꦃ (L999025-S1). The rest of them, while they have the krama register, are not the krama _for_ endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2).
- The ngoko variant for mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1) is only one: endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2). The rest of them, while they have the ngoko register, are not the ngoko _for_ mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1).
- So, you see, many synonym of endhas/sirah/mastaka (head) have the register ngoko, krama, or both, but none of them are paired as _the_ register variant to the triplet endhas/sirah/mastaka. Therefore we need dedicated properties to store these values. Most have one-to-one relations, while some rarely have one-to-two or two-to-one, but never one-to-many. Bennylin (talk) 11:04, 23 August 2024 (UTC)
- @Mahir256, would you like to give your opinion? Regards, ZI Jony (Talk) 18:31, 16 September 2024 (UTC)
- @Mahir256, would you like to give your opinion based on the response? Regards, ZI Jony (Talk) 08:54, 10 October 2024 (UTC)
- I am going to Oppose only because this property (these properties?) is specific to one language and has not as clearly been demonstrated to be useful (and distinct from language style (P6191)) for other languages with similar phenomena. While I don't claim to intuitively understand the system described better than the speakers that use it, from the links provided I'm not as convinced that the alignments are as rigid as claimed by the last response from the property's proposer (except perhaps for the Wiktionary table, although this was created/edited only by this property's proposer and has no external references provided on it); indeed, section 5.6 and 5.7 of the Horne dictionary's front matter suggests that there is fluidity in these correspondences. (There is also no need to add 'synonym' relationships between every pair of possible synonyms if the senses can be linked via item for this sense (P5137) to the same item.) If indeed there is a stronger correspondence between e.g. 'endhas' and 'sirah' compared to with e.g. 'utamangga' and 'pasuhunan' (despite the same Wikidata item being applicable to both), and if this correspondence may be substantiated with a more precise source, then this could perhaps be indicated using the synonym property only between the senses involved in that stronger correspondence qualified with 'language style' pointing to the register in question: 'endhas' 'synonym' 'sirah' ('language style' 'kromo'). Mahir256 (talk) 02:29, 11 October 2024 (UTC)
- @Mahir256, would you like to give your opinion based on the response? Regards, ZI Jony (Talk) 08:54, 10 October 2024 (UTC)
- @Mahir256, would you like to give your opinion? Regards, ZI Jony (Talk) 18:31, 16 September 2024 (UTC)
- They're incorrect
Presisov večjezični slovar ID
[edit]Description | entry for a lexeme in the online edition of Presisov večjezični slovar |
---|---|
Represents | Presisov večjezični slovar (Q130758466) |
Data type | External identifier |
Example 1 | kam (L346529) 11628302 |
Example 2 | qumësht (L1355532) 11628302 |
Example 3 | ka (L1358646) 11789269 |
Example 4 | мајка/majka (L226791) 11606270 |
Example 5 | klobása (L1213785) 8164671 |
Number of IDs in source | 372008 |
Formatter URL | https://www.termania.net/slovarji/presisov-vecjezicni-slovar/$1/_ |
Motivation
[edit]Presisov večjezični slovar (Q130758466) is a comprehensive dictionary of Albanian which can be used to add references to existing Albanian lexemes and new ones. It includes glosses in Albanian, Slovenian, English, German, French, and Serbo-Croatian, making it a potentially useful cross-refernce for contributions in other languages as well. Entries include lists of forms in addition to senses. -عُثمان (talk) 16:29, 2 November 2024 (UTC)
I have updated this proposal with some examples reflecting that this dictionary has multilingual headwords. --عُثمان (talk) 13:26, 3 November 2024 (UTC)
Discussion
[edit]- Comment This should properly be considered a multilingual dictionary (as the term "večjezični" indicates); the same entry for 'ka' also appears as 7989528 (with the focus on German 'Ochse') and 8214488 (with the focus on Slovenian 'vol'). Mahir256 (talk) 21:23, 2 November 2024 (UTC)
- I have updated the proposal with some examples reflecting this عُثمان (talk) 13:26, 3 November 2024 (UTC)
Sanzhi Dargwa dictionary ID
[edit]Description | entry for a lexeme in Diana Forker’s Sanzhi Dargwa dictionary |
---|---|
Represents | Sanzhi Dargwa dictionary (Q125749659) |
Data type | External identifier |
Example 1 | вахт/واخت/vaxt (L1357656) LX003968 |
Example 2 | инжир/اینژیر/inƶir (L1322127) LX001829 |
Example 3 | муцІур/موڗور/muⱬur (L301647) LX002412 |
Formatter URL | https://dictionaria.clld.org/units/sanzhi-$1 |
Motivation
[edit]This proposal is for a property linking to entries in Sanzhi Dargwa dictionary (Q125749659) from Dargwa lexemes. The dictionary is available as a downloadable data set from Dictionaria, and contains usage examples and some grammatical details. -عُثمان (talk) 13:39, 3 November 2024 (UTC)
Discussion
[edit]FVDP Vietnamese dictionary ID
[edit]Description | entry for a lexeme in the Free Vietnamese Dictionary Project’s monolingual Vietnamese dictionary |
---|---|
Represents | FVDP Vietnamese dictionary (Q130812916) |
Data type | External identifier |
Example 1 | xanh (L705061) 573415 |
Example 2 | gặp (L1011653) 104184 |
Example 3 | chuột (L1360864) 61855 |
Formatter URL | https://www.informatik.uni-leipzig.de/~duc/TD/td/index.php?bpos=$1&db=vv |
Motivation
[edit]This property is proposed for use as a reference to link to Vietnamese lexemes. -عُثمان (talk) 22:26, 3 November 2024 (UTC)
Discussion
[edit]- Support Mahir256 (talk) 22:44, 3 November 2024 (UTC)
Oppose I'm hesitant about claiming this ID means anything more than a very specific way to form a particular URL. I would support an external reference property about each of the FVDP dictionaries, but I think this property as proposed should be limited to qualifiers.
The Free Vietnamese Dictionary Project (FVDP) consists of a DICT (Q977872) Web server and desktop client, software to generate compatible dictionaries, and a collection of precompiled dictionaries compiled by a long-gone group of volunteers. [1] The software is licensed as open source, but I have no idea where to find the source code anymore. The provided dictionaries are available in two formats: StarDict Info (Q105858121) (which can be used with any compatible client and server) and a custom format specific to this client and server. [2]
This proposal relies on the FVDP Web server's
bpos
URL query parameter, which indicates the byte offset of the entry within the dictionary's index file (in which each entry is listed alphabetically, separated by 8 bytes). Specifically, it assumes the "DE1" server, one of two DICT servers that the author Hồ Ngọc Đức (Q102291268) runs out of the University of Leipzig. If you plug the same byte offset into a different server, it will likely return a different entry. For example, xanh (L705061) is 573415 on "DE1" but 573297 on "DE3" (which is currently malfunctioning). Another popular instance ("US2") no longer exposesbpos
at all.As I understand it, the purpose of this property is to durably link to an entry in the dictionary from a lexeme that inherently pertains to a specific word. The differences in byte offsets between servers illustrates that this is not an inherent property of a dictionary entry. The offsets have changed over time for a variety of reasons, such as adding more "00" front-matter entries and deleting duplicate entries. Moreover, the byte offset doesn't seem to be useful for offline distributions of this content. I think a primary external reference should follow wikt:Template:R:FVDP and its translations, which set the
word
parameter to the word itself. This would be a good way to indicate that the dictionary spells hóa/hoá differently in hóa đơn versus hoá nhi for no particular reason.
Kamus Dewan Edisi Tiga ID
[edit]Description | entry for a Malay lexeme in Kamus Dewan Edisi Tiga in Dewan Bahasa dan Pustaka’s Gerbang Kata |
---|---|
Represents | Kamus Dewan Edisi Tiga (Q131448531) |
Data type | External identifier |
Example 1 | kamus/قاموس (L184226) 176953 |
Example 2 | pelajar/ڤلاجر (L121340) 159216 |
Example 3 | edisi/ايديسي (L618706) 167417 |
Number of IDs in source | 44,179 |
Formatter URL | http://ekamus.dbp.gov.my/Makna.aspx?kid=$1 |
Motivation
[edit]This property represents a series of entries in Kamus Dewan Edisi Tiga hosted by Dewan Bahasa dan Pustaka's Gerbang Kata. This is in line with suggestion to propose separate properties per dictionary, Kamus Dewan Edisi Keempat having recently been approved for its own property. -عُثمان (talk) 14:31, 15 December 2024 (UTC)
Discussion
[edit]- Support Mahir256 (talk) 14:56, 15 December 2024 (UTC)
Comprehensive Historical Dictionary of Ladino entry ID
[edit]Description | identifier for an entry in Avner Perez's Judaeo-Spanish dictionary hosted on folkmasa.org |
---|---|
Data type | External identifier |
Domain | Judaeo-Spanish lexemes |
Allowed values | [1-9][0-9]+ |
Example 1 | adjustar/אגֿוּסטאר (L1348606) → 4130 |
Example 2 | agora/אגורה (L1348605) → 7860 |
Example 3 | agua/אגואה (L8333) → 8710 |
Source | http://folkmasa.org/milon/pmilonh.htm |
Planned use | add to existing Judaeo-Spanish lexemes |
Number of IDs in source | 38,177 |
Expected completeness | eventually complete (Q21873974) |
Formatter URL | https://folkmasa.org/milon/milon41test.php?mishtane=$1 |
See also | Ma'agarim ID (P11280), Jiddisch-Nederlands Woordenboek ID (P11562), Diccionario de la lengua española entry ID (P12529) |
Motivation
[edit]This property will provide a link to a reliable source of information on Judaeo-Spanish words and phrases. Mahir256 (talk) 17:15, 20 December 2024 (UTC)
Discussion
[edit]- Support -عُثمان (talk) 18:04, 20 December 2024 (UTC)
- Support - AdamSeattle (talk) 05:30, 4 January 2025 (UTC)
- @Mahir256, عُثمان, AdamSeattle: Done: Comprehensive Historical Dictionary of Ladino entry ID (P13220). --Lewis Hulbert (talk) 09:52, 8 January 2025 (UTC)
Spanish-German Dictionary ID
[edit]Description | entry for a lexeme in the online Spanish-German dictionary hosted on Termania |
---|---|
Represents | Spanish-German dictionary (Q131078154) |
Data type | External identifier |
Example 1 | renuevo (L1324225) 6741728 |
Example 2 | florete (L1401239) 6731457 |
Example 3 | cuchilla (L1401236) 6728643 |
Number of IDs in source | 21,354 |
Formatter URL | https://www.termania.net/slovarji/spanish-german-dictionary/$1/_?ld=106 |
Motivation
[edit]This property may be used to add references to the many existing Spanish lexemes, and new ones. -عُثمان (talk) 15:08, 23 December 2024 (UTC)
- This may be of interest to @Hameryko عُثمان (talk) 15:09, 23 December 2024 (UTC)
Discussion
[edit]A Dictionary of Geology and Earth Sciences entry ID
[edit]Description | identifier for an entry in online fifth edition of A Dictionary of Geology and Earth Sciences |
---|---|
Represents | A Dictionary of Geology and Earth Sciences (Q118189234) |
Data type | External identifier |
Domain | lexeme, item |
Allowed values | [1-9][0-9]* |
Example 1 | abiogenesis (L1327335) --> 12 |
Example 2 | acceleration (L227174) --> 44 |
Example 3 | amplitude (L29579) --> 316 |
Example 4 | datum (L6496) --> 2147 |
Example 5 | oxide (L24775) --> 6000 |
Example 6 | unilocular (L1407424) --> 8888 |
Example 7 | water (L3302) --> 9114 |
Example 8 | X-ray photography (L1407426) --> 9280 |
Example 9 | water (Q283) --> 9114 |
Example 10 | azurite (Q108212) --> 693 |
Example 11 | Bacillariophyceae (Q2878349) --> 695 |
Example 12 | Eburonian (Q19844987) --> 2582 |
Example 13 | kerogen (Q938398) --> 4546 |
Example 14 | quartz porphyry (Q582768) --> 6912 |
Source | https://www.oxfordreference.com/display/10.1093/acref/9780198839033.001.0001/acref-9780198839033 |
Planned use | adding to newly created lexemes or lexemes being edited or to new items or items being edited |
Number of IDs in source | over 10,000 (Cf. https://www.oxfordreference.com/display/10.1093/acref/9780198839033.001.0001/acref-9780198839033) |
Expected completeness | always incomplete (Q21873886) |
Formatter URL | https://www.oxfordreference.com/display/10.1093/acref/9780198839033.001.0001/acref-9780198839033-e-$1 |
Applicable "stated in"-value | A Dictionary of Geology and Earth Sciences (Q118189234) |
Distinct-values constraint | no |
Motivation
[edit]A Dictionary of Geology and Earth Sciences (Q118189234), from Oxford University Press, provides definitions of over 10,000 terms in geology and would be a useful property for both lexemes and items. AdamSeattle (talk) 07:21, 5 January 2025 (UTC)
Discussion
[edit]A Dictionary of Sociology entry ID
[edit]Motivation
[edit]A Dictionary of Sociology (4th edition) (Q131678537), from Oxford University Press, provides definitions of over 2,000 terms in sociology and would be a useful property for both lexemes and items. AdamSeattle (talk) 23:18, 5 January 2025 (UTC)
Discussion
[edit]A Dictionary of Cultural Anthropology entry ID
[edit]Motivation
[edit]A Dictionary of Cultural Anthropology (1st edition) (Q131681107), from Oxford University Press, provides definitions of over 400 terms in cultural anthropology and would be a useful property for both lexemes and items. AdamSeattle (talk) 04:22, 6 January 2025 (UTC)
Discussion
[edit]A Dictionary of Geography entry ID
[edit]Motivation
[edit]A Dictionary of Geography (6th edition) (Q131689700), from Oxford University Press, provides definitions of over 3,000 terms in geography and would be a useful property for both lexemes and items. AdamSeattle (talk) 20:16, 6 January 2025 (UTC)
Discussion
[edit]DGLAi ID
[edit]Description | entry for a lexeme in the DGLAi Standard Moroccan Tamazight - French / Arabic dictionary by Institut Royal de la Culture Amazighe (IRCAM) of Morocco |
---|---|
Data type | External identifier |
Example 1 | ⵙⵍⴳⵎ (L226035) 139453 |
Example 2 | ⵙⵎⵎⵓⵙ (L226036) 138902 |
Example 3 | ⵜⴰⴳⵔⵙⵜ (L1407657) 140376 |
Formatter URL | https://tal.ircam.ma/dglai/search/indexs?session=$1 |
Motivation
[edit]DGLAi is a dictionary created by Institut Royal de la Culture Amazighe (IRCAM) of Morocco, which explains Standard Moroccan Tamazight lexemes in French and Arabic language. This property is proposed for future usage when Tamazight lexicons are being created on Wikidata. (@Lhoussine AIT TAYFST: may have his say on this property proposal)
Discussion
[edit]Wikibase form
[edit]Wikibase sense
[edit]prototypical syntactic role of argument
[edit]Description | qualifier for has semantic argument (P9971) indicating the most basic/fundamental syntactic position of that argument for that verb sense (that is, when the argument structure is not subject to any alternations) |
---|---|
Data type | Item |
Domain | senses on verb lexemes |
Allowed values | any item indicating a syntactic position (subclasses of linguistic unit (Q11953984), but this may be too general?) |
Example 1 | (gather (L1163-S1) has semantic argument (P9971) gatherer (Q128357561)) → subject (Q164573) |
Example 2 | (liquefy (L332143-S1) has semantic argument (P9971) entity being liquefied (Q127789399)) → direct object (Q2990574) |
Example 3 | (accept (L5421-S2) has semantic argument (P9971) source of accepted entity (Q126380671)) → source location complement (Q3685157) |
Planned use | replace object of statement has role (P3831) qualifying has semantic argument (P9971) to use this property as a qualifier instead |
See also | predicate for (P9970), has semantic argument (P9971) |
Motivation
[edit]This property is intended as a substitute for object of statement has role (P3831) on the semantic arguments of verb lexemes, as it helps to clarify that the particular syntactic position taken by a semantic argument is not the only such position that can be taken.
Verb predicates are planned to be modeled so that they can be reasoned about in terms of the roles played by their arguments, rather than by any particular syntactic position they may take in a sentence—so that instead of talking about 'liquefaction' involving a 'subject' and a 'direct object', or as involving an 'agent' and a 'patient', it is instead thought of as involving a 'liquifier' and an 'entity being liquefied'. This has the advantage that someone wanting to model instances of liquefaction—whether in Wikidata items or in the future in Abstract Wikipedia content—need not have to worry about the linguistic question of whether the 'entity being liquefied' is a theme or patient or whatever.
It is certainly possible to tie these roles to the syntactic positions they take with respect to the specific verb 'liquefy' in English by using object of statement has role (P3831), but this may imply to the viewer that e.g. when using the verb 'liquefy' in English the 'liquifier' is always the subject and the 'entity being liquefied' always the direct object. Indeed, for many verbs across languages there are documented variations in their argument structure, and resources like ValPal and the Unified Verb Index can be browsed which highlight such alternations in lots of different verbs.
As an example in English, one of these is the Instrumental Subject alternation, that turns a phrase like "John liquefied the tomatoes with a blender"—where 'John' is the liquifier (an agent (Q392648)), 'tomatoes' is the entity being liquefied (a patient (Q170212)), and 'blender' is a tool used in the liquefaction (an instrument (Q6535309))—into "A blender liquefied the tomatoes"—where the three nouns still retain their semantic argument roles (the blender did not somehow gain the kind of conscious awareness expected of an agent) but appear in the sentence in different places or are removed completely (the sentence now begins by mentioning the blender instead of John, and the question of who/what is controlling the blender is now unstated).
These variations, however, imply that there is some 'basic' or 'prototypical' syntactic arrangement of arguments that is being modified by that alternation, and this proposal serves to indicate that 'basic' or 'prototypical' syntactic arrangement. The proposal here is independent of the particular means of recording applicable alternations of a lexeme sense's argument structure, which is to be determined later.
Discussion
[edit]- Comment I'm afraid I find this proposal confusing, can you explain a bit better. How is a liquefied entity the direct object of "liquify", for example, wouldn't the direct object be the entity before it was liquified? ArthurPSmith (talk) 21:30, 4 November 2024 (UTC)
- @ArthurPSmith: Sorry about the confusion; I added a more in-depth explanation above. (Also the label for the role taken by the direct object of 'liquefy' was not precise enough and has been adjusted.) Mahir256 (talk) 04:10, 12 November 2024 (UTC)
- For a property to work well it has to be understood by reading the property description. If it's necessary to read the motivation part to understand what a property does, there a good chance that problems will arise in the practice of using the property. ChristianKl ❪✉❫ 14:06, 13 November 2024 (UTC)
- @ArthurPSmith: Sorry about the confusion; I added a more in-depth explanation above. (Also the label for the role taken by the direct object of 'liquefy' was not precise enough and has been adjusted.) Mahir256 (talk) 04:10, 12 November 2024 (UTC)
- Comment I would like to give this more thought, but I think this proposal could use some more clarity. I will note that conscious awareness or animacy are not prerequisites for agency. Take this Hindustani sentence (source) for example:
- مشین نے ہمیں ٹکٹ دے دیا
- मशीन ने हमें टिकट दे दिया
“Ticket got given to us by machine,” roughly translated, where the machine is marked as an oblique agent with the ergative postposition, the personal pronoun is in the dative case as a recipient of the ticket, and the ticket as the subject of the sentence. Their roles are the giver, recipient, and given entity respectively. The ergative construction emphasizes the fact that we/us do not have agency, our receipt of the ticket is dependent on whether or not the machine produces is it. Similarly to the liquefy sentences above are intended to exemplify, we can alter the sentence to exclude the agent:
- ہمیں ٹکٹ دے گیا
- हमें टिकट दे गिया
However, the syntactic roles are exactly the same and the alternation is external to the lexeme for the main verb. (Hindustani generally is a language that I think would not have reason to use this property at all if implemented as proposed.)
Considering liquefy again, we can construct a sentence like:
- An igneous intrusion liquefied its surroundings with molten rock.
Wherein an inanimate agent is used with an instrument - by comparison, it is possible to see how blender as well can be considered as filling the agent slot (in some cases). ValPal gives an interesting example for the instrumental subject alternation in “fill” which is less ambiguous:
- Water filled the cup.
Unlike a sentence such as, “raindrops hit the ground,” the water is still an instrument in the above sentence. The alternation effectively emphasizes that the water can be used to complete the action. We can make a case for blender as instrument rather than agent by comparing sentences which emphasize the fact that the blender can be used to complete the action of liquefying:
- The blender liquefies food.
- John liquefies food with his blender.
In the first sentence, it is implied that the blender is able to liquefy food (to completion), and in the second the action is given a more habitual connotation as ascribed to its agent subject.
All that is to say, it is not clear at the moment how to demonstrate which syntactic roles can be considered prototypical, whether an alternation requires those roles to change for the arguments or is expressed through other means, and how to model the relationship of alternations to other features of the predicate sense such as telicity. -عُثمان (talk) 04:00, 14 November 2024 (UTC)