Property proposal:	Generic	Authority control	Person	Organization
	Creative work	Place	Sports	Sister projects
	Transportation	Natural science	Computing	Lexeme

Wikibase lexeme

Javanese (language) register

Under discussion

Description	suggest the relationship between similar Javanese lexemes, between its various registers (social variants), mainly ngoko (Q12500634) register (plain Javanese), krama (Q12492493) register (high/polite Javanese), and madya (Q13091955) register (middle Javanese)
Data type	Lexeme
Domain	lexeme senses, in particular forms with spelling alternatives
Example 1	kowé/kowe/ꦏꦺꦴꦮꦺ (L2328) "ngoko" register and sampéyan/sampeyan/ꦱꦩ꧀ꦥꦺꦪꦤ꧀ (L1322036) "krama" register both means "you", but have different social register, where the former is considered casual, and the latter more formal and polite. For reference, please see the online Javanese dictionary in https://www.sastra.org/leksikon (make sure to tick "kata utuh" checkbox when searching to exclude partial matches). For more information regarding this ngoko/krama, see the introduction in this Javanese-English dictionary: https://www.sastra.org/bahasa-dan-budaya/kamus-dan-leksikon/1703-javanese-english-dictionary-horne-1974-1968, especially section 4.1. Organization of the Entries, and 5. SOCIAL STYLES. See also: en.wp, https://jv.wiktionary.org/wiki/Wikisastra:Tabel_krama-ngoko jv.wikt
Example 2	(update 18 August) gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
Example 3	(update 18 August) endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)

Motivation

I'm planning to add more Javanese lexeme, but there are many words with different registers, and using synonym (P5973) is not correct, because although they have different meaning, but they have different usage, and also there are many synonyms within the same registers (for example, "you" have 4 or more synonyms in "ngoko", and 3 or more different words in "krama"). Using a dedicated property would enable to search and query the relationship between different registers. As you can ses from the links provided above, the relationship between these registers are not one-to-one, and while "ngoko" form is considered the default, not all "ngoko" have "krama" equivalent (only about 1000 without affixation, much more with affixation), much less "madya" and other register ("krama inggil", etc.) and some "krama" are equivalent to several "ngoko", because they are not true "synonym" equivalent, but rather substitutions words for different social context. Therefore this property should support multiple relationships. For example:

"you"

ngoko: kowe, (synyonym: ko'ên, kohên, kowên)
madya: samang, andika, (synyonym: dika)
krama: sampeyan, (synyonym: bênampeyan, bênangpeyan)
krama inggil: panjênêngan, (synyonym: nandalêm, paduka)

"to say, to tell"

ngoko: kandha, (synyonym: clathu, ngomong, kêcap, wara, gotèk, cluluk, wuwus, etc.)
krama: criyos, sanjang, (synyonym: sajang, wicantên, etc.)
krama andhap: matur
krama inggil: andika, ngêndika, (synyonym: unandika)
kawi: angling

(things related to hand / "tangan")

ngoko: tangan, krama inggil: asta, simple noun, but the verbs get complicated:
krama inggil: ngasta (ng- + asta) serve as substitutions for ngoko: 1 nyambut gawe (to work), 2 nggawa (to bring, take, carry), 3 nandang (to do), 4 nyekel (to hold, grasp, to handle), 5 mulang (to teach)

Bennylin (talk) 18:23, 9 August 2024 (UTC)[reply]

Update 18 August

Just to make it clearer, on behalf of Javanese speakers, we would like to request 5 new properties:

ngoko variations (see ngoko (Q12500634)
madya variations (see madya (Q13091955)
krama variations (see krama (Q12492493)
krama inggil variations (see krama inggil word (Q16893583)
krama andhap variations (see krama andhap word (Q66724909)

The first and foremost reasoning is that most Javanese dictionaries (monolingual, bilingual jv-id, jv-en, jv-nl) separate Javanese lexemes into mainly these 5 registers and link to their counterparts seamlessly. Secondly, the current available property (synonym (P5973)) doesn't fit our need for specific-linking from one lexeme to another - besides, synonymy in Javanese is called dasanama (lit. ten names), instead of register (Jv: unggah-ungguh) - and in the future I believe using these 5 new properties would make it much easier to "transform" words, phrases, sentences from one register to another (e.g. via WikiFunctions or other tools).

I've given in the form above two new examples:

mountain: gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
- L680638-S1, instead of having property "synonym: L45622-S1", should instead have property "krama variations: L45622-S1"
- Likewise L45622-S1, instead of having property "synonym: L680638-S1", should instead have property "ngoko variations: L680638-S1"
- both lexemes could have the following synonyms: ancala, indra, endra, ancala, ardi; ardya, arga, asalingga, awukir, aldaka, hyang parwata, imandri, himawan, himawat, nala, cala, dri, tambana, wanawasa, wukir, wukira, parsa of parswa, parasu, parswa = paraswa, praswa, parwaka, par(of pwar)wata, prawata, parja, pradesa, pra(of prê)bata, par(of pêr)bata, par(of pêr)bwata, par(of pêr)byata, padaka, jambangan, mahahimawan, mahendra, mèru, malaya, gana, gunungan, giri, gori, girindra, girinata, gorata, giriwara, gêgêr, basulingga, byata, ngasrama. These all means "mountain" in Javanese language
head: endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)
- endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S1) and S3 (ngoko), should have "krama variations: L999025-S1" only, while
- endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2) (ngoko), should have "krama variations: L999025-S1", and "krama inggil variations: L413863-S1", while
- sirah/ꦱꦶꦫꦃ (L999025-S1) (krama), should have "ngoko variations: L413183-S1, S2, S3", and "krama inggil variations: L413863-S1", and
- sirah/ꦱꦶꦫꦃ (L999025-S2) (ngoko and krama), has no other variations
- mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1) (krama inggil) should have "ngoko variations: L413183-S2", and "krama variations: L999025-S1"
- mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S2) and S3 (ngoko and krama), has no other variations
- all three lexemes could have the following synonyms: utamăngga, hulu, cêngêl, rajawèni, katumangga, katumăngga, kapala, kumba, têndhas, swa, sidhira, pasuhunan, murda, mukyana. All of them means "head"

Discussion

Support Thersetya2021 (talk) 14:39, 13 August 2024 (UTC)[reply]
Support Empat Tilda (talk) 01:20, 14 August 2024 (UTC)[reply]
Support Alfiyah Rizzy Afdiquni (talk) 04:41, 14 August 2024 (UTC)[reply]
Comment What's wrong with using something like language style (P6191) or variety of lexeme, form or sense (P7481) for this purpose? (Korean suffixes currently mark the register in which they are used with the former of these properties.) Mahir256 (talk) 16:53, 14 August 2024 (UTC)[reply]
I don't think you get what I mean, so I am going to give another example later. Meanwhile could you give the link for said Korean suffixes, and preferably lexemes? Bennylin (talk) 12:03, 18 August 2024 (UTC)[reply]
@Bennylin: There are a number of registers used in Korean, such as hasoseo-che (Q115744995), hapsyo-che (Q115744896), haeyo-che (Q115744904), and hae-che (Q115744915), where each is named for the verb meaning 'to do' in that language with the appropriate suffix used for indicative sentences in that language. The interrogative suffixes 나이까 (L749506), ᆸ니까 (L749614), and ᆯ까 (L1346003), to give examples of specific lexemes, have the same meaning(s) but differ only in the register used. More generally, though, it is not clear from this proposal why register differences between vocabulary items (especially register differences within a single language) should be treated differently from other stylistic differences between words in other languages with the same meaning (and indeed, the property 'language style', usable with a lot of language styles broadly construed, has at least five aliases containing the word 'register' in it) when an application (such as Ninai/Udiron and its deployment as Elemwala) can filter for senses in a language with particular language styles without requiring specialized links for them. Mahir256 (talk) 21:48, 20 August 2024 (UTC)[reply]
Give a simple query each for these questions:
What is the krama (Q12492493) for endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2)?

What is the krama inggil word (Q16893583) for sirah/ꦱꦶꦫꦃ (L999025-S1)?

What is the ngoko (Q12500634) and krama (Q12492493) for mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1)?

Bennylin (talk) 10:35, 22 August 2024 (UTC)[reply]
@Bennylin: Here are queries for endhas in the krama register, sirah in the krama inggil register, and mastaka belonging to both the ngoko and krama registers. Mahir256 (talk) 15:24, 22 August 2024 (UTC)[reply]

They're incorrect

The krama variant for endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2) is only one: sirah/ꦱꦶꦫꦃ (L999025-S1). The rest of them, while they have the krama register, are not the krama _for_ endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2).
The ngoko variant for mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1) is only one: endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183-S2). The rest of them, while they have the ngoko register, are not the ngoko _for_ mastaka/ꦩꦱ꧀ꦠꦏ (L413863-S1).

So, you see, many synonym of endhas/sirah/mastaka (head) have the register ngoko, krama, or both, but none of them are paired as _the_ register variant to the triplet endhas/sirah/mastaka. Therefore we need dedicated properties to store these values. Most have one-to-one relations, while some rarely have one-to-two or two-to-one, but never one-to-many. Bennylin (talk) 11:04, 23 August 2024 (UTC)[reply]

@Mahir256, would you like to give your opinion? Regards, ZI Jony ^(Talk) 18:31, 16 September 2024 (UTC)[reply]

@Mahir256, would you like to give your opinion based on the response? Regards, ZI Jony ^(Talk) 08:54, 10 October 2024 (UTC)[reply]

I am going to

Oppose only because this property (these properties?) is specific to one language and has not as clearly been demonstrated to be useful (and distinct from language style (P6191)) for other languages with similar phenomena. While I don't claim to intuitively understand the system described better than the speakers that use it, from the links provided I'm not as convinced that the alignments are as rigid as claimed by the last response from the property's proposer (except perhaps for the Wiktionary table, although this was created/edited only by this property's proposer and has no external references provided on it); indeed, section 5.6 and 5.7 of the Horne dictionary's front matter suggests that there is fluidity in these correspondences. (There is also no need to add 'synonym' relationships between every pair of possible synonyms if the senses can be linked via item for this sense (P5137) to the same item.) If indeed there is a stronger correspondence between e.g. 'endhas' and 'sirah' compared to with e.g. 'utamangga' and 'pasuhunan' (despite the same Wikidata item being applicable to both), and if this correspondence may be substantiated with a more precise source, then this could perhaps be indicated using the synonym property only between the senses involved in that stronger correspondence qualified with 'language style' pointing to the register in question: 'endhas' 'synonym' 'sirah' ('language style' 'kromo'). Mahir256 (talk) 02:29, 11 October 2024 (UTC)[reply]

‎Presisov večjezični slovar ID

Under discussion

Description	entry for a lexeme in the online edition of Presisov večjezični slovar
Represents	Presisov večjezični slovar (Q130758466)
Data type	External identifier
Example 1	kam (L346529) 11628302
Example 2	qumësht (L1355532) 11628302
Example 3	ka (L1358646) 11789269
Example 4	мајка/majka (L226791) 11606270
Example 5	klobása (L1213785) 8164671
Number of IDs in source	372008
Formatter URL	https://www.termania.net/slovarji/presisov-vecjezicni-slovar/$1/_

Motivation

Presisov večjezični slovar (Q130758466) is a comprehensive dictionary of Albanian which can be used to add references to existing Albanian lexemes and new ones. It includes glosses in Albanian, Slovenian, English, German, French, and Serbo-Croatian, making it a potentially useful cross-refernce for contributions in other languages as well. Entries include lists of forms in addition to senses. -عُثمان (talk) 16:29, 2 November 2024 (UTC)[reply]

I have updated this proposal with some examples reflecting that this dictionary has multilingual headwords. --عُثمان (talk) 13:26, 3 November 2024 (UTC)[reply]

Discussion

Comment This should properly be considered a multilingual dictionary (as the term "večjezični" indicates); the same entry for 'ka' also appears as 7989528 (with the focus on German 'Ochse') and 8214488 (with the focus on Slovenian 'vol'). Mahir256 (talk) 21:23, 2 November 2024 (UTC)[reply]
I have updated the proposal with some examples reflecting this عُثمان (talk) 13:26, 3 November 2024 (UTC)[reply]

‎Sanzhi Dargwa dictionary ID

Under discussion

Description	entry for a lexeme in Diana Forker’s Sanzhi Dargwa dictionary
Represents	Sanzhi Dargwa dictionary (Q125749659)
Data type	External identifier
Example 1	вахт/واخت/vaxt (L1357656) LX003968
Example 2	инжир/اینژیر/inƶir (L1322127) LX001829
Example 3	муцІур/موڗور/muⱬur (L301647) LX002412
Formatter URL	https://dictionaria.clld.org/units/sanzhi-$1

Motivation

This proposal is for a property linking to entries in Sanzhi Dargwa dictionary (Q125749659) from Dargwa lexemes. The dictionary is available as a downloadable data set from Dictionaria, and contains usage examples and some grammatical details. -عُثمان (talk) 13:39, 3 November 2024 (UTC)[reply]

Discussion

‎FVDP Vietnamese dictionary ID

Under discussion

Description	entry for a lexeme in the Free Vietnamese Dictionary Project’s monolingual Vietnamese dictionary
Represents	FVDP Vietnamese dictionary (Q130812916)
Data type	External identifier
Example 1	xanh (L705061) 573415
Example 2	gặp (L1011653) 104184
Example 3	chuột (L1360864) 61855
Formatter URL	https://www.informatik.uni-leipzig.de/~duc/TD/td/index.php?bpos=$1&db=vv

Motivation

This property is proposed for use as a reference to link to Vietnamese lexemes. -عُثمان (talk) 22:26, 3 November 2024 (UTC)[reply]

Discussion

Support Mahir256 (talk) 22:44, 3 November 2024 (UTC)[reply]
Oppose I'm hesitant about claiming this ID means anything more than a very specific way to form a particular URL. I would support an external reference property about each of the FVDP dictionaries, but I think this property as proposed should be limited to qualifiers.
The Free Vietnamese Dictionary Project (FVDP) consists of a DICT (Q977872) Web server and desktop client, software to generate compatible dictionaries, and a collection of precompiled dictionaries compiled by a long-gone group of volunteers. [1] The software is licensed as open source, but I have no idea where to find the source code anymore. The provided dictionaries are available in two formats: StarDict Info (Q105858121) (which can be used with any compatible client and server) and a custom format specific to this client and server. [2]
This proposal relies on the FVDP Web server's bpos URL query parameter, which indicates the byte offset of the entry within the dictionary's index file (in which each entry is listed alphabetically, separated by 8 bytes). Specifically, it assumes the "DE1" server, one of two DICT servers that the author Hồ Ngọc Đức (Q102291268) runs out of the University of Leipzig. If you plug the same byte offset into a different server, it will likely return a different entry. For example, xanh (L705061) is 573415 on "DE1" but 573297 on "DE3" (which is currently malfunctioning). Another popular instance ("US2") no longer exposes bpos at all.
As I understand it, the purpose of this property is to durably link to an entry in the dictionary from a lexeme that inherently pertains to a specific word. The differences in byte offsets between servers illustrates that this is not an inherent property of a dictionary entry. The offsets have changed over time for a variety of reasons, such as adding more "00" front-matter entries and deleting duplicate entries. Moreover, the byte offset doesn't seem to be useful for offline distributions of this content. I think a primary external reference should follow wikt:Template:R:FVDP and its translations, which set the word parameter to the word itself. This would be a good way to indicate that the dictionary spells hóa/hoá differently in hóa đơn versus hoá nhi for no particular reason.
– Minh Nguyễn ^💬 18:19, 10 November 2024 (UTC)[reply]

Kamus Dewan Edisi Tiga ID

Under discussion

Description	entry for a Malay lexeme in Kamus Dewan Edisi Tiga in Dewan Bahasa dan Pustaka’s Gerbang Kata
Represents	Kamus Dewan Edisi Tiga (Q131448531)
Data type	External identifier
Example 1	kamus/قاموس (L184226) 176953
Example 2	pelajar/ڤلاجر (L121340) 159216
Example 3	edisi/ايديسي (L618706) 167417
Number of IDs in source	44,179
Formatter URL	http://ekamus.dbp.gov.my/Makna.aspx?kid=$1

Motivation

This property represents a series of entries in Kamus Dewan Edisi Tiga hosted by Dewan Bahasa dan Pustaka's Gerbang Kata. This is in line with suggestion to propose separate properties per dictionary, Kamus Dewan Edisi Keempat having recently been approved for its own property. -عُثمان (talk) 14:31, 15 December 2024 (UTC)[reply]

Discussion

Support Mahir256 (talk) 14:56, 15 December 2024 (UTC)[reply]

‎Comprehensive Historical Dictionary of Ladino entry ID

Done: Comprehensive Historical Dictionary of Ladino entry ID (P13220) (Talk and documentation)

Description	identifier for an entry in Avner Perez's Judaeo-Spanish dictionary hosted on folkmasa.org
Data type	External identifier
Domain	Judaeo-Spanish lexemes
Allowed values	[1-9][0-9]+
Example 1	adjustar/אגֿוּסטאר (L1348606) → 4130
Example 2	agora/אגורה (L1348605) → 7860
Example 3	agua/אגואה‎ (L8333) → 8710
Source	http://folkmasa.org/milon/pmilonh.htm
Planned use	add to existing Judaeo-Spanish lexemes
Number of IDs in source	38,177
Expected completeness	eventually complete (Q21873974)
Formatter URL	https://folkmasa.org/milon/milon41test.php?mishtane=$1
See also	Ma'agarim ID (P11280), Jiddisch-Nederlands Woordenboek ID (P11562), Diccionario de la lengua española entry ID (P12529)

Motivation

This property will provide a link to a reliable source of information on Judaeo-Spanish words and phrases. Mahir256 (talk) 17:15, 20 December 2024 (UTC)[reply]

Discussion

Support -عُثمان (talk) 18:04, 20 December 2024 (UTC)[reply]

Support - AdamSeattle (talk) 05:30, 4 January 2025 (UTC)[reply]

@Mahir256, عُثمان, AdamSeattle:

Done: Comprehensive Historical Dictionary of Ladino entry ID (P13220). --Lewis Hulbert (talk) 09:52, 8 January 2025 (UTC)[reply]

‎Spanish-German Dictionary ID

Under discussion

Description	entry for a lexeme in the online Spanish-German dictionary hosted on Termania
Represents	Spanish-German dictionary (Q131078154)
Data type	External identifier
Example 1	renuevo (L1324225) 6741728
Example 2	florete (L1401239) 6731457
Example 3	cuchilla (L1401236) 6728643
Number of IDs in source	21,354
Formatter URL	https://www.termania.net/slovarji/spanish-german-dictionary/$1/_?ld=106

Motivation

This property may be used to add references to the many existing Spanish lexemes, and new ones. -عُثمان (talk) 15:08, 23 December 2024 (UTC)[reply]

This may be of interest to @Hameryko عُثمان (talk) 15:09, 23 December 2024 (UTC)[reply]

Discussion

A Dictionary of Geology and Earth Sciences entry ID

Under discussion

Description	identifier for an entry in online fifth edition of A Dictionary of Geology and Earth Sciences
Represents	A Dictionary of Geology and Earth Sciences (Q118189234)
Data type	External identifier
Domain	lexeme, item
Allowed values	[1-9][0-9]*
Example 1	abiogenesis (L1327335) --> 12
Example 2	acceleration (L227174) --> 44
Example 3	amplitude (L29579) --> 316
Example 4	datum (L6496) --> 2147
Example 5	oxide (L24775) --> 6000
Example 6	unilocular (L1407424) --> 8888
Example 7	water (L3302) --> 9114
Example 8	X-ray photography (L1407426) --> 9280
Example 9	water (Q283) --> 9114
Example 10	azurite (Q108212) --> 693
Example 11	Bacillariophyceae (Q2878349) --> 695
Example 12	Eburonian (Q19844987) --> 2582
Example 13	kerogen (Q938398) --> 4546
Example 14	quartz porphyry (Q582768) --> 6912
Source	https://www.oxfordreference.com/display/10.1093/acref/9780198839033.001.0001/acref-9780198839033
Planned use	adding to newly created lexemes or lexemes being edited or to new items or items being edited
Number of IDs in source	over 10,000 (Cf. https://www.oxfordreference.com/display/10.1093/acref/9780198839033.001.0001/acref-9780198839033)
Expected completeness	always incomplete (Q21873886)
Formatter URL	https://www.oxfordreference.com/display/10.1093/acref/9780198839033.001.0001/acref-9780198839033-e-$1
Applicable "stated in"-value	A Dictionary of Geology and Earth Sciences (Q118189234)
Distinct-values constraint	no

Motivation

A Dictionary of Geology and Earth Sciences (Q118189234), from Oxford University Press, provides definitions of over 10,000 terms in geology and would be a useful property for both lexemes and items. AdamSeattle (talk) 07:21, 5 January 2025 (UTC)[reply]

Discussion

A Dictionary of Sociology entry ID

Under discussion

Description	identifier for an entry in online fourth edition of A Dictionary of Sociology
Represents	A Dictionary of Sociology (4th edition) (Q131678537)
Data type	External identifier
Domain	lexeme, item
Allowed values	[1-9][0-9]*
Example 1	absolutism (L315952) --> 6
Example 2	capitalism (L228269) --> 207
Example 3	collective bargaining (L1407474) --> 314
Example 4	egocentrism (L1407475) --> 693
Example 5	identity (L7507) --> 1061
Example 6	nurture (L38214) --> 1589
Example 7	patriarchy (L40566) --> 1690
Example 8	zeitgeist (L24280) --> 2520
Example 9	Florian Znaniecki (Q599197) --> 2523
Example 10	zero-sum game (Q156612) --> 2521
Example 11	victimless crime (Q2026760) --> 2463
Example 12	national character (Q15836344) --> 1514
Example 13	iatrogenesis (Q284220) --> 1056
Example 14	extended family (Q721790) --> 786
Example 15	collective bargaining (Q452421) --> 314
Source	https://www.oxfordreference.com/display/10.1093/acref/9780199683581.001.0001/acref-9780199683581
Planned use	adding to newly created lexemes or lexemes being edited or to new items or items being edited
Number of IDs in source	over 2,000 (Cf. https://www.oxfordreference.com/display/10.1093/acref/9780199683581.001.0001/acref-9780199683581)
Expected completeness	always incomplete (Q21873886)
Formatter URL	https://www.oxfordreference.com/display/10.1093/acref/9780199683581.001.0001/acref-9780199683581-e-$1
Applicable "stated in"-value	A Dictionary of Sociology (4th edition) (Q131678537)
Distinct-values constraint	no

Motivation

A Dictionary of Sociology (4th edition) (Q131678537), from Oxford University Press, provides definitions of over 2,000 terms in sociology and would be a useful property for both lexemes and items. AdamSeattle (talk) 23:18, 5 January 2025 (UTC)[reply]

Discussion

A Dictionary of Cultural Anthropology entry ID

Under discussion

Description	identifier for an entry in online edition of A Dictionary of Cultural Anthropology
Represents	A Dictionary of Cultural Anthropology (1st edition) (Q131681107)
Data type	External identifier
Domain	lexeme, item
Allowed values	[1-9][0-9]*
Example 1	acculturation (L253664) --> 1
Example 2	asexuality (L313885) --> 24
Example 3	racialization (L1407492) --> 300
Example 4	hegemony (L295351) --> 167
Example 5	ideology (L294937) --> 178
Example 6	nutritional anthropology (L1407493) --> 259
Example 7	gender (L12545) --> 151
Example 8	racism (L296365) --> 302
Example 9	worldview (L52443) --> 404
Example 10	nutritional anthropology (Q5700499) --> 259
Example 11	Human Relations Area Files (Q5487142) --> 174
Example 12	Clifford Geertz (Q310956) --> 150
Example 13	racism (Q8461) --> 302
Example 14	cultural relativism (Q741550) --> 73
Example 15	National Association for the Practice of Anthropology (Q131681138) --> 249
Example 16	salvage ethnography (Q7406565) --> 320
Source	https://www.oxfordreference.com/display/10.1093/acref/9780191836688.001.0001/acref-9780191836688
Planned use	adding to newly created lexemes or lexemes being edited or to new items or items being edited
Number of IDs in source	over 400 (Cf. https://www.oxfordreference.com/display/10.1093/acref/9780191836688.001.0001/acref-9780191836688)
Expected completeness	always incomplete (Q21873886)
Formatter URL	https://www.oxfordreference.com/display/10.1093/acref/9780191836688.001.0001/acref-9780191836688-e-$1
Applicable "stated in"-value	A Dictionary of Cultural Anthropology (1st edition) (Q131681107)
Distinct-values constraint	no

Motivation

A Dictionary of Cultural Anthropology (1st edition) (Q131681107), from Oxford University Press, provides definitions of over 400 terms in cultural anthropology and would be a useful property for both lexemes and items. AdamSeattle (talk) 04:22, 6 January 2025 (UTC)[reply]

Discussion

A Dictionary of Geography entry ID

Under discussion

Description	identifier for an entry in the online 6th edition of A Dictionary of Geography
Represents	A Dictionary of Geography (6th edition) (Q131689700)
Data type	External identifier
Domain	lexeme, item
Allowed values	[1-9][0-9]*
Example 1	abiotic (L29417) --> 3
Example 2	butte (L21778) --> 404
Example 3	chaos theory (L1407617) --> 496
Example 4	groyne (L321517) --> 1456
Example 5	hysteresis (L322213) --> 3505
Example 6	nuptiality (L1407618) --> 2202
Example 7	underdevelopment (L329838) --> 3200
Example 8	xerophyte (L1407619) --> 3327
Example 9	zoogeomorphology (L1407620) --> 4191
Example 10	Marxist geography (L1404698) --> 1980
Example 11	groyne (Q153084) --> 1456
Example 12	xerophyte (Q212337) --> 3327
Example 13	zero-sum game (Q156612) --> 3335
Example 14	Aleutian Current (Q1243936) --> 100
Example 15	Bowen ratio (Q895404) --> 385
Example 16	Carboniferous (Q133738) --> 436
Example 17	Laurasia (Q132033) --> 1826
Example 18	Organization of American States (Q123759) --> 2252
Example 19	World Bank (Q7164) --> 3324
Source	https://www.oxfordreference.com/display/10.1093/acref/9780192896391.001.0001/acref-9780192896391
Planned use	adding to newly created lexemes or lexemes being edited or to new items or items being edited
Number of IDs in source	over 3,000 (Cf. https://www.oxfordreference.com/display/10.1093/acref/9780192896391.001.0001/acref-9780192896391)
Expected completeness	always incomplete (Q21873886)
Formatter URL	https://www.oxfordreference.com/display/10.1093/acref/9780192896391.001.0001/acref-9780192896391-e-$1
Applicable "stated in"-value	A Dictionary of Geography (6th edition) (Q131689700)
Distinct-values constraint	no

Motivation

A Dictionary of Geography (6th edition) (Q131689700), from Oxford University Press, provides definitions of over 3,000 terms in geography and would be a useful property for both lexemes and items. AdamSeattle (talk) 20:16, 6 January 2025 (UTC)[reply]

Discussion

DGLAi ID

Under discussion

Description	entry for a lexeme in the DGLAi Standard Moroccan Tamazight - French / Arabic dictionary by Institut Royal de la Culture Amazighe (IRCAM) of Morocco
Data type	External identifier
Example 1	ⵙⵍⴳⵎ (L226035) 139453
Example 2	ⵙⵎⵎⵓⵙ (L226036) 138902
Example 3	ⵜⴰⴳⵔⵙⵜ (L1407657) 140376
Formatter URL	https://tal.ircam.ma/dglai/search/indexs?session=$1

Motivation

DGLAi is a dictionary created by Institut Royal de la Culture Amazighe (IRCAM) of Morocco, which explains Standard Moroccan Tamazight lexemes in French and Arabic language. This property is proposed for future usage when Tamazight lexicons are being created on Wikidata. (@Lhoussine AIT TAYFST: may have his say on this property proposal)

Discussion

Wikibase form

Wikibase sense

‎prototypical syntactic role of argument

Under discussion

Description	qualifier for has semantic argument (P9971) indicating the most basic/fundamental syntactic position of that argument for that verb sense (that is, when the argument structure is not subject to any alternations)
Data type	Item
Domain	senses on verb lexemes
Allowed values	any item indicating a syntactic position (subclasses of linguistic unit (Q11953984), but this may be too general?)
Example 1	(gather (L1163-S1) has semantic argument (P9971) gatherer (Q128357561)) → subject (Q164573)
Example 2	(liquefy (L332143-S1) has semantic argument (P9971) entity being liquefied (Q127789399)) → direct object (Q2990574)
Example 3	(accept (L5421-S2) has semantic argument (P9971) source of accepted entity (Q126380671)) → source location complement (Q3685157)
Planned use	replace object of statement has role (P3831) qualifying has semantic argument (P9971) to use this property as a qualifier instead
See also	predicate for (P9970), has semantic argument (P9971)

Motivation

This property is intended as a substitute for object of statement has role (P3831) on the semantic arguments of verb lexemes, as it helps to clarify that the particular syntactic position taken by a semantic argument is not the only such position that can be taken.

Verb predicates are planned to be modeled so that they can be reasoned about in terms of the roles played by their arguments, rather than by any particular syntactic position they may take in a sentence—so that instead of talking about 'liquefaction' involving a 'subject' and a 'direct object', or as involving an 'agent' and a 'patient', it is instead thought of as involving a 'liquifier' and an 'entity being liquefied'. This has the advantage that someone wanting to model instances of liquefaction—whether in Wikidata items or in the future in Abstract Wikipedia content—need not have to worry about the linguistic question of whether the 'entity being liquefied' is a theme or patient or whatever.

It is certainly possible to tie these roles to the syntactic positions they take with respect to the specific verb 'liquefy' in English by using object of statement has role (P3831), but this may imply to the viewer that e.g. when using the verb 'liquefy' in English the 'liquifier' is always the subject and the 'entity being liquefied' always the direct object. Indeed, for many verbs across languages there are documented variations in their argument structure, and resources like ValPal and the Unified Verb Index can be browsed which highlight such alternations in lots of different verbs.

As an example in English, one of these is the Instrumental Subject alternation, that turns a phrase like "John liquefied the tomatoes with a blender"—where 'John' is the liquifier (an agent (Q392648)), 'tomatoes' is the entity being liquefied (a patient (Q170212)), and 'blender' is a tool used in the liquefaction (an instrument (Q6535309))—into "A blender liquefied the tomatoes"—where the three nouns still retain their semantic argument roles (the blender did not somehow gain the kind of conscious awareness expected of an agent) but appear in the sentence in different places or are removed completely (the sentence now begins by mentioning the blender instead of John, and the question of who/what is controlling the blender is now unstated).

These variations, however, imply that there is some 'basic' or 'prototypical' syntactic arrangement of arguments that is being modified by that alternation, and this proposal serves to indicate that 'basic' or 'prototypical' syntactic arrangement. The proposal here is independent of the particular means of recording applicable alternations of a lexeme sense's argument structure, which is to be determined later.

Discussion

Comment I'm afraid I find this proposal confusing, can you explain a bit better. How is a liquefied entity the direct object of "liquify", for example, wouldn't the direct object be the entity before it was liquified? ArthurPSmith (talk) 21:30, 4 November 2024 (UTC)[reply]
- @ArthurPSmith: Sorry about the confusion; I added a more in-depth explanation above. (Also the label for the role taken by the direct object of 'liquefy' was not precise enough and has been adjusted.) Mahir256 (talk) 04:10, 12 November 2024 (UTC)[reply]
  For a property to work well it has to be understood by reading the property description. If it's necessary to read the motivation part to understand what a property does, there a good chance that problems will arise in the practice of using the property. ChristianKl ❪✉❫ 14:06, 13 November 2024 (UTC)[reply]
Comment I would like to give this more thought, but I think this proposal could use some more clarity. I will note that conscious awareness or animacy are not prerequisites for agency. Take this Hindustani sentence (source) for example:

مشین نے ہمیں ٹکٹ دے دیا

मशीन ने हमें टिकट दे दिया

“Ticket got given to us by machine,” roughly translated, where the machine is marked as an oblique agent with the ergative postposition, the personal pronoun is in the dative case as a recipient of the ticket, and the ticket as the subject of the sentence. Their roles are the giver, recipient, and given entity respectively. The ergative construction emphasizes the fact that we/us do not have agency, our receipt of the ticket is dependent on whether or not the machine produces is it. Similarly to the liquefy sentences above are intended to exemplify, we can alter the sentence to exclude the agent:

ہمیں ٹکٹ دے گیا

हमें टिकट दे गिया

However, the syntactic roles are exactly the same and the alternation is external to the lexeme for the main verb. (Hindustani generally is a language that I think would not have reason to use this property at all if implemented as proposed.)

Considering liquefy again, we can construct a sentence like:

An igneous intrusion liquefied its surroundings with molten rock.

Wherein an inanimate agent is used with an instrument - by comparison, it is possible to see how blender as well can be considered as filling the agent slot (in some cases). ValPal gives an interesting example for the instrumental subject alternation in “fill” which is less ambiguous:

Water filled the cup.

Unlike a sentence such as, “raindrops hit the ground,” the water is still an instrument in the above sentence. The alternation effectively emphasizes that the water can be used to complete the action. We can make a case for blender as instrument rather than agent by comparing sentences which emphasize the fact that the blender can be used to complete the action of liquefying:

The blender liquefies food.

John liquefies food with his blender.

In the first sentence, it is implied that the blender is able to liquefy food (to completion), and in the second the action is given a more habitual connotation as ascribed to its agent subject.

All that is to say, it is not clear at the moment how to demonstrate which syntactic roles can be considered prototypical, whether an alternation requires those roles to change for the arguments or is expressed through other means, and how to model the relationship of alternations to other features of the predicate sense such as telicity. -عُثمان (talk) 04:00, 14 November 2024 (UTC)[reply]

Other

Wikidata:Property proposal/Lexemes

See also

Wikibase lexeme

Javanese (language) register

Motivation

Update 18 August

Discussion

‎Presisov večjezični slovar ID

Motivation

Discussion

‎Sanzhi Dargwa dictionary ID

Motivation

Discussion

‎FVDP Vietnamese dictionary ID

Motivation

Discussion

Kamus Dewan Edisi Tiga ID

Motivation

Discussion

‎Comprehensive Historical Dictionary of Ladino entry ID

Motivation

Discussion

‎Spanish-German Dictionary ID

Motivation

Discussion

A Dictionary of Geology and Earth Sciences entry ID

Motivation

Discussion

A Dictionary of Sociology entry ID

Motivation

Discussion

A Dictionary of Cultural Anthropology entry ID

Motivation

Discussion

A Dictionary of Geography entry ID

Motivation

Discussion

DGLAi ID

Motivation

Discussion

Wikibase form

Wikibase sense

‎prototypical syntactic role of argument

Motivation

Discussion

Other

Navigation menu

Search