Content deleted Content added

Inline

Revision as of 23:55, 14 April 2013

October 2012

; Archives
	2006 · 2007 · 2008
	2009 · 2010 · 2011
	2012 ⋯
	→ search archives

interwikilinks

Latest comment: 11 years ago2 comments2 people in discussion

hai madom, in connection with the linterwikilinks at अंशुकम्- giving interlinks is the easiest way to make that word in that language. that is why i gave links in that. interwikilinks are not automatically coming. so what is the problem of giving interwikilinks?--Dvellakat (talk) 09:44, 1 October 2012 (UTC)Reply

The primary purpose of interwiki-links is to let readers know which other Wiktionaries have the same entry, so they can read it in whatever language they're most comfortable with. A secondary purpose is to let editors know which other Wiktionaries have the same entry, so the various Wiktionaries can take information from each other, to the benefit of all. Both of these purposes are subverted when you add interwiki-links to entries that don't exist. Re: "giving interlinks is the easiest way to make that word in that language": No, it's not. If you're capable of creating an interwiki-link, then you don't need to. —Ruakh 12:21, 1 October 2012 (UTC)Reply

Marrone

Latest comment: 11 years ago4 comments2 people in discussion

You don't agree with the dictionary's etymology? http://en.wiktionary.org/w/index.php?title=marrone&diff=18308052&oldid=18286043 Djkcel (talk) 20:03, 1 October 2012 (UTC)Reply

I don't agree with stealing it. See w:Copyright. —Ruakh 20:08, 1 October 2012 (UTC)Reply

OK, I would like to provide information from external sources that's relevant. How can I properly add it - paraphrase, use quotes? I did provide the source. Djkcel (talk) 20:27, 1 October 2012 (UTC)Reply

Providing the source is not enough: it addresses the plagiarism, but not the copyright violation. As for how to avoid copyright violation — I don't know. Copyright does not cover facts/opinions/ideas, only the expression of them, so I think it should be O.K. to take information from a source, as long as you express it in your own words? —Ruakh 20:42, 1 October 2012 (UTC)Reply

this is a dictionary

Latest comment: 11 years ago3 comments3 people in discussion

You do realize that that's a particularly ridiculous example of SOP, right. And also that it's not going to get me to change my opinion on SOP, right? Purplebackpack89 ^{(Notes Taken) (Locker)} 21:22, 2 October 2012 (UTC)Reply

Wait a minute Purplebackpack89, since you don't believe that SOP should be a reason for deletion, why would it matter to you how ridiculous an example it is? Unless I'm missing something, you're vocally in favor of entries like this is a dictionary. Mglovesfun (talk) 21:31, 2 October 2012 (UTC)Reply

As Mglovesfun says — if you accept that [[this is a dictionary]], were it to exist, should be deleted, but you object to the very idea of using "SOP" as an argument for deletion, then you need to offer some other argument in its place. —Ruakh 21:52, 2 October 2012 (UTC)Reply

Templates outside a language section

Latest comment: 11 years ago1 comment1 person in discussion

Hi, you seem to be the most knowledgeable/experienced with this, so could you have a look? Wiktionary talk:Word of the day#Move 'was WOTD' template into English section? —CodeCa t 23:24, 2 October 2012 (UTC)Reply

maggot

Latest comment: 11 years ago2 comments2 people in discussion

yo, I disagree with your rollback 94.197.65.192 17:48, 10 October 2012 (UTC)Reply

. . . —Ruakh 17:50, 10 October 2012 (UTC)Reply

one and a half

Latest comment: 11 years ago3 comments3 people in discussion

In my opinion the content was useful because of numerous languages with a separate term for the number 1.5 and the fact that the Wiktionary is a dictionary aimed on translating into foreign languages as well. I see no reason for it to be non-usable and worth only deletion. 37.30.166.74 18:59, 11 October 2012 (UTC)Reply

It's true that we do include a few terms only for their translations; but in this case I would have removed the translations anyway (a single editor had added translations into twelve different languages, ranging from Tamil to Yiddish to Latin, which is good reason to think that the translations aren't accurate). If you want the entry to be recreated, feel free to list it at Wiktionary:Requests for deletion. —Ruakh 19:22, 11 October 2012 (UTC)Reply

I have verified 90% of the translations and I can add more with confidence. I also think the entry was important and I was advocating its creation as a translation target at some stage and I am going to restore it. For deletion, feel free to add {{rfd}}. --Anatoli ^{(обсудить}/^вклад) 22:34, 11 October 2012 (UTC)Reply

sannur#Icelandic

See Talk:sannur#Neuter about its inflection.

Advices

Latest comment: 11 years ago1 comment1 person in discussion

Thanks. I'll make use of them. --Fncd (talk) 23:54, 14 October 2012 (UTC)Reply

Druidry

Latest comment: 11 years ago5 comments2 people in discussion

I'm confused about your deletion of Druidry and the others as "copyrighted". I did not paraphrase anything. I worded them in my own way. Pass a Method (talk) 10:56, 17 October 2012 (UTC)Reply

They were clearly copies of Neodruidism?oldid=18483522 by 84.13.29.124 (talk). —Ruakh 12:03, 17 October 2012 (UTC)Reply

That IP is actually me. My computer crashes sometimes hence making me log off unknowingly. You can tell its me by looking at the other two contributions (also by me), both areas i have recently worked in with my account. Should i re-create them? Pass a Method (talk) 14:38, 17 October 2012 (UTC)Reply

Oh, I'm sorry. This would have been obvious if I'd noticed that [[Neodruidism]] had been created less than a minute before your edits to it; somehow I'd thought that [[Neodruidism]] was an older entry than that, and that you were doing a re-organization. I'll undelete the entries I deleted. —Ruakh 15:46, 17 October 2012 (UTC)Reply

I will go ahead and re-create them. If you dont believe the above IP is me, then please prompt me to rewrite/reword instead. Thanks Pass a Method (talk) 15:43, 17 October 2012 (UTC)Reply

Karaimism

Latest comment: 11 years ago9 comments2 people in discussion

Would have been polite to discuss it with me first no? Anyway, let's start the discussion now. Would you be kind enough to retrieve the reference from the deleted version since you have the capability to see the deleted version s and I don't yet? Many thanks. WordsWorth (talk) 20:53, 17 October 2012 (UTC)Reply

The reference was http://www.unesco.kz/qypchaq/Docs/KaraiMol/KaraiMol.pdf. —Ruakh 21:03, 17 October 2012 (UTC)Reply

OK great, thanks. Please open Central Asia's UNESCO website URL provided above and have a look at prayer 122 on page 61 in the document. You will see the same prayer in two columns one on the left in Karayce tili, and the other on the right in Russian language. If you are unfamiliar with the Russian language you will be able to copy the whole Russian section

122. Отче наш, сущий на небесах

[Матфей 6: 9�13; Лука 11: 2�4]

Отче наш, сущий на небесах, да славит)

ся единое имя Твое, и да укрепится

царствие Твое и воля Твоя на небесах

вверху и на земле внизу.

Ежедневный хлеб дай нам и прости все

грехи наши.

Не дай нам совратиться с прямых пу)

тей Твоих, но спаси нас от искусителя.

Амен.

and insert it into a translate.google.com and select from Russian to English. If you are able to read and understand Russian then so much the better because the Google translation is pretty poor after the first few lines, although it does get the point across and allows one to get to the heart of what the Turkic Karaims (not Karaite Jews) believe illustrating neatly how different that religion is from the Jewish version of the religion. No Karaite Jews in the world can agree that prayer belongs in their religion. So basically it is very clear that there are two different religions with two different names (naturally sometimes confused by even very well educated people who don't check their assumptions, but then that makes due diligence in our our duty of care towards ensuring correct related dictionary entries matter all the more important right?) I look forward to your feedback. WordsWorth (talk) 23:54, 17 October 2012 (UTC)Reply

I am not making any claims about whether there exists a Turkic group whose endonym resembles 'Karaim' but whose religion differs from that of Karaite Jews; you say there is, and sure, whatever, I'll accept that statement for the moment. The only claims I'm making are (1) that the claimed English term (deprecated template usage) Karaimism is somewhere between "extremely rare" and "nonexistent", and (2) its few stray uses are all clearly either (a) referring to the religion of Karaite Jews, not that of this Turkic group, or (b) not distinguishing between these two religions. Since your reference neither uses nor mentions the claimed English term (deprecated template usage) Karaimism, it has no bearing on my claims, and no bearing on whether we should have an entry for the claimed English term (deprecated template usage) Karaimism. —Ruakh 00:46, 18 October 2012 (UTC)Reply

Many thanks for your response. It is very curious that there should exist two very similar sounding words for the religion/culture of one group rather than referring to the practices of two different but very similarly named groups. I do take on board your comment about the sources though and I would like to follow this line of investigation please. If you could have a little time to find time to find a reliable source where Karaimism without doubt refers to Karaite Judaism and not to the somewhat "Eastern Christian" Turkic ethnicity (who are so endangered that they now similarly "somewhere between extremely rare and non-existent" as you aptly put it) that would be great. WordsWorth (talk) 08:36, 18 October 2012 (UTC)Reply

"Reliable source" is not really a relevant concept; the question is whether and how the term is used, not what "reliable sources" have to say about it.

google books:"Karaimism" turns up five borderline-usable cites:

c. 1970 in East Europe, volumes 19–20,^[1] page 25:
In any case, it must be emphasized that Mosaism or Karaimism was not the prevailing religion in the state of Khazaria.
2001, Ursula Owen, Genetics and Genomes,^[2] Index on Censorship (publisher), ISBN 090428686X, page 214:
The most widely accepted theory on Karaimism claims that it started as a reform movement among Mesopotamian Jews in the eighth century during the reign of Caliph Abu-Jafar-Abdullah al-Mansur.
2001, in Anna-Mária Bíró and Petra Kovács (editors), Diversity in Action: Local Public Management of Multi-Ethnic Communities in Central and Eastern Europe,^[3] LGI Books, ISBN 9637316701, page 333:
Karaims are one of ancient indigenous peoples of the Crimea. The Karaim language belongs to the Turkish language group. Their religion is Karaimism, based on the Old Testament.
2005, International Institute of Crimean Karaites, Karaites of Turkey:^[4]
page 7: When the Karaimism (the mainstream that does not recognize the authority of the post-Biblical tradition incorporated in the Talmud and in the later rabbinical works) became the state faith of Khazarian Kaganate in 740 A.D., the community of […]

Elijah (Eliyagu) Bashiyachi ben rabbi Menahem from Adrianople (died in 1490 in Constantinople, where he lived); one of the most outstanding scientists, writers and teachers of the XV century, last principal authority in Karaimism.
2006, Victor Tiriyaki, Complex of Karaite Kenassas in Eupatoria and Other Kenassas Around the World,^[5] Vadim Mireyev (publisher), page 3:
There are also some Jews and Russians who profess Karaimism. The principles of Karaimism are derived from the books of the Torah (the Old Testament). The aim of Karaimism is to serve G-d, love everybody, and preserve the laws of the Torah. The Karaites are distinct from Rabbinical Jews, Muslims, and Christians and therefore do not use concepts from Talmud, Koran, […]

As you can see, all of them take "Karaimism" to be synonymous with "Karaite Judaism". (We could maybe justify an entry along the lines of "(deprecated template usage) Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "extremely" is not valid. See WT:LOL and WT:LOL/E. Karaite Judaism; used only in reference to the Crimean Karaims"; but that's obviously not the result you want.)

—Ruakh 12:30, 18 October 2012 (UTC)Reply

This is a lovely start Ruakh, and thank you for sparing the time to discuss this. But when you wrote:

As you can see, all of them take "Karaimism" to be synonymous with "Karaite Judaism".

I must confess that I don't see that at all. I had asked for "reliable" (by which I meant not self-published websites where the author can change his/her definition whimsically day to day) which you did provide, but also where it "without doubt refers to Karaite Judaism" so from bitter experience might I suggest that we both take extra caution concerning what we assume. Imagine if you will a situation where an Israeli Karaite Jew having read on a wiki site that Karaimism and Karaism is the same and that Karaims of Eastern Europe are co-religionists with the Karaite Jews of Israel. Imagine if you will that this Israeli visits a "Kinesa" (from the Aramaic word for Church but related to the Hebrew word Knesset) in Lithuania or Crimea where the Karaims still pray and joins in prayers. Now this Jew is suddenly in the uncomfortable situation of finding himself praying the Our Father taught by Jesus in the Gospel of Matthew and Luke. He is basically surrounded by uncircumcised Unitarian Christians led by a Torah observant priest and finding it very difficult to escape. For missionaries interested in converting a Jew a situation like this is ideal of course, but for the Jews in question it is not necessarily something they would have walked into willingly if they had been reliably better informed. Some people in this situation might not be favourably disposed to the Wikimedia foundation at all following that. You probably have some post-graduate qualification under your belt like me and you will know therefore that any high-school student can quote a source without understanding it. We however know better that it is vitally important to be able to critically engage with a source to be able to make proper use of it. All I am suggesting is that we critically engage with what you have provided here. If we do so honestly we can see there is no evidence that these references to Karaimism concern Karaite Judaism. Mosaism (Torah observance like that of the Molokans and Subbotniks) yes but not any recognisable form of Judaism. How do we know? Because all the references you have provided concern the Turkic/"Khazar" group of Eastern Europe, whose prayer book which includes the "Our Father" you linked to at the beginning of this conversation.

c. 1970 in East Europe, volumes 19–20,^[6] page 25:
In any case, it must be emphasized that Mosaism or Karaimism was not the prevailing religion in the state of Khazaria.

Mosaism like that of the Molokans and Subbotniks is a word specifically used to describe Torah observance which is not any form of Judaism.

2001, Ursula Owen, Genetics and Genomes,^[7] Index on Censorship (publisher), ISBN 090428686X, page 214:
The most widely accepted theory on Karaimism claims that it started as a reform movement among Mesopotamian Jews in the eighth century during the reign of Caliph Abu-Jafar-Abdullah al-Mansur.

The most widely accepted theory yes, but not the most accurate theory. While Karaite Jews believe their religion began with Anan ben David among the Messopotamian Jews, the followers of Karaimism believe he was converted to this religion by someone called Abu Hanifa (see Leon Mimoy's "Karaite Anthology")

2001, in Anna-Mária Bíró and Petra Kovács (editors), Diversity in Action: Local Public Management of Multi-Ethnic Communities in Central and Eastern Europe,^[8] LGI Books, ISBN 9637316701, page 333:
Karaims are one of ancient indigenous peoples of the Crimea. The Karaim language belongs to the Turkish language group. Their religion is Karaimism, based on the Old Testament.

Yet again no reference to Karaite Judaism.

2005, International Institute of Crimean Karaites, Karaites of Turkey:^[9]
page 7: When the Karaimism (the mainstream that does not recognize the authority of the post-Biblical tradition incorporated in the Talmud and in the later rabbinical works) became the state faith of Khazarian Kaganate in 740 A.D., the community of […]

Again in Khazaria, but no mention of Karaite Jews.

Elijah (Eliyagu) Bashiyachi ben rabbi Menahem from Adrianople (died in 1490 in Constantinople, where he lived); one of the most outstanding scientists, writers and teachers of the XV century, last principal authority in Karaimism.

Indeed he was inspired by Karaimism and borrowed from it because it had innovative solutions to the problems encountered by Karaite Jews, but his approaches are not recognised as legitimate by modern Karaite Jews.

2006, Victor Tiriyaki, Complex of Karaite Kenassas in Eupatoria and Other Kenassas Around the World,^[10] Vadim Mireyev (publisher), page 3:
There are also some Jews and Russians who profess Karaimism. The principles of Karaimism are derived from the books of the Torah (the Old Testament). The aim of Karaimism is to serve G-d, love everybody, and preserve the laws of the Torah. The Karaites are distinct from Rabbinical Jews, Muslims, and Christians and therefore do not use concepts from Talmud, Koran, […]

Indeed some Jews (for example Gershom Tzipris and his followers) or perhaps one should better say ethnic Jews because their Halakhic status is questionable, have indeed abandoned normative forms of Judaism (including Karaite Judaism) and adopted Karaimism with its Christian prayers instead. Victor is correct in saying this, but it is not many by any means literally just some. I hope you can how important it can be to make sure that there are no grey areas in addressing such words and their usages. Many famous academics who are not-experts in this field have misused shamefully these words simply upon their assumptions (although in their defence everyone has a blind spot). Wiktionary can help people navigate this complex terrain if we do things right. Looking forward to your enlightening response. WordsWorth (talk) 18:11, 18 October 2012 (UTC)Reply

My "enlightening response" is simply that we are a dictionary, and we document terms as they are used, not as we think they should be used. If the people who use the term "Karaimism" fail to distinguish between two things that we think should be distinguished, then that's all there is to it. We don't get a say. —Ruakh 18:32, 18 October 2012 (UTC)Reply

Great then let's do that! Also, when you really have found an educated example where Karaimism "withough doubt" refers to Karaite Judaism and not to the religion of the Turkic "Khazar" Karaims let me know. I am only suggesting due dilligence in our duty of care against making invalid assumptions. Until that time, the references you found provide good evidence that Karaimism refers to the religion of the Turkic "Khazar" Karaims of Eastern Europe and that is a good enough definition for the page is it not? Define the word as it is used not as you assume it is. Best wishes. WordsWorth (talk) 07:17, 19 October 2012 (UTC)Reply

British and Irish Isles

Latest comment: 11 years ago7 comments3 people in discussion

Hi,

I re-created this entry with citations. I presumed this is OK given your comment on the talk page. However, I notice it has just been deleted (pretty much straight away). The sources I used were:

Lua error in Module:quote at line 2959: Parameter 1 is required.

Lua error in Module:quote at line 2959: Parameter 1 is required.

I've notified the deleting admin.

Regards, --Rannpháirtí anaithnid (talk) 21:31, 17 October 2012 (UTC)Reply

Actually, neither of those sources is relevant, because both are mentioning the term rather than using it; see w:Use–mention distinction. We are a secondary source, not a tertiary source; we base our entries directly on primary sources that actually use a term. (This is in contrast to Wikipedia, which is a tertiary source, basing its articles on secondary sources that discuss a concept.) —Ruakh 21:51, 17 October 2012 (UTC)Reply

Ah. I had assumed otherwise. Are these better:

Lua error in Module:quote at line 2959: Parameter 1 is required.
Lua error in Module:quote at line 2959: Parameter 1 is required.

--Rannpháirtí anaithnid (talk) 22:21, 17 October 2012 (UTC)Reply

Yes, much. (Note that we actually require three uses, but I'm sure you'll have no difficulty finding a third.) —Ruakh 22:36, 17 October 2012 (UTC)Reply

Cool. Here's a third ;-)

Lua error in Module:quote at line 2959: Parameter 1 is required.

Should I hang on for User:SemperBlotto or is it safe to re-create the entry? --Rannpháirtí anaithnid (talk) 22:58, 17 October 2012 (UTC)Reply

I think you can go ahead and re-create it; just be sure to include the quotations in there from the get-go. And be prepared for the possibility that it will be listed for deletion, since some of the objections were more along the lines of "this doesn't belong in a dictionary" than along the lines of "this doesn't exist". —Ruakh 00:48, 18 October 2012 (UTC)Reply

@Rannpháirtí anaithnid: I went ahead and restored the entry, since that's easier than you having to retype the whole thing. - -sche (discuss) 01:43, 18 October 2012 (UTC)Reply

RE: A bunch of copyvio

Latest comment: 11 years ago1 comment1 person in discussion

Just wanted to let you know that I was SnoopY (old account from high school, lost the logins) and was the anon who edited, reported, and followed up with you. Brownie Charles (talk) 17:22, 19 October 2012 (UTC)Reply

Joseon814

Latest comment: 11 years ago6 comments2 people in discussion

User_talk:Joseon814 has not provided the sources used for adding Goguryeo. In addition to the user page, I have asked that person twice by e-mail. Once I got a response that made no sense, and I have since sent two more e-mails. In the one I just sent, I said I would recommend deleting all of Joseon814's contributions if I don't hear back within three days. --BB12 (talk) 21:19, 22 October 2012 (UTC)Reply

He (or she — assuming "he" henceforth for convenience of typing) e-mailed me five times to tell me his source, plus once to tell me that he didn't know how to reply to talk-page messages. I tried to explain how he could do so, but he just kept re-e-mailing me variations on the same message over and over, until I finally modified his block settings to stop him from sending e-mail. (Technically speaking, he should already have my e-mail address, since I had replied to two of his messages, but he seems to have ignored my replies, so it wasn't a problem.) Anyway, his source is some edition of Christopher I. Beckwith's Koguryo, the Language of Japan's Continental Relatives: An Introduction to the Historical-comparative Study of the Japanese-Koguryoic Languages with a Preliminary Description of Archaic Northeastern Middle Chinese. According to Buyeo languages#Japanese–Koguryoic hypothesis, "Beckwith reconstructs about 140 Goguryeo words, mostly from ancient place names" (emphasis mine), so apparently these are scholarly hypotheses rather than actual attested terms. (Actually, it's probably even worse than that: I can't imagine that Beckwith's book presents his Goguryeo reconstructions in Hanzi rather than Latin script, so unless I miss my guess, Joseon814 is extrapolating wildly.) Conclusion: yes, I think these should all be deleted. —Ruakh 21:54, 22 October 2012 (UTC)Reply

Thank you for that. Now I understand the cryptic "Koguryeo Language of Japan's Continental Relatives" s/he mentioned. Reconstructions can be put in an appendix, right? Could we suggest that to avoid bumming Joseon814 out? --BB12 (talk) 21:59, 22 October 2012 (UTC)Reply

Re: "the cryptic 'Koguryeo Language of Japan's Continental Relatives' s/he mentioned": Yeah, it took me a while to figure it out: three of his e-mails gave the wrong title, and another one gave only part of the title. Re: "Reconstructions can be put in an appendix, right?": I don't know. That's what we do for accepted reconstructions in languages like PIE; I'm not sure that we do it for individual scholars' one-off reconstructions. (You might want to discuss this with CodeCat; she's done a lot with reconstructed languages here, and might have better input.) Even if we allow that in this case, we'd need to do a good job citing the source work (to avoid plagiarism), and I really don't trust Joseon814 to do that. :-/ —Ruakh 22:28, 22 October 2012 (UTC)Reply

As Joseon814 is not responding and does not seem likely to, I would like to propose that all Goguryeo data be stricken from Wiktionary, about 120 entries. Should I advertise this proposal on the BP? See Category:Goguryeo_language. — This unsigned comment was added by BenjaminBarrett12 (talk • contribs) at 20:51, 2 November 2012 (UTC).Reply

Probably at BP, and at RFD post a link to the BP discussion? —Ruakh 21:16, 2 November 2012 (UTC)Reply

Reply (rite of passage, bot)

Latest comment: 11 years ago2 comments2 people in discussion

Thank you for the explanation! I indeed saw nothing out of place in the bot's edit; should've asked at the Grease Pit first, sorry for the trouble. Cheers, --CopperKettle (talk) 14:37, 23 October 2012 (UTC)Reply

No worries. :-) —Ruakh 14:38, 23 October 2012 (UTC)Reply

Spanish feminine nouns

Latest comment: 11 years ago3 comments2 people in discussion

Hi. I'm not sure which entry you were referring to, and I apologize if I added gender to an adjective, I usually don't. But there are nouns with two gender forms in Spanish and only one in English, particularly those referring to people's occupations, such as geographer, and in those cases both genders should be provided in the translation, don't you agree? Cheers, --Edgefield (talk) 17:50, 23 October 2012 (UTC)Reply

Re: adjectives: I was referring to [[transitive]]. And there's no need to apologize; people do it a lot.

Re: nouns: I, like you, prefer to give both, but some people prefer to give only the male/masculine form (at least, in cases where the female/feminine form is regularly derived from it). I don't think we have a standard one way or the other.

—Ruakh 18:43, 23 October 2012 (UTC)Reply

Fair enough, thanks! --Edgefield (talk) 21:59, 23 October 2012 (UTC)Reply

ku-noun

Latest comment: 11 years ago3 comments2 people in discussion

Hi Ruakh

I added see here(tried to add) the parameters for plural forms in Kurdish but you can't see the green link for easy-creation of plural forms.Could you fix the templtae so that it will work.Thanks in advance and best regards.George^Animal. 17:50, 24 October 2012 (UTC)Reply

I'm sorry, but that template is a mess; it looks like a lot of different things have been copied-and-pasted into it, and some of them have been commented out, but others haven't. As a result, it expects the plural to be parameter #2, except that it actually/also expects the plural to be pl= . . . Overall, the lack of a green link is really the least of its problems. You should fix everything else first, and then it won't be so hard to get the green link working. —Ruakh 18:16, 24 October 2012 (UTC)Reply

Also — that template is used on almost a thousand pages. You should perform your experiments in a sandbox, and only copy them to {{ku-noun}} once you have them working. —Ruakh 18:24, 24 October 2012 (UTC)Reply

Dump question

Latest comment: 11 years ago4 comments2 people in discussion

Just curious, how long does it take you to get through a database dump with your Perl script (assuming you aren't doing any heavy processing of the results at the same time)? DTLHS (talk) 02:56, 25 October 2012 (UTC)Reply

Typically between 3½ to 4 minutes. —Ruakh 03:20, 25 October 2012 (UTC)Reply

To elaborate a bit . . . that's that's if I write bzip2 -d < enwiktionary-20121021-pages-articles.xml.bz2 | perl …, which is usually what I do. If I write bzip2 -d < enwiktionary-20121021-pages-articles.xml.bz2 > enwiktionary-20121021-pages-articles.xml and perl … < enwiktionary-20121021-pages-articles.xml as separate commands, then the latter is under thirty seconds. Clearly that is the approach I should be taking, since I do have enough hard-drive space for it.
How about you?
—Ruakh 13:01, 31 October 2012 (UTC)Reply

Several hours. This is in Python with the xml.sax parser on the unzipped dump. Obviously not optimal... After reading your comments below I'll probably abandon any official xml parser and just process the file line by line. DTLHS (talk) 22:02, 1 November 2012 (UTC)Reply

對象

Latest comment: 11 years ago8 comments3 people in discussion

Hi, please give me a reason why you revert my Vietnamese meaning of this word? http://en.wiktionary.org/wiki/%E5%B0%8D%E8%B1%A1 Vutrankien (talk)

I didn't: you never added the meaning. You just created a sort of stub section that claimed that this is a Vietnamese Han character. (Which isn't even true: it's a sequence of two Han characters. If it exists in Vietnamese, then it must be a noun or a verb or an adjective or something.) —Ruakh 12:41, 25 October 2012 (UTC)Reply

So I understand that I should add information about whether it a noun or verb right? Because in Vietnamese that 2-character Han compound word it's a widely used term for meaning of 'object', and as I understood Han writing system had been used in Vietnamese until around 1930.

I've seen you revert some of my other addition too: 計算, 題材, 記憶, those are widely used term in Vietnamese also ( though they didn't use that writing system anymore )

— This unsigned comment was added by Vutrankien (talk • contribs) at 16:05, 30 October 2012 (UTC).Reply

Take a look at 大洋#Vietnamese. I'm not sure if it's 100% perfect — I don't speak Vietnamese, so I really can't judge — but it looks like an excellent starting-point. You can use it as a model for your own entries. —Ruakh 17:04, 30 October 2012 (UTC)Reply

Yes, the entry 大洋#Vietnamese is correct. @Ruakh. Do you know many Arabic translations that have been stuffed by removing alt. I think, I saw you posting about it somewhere. --Anatoli ^{(обсудить}/^вклад) 13:28, 31 October 2012 (UTC)Reply

Are you referring to Wiktionary:Grease pit/2012/October#xte, and my comment that the {{t}}-ification script tries to change {{Arab|[[أجل|أجَل]]}} (’ájal) to {{t|ar|أجَل|tr=’ájal|sc=Arab}}? If so — I have no idea how many, if any, have been broken in this way. But we can probably find such cases after-the-fact, since if I'm not mistaken, the rules for recognizing a valid page-name (like أجل; as opposed to valid page-text, like أجَل) are straightforward (right?), so we could analyze the XML dump to look for cases of Arabic translations where the page-name is invalid in this way. —Ruakh 14:51, 31 October 2012 (UTC)Reply

Yes, that's the one. Among others, I have fixed the Arabic translation of flag from {{t-|ar|راية|رايَة|f|tr=rāya}} to {{t-|ar|راية|f|alt=رايَة|tr=rāya}}, which must be in the same boat. I'd appreciate if you could make some list from a dump (if it's not too hard). I must be missing some tools/skills to work with the XML dump. --Anatoli ^{(обсудить}/^вклад) 21:38, 31 October 2012 (UTC)Reply

Nope, [[flag]] is unrelated; but you can see a list of entries with that sort of problem at Wiktionary:Todo/Weird translations templates. —Ruakh 21:59, 31 October 2012 (UTC)Reply

Wow, that's a lot of problems. Thanks. --Anatoli ^{(обсудить}/^вклад) 22:12, 31 October 2012 (UTC)Reply

Template:l

Latest comment: 11 years ago11 comments3 people in discussion

I just changed to what it was before CodeCat removed Xyzy. Specifically, this revision. -- Liliana • 17:04, 27 October 2012 (UTC)Reply

Xyzy now handles the case that sc= is specified but blank. This change was needed, because of templates that called {{term}} using sc={{{sc|}}}. It is possible that there are no places where {{l}} has a specified-but-blank sc=, in which case your change probably didn't break anything; but I assume you didn't check?

I'm really willing to help with changes, if I have an opportunity to: i.e., if they're discussed beforehand. But you don't like to discuss changes, and you don't like to receive help, so my only recourse is the "rollback" button.

—Ruakh 17:54, 27 October 2012 (UTC)Reply

The thing is, it was like this before, so strictly spoken, your edit would have needed consensus, because it's a change from the previous state. -- Liliana • 17:57, 27 October 2012 (UTC)Reply

I did discuss those changes at the time. (It's funny — you're so unused to discussion and consensus-gathering that you can't even imagine that anyone else actually engages in them.) —Ruakh 18:27, 27 October 2012 (UTC)Reply

Otherwise, I'll just go ahead and create {{l2}}. I've done this before to put pressure on people. (Remember {{poscatboiler2}}? That was my doing.) -- Liliana • 17:58, 27 October 2012 (UTC)Reply

I recommend you remove that comment (and this reply to it). Freely admitting that your reason for taking an action is that you know people will object to it? Is a pretty quick path to permablocking. —Ruakh 18:27, 27 October 2012 (UTC)Reply

I'm not acting in malicious intent, I just want to fix the performance issue. It is fact that {{l2}} is much faster than {{l}}, just as {{t-simple}} is much faster than {{t}}. It should be probably kept to high-profile pages but other than that, I don't see what's so bad about the template. -- Liliana • 18:42, 27 October 2012 (UTC)Reply

Relatedly, so I don't overreact to the WT:REE and Wiktionary:Requested entries (Scientific names) performance improvements, why is {{l}} a good idea when used with {{{1}}}=en compared to a plainlink? On pages with loading-time problems, it seems undesirable in routine use. Of course other languages benefit from it or from {{term}} depending on the circumstances and sometimes it is easier to type than "#English" in a plainlink. DCDuring TALK 18:12, 27 October 2012 (UTC)Reply

I don't think {{l|en|...}} is much (any?) improvement over [[...#English|...]]. —Ruakh 18:27, 27 October 2012 (UTC)Reply

It does apply some helpful classes, though. Dunno if anyone uses them but... -- Liliana • 18:28, 27 October 2012 (UTC)Reply

The problem is that it's not billed as doing so, and the result doesn't really work. It's currently used in places where those classes are already added, by the containing context; so if someone, say, wanted English-tagged text to appear somewhat bigger than the surrounding text, then sometimes the result would be that it appeared much bigger. —Ruakh 18:33, 27 October 2012 (UTC)Reply

Dumps

Latest comment: 11 years ago8 comments3 people in discussion

Hi, I'm trying to get a bit of an idea how to use XML dumps (yes, finally!). But I wonder which of the dumps listed at [11] I should use? I imagine I want one that is as small as possible so that it loads faster. It really takes a very long time to go through the dump, so I'm wondering if I'm approaching it the wrong way. I am trying to generate a list of all transclusions of {{recons}}, but going through every page in the dump looking for that template seems a bit too slow. I could probably tell the bot to fetch the list of transclusions from Wiktionary itself, but then I would like to parse only the entries in the dump that are in that list of transclusions. How can I do that without having to parse the whole dump anyway? —CodeCa t 14:27, 31 October 2012 (UTC)Reply

[after e/c; note: this is a reply to your original comment]

Re: "I wonder which of the dumps listed at [12] I should use?": With the exception of enwiktionary-YYYYMMDD-pages-meta-history.xml.7z and enwiktionary-YYYYMMDD-pages-meta-history.xml.bz2 (which are the same except their compression-format: 7-Zip vs. bzip2), each file has different information; so, you should read the descriptions to figure out which one(s) you need. The one that I use most often — and the one I am invariably referring to when I say "the XML dump" without qualification — is enwiktionary-YYYYMMDD-pages-articles.xml.bz2 (just look for the one whose description is <big> and <b>). It includes the current wikitext of almost the entire project.

Re: "I imagine I want one that is as small as possible so that it loads faster": You probably don't want to "load" the entire dump into memory; rather, you normally want to examine one page at a time, writing out whatever information interests you. (In particular: although the file is technically an XML document, you do not want to pass it to some sort of XML parser that parses the whole file into a ginormous DOM object. Personally I don't bother with an XML parser at all — the file is in such a regular/consistent/straightforward format that you don't need full XML support — but you can use an XML parser, if you want, provided that either (1) it's a streaming parser, giving you a token at a time rather than parsing the whole document at once, or (2) you break the document into <page>…</page> sections, and hand one at a time to the XML parser as though it were a complete document.)

Re: "How long would it normally take to parse the whole file?": If I unzip it first, then — less than 30 seconds. If I unzip it on-the-fly, piping output from bzip2 -d into perl, then — between 3½ to 4 minutes.

Let me know if I can give you any help getting started.

—Ruakh 14:34, 31 October 2012 (UTC)Reply

What is the approximate size of these dumps, zipped and unzipped? --Wiki Tiki 89 14:55, 31 October 2012 (UTC)Reply

The download page gives the zipped sizes. enwiktionary-20121021-pages-articles.xml.bz2 is 344.7 MB. For unzipped sizes, I think you can assume they're about ten times bigger (for the .bz2 and .gz files). —Ruakh 15:01, 31 October 2012 (UTC)Reply

Re: "I am trying to generate a list of all transclusions of {{recons}}": In Perl, I would probably write something like perl -nwe 'BEGIN { $/ = "</page>\n" } next unless m/{{\s*recons\s*\|/; die unless m{<title>([^<]+)</title>}; my $title = $1; die unless m{<text xml:space="preserve">([^<]+)</text>} or m{<text xml:space="preserve" />()}; $_ = $1; print "$title\t$1\n" while m/({{\s*recons\s*\|.*)/g' < enwiktionary-20121021-pages-articles.xml > uses-of-recons.txt so I had a small working set, and then do whatever further analysis I wanted. —Ruakh 15:01, 31 October 2012 (UTC)Reply

I'm using the XmlReader.py script that is part of the PyWikipediabot package that I use for all bot work. It uses a streaming parser, so it can parse entries in the dump as they are requested by the script - Python has a feature called "yield" for this, which allows a function to generate new elements of a list as they are iterated over. It returns them to me as whole pages with metadata already parsed, which is convenient. However, for it to iterate through every page in the dump takes several minutes. I believe that it uncompresses the file on the fly, so I could try uncompressing it myself first to see what happens. —CodeCa t 15:16, 31 October 2012 (UTC)Reply

I tried it and compared the times. Uncompressed, iterating through the dump without any processing whatsoever takes 273 seconds, and when the dump is compressed it takes 480 seconds. So there is a significant difference, but still nowhere near the 30 seconds that your script achieved, probably because it does a lot of extra parsing to include the metadata. Unfortunately I can't read Perl code, so can you explain what steps your code above does so that I can recreate it? (Note that I am not just looking for pages that transclude {{recons}}, I want to extract the parameters of each invocation too, so that I can build a list of which reconstructed terms are being linked to and from which pages) —CodeCa t 15:39, 31 October 2012 (UTC)Reply

Re: "I am not just looking for pages that transclude {{recons}}, I want to extract the parameters of each invocation too": Yeah, the above Perl script prints out everything from {{recons| to the end of the line, so you can do whatever subsequent analysis you want.

Re: "Unfortunately I can't read Perl code, so can you explain what steps your code above does so that I can recreate it?": Oh, gosh, I can try . . .

The options -nwe:

-n wraps the entire program in a loop that reads one line at a time, saves that line in the default variable ($_), runs the specified program, and loops.
-w enables warning-messages.
-e '...' says that '...' itself is the text of the program. (N.B. this is in Bash, so I can wrap an argument in '. If I were using the regular Windows command-prompt, I'd have to use ", and then avoid " inside the program. Fortunately that's not hard in Perl. Dunno about Python.)

The program itself:

BEGIN { $/ = "</page>\n" } runs before the program begins, and says that the line-terminator ($/) is the string </page>\n. (So, instead of giving me one line at a time, Perl will now give me one page at a time.)
next unless m/{{\s*recons\s*\|/; says to skip to the next loop-iteration unless $_ (the page-XML) contains {{ followed by optional whitespace followed by recons followed by optional whitespace followed by |. (This is just for performance reasons; the program would still behave the same way even without this line.)
die unless m{<title>([^<]+)</title>}; says to error-out unless $_ (the page-XML) contains <title>...</title>. More importantly, it saves the ... (the title, except with & written as &, etc.) in the variable $1.
my $title = $1; copies the title (XML-ified) to the new variable $title for later use.
die unless m{<text xml:space="preserve">([^<]+)</text>} or m{<text xml:space="preserve" />()}; says to error-out unless $_ (the page-XML) contains either <text xml:space="preserve">...</text> (non-empty page text) or <text xml:space="preserve" /> (empty page). More importantly, it saves the ... in the former case, or the empty string in the latter case, in the variable $1.
$_ = $1; copies the page-text (XML-ified) to the default variable, $_. (Previously we'd had the full page-XML in $_, but we don't need that anymore.)
print "$title\t$1\n" while m/({{\s*recons\s*\|.*)/g finds each line containing {{ followed by optional whitespace followed by recons followed by optional whitespace followed by |. For each line, it prints the page-title (XML-ified), a tab, and everything from the {{ to the end of the line (XML-ified). (Actually, this description is somewhat oversimplified — the "optional whitespace" I mentioned can include line-breaks — but that's the idea.)

< enwiktionary-20121021-pages-articles.xml > uses-of-recons.txt are shell notations for taking standard-input from enwiktionary-20121021-pages-articles.xml and sending standard-output to uses-of-recons.txt.

Naturally, the above is written in a way that plays to Perl's strengths. If I were writing it in Python, I'd handle a lot of things differently, because Python doesn't have strengths.

By the way, note that this code misses some cases. If an entry contains {{recons}} with no arguments, this will miss it. If {{recons}} appears on a talk-page, this will miss it. If {{recons}} is called indirectly, e.g. via {{proto}}, this will miss it. If the call to {{recons}} contains line-breaks, this will catch it, but in an less-than-perfect way. This is par for the course in analyzing the XML dump: it's impossible to match all cases perfectly, and it's not worth the effort to try too hard to come close. We just have to make the best tradeoff we can — and encourage editing practices that promote the tractability of wikitext.

—Ruakh 17:47, 31 October 2012 (UTC)Reply

excessive and defective spellings

Latest comment: 11 years ago7 comments4 people in discussion

Hi. I hope you're well. Your comment makes it sound as though you view excessive spelling (at least as it's used in the displayed text of {{he-excessive spelling of}}) as derogatory or implying a negative judgment, much as excessive is usually used (put excessive pressure, etc.). (Do you view defective spelling that way too?) I'd been understanding excessive spelling as not derogatory but a term of art, meaning simply "spelling with matres lectionis not found in some other spelling" with no judgment implied (and likewise but in reverse for defective spelling). Thus, in my view, anywhere {{alternative spelling of}} can be used, so can the other, if the difference in spelling is (and is only) the presence/absence of matres lectionis. We should sort this out, because, if we're going to be doing things your way, then I'll need to know for my future edits and I suspect there are already a good many entries that need fixing.—msh210℠ (talk) 19:48, 31 October 2012 (UTC)Reply

I don't view either term as derogatory, but I think of them as both being relative to some sort of contextually-determined implicit default spelling, such that "defective spelling of _____" implies that "_____" is the default. For example, when people talk about excessive spellings (or rather "plene spellings" — judging from the Google-hits, "excessive spellings" seems to be a me-ism that I accidentally introduced into our Hebrew template system) in the Dead Sea Scrolls, I take that to mean, relative to the [most common] spelling in the Masoretic Text. For our purposes, I take the implicit default to be the spellings that are currently common in text without nikúd. Perhaps I am wrong to take it that way. (Actually, I guess this is the same issue that most editors seem to have with "alternative spelling" — they take it to imply "less common" or "less good" — but I've managed to internalize the idea that {{alternative spelling of}} just means "we've put the main definition elsewhere", whereas apparently I haven't managed to internalize the analogous ideas for "excessive" and "defective" spellings.) —Ruakh 20:04, 31 October 2012 (UTC)Reply

Re "excessive" vs. "plene": that's the beauty of templates: we can just fix the template. Re "I take the implicit default to be the spellings that are currently common in text without nikúd": ah, so this is our old difference of opinion over what we should consider primary. Even if you do take that to be primary, though, I don't think {{he-excessive spelling of}} is any more indicative of "the other one's better" than {{alternative spelling of}} is (which seems to be what you're on the berge of agreeing to in your final, parenthetical remark). Nor less.—msh210℠ (talk) 22:38, 31 October 2012 (UTC)Reply

Plene? We have that as a vocative (of an adjective which actually makes sense with this meaning), which wounds like an odd choice. Perhaps we're missing a sense of plene. Or did you mean perhaps one of the nominative singular forms?—msh210℠ (talk) 22:42, 31 October 2012 (UTC)Reply

Re: main point: O.K., yeah, I'm pretty sure that I can bring myself to accept that use of these templates. Re: "plene": I don't know Latin; I took that term from the English Wikipedia, because it sounded familiar, and fared well on Google and on Google Books. I'm pretty sure we are missing some sense at [[plene#Latin]], because [[plenus#Latin]] lists (deprecated template usage) plēnē as a related term (whereas the form that you mention is (deprecated template usage) plēne, and not liable to be listed in the related-terms section); but I can't say whether said missing sense is the relevant one. —Ruakh 23:44, 31 October 2012 (UTC)Reply

Re Latin: plēnē means "fully", but this is probably just an anglicisation of plēnus/plēna/plēnum itself, following the standard pattern. Re Hebrew: I don't have much of an opinion, but I tend to agree with Ruakh about the "implicit default". —Μετάknowledge^{discuss/deeds} 03:45, 1 November 2012 (UTC)Reply

FWIW, Dictionary.com has "plene" as an English word meaning "full, complete", but marks it obsolete. - -sche (discuss) 04:04, 1 November 2012 (UTC)Reply

November 2012

דובים

Latest comment: 11 years ago3 comments3 people in discussion

Do you know why w:he:דוביים uses this strange spelling with two yuds? Or is it supposed to be "דֻּבִּיִּים" meaning "bear-like things". --Wiki Tiki 89 09:16, 4 November 2012 (UTC)Reply

he.WP names these articles by a sort of Hebrew scientific name for the taxon, in this case (deprecated template usage) דוביים (dubiyím). (It really threw me for a loop, too, the first time I encountered it.) Notes:

I don't know where exactly these Hebrew scientific names come from, but he.WP isn't making them up; my almost-forty-year-old, hard-copy, twenty-volume translation of Félix Rodríguez de la Fuente's schoolchild encyclopedia of wildlife also mentions them. (Well, actually, it only seems to use them down to level of the family, whereas he.WP uses them for even-lower-ranked taxa; but it could just be that the Hebrew terminology has made progress since then. Or it could have been an editorial decision, since a lot of he.WP's genus names are things like Template:Hebr and Template:Hebr that are identical with common names, and that may have been judged too confusing for schoolchildren.)
But not everyone uses them. Even-Shoshan, when he feels the need to give a scientific name, always gives the regular/international/Latin-like/English-script one. (Though in part that could be because the Hebrew scientific name would generally be closely based on the Hebrew common name that he's defining, so, not very useful.)
And "scientific" might not be the right word, anyway, because although they seem to be coextensive with corresponding scientific names — e.g., Template:Hebr meaning exactly the order Psittaciformes, no more, no less — he.WP articles nonetheless give the international name as the Template:Hebr ("scientific name").
Although he.WP's naming-conventions document has [[w:he:ויקיפדיה:מתן שם לערך#ערכים על אורגניזמים|a section called Template:Hebr ("articles on organisms")]], it does not mention this practice, and what's more, it seems unaware of the existence of any Hebrew scientific names. I really don't know what to make of that.

—Ruakh 14:58, 4 November 2012 (UTC)Reply

Interesting! You may want to take a look at this discussion... perhaps the Hebrew names should be added as translations of the taxonomic names? - -sche (discuss) 22:09, 4 November 2012 (UTC)Reply

clarification

Latest comment: 11 years ago3 comments2 people in discussion

This was made back in the day when I was mostly just blindly copying stuff. I hope that you don’t think that I am stupid, as your summary could imply. --Æ&Œ (talk) 02:16, 10 November 2012 (UTC)Reply

I'm sorry if my edit-summary offended you. I wrote it tongue-in-cheek. I wasn't sure exactly what led to that error, but I opted for the funniest explanation rather than the plausiblest. :-P —Ruakh 02:34, 10 November 2012 (UTC)Reply

Yeah, I have a nasty tendency for the over‐dramatic; a lot of jokes go over my head. Please excuse me whilst I go listen to emo music and bawl in my pillow for several hours crying about how you don’t love me and you think that I am stupid. --Æ&Œ (talk) 02:45, 10 November 2012 (UTC)Reply

Template_talk:term#mention-tr-gloss-paren

Latest comment: 11 years ago1 comment1 person in discussion

Any comments on this, or not? -- Liliana • 17:13, 12 November 2012 (UTC)Reply

Debug Javascript?

Latest comment: 11 years ago3 comments2 people in discussion

I'm trying to fix a script but it's not quite doing what it's supposed to. I would like to show a debug message of some sort, to see what the value of a variable is at a certain point in the script. How can I do this? Is there a way to show an alert message that only I can see, not anyone else? —CodeCa t 19:17, 15 November 2012 (UTC)Reply

Yeah, I do that all the time. For example, in MediaWiki:Gadget-PatrollingEnhancements.js you'll find this:

if(mediaWiki.config.get('wgUserName') == 'Ruakh')
alert(shortMsg + '\n' + debugInfo);

—Ruakh 19:33, 15 November 2012 (UTC)Reply

Ok, thank you! —CodeCa t 19:37, 15 November 2012 (UTC)Reply

Template:PL:pedia

Latest comment: 11 years ago4 comments2 people in discussion

I think Mglovesfun's edit was meant to fix sauna#Serbo-Croatian. Your edit seems to have broken it again. —CodeCa t 14:17, 16 November 2012 (UTC)Reply

I'm not opposed to some change to {{PL:pedia}} — I think it's essential that we handle cases like sauna#Serbo-Croatian, and it seems that some change is needed for that — but that specific change was wrong, for the reason I gave in my edit summary. —Ruakh 15:01, 16 November 2012 (UTC)Reply

I know, I'm just letting you know why that change was made, so that you can think of a better fix. —CodeCa t 15:18, 16 November 2012 (UTC)Reply

Ah. Thank you. :-) —Ruakh 16:07, 16 November 2012 (UTC)Reply

Thread:User talk:Yair rand/WT:ACCEL

Latest comment: 11 years ago3 comments2 people in discussion

Your expertise has been called on... any help is appreciated. Thanks —Μετάknowledge^{discuss/deeds} 04:19, 18 November 2012 (UTC)Reply

That stuff is just so complicated. I've been meaning to fix the Hebrew for a while now, and it's . . . daunting. I may take a look, but don't hold your breath. :-/ —Ruakh 20:07, 20 November 2012 (UTC)Reply

Problem mostly solved, but I'm still relying on Yair to add the spans to the rest (yes, I am still inept). Maybe I'll give it a try, it's not that widely transcluded... —Μετάknowledge^{discuss/deeds} 20:57, 20 November 2012 (UTC)Reply

Hebrew direct object suffixes

Latest comment: 11 years ago8 comments2 people in discussion

Do you know where I can find a list of the old Hebrew direct object suffixes? I only know the first person (deprecated template usage) ־נִי (-ni) and (deprecated template usage) ־נוּ (-nu). --Wiki Tiki 89 14:40, 20 November 2012 (UTC)Reply

I think they're essentially the same as the other pronominal suffixes. Unfortunately, it's basically impossible to search for examples, but here are a few that come to mind:

2ms — Template:Hebr — Isaiah 44:22
3ms — Template:Hebr — Exodus 20:11
3fs — Template:Hebr — Deuteronomy 24:4
3mp — Template:Hebr — Deuteronomy 6:7

but really you can hardly read a chapter of Tanakh without finding some examples. (I came across plenty more while looking up the above, though unfortunately they were all additional examples of 2ms/3ms/3mp.)

By the way, I imagine the reason we get a nún in forms like asáni, bar'khúni, etc., is that it immediately follows a vowel. When the pronominal suffixes attach to nouns or prepositions, they tend to merge with or destroy any preceding vowel, and some of them lose their own consonant; but when they attach to verbs, they tend to leave the final vowel intact, and therefore keep their own consonant. For a better-attested example . . . when it attaches to a noun or preposition, the third-person masculine singular pronominal suffix is usually -ó, as in sif'ró, or -(á)v, as in mitsvotáv or pív, and when it attaches to a verb it's usually -hu; so you might think it's two different endings. But then you get forms like (deprecated template usage) קָצֵהוּ (katséhu), demonstrating that it's not clear-cut as all that.

(Really, I think the best way to view them is as inflectional endings. In English, are -s and -es and -Ø and -en the same suffix?)

—Ruakh 20:05, 20 November 2012 (UTC)Reply

Re: -ni: I eventually thought of (deprecated template usage) כמוני \ כָּמֹנִי (kamóni) and (deprecated template usage) מִמֶּנִּי (miméni). —Ruakh 23:27, 23 November 2012 (UTC)Reply

Actually, to answer you a bit more directly — see pages 1596–9 of the four-volume Even-Shoshan. He gives full conjugation tables, including pronominal suffixes, for one regular verb from each transitive binyán (specifically: (deprecated template usage) פָּקַד (pakád), (deprecated template usage) לִמֵּד (liméd), and (deprecated template usage) הִפְקִיד (hifkíd)), and partial conjugation tables, with a few sample pronoun-including forms, for seven irregular verbs from each transitive binyán. (I seem to recall your mentioning that you had access to Even-Shoshan at a library or bookstore or something? If not, or if you prefer, I can try to scan those four pages and e-mail them to you, albeit not until next week. Next Monday or so, send me an e-mail if you want me to do that, and I can reply with an attempt at scans.) —Ruakh 02:48, 21 November 2012 (UTC)Reply

I would really like to see that. At the bookstore I was in, they only had a single-volume Even-Shoshan. But I'm sure somewhere in Tel Aviv there is a library that has it. Would you happen to know any such library? --Wiki Tiki 89 08:22, 21 November 2012 (UTC)Reply

No idea, sorry. (I mean, I live in the U.S.; I don't really know anything about libraries in Tel Aviv. I would assume that one of Tel Aviv University's libraries would have it, but I don't know which one, and I suspect that those libraries are less open to non-students than their American counterparts would be.) According to WorldCat, the closest library to Tel Aviv that has the four-volume edition is the National Library of Israel, in Jerusalem; but I'm sure it's just that WorldCat is incomplete. —Ruakh 00:51, 22 November 2012 (UTC)Reply

It turns out the single-volume version also has these pages (and I actually ended up buying one). Thanks! --Wiki Tiki 89 20:32, 6 December 2012 (UTC)Reply

Template:Hebr —Ruakh 20:45, 6 December 2012 (UTC)Reply

Recent reversions

Latest comment: 11 years ago8 comments3 people in discussion

Hi, you recently reinstated a punctuation error in ablate, and removed information from .de, and in each case gave no reason. I've just reverted. If you still think there's a reason for the changes you made, please tell me what it is. — Smjg (talk) 00:08, 21 November 2012 (UTC)Reply

Your edit to [[.de]] was simply wrong. I'm not sure what more explanation I can give; it was wrong, I reverted it. The edit to [[ablate]], I admit, I was too quick to revert; the edit-summary of "tpyo" made it seem like a fatuous edit — and really, "punctuation error"?! — but the edit itself was O.K. —Ruakh 01:28, 21 November 2012 (UTC)Reply

If you can't explain why an edit was wrong, it implies that you're just gratuitously reverting edits for the sake of it. Either know why you're making a given edit or don't make the edit. And so you know for the future, "tpyo" is a deliberate tpyo for "typo", and is a fairly common summary used to indicate that the editor is rectifying one. — Smjg (talk) 08:02, 21 November 2012 (UTC)Reply

Re: first two sentences: I guess you now understand why your edit was wrong, so I'll ignore this.

Re: third sentence: I've gathered as much, but your edit wasn't rectifying a "typo". It was adding a period where one was optional.

—Ruakh 13:47, 21 November 2012 (UTC)Reply

No it wasn't. I removed an extra fullstop where there was already one as part of the template. — Smjg (talk) 19:29, 21 November 2012 (UTC)Reply

Gah, sorry, mea culpa. You're right; and that makes more sense. (So, yes, you're right, "punctuation error".) —Ruakh 19:59, 21 November 2012 (UTC)Reply

@Smjg: your edit removed the ". +" formatting from [[.de]], but was otherwise alright, AFAICT. .de does derive from the German nation's autonym. It seems to be Wiktionary's current practice not to indicate such things, but I don't see why (it's etymological information, after all). - -sche (discuss) 03:49, 21 November 2012 (UTC)Reply

OK, so I was confused by the ". +". It looks like a typo until you realise that it's splitting the headword into two portions, the "." and the alphabetic portion, especially given that the "." isn't linked to anything. Maybe there's a better way to notate it.... — Smjg (talk) 08:02, 21 November 2012 (UTC)Reply

I need to delete some redirects again

Latest comment: 11 years ago2 comments2 people in discussion

I remember you did this for me before. I am trying to delete all of Special:PrefixIndex/Proto-Germanic * and I tried to adapt to your script, but I don't really understand what it does and it won't work. It's at User:CodeCat/common.js and I listed some of the entries at User:CodeCat/sandbox. Can you have a look? —CodeCa t 22:25, 22 November 2012 (UTC)Reply

One problem is that jQuery('div.allpagesredirect > a').each(/*...*/) finds all <a> elements whose direct parents are <div class="allpagesredirect"> elements, but you haven't included any <div class="allpagesredirect"> elements on User:CodeCat/sandbox, so it doesn't find anything, so it doesn't do anything. (On Special:PrefixIndex, such a search would find all links that point to redirects, but that's a specific feature of that page, not a general feature of MediaWiki.)

Another problem is that if(! /[/]User:CodeCat[/]sandbox[/][^?]+$/.test(this.href)) return; skips any links whose URLs don't contain the string /User:CodeCat/sandbox/, which isn't what you want.

—Ruakh 00:23, 23 November 2012 (UTC)Reply

Wiktionary:Votes/2012-10/Enabling Tabbed Languages#Decision

Latest comment: 11 years ago3 comments3 people in discussion

You've been mentioned as someone able to help clarify matters.—msh210℠ on a public computer 03:19, 26 November 2012 (UTC)Reply

Thanks.—msh210℠ (talk) 05:01, 26 November 2012 (UTC)Reply

Actually, I think you should close the vote. —Μετάknowledge^{discuss/deeds} 05:03, 26 November 2012 (UTC)Reply

Re: m:Bureacrat

Latest comment: 11 years ago1 comment1 person in discussion

By the way, I was and am totally uninterested in those petty battles; I was merely pointing out for your benefit that, procedurally speaking, you had still not addressed this problem (which is still there). --Nemo 21:40, 26 November 2012 (UTC)Reply

December 2012

Vote

Latest comment: 11 years ago2 comments2 people in discussion

How do I get the "Quality of sources" vote closed? It is now well past its close date. Spinning Spark 20:57, 2 December 2012 (UTC)Reply

I've given it a shot . . . we'll see if people agree with how I closed it. :-P —Ruakh 02:36, 3 December 2012 (UTC)Reply

Talk:דוסי

Latest comment: 11 years ago15 comments6 people in discussion

Are you able to help, please?—msh210℠ (talk) 07:05, 4 December 2012 (UTC)Reply

Nope, sorry. I know I've heard it before, but I can't clearly remember how it was pronounced. And it almost goes without saying that it's not in Even-Shoshan. —Ruakh 13:21, 4 December 2012 (UTC)Reply

Thanks anyway.—msh210℠ (talk) 01:23, 5 December 2012 (UTC)Reply

I know this sounds stupid, but want me to ask an Israeli? I mean, I know that WikiTiki and I come into contact with native Hebrew speakers on a daily basis, and I doubt we're the only ones around here, but if it can help... —Μετάknowledge^{discuss/deeds} 03:55, 5 December 2012 (UTC)Reply

That doesn't sound stupid at all. Over the years, I've reposed a fair number of Wiktionarian questions to my parents, to my older sisters, to friends who live in Israel, to the good folks at he.wikt, and so on. (Though unfortunately, I've sometimes made the mistake of asking multiple people. Turns out, people never agree on anything!)
In this case, however, it now occurs to me that it may not be necessary, because while most of the YouTube hits for Template:Hebr depict Haredim themselves (rather than people talking about them), this one has a news-show-host-person repeatedly pronounce the title of the book Template:Hebr (roughly “classification of Dosim”) in his wonderful precise newscaster-like voice, where it is clearly /ˈdo.sim/, with no attempt to mimic an Ashkenazi kamáts. (I mean, there's some mimicry inherent in representing it as /o/ rather than /a/, but he stays in the normal Modern Israeli Hebrew vowel repertoire.) Though he uses the singular /dos/ rather than /ˈdo.si/; and his interviewee, who mostly seems to avoid using the word in speech, does seem to use /dos/ at about 3:51. But if you want to solicit opinions from other Israelis, to see if there's some variation or whatnot, be my guest!
—Ruakh 04:26, 5 December 2012 (UTC)Reply

That's pretty good; I vote dósi now. By the Ashkenazi kamáts, you mean Yiddish [ɔ]? But I'll ask iff I remember. —Μετάknowledge^{discuss/deeds} 04:58, 5 December 2012 (UTC)Reply

Ashkenazi kamatz is unrounded I think: more like ʌ. (But I can't speak to what Ruakh meant.)—msh210℠ on a public computer 05:02, 5 December 2012 (UTC)Reply

Thanks for searching videos; and great idea.—msh210℠ on a public computer 05:02, 5 December 2012 (UTC)Reply

Everything in Yiddish converges on ʌ~‌ə in unstressed positions. I mean when it's stressed, like in this case. Or am I just ignorant of something established in Ashkenazi phonology? —Μετάknowledge^{discuss/deeds} 05:05, 5 December 2012 (UTC)Reply

There's just a very wide variation in pronunciation because Yiddish was really a macrolanguage, so it's hard to generalize. The standard Yiddish /ɔ/ is [ʊ] in Poylish. And English borrowings seem to usually have it as /ʊ/ or /ʌ/. Russian borrowings seem to always have /u/. Basically it's unclear to me whether [ʌ] is English influence or was a genuine Yiddish pronunciation. --Wiki Tiki 89 08:07, 5 December 2012 (UTC)Reply

Oh, okay, I guess I mean ɔ, then. (I know almost no Yiddish-specific phonology (or phonetics) beyond what I've gleaned from hearing it spoken.) (But I meant stressed, Metaknowledge.)—msh210℠ (talk) 14:52, 5 December 2012 (UTC)Reply

I don't know Hebrew but can we put multiple translit variants, if there are disagreements about the readings depending on the speaker (perhaps up to 4, max)? In my opinion, if the word doesn't appear in dictionaries with niqud or romanisation and is pronounced in different ways, then Wiktionary can list all. The situation with Arabic is even worse and the fact that both Hebrew and Arabic don't normally write vocalisation creates variants. We can still make the users' life easier by suggesting the pronunciation. My two cents only, my concern that we have many Hebrew translation with no transliteration, even though the contributors may have a rough idea how to pronounce those words. --Anatoli ^{(обсудить}/^вклад) 01:09, 6 December 2012 (UTC)Reply

Many, many of our Hebrew words are pronounced in more than way depending on lect, but we have a standard way of transliterating. (Well, almost standard. Some things aren't pinned down yet.) The problem is that that standard transliteration depends on the pronunciation/vowelization of the term, which is clear for most older terms and even for most modern terms, but wasn't clear for this term. Variants pronunciations are included in the Pronunciation section if at all: we include only one transliteration. I think if there really were two ways to pronounce it, they would be considered different (invisible) vowelizations and thus we'd have separate definition lines for them.—msh210℠ (talk) 02:14, 6 December 2012 (UTC)Reply

I see. Perhaps if there are multiple pronunciations, then vowelization would be cumbersome to use on the header or having to split. Arabic entries seldom use vowelization. I actually meant missing transliterations altogether (usually on translations, haven't seen on entries) but there are many in Category:Hebrew terms lacking transliteration. --Anatoli ^{(обсудить}/^вклад) 05:46, 6 December 2012 (UTC)Reply

No, no, I wasn't clear. The vowelization is the same: how those vowels (and consonants) are pronounced differs. Terms lacking transliteration is (sometimes neglect, of course, but otherwise) typically because our editors don't know what the vowelization is, not because there are two different vowelizations for the same word. (Sometimes there are, but that's relatively rare.)—msh210℠ (talk) 16:02, 6 December 2012 (UTC)Reply

Talkback

Latest comment: 11 years ago2 comments1 person in discussion

Talkback. --Dan Polansky (talk) 22:01, 5 December 2012 (UTC)Reply

TB2. --Dan Polansky (talk) 22:09, 5 December 2012 (UTC)Reply

Another user having trouble seeing quotations today

Latest comment: 11 years ago1 comment1 person in discussion

See Talk:cromulent. DCDuring TALK 14:20, 6 December 2012 (UTC)Reply

Hebrew stress marks

Latest comment: 11 years ago13 comments3 people in discussion

Theoretically, if font support for this weren't virtually non-existent, would you have anything against adding stress marks (U+05BD: HEBREW POINT METEG) to our Hebrew entries? --Wiki Tiki 89 22:16, 6 December 2012 (UTC)Reply

Only for mil'él, I assume? —Ruakh 23:02, 6 December 2012 (UTC)Reply

Well, anything other than milrá'. I don't know if you would consider (deprecated template usage) דוֹקְטוֹרִים (dóktorim) to be mil'él exactly. --Wiki Tiki 89 00:30, 7 December 2012 (UTC)Reply

Good point. Anyway — no, I'd have nothing against that. In fact, I'd consider it a helpful practice. (Incidentally, w:Modern Hebrew verb conjugation uses U+05AB HEBREW ACCENT OLE instead of a méteg. But I don't know why; despite what w:Ole (cantillation) says, I've only ever seen méteg used that way, except in that one Wikipedia article.) —Ruakh 01:05, 10 December 2012 (UTC)Reply

I assume you also mean except for full outright cantillation marks. Anyway the only fonts I've found that do a decent job are the ones in the "Guttman" series. They display the meteg in the right place even if the codepoints are backwards (which they will be because of the whole character equivalence thing that wikimedia does). The problem with them is that they are all either too fancy or too awkward-looking. Also, in most of them the meteg still slightly overlaps the segol. The best one I found is "Guttman Mantova", but no one is gonna have that installed. --Wiki Tiki 89 15:18, 10 December 2012 (UTC)Reply

Re: first sentence: Sorry, I'm not sure what you mean. :-/ —Ruakh 16:54, 10 December 2012 (UTC)Reply

In text with cantillation marks, the cantillation marks are located on the stressed syllable. So every cantillation mark is also a stress mark. I guess the meteg (and supposedly the ole) was borrowed from its use as a cantillation mark to be used as a stress mark in texts without cantillation. --Wiki Tiki 89 17:08, 10 December 2012 (UTC)Reply

Re: "every cantillation mark is also a stress mark": O.K., I see what you mean. I more think of cantillated text as not needing stress marks because the ta'amím indicate stress anyway, rather than as having stress marks in the form of ta'amím, but I guess the two amount to roughly the same thing. (But if you look through a bit of real cantillated text with an eye toward this specifically, I think you'll soon see why I look at it the way I do; for example, Genesis 1:2 has Template:Hebr, where the mark appears on the stressed syllable but is also "copied", so to speak, to the last syllable. This is reasonable for a tá'am, but would be bizarre for a stress mark.) —Ruakh 18:42, 10 December 2012 (UTC)Reply

Oh, I see what you mean. It's the same question as whether the iPhone is a photo camera. I've always wondered though what cantillation marks mean at the end of a word (i.e. not on a syllable). Another good example is Numbers 1:2 Template:Hebr, where there is no other cantillation mark, so which syllable does it apply to? the last one? --Wiki Tiki 89 18:57, 10 December 2012 (UTC)Reply

Not every cantillation mark goes over/under the stressed syllable: some appear at the start or end of the word. As noted for Template:Hebr, the mark is sometimes duplicated over the stressed syllable — but sometimes it's not. (And many texts I've seen duplicate it for certain marks and not for others.) Just fyi.—msh210℠ (talk) 19:09, 10 December 2012 (UTC)Reply

What do they indicate when not on a syllable? --Wiki Tiki 89 19:35, 10 December 2012 (UTC)Reply

They indicate the cantillation (the way the word is chanted). The cantillation mark in your example (which is the same one as in my example) always appears at the end of the word, because it serves to link the word to what follows (tóhu to vavóhu, ét-rósh to kól-adát b'néi-yisra'él); it's slightly like the Unicode undertie symbol ‿, or like a much weaker version of the makáf. —Ruakh 19:42, 10 December 2012 (UTC)Reply

I have seen ole used as a stress marker. Also U+0592 _segol_. But I've seen both far less often than _meteg_.—msh210℠ (talk) 19:09, 10 December 2012 (UTC)Reply

סופגנייה

Latest comment: 11 years ago10 comments2 people in discussion

Can you check my translation of the citation at סופגנייה? Thanks in advance! --Wiki Tiki 89 17:58, 9 December 2012 (UTC)Reply

To me miznón doesn't mean "buffet"; when I hear miznón I picture something more like a concession stand or a miniature restaurant or something. The (an?) official English translation of that book uses "canteen", which I think is much closer to the mark. Also, I'd translate mét ál- as "crazy about", rather than as "to love […] to death". (But the official translation just has "love".) And I'm not sure I like the scare-quotes around "donuts". Is that an attempt to translate the effect of ka'éle? Because I think scare-quotes indicate irony in a way that the ka'éle does not; as I see it, the discourse purpose of ka'éle is to indicate that the speaker is not being very precise in his wording (in this case probably "American"), and invites his listener to make the appropriate leap. So in this case I think I'd translate ka'éle as something like "those" or "you know".
More generally . . . I usually like to use existing translations where available. If a book is a translation from English, I'll use the English source-text as a sort of back-translation. For the Bible, I usually use the King James Version, unless I really dislike it for some reason, in which I'll use Artscroll. In this case, you might want to use the official English translation as a starting-point at least (and list it as a reference). It's not a close translation, so you may well want to make changes to it, but at least it's a "safe" version to start with.
By the way, I was going to object to "k'éle" (as opposed to "ka'éle"), but Google does find some instances of Template:Hebr (though albeit seemingly fewer than of Template:Hebr). Do you know what the story is? Is there some difference? Is ka'éle a result of leveling from kazé?
Also by the way, unless I'm really missing something, you have the author's name wrong. You might want to double-check the other metadata just to be sure. And the whole sentence is part of a quoted utterance, so you might want to wrap it in “” with {{...}}.
—Ruakh 02:04, 10 December 2012 (UTC)Reply

Google doesn't feel like giving me access to the English version of the book (maybe you have to be in the US?). Does "snack bar" work for "מזנון"? I wanted to include the allusion to death in "מת על" (it's a book about a murder after all), but if you think that's wrong then "crazy about" will do. The quotes around "donuts" were an attempt to translate the foreignness of "דוֹנַטס" (does "כאלה" mean that those were the same donuts referred to by the first occurrence of "סופגנייה"?). --Wiki Tiki 89 07:30, 10 December 2012 (UTC)Reply

Re: Access to English version: It didn't give me access to the Hebrew version from your link, but to get it to work, I just had to change co.il to com. So, if the same principle works in reverse, you can try <http://books.google.co.il/books?id=QlSO7togTcsC&pg=PA176&dq=%22Hanukkah+doughnut%22>. · Re: "Snack bar": Sure, that works. · Re: Death: Sometimes a cigar is just a cigar. Mét ál is a very common expression meaning "crazy about". (But I don't think "love ___ to death" is wrong. If you get a kick out of the cleverness, then go for it.) · Re: Foreignness of "דוֹנַטס": Oh, I see. Yeah, that's tricky. Maybe something like " […] those American dónats […] "? Or just, not worry about it. Dónats isn't that foreign. · Re: "does 'כאלה' mean that those were the same donuts referred to by the first occurrence of 'סופגנייה'?": No, not at all. To clarify what I wrote above: when I described its discourse purpose, what I meant is that it's a discourse particle, and that is its purpose. —Ruakh 13:38, 10 December 2012 (UTC)Reply

Google automatically converts ".com" to the domain of whatever country you're in anyway (which is why I always accidentally post ".co.il" links). There are ways around it, but I can't access the text of the English version from either domain. As for cigars, it's usually the context that determines whether they can be more than just a cigar. In this case, not having read the book, all I know about the context surrounding this quote is that the book is about a murder. Since I can't say for sure in this case whether it is an allusion to anything, I'd be happy with either one (and "crazy about" does sound more natural than "love to death"). As for donuts, is there anything wrong with putting it in quotes? In my mind, it has the effect I desired, but it's hard to judge something that you wrote yourself. If in other peoples minds the quotes don't make them think donuts are foreign then they are useless and I'll remove them. Re: "כאלה", Ok, just wanted to make sure. --Wiki Tiki 89 14:40, 10 December 2012 (UTC)Reply

Re: English translation: In that case, here's that paragraph in the official translation:

“So,” Zadik said, addressing Michael, “I see you’ve become a permanent fixture in the News Department. You think that Israel Television is only the news? Come, let’s get out of here, nobody here has time for you now, they’re running full steam ahead. I’ll take you down to the canteen, that’s where everything important takes place anyway. Maybe they’ll even have a leftover Hanukkah doughnut for us. I love doughnuts. Not the American kind, but the Hanukkah kind, like my grandmother used to make.”

Re: cigars: Batya Gur is a serious writer; that's not to say that there's no humor in her books, but they won't randomly overuse death-related expressions.

Re: quotes: To me they come off like scare quotes: "American 'donuts'", as if to imply that they're not even real donuts, just what passes for donuts in America. But maybe that's just me.

—Ruakh 16:53, 10 December 2012 (UTC)Reply

Thanks for the translation. Too bad book translations don't strive to be literal the way Bible translations do, but it still helps me see the point better.

Re: cigars: I never said anything about humor. It's more like subtle irony. If I understood it right, there is another reference to death in the same paragraph in the Hebrew: "עכשיו זה שעות מתות בשבילך".

Re quotes: Well doesn't that just serve to reinforce the point the speaker is trying to make? I'll get rid of them if you really want though since American is already enough.

--Wiki Tiki 89 17:31, 10 December 2012 (UTC)Reply

Re: cigars: Too subtle for me, but O.K. :-)

Re: quotes: But the speaker specifically reserves dónats for the American kind, using sufganiyót for the Hanukkah kind. So the quotes, if interpreted as scare-quotes, would imply essentially the opposite terminological viewpoint from his own.

—Ruakh 18:45, 10 December 2012 (UTC)Reply

I can see how they can be unnecessary, but I don't see how they can imply the opposite. But anyway, what do you think of the translation as it is now? --Wiki Tiki 89 19:00, 10 December 2012 (UTC)Reply

Re: how they can imply the opposite: Can you imagine an American saying, "I don't even see how you can call British cops 'bobbies'. Real bobbies would have guns!" It just doesn't make sense; it's like accepting the British term as equivalent to the American, and then therefore rejecting it as inapplicable to its British referent. · But anyway, yes, I think the translation is good. —Ruakh 19:25, 10 December 2012 (UTC)Reply

Quotes

Latest comment: 11 years ago2 comments2 people in discussion

I'm a very new/naive contributor to the Wiktionary. My contributions are mainly quotes and subedits. As a contributor of quotes, I stopped putting them in when they seemed to disappear from the display though they appeared in the source code. Very mysterious.

Today I decided to check and discovered that there is now a little button to bring up the quotes on demand. This seems like qa very good idea to me. but ...,

When I'm putting in a quote the preview doesn't show the quote and I have to save the edit before I can check the quote (by 'quote' I mean what the tag calls 'quotation'). My browser is Firefox, by the way.

When I put a quote in alongside examples (as I just did for 'imagination' then it needs to precede the examples otherwise it appears awkwardly placed below them.

If you use 'quote' instead of 'quotation' in the tag then you could enlarge it; at the moment it's uncomfortably small. Also the tag (button?) would be much more useful/informative if it were followed by the range of years of the quotes (or the single year).

The new system of quotes is particularly useful as it encourages quotes to be attached to the relative meanings. I would suggest that the separate Quotations section be abandoned and existing items in the sections be moved to the appropriate meanings.

Along the same lines the Citations sections could be abandoned/transferred.

Enough of quotes ...

I have taken to subediting when I notice that a meaning doesn't start with a capital letter and end with a full stop. I have done this because of my subeditting experience at a newspaper more than 60 years ago. Arethere Wikipedia standards for such matters ? If so, where do I find them ?

Will I be emailed a response to this post ? If not, where would I look for it ? (An email to [e-mail address redacted] would be greatly appreciated even if it oly answered this question.)

10:12, 11 December 2012 (UTC)

— This unsigned comment was added by Hlmswn (talk • contribs) at 10:12, 11 December 2012 (UTC).Reply

I'm not sure that I can adequately address all of this, but:

Re: the button to show quotations: That's actually always been there. If you weren't seeing it, it was probably due to a bug. (Are you using Firefox 3.0? If so, then that's why: until last week, we were using a function that wasn't added to Firefox until 3.5.)
Re: remaining bugs (e.g., not seeing quotations on preview): Are you willing to help us track down and debug these issues? By the way — whenever a page has collapsed quotations, the sidebar at left should have a "Visibility" section, with a link that says "Show quotations" (if you have quotations set to be hidden) or "Hide quotations" (if you have them set to be shown). Clicking that link will actually modify your preference for the future (though it only works within the same browser, and only for 30 days). This might help, indirectly, with your preview problem.
Re: quotation alongside examples: I know what you mean, but even so, please put the quotation(s) below the example sentence(s).
Re: size and appearance of the 'quotations' button: I don't think I have an opinion on this. But I'm fairly confident that the small size was due to a specific esthetic or usability decision, and not tied to the length of the word "quotations". (The goal being to make it clearly distinct from the definition.) But you might want to start a discussion at Wiktionary:Beer parlour.
Re: getting rid of the separate ====Quotations==== section: Yes, absolutely. We've been working on that for a long time now. :-)
Re: getting rid of Citations: pages: That's not gonna happen, unfortunately. A lot of editors really like those pages, and annoyingly, you'll sometimes even see people remove all the quotations from the entry!
Re: capital letter and full stop: We don't have a standard on this. If you see an inconsistently-formatted entry, you can fix it, but if you see an entry that's consistently formatted in the way that you don't like, you just have to grin and bear it. :-/
Re: Wikipedia standards: This is Wiktionary, not Wikipedia. :-) We have Wiktionary:Entry layout explained.

—Ruakh 16:25, 12 December 2012 (UTC)Reply

Taking over from Tbot

Latest comment: 11 years ago35 comments7 people in discussion

Now that it seems like the whole {{t}} family drama is more resolved than not, I was wondering if you would consider taking over another of Tbot's functions that has lain dormant all these years, namely the creation of FL entries from translation sections. One of the problems was that Tbot created FL sections in tons of languages that nobody has ever gone through and which we don't have active editors in, like Swahili. Instead, I think this time around Rukhabot should only create entries if editors working in that language explicitly ask for it. For example, Anatoli has suggested that he's willing to check Russian entries, and I would certainly check Latin entries. Languages that wouldn't be requested (most likely) include languages like Oriya (i.e. those having no active editors) and Yiddish (i.e. those having a generally low quality of translations, which will need to be fixed first). Are you technically capable of this (I'm not making a judgment here, I honestly don't know what it will take), and are you willing to try this? Many thanks —Μετάknowledge^{discuss/deeds} 03:02, 12 December 2012 (UTC)Reply

Yes, please (if you can and would)! I'm sure a few people will be requesting the same for their favourite language(s). If you remember Tbot was checking for existence of the entry in the FL wiki, the gloss, part of speech, transliteration, gender were coming from the English Wiki, pronunciation (IPA and audio), images were borrowed. Doesn't have to be the same but that was the idea. A remaining unchanged Tbot entry example: абажур#Ukrainian. --Anatoli ^{(обсудить}/^вклад) 03:11, 12 December 2012 (UTC)Reply

One problem is that many translations are currently singly-linked SOP, which would lead to many unnecessary SOP entries. --Wiki Tiki 89 07:47, 12 December 2012 (UTC)Reply

Except that it wouldn't happen. (Did you read what Anatoli wrote above?) If it's truly SOP, the FL wikt won't have it. I suppose we could override the FL wikt restriction for a language like Tok Pisin, where I'd be willing to go through and delete the SOP ones (converting the {{t}}s to {{t-SOP}}s back at the original entry along the way). Otherwise, that wouldn't be a problem. —Μετάknowledge^{discuss/deeds} 07:52, 12 December 2012 (UTC)Reply

Oh. I guess I was just confused by the "FL" abbreviation. --Wiki Tiki 89 08:06, 12 December 2012 (UTC)Reply

As far as I know, the historical Tbot still created entries even when the FL wikt had no entry, though it noted whether the FL wikt had an entry or not. That doesn't mean Ruakh's bot shouldn't behave differently, but not all FL wikts are as complete as our trans tables(!). - -sche (discuss) 08:40, 12 December 2012 (UTC)Reply

I don't think it created an entry unless the FL wikt (not just any FL wikt but the one in the language in question) had the entry.—msh210℠ (talk) 16:00, 12 December 2012 (UTC)Reply

Shouldn't we also add that the FL wikt in the language in question should have an entry in that language? --Wiki Tiki 89 18:59, 12 December 2012 (UTC)Reply

That is easier said than done, for two reasons. Firstly, it would require the bot to actually examine the FL-wikt entry, which is not something bots usually do; and secondly, it would require the bot to understand the FL-wikt entry layout, even to the point of knowing how the FL-wikt identifies entries in its own language (e.g. ==English== for en.wikt, == {{langue|fr}} == for fr.wikt, ==.*== for he.wikt). It's definitely possible, and actually it wouldn't shock me if the original Tbot did it (since ISTR that it took images from the FL-wikt entry), but I think it's significantly more difficult than a bot that doesn't do those things. —Ruakh 22:40, 12 December 2012 (UTC)Reply

Well if the bot is enabled language by language based on request, it wouldn't be too hard to go and quickly figure out the structure on that language's wiki one language at a time. And it would save quite a bit of erroneous entries from being created (such as when a translation is in the wrong language, especially a very closely related language). --Wiki Tiki 89 23:11, 12 December 2012 (UTC)Reply

I think it would suffice if the bot were to check for existence of AN entry in the appropriate FL wiki and leave the rest to the editors, monitoring the processes. Deleting bad entries (wrong script, wrong language or SoP) doesn't take long, they could also be moved (renamed) if a non-lemma or SoP entry were created. Moreover, even if there was no check at all but basic structured entries were dumped in some kind of appendix (not in the main space), even without any checks, directly from translations, then good one could gradually be moved into the main space after checking and proper fixes, additions. To clarify - translations with a basic check in the FL wiki - into the main space, no check - appendix or separate container. --Anatoli ^{(обсудить}/^вклад) 23:16, 12 December 2012 (UTC)Reply

But it would also take work away from improving the entries that are correctly created. The Tbot entries are often very ambiguous and unclear even when they are correct. --Wiki Tiki 89 23:31, 12 December 2012 (UTC)Reply

Everyone works on what they like and wish to do. Entries need improvement and this will always happen but we are not talking about improvements here but mass creation of new entries in foreign languages. If you followed the discussion, it's only for languages where people are available for checking generated entries after an explicit request, removing ambiguities and making sure they are correct. No one will demand from you to stop working on improving English or Hebrew entries and check bot-generated entries in Russian, Hebrew or any other language, if you don't want to do it. --Anatoli ^{(обсудить}/^вклад) 23:46, 12 December 2012 (UTC)Reply

You misunderstood me. I am only referring to entries that would be created by this bot (and the editors who volunteer to work on them). It would be a better use of time improving the entries that are correctly created by the bot, than deleting entries that are incorrectly created that could have not been created in the first place if the bot does this additional check. The entries that the bot creates correctly will still take a lot of improvement. --Wiki Tiki 89 00:09, 13 December 2012 (UTC)Reply

Yes, correctly created entries would be a priority but I wouldn't linger to make all entries perfect from the start (OK, I won't pursue creating of entries without checking other wikis.) The above entry абажур#Ukrainian is about good enough, even without the audio file. Transliteration and definition (i.e. (an unambiguous) translation into English) are the most important things in an FL entry. Declension/conjugation, etymology, example sentences, related terms, synonyms, etc. all can be added later (in this order, IMHO). They are bells and whistles, not the essential parts. (Ruakh, sorry for hijacking your discussion page, should we move to BP?). --Anatoli ^{(обсудить}/^вклад) 01:17, 13 December 2012 (UTC)Reply

I wasn't referring to "Declension/conjugation, etymology, example sentences, related terms, synonyms, etc.", only to the definitions themselves. The ones I've seen from Tbot are usually very ambiguous and unclear (since they are only the English headword and the translation table heading). --Wiki Tiki 89 01:29, 13 December 2012 (UTC)Reply

Is that really common? I don't know many examples of bad entries generated from good translations, even after a check with the other wiki. Could you give me a couple of example? --Anatoli ^{(обсудить}/^вклад) 01:55, 13 December 2012 (UTC)Reply

@-sche: Not quite. What happens is, the very earliest Tbot entries didn't check for FL-wikt entries, but Robert soon changed it to add that check. So you've probably seen some entries that mention their FL-wikt counterparts (because they were created after the change) and some that don't (because they were created before the change, and their counterparts likely don't even exist), but at no time did Tbot have an intermediate state like you describe. —Ruakh 16:54, 12 December 2012 (UTC)Reply

I'll join the clamor, but I disagree with "I think this time around Rukhabot should only create entries if editors working in that language explicitly ask for it": I think the big notice indicating the entry may be wrong (couple with, of course, the other safeguards, like the one I mention in my comment above of even timestamp) suffices.—msh210℠ (talk) 16:00, 12 December 2012 (UTC)Reply

This is not so hard, but (for a few reasons) it's not something I'm likely to do in the near future. If someone else is interested in doing it, I'd be happy to help. —Ruakh 16:54, 12 December 2012 (UTC)Reply

All right, no worries. Thanks. --Anatoli ^{(обсудить}/^вклад) 22:13, 12 December 2012 (UTC)Reply

I wonder if maybe a bot isn't the best approach? Conrad.Irwin (talk • contribs)'s language-indices include not only terms that already have an en.wikt entry, but also terms that appear in en.wikt translation tables; and it includes links to the en.wikt entries containing those translation tables; so it's easy for a human to browse for translations-tables that have redlinks in a given language. So if we take that starting-point as a "given", I think all we'd need is some JavaScript that, when it sees a redlinked translation to a language you're interested in, converts that redlink into a "greenlink" that prefills the linked creation-page with information from the current page. This would both prevent any creation of bad entries (because you'd see them before creating them, and could simply fix any bad translations), and allow you to make any improvements you want even before you click "Save page". —Ruakh 01:00, 13 December 2012 (UTC)Reply

Now, there's an idea! I would support that. --Wiki Tiki 89 01:04, 13 December 2012 (UTC)Reply

Well... except that most of the languages I'm interested in don't have indices, and those that do have them are updated rather infrequently. But otherwise I support. (Naturally, that also requires that somebody write the JS.) —Μετάknowledge^{discuss/deeds} 01:10, 13 December 2012 (UTC)Reply

@Metaknowledge. See my post below. Languages you're insterested in can be monitored if the indices are generated from English translation. You'd need to speak to User:Matthias_Buchmeier about your request. He kindly added a few languages on my request. --Anatoli ^{(обсудить}/^вклад) 01:27, 13 December 2012 (UTC)Reply

Conrad.Irwin's indices are heavily out of date and it doesn't seem he is going to refresh them soon. I like User:Matthias_Buchmeier's English-FL indices, which can work as offline dictionaries if downloaded. They are generated from translations only (not from entries) and one can also see red and blue links. Also, he made some but only few languages available in reverse order - FL-English. All these dictionaries are refreshed regularly. --Anatoli ^{(обсудить}/^вклад) 01:25, 13 December 2012 (UTC)Reply

Indices from translation tables are indeed easy to create for any language on request. The situation is very different for FL entries, which are a real pain, because each language uses its own formatting/templating and many languages are quite messy when it comes to headline- and form-of-templates. That means that those indices will either contain some degree of inflected forms or be missing some lemma entries. Matthias Buchmeier (talk) 10:30, 17 December 2012 (UTC)Reply

@Μετάknowledge: Re: first sentence: Hopefully Anatoli's response sufficiently addresses that . . . if not, then, I'm also quite willing to generate lists for this purpose for any language you might be interested in. That should be a pretty straightforward XML-dump–analysis task. Re: parenthetical sentence: I don't think that should be a problem. I can undertake that, if at least one person is really serious about running it, and as long as y'all are willing to potentially wait a few months. (I am going to be crazy busy for the next two months or so.) But if someone else wants to beat me to it, that would be even better. :-) —Ruakh 01:31, 13 December 2012 (UTC)Reply

I'm serious about Russian and Stephen agreed to help. Hopefully Vahagn, Wanjuscha, CopperCattle, Wikitiki89, One_half_3544 join the efforts for Russian. I'm sure there will be serious commitments for other languages as well. Generating entries without commitments from editors won't be harmful either, see msh210's message above. If entries had a better warning than Tbot's, something like WARNING, this entry was automatically generated from "BLAH", it needs attention of ..., blah-blah. --Anatoli ^{(обсудить}/^вклад) 01:55, 13 December 2012 (UTC)Reply

Re: "Generating entries without commitments from editors won't be harmful either": But remember, what I'm proposing is that I wouldn't generate entries; I'd merely write JavaScript that would (hopefully) make it easier for other people to generate entries. So I'd want a commitment that people would actually do so. (But not really a "commitment"; that's too strong a term: what if people try out the JavaScript and find that it doesn't really make things easier for them?) —Ruakh 02:32, 13 December 2012 (UTC)Reply

I see but perhaps this JavaScript won't be extremely helpful, since everyone can save a template for creating entries from about (language name) pages, similar to what Tooironic wrote about creating basic Mandarin entry? What Metaknowledge and I were suggesting is an automatic mass-creation of entries, similar to Category:Tbot_entries. Don't you like this idea? --Anatoli ^{(обсудить}/^вклад) 02:42, 13 December 2012 (UTC)Reply

I think you missed a few comments. If you search this page for I wonder if maybe a bot isn't the best approach?, you'll see where I changed the subject a bit. :-P It's not that I dislike the idea of a bot, but I personally don't like to write bots that will make bad edits. I prefer to write "conservative" bot-tasks where, say, >99% of edits are clearly neutral-to-positive; and it's clear that with Tbot-entries, that's not an attainable goal. And Tbot-entries automatically require editor intervention anyway, so it seems like that intervention might as well be "front-loaded", happening before the entry is even saved. Also — I'm not likely to run such a bot any time soon. I don't have a server or anything, so every bot I run, I run it on my laptop. That severely curbs my desire to run open-ended bot-tasks. With the JavaScript approach, I might be able to write the JavaScript within the next two weeks, and will almost certainly be able to do within the next three months; a bot, I probably wouldn't run for another year or two. Plus, with the JavaScript it would be on-wiki, and other editors could (at least theoretically) make improvements to it, whereas a bot is pretty much locked down. In the case of generating entries in a language I don't speak, I'd really prefer that. —Ruakh 03:01, 13 December 2012 (UTC)Reply

Yes, I missed some comments. This idea is not bad either, even though it's quite different from the original request. Any simplification in the creation of entries is welcome. I've got some technical questions. What tools do you need to download, what language is used to write bots, how do you view the database? I always approached Wiktionary from the linguistic point of view, never tried to program in it, maybe I could start. --Anatoli ^{(обсудить}/^вклад) 03:11, 13 December 2012 (UTC)Reply

Re: "what language is used to write bots?": Almost any language. Bots interact with the site over HTTP (see http://en.wiktionary.org/w/api.php), so you can use any language that you feel comfortable with, as long as it has an HTTP library. Personally, I use Perl; but most other Wiktionarian bot-runners seem to use PyWikipediaBot, which is a framework for writing bots in Python. Re: "how do you view the database?": You can't view it directly, but the Wikimedia Foundation regularly generates "dumps" of various aspects of the database — some as XML-files, some as SQL scripts — which you can download from http://dumps.wikimedia.org/backup-index.html. —Ruakh 03:22, 13 December 2012 (UTC)Reply

Template:Hebr. --Anatoli ^{(обсудить}/^вклад) 03:38, 13 December 2012 (UTC)Reply

Binyan of ניגש

Latest comment: 11 years ago6 comments3 people in discussion

How should I describe the binyan of (deprecated template usage) ניגש \ נִגַּשׁ (nigásh) in the headword line? --Wiki Tiki 89 08:27, 15 December 2012 (UTC)Reply

I think I'd write {{head|he|verb|head=ניגש|ניגש \ נִגַּשׁ|tr=nigásh}}, manually categorize it as both pa'ál and nif'ál, and leave the details for usage notes; but if you prefer to try to use {{he-verb}}, I'd say nif'ál (plus usage notes). —Ruakh 17:33, 15 December 2012 (UTC)Reply

Is the current usage note true — that Template:Hebr "uses the pa'al construction for the infinitive, imperative, and future tenses, but the nif'al construction for the past and present"? Or is it, rather, the case that Template:Hebr uses the nif'al for past and present and has no other (extant) forms, and Template:Hebr uses the pa'al for the infinitive, imperative, and future, and has no other (extant) forms? The latter seems more reasonable to my (linguistically untrained) mind.—msh210℠ (talk) 05:49, 16 December 2012 (UTC)Reply

I would say those are two equally correct ways of analyzing it, but your way would require two separate definitions and two separate conjugation tables. --Wiki Tiki 89 07:49, 16 December 2012 (UTC)Reply

As Wikitiki89 says, they both seem like valid analyses. My (also linguistically untrained) mind is not sure how to set about distinguishing them experimentally. But if the past tense of go is went and the finite forms of be are am, is, are, was, and were, then I, at least, am fine with saying that the infinitive of nigásh is lagéshet. (FWIW, my copy of Tarmon and Uval treats it as a single irregular verb, albeit of binyan pa'ál.) —Ruakh 11:13, 16 December 2012 (UTC)Reply

(A reply to both of you.) Seems reasonable. Thank you.—msh210℠ (talk) 15:21, 17 December 2012 (UTC)Reply

WT:RFD#side wall

Latest comment: 11 years ago2 comments2 people in discussion

Something you'd commented on.—msh210℠ (talk) 15:21, 17 December 2012 (UTC)Reply

Thanks. —Ruakh 19:01, 20 December 2012 (UTC)Reply

Is this possible?

Latest comment: 11 years ago3 comments2 people in discussion

I rashly advocated something at BP that I don't know is possible.

Would it be possible using CSS or CSS + JS to suppress the display of a definition line that contained a context tag such as {{obsolete}}?

Would it be possible to allow users (without registration) to switch the display using options such as be use for the option display of expanded translations, quotations, and items enclosed in {{rel-top}} and its relatives?

I suppose that if these are possible, it should be relatively easy to allow registered users to specify what they want displayed or suppressed by default. DCDuring TALK 02:20, 22 December 2012 (UTC)Reply

Anything is possible, if we're willing to accept the tradeoffs. In this case, tradeoffs include:

The definitions displayed when you view the page will not match the definitions displayed when you edit the page.
{{obsolete}} will become overloaded, having both its current meaning and also the meaning "hidden by default", and this will naturally affect editors' decisions about when to use it and when not to use it. (You can expect frequent subconscious thought processes, and occasional conscious ones, along the lines of "This sense is important, it should be shown, I'm sure someone somewhere would understand it, let's just mark it {{archaic}}" and "This sense is really pretty rare nowadays, there's no reason to show it, let's mark it {{obsolete}}.")
Orphaned translations-tables and so on, that correspond to a hidden definition. (This is not likely to be a huge problem — if a sense is genuinely obsolete, then it shouldn't have translations — but when it does happen, it'll sure confuse people!)

—Ruakh 17:00, 22 December 2012 (UTC)Reply

Thanks. Sorry for the tardy response to your thoughts.

I agree that there are issues. I advanced the idea because of the overhasty, incomplete discussion of reordering senses. I question the wisdom of straw polls because they lead to a hardening of vaguely felt preferences into positions.

The non-matching problem could be partly addressed if a control for the display of obsolete senses appeared on the entry display in addition to what was on the left-hand side (where Yair (?) has the display controls for rel and trans tables.

A "bias" in the use of the tag follows from any function that it has. We could attempt to sharpen our definitions of obsolete and archaic by putting time frames and absolute or relative frequency parameters on it, presumably specific to each language, but certainly specific to English.

There are possible solutions to the translation-table problem, including inserting "obsolete" into the trans-table headers for the senses marked obsolete. I have long noted the enthusiasm of translators in providing translations for rare, archaic, and obsolete terms whose principal attribute seems to be etymological relationship to a term in the FLs. We do need more trans-sees for obsolete, archaic and rare senses.

We'll see what happens with the BP discussion. DCDuring TALK 16:41, 28 December 2012 (UTC)Reply

A couple of deletion requests

Latest comment: 11 years ago3 comments3 people in discussion

There are a couple of redirects in WT:SD (שתים, אוץ) that I've refrained from deleting or untagging because I'm not sure how we handle variations like this. I certainly wouldn't advocate making lemmas out of them, but they seem to be within the realm of what people might run into in texts or other references. Do we do alt-spelling entries for variants that only differ in whether the yods and waws are written defectively or not? Chuck Entz (talk) 02:17, 24 December 2012 (UTC)Reply

Well as redirects, they are useless and it won't hurt to delete them and that's why I tagged them. Ideally I guess they should be form-of entries. --Wiki Tiki 89 02:24, 24 December 2012 (UTC)Reply

Re: Template:Hebr: That's an alternative spelling; in a case like this, you can use {{he-Defective spelling of|שתיים|.=.}}, or just the normal alternative-spelling template, as you prefer. (I've just

done the former.)
Re: Template:Hebr: That is not an alternative spelling, but rather, an artifact of a lexicographical convention that we do not use here. However, it's also the masculine singular imperative form, as well as the "gerund" or "bare infinitive" or "infinitive construct" form (which doesn't really exist as its own form, but we pretend it does); so, we can use {{he-Infinitive of|bare|אץ|wv=אָץ|tr=áts|.=.}} and {{he-Imperative of|אץ|wv=אָץ|tr=áts|g=m|n=s|.=.}} (as I've just

done).
—Ruakh 03:07, 24 December 2012 (UTC)Reply

פרעה

Latest comment: 11 years ago5 comments3 people in discussion

I've just neatened this, but would appreciate your taking a look at three things: 1, it was listed as a ===Noun=== and so categorized, but I'm not sure it shouldn't be a proper noun. 2, I've added a usage note that might need improving. And, 3, I've set cons=-, but I'm not sure that's correct. Thanks much.—msh210℠ (talk) 05:21, 24 December 2012 (UTC)Reply

I agree that, in Biblical usage, it's a proper noun with no construct form. (This is particularly clear, IMHO, in Exodus chapter 1, which covers the transition from the Pharaoh of Joseph to the Pharaoh of the Exodus: the chapter switches back and forth between mélekh and par'ó, using the former when it needs a common noun or a construct form and the latter when it does not.) The usage note seems fine, except that it seems like another way of saying "this is a proper noun", so is redundant to just changing the POS. :-P

Modern Hebrew usage is a bit trickier. Using the Hebrew Wikipedia as a guide . . . some articles, such as w:he:פרעה, consistently use the term in the same sort of way as the Bible. (And w:he:פרעה is very mention-heavy, preferring to talk about the term par'ó rather than about Pharaohs, which makes sense if you view it as a proper noun; imagine having an article w:Carl about people named Carl.) But there are other articles that frequently or even consistently use par'ó as a common noun; hence phrases like these:

Template:Hebr — […] hayá par'ó-mitsráyim hashishí l'shoshélet ha-21. — […] was the sixth pharaoh of Egypt of the 21st dynasty. [from w:he:סיאמון]

Template:Hebr — tákhat hapar'ó harishón — under the first pharaoh [from w:he:מצרים העתיקה]

Template:Hebr — shiltonám shél hapar'oním histayém rishmít b'31 Lifn.Has. — The rule of the pharaohs officially ended in 31 B.C.E. ibid.

(The latter article is interesting in that it uses it consistently as a common noun, or at least in ways construable as common-noun uses, except in one place: the phrase (deprecated template usage) ארמון פרעה (armón-par'ó). I wonder if this is related to the fact that — according to w:he:פרעה — par'ó actually originally referred to the house of Pharaoh, rather than the Pharaoh himself.) The Hebrew Wikipedia also uses par'ó sometimes before a name, as in (deprecated template usage) פרעה מרנפתח.

—Ruakh 17:22, 24 December 2012 (UTC)Reply

Thanks a lot; and Wikitiki has edited the entry (and [[Pharaoh]]) in accordance with the above (and thanks to him, too).—msh210℠ (talk) 23:13, 24 December 2012 (UTC)Reply

Does the common noun have a plural? --Wiki Tiki 89 23:18, 24 December 2012 (UTC)Reply

Yes: (deprecated template usage) פרעונים (par'oním). (I'm not sure if that's the only plural, but it's the only plural I'm familiar with, and it's the only plural I noticed while looking through the aforementioned Hebrew Wikipedia articles, so I feel reasonably safe in saying that it's at least the primary plural.) —Ruakh 23:24, 24 December 2012 (UTC)Reply

Tbot-like JavaScript

Latest comment: 11 years ago9 comments2 people in discussion

Hi,

Thank you for your efforts. Sorry, I was away with family and friends today. My browser is Firefox 17.0.1 on Windows 7 and Windows NT. I have added the import line onto User:Atitarev/monobook.js and hard refreshed the page. I don't see any difference though. What should I do next? --Anatoli ^{(обсудить}/^вклад) 10:51, 26 December 2012 (UTC)Reply

Just look for translations-tables with properly-formatted Russian redlinks, such as (as of this writing) the "something particularly good or pleasing" table at [[beauty]] or the "word used to indicate disagreement or dissent in reply to a negative statement" table at [[yes]]. —Ruakh 13:16, 26 December 2012 (UTC)Reply

I did check for red-linked translations but I didn't see any change. Red-links open as blank create pages as before. --Anatoli ^{(обсудить}/^вклад) 20:12, 26 December 2012 (UTC)Reply

Hmm. I guess we'll have to debug . . . could you change your monobook.js from using User:Ruakh/Tbot.js to using User:Ruakh/Test.js, and visit [[User:Ruakh/Test]], and tell me what you see in the box labeled "Output from the Tbot script"? —Ruakh 00:06, 27 December 2012 (UTC)Reply

Done and after the hard refresh I only see "Output from the Tbot script:" in a box with nothing after it. --Anatoli ^{(обсудить}/^вклад) 00:12, 27 December 2012 (UTC)Reply

I have just tested with MS IE 7.0 (after logging on and a hard refresh) with the same result. Any further tests I will do after a hard refresh. --Anatoli ^{(обсудить}/^вклад) 01:00, 27 December 2012 (UTC)Reply

Strange; that makes it seem like the JavaScript isn't even being run. Can you e-mail me the text you see at http://en.wiktionary.org/w/api.php?format=jsonfm&action=query&meta=userinfo&uiprop=options? —Ruakh 01:10, 27 December 2012 (UTC)Reply

It looks much better now, thanks! Are you open for other languages in the near future, e.g. cmn and ja? I don't think the template can be made perfect but a few differences to Russian. --Anatoli ^{(обсудить}/^вклад) 05:24, 27 December 2012 (UTC)Reply

Yes, I'm definitely open to adding special support other languages, improving the special support for Russian, etc. —Ruakh 14:11, 27 December 2012 (UTC)Reply

January 2013

ISBN

Latest comment: 11 years ago2 comments2 people in discussion

What does the ISBN code mean in this edit? Pass a Method (talk) 20:48, 7 January 2013 (UTC)Reply

See w:ISBN. —Μετάknowledge^{discuss/deeds} 21:11, 7 January 2013 (UTC)Reply

yákhas ót l'rá'ash

Latest comment: 11 years ago3 comments2 people in discussion

Are you sure about the transliteration? Laráash (Template:Hebr) sounds better to my ears than l'ráash, but I may be overly influenced by my greater familiarity with Biblical Hebrew than with modern. (Cf. Gesenius, 102h.)—msh210℠ (talk) 06:57, 9 January 2013 (UTC)Reply

I was confident enough to put it in the entry, but no, I'm not sure enough to call myself "sure about" it. —Ruakh 06:03, 10 January 2013 (UTC)Reply

Good enough for me. I didn't know whether you had put that transliteration in advisedly.—msh210℠ (talk) 06:47, 10 January 2013 (UTC)Reply

Template:list:Hebrew script letters/he

Latest comment: 11 years ago11 comments5 people in discussion

Hi, Ruakh. I'm trying to convert templates to the new format used in User:CodeCat/list helper which is less resource-intensive. But with this template I am having some problems, because my browser is acting strange with the Hebrew characters. When I press enter it changes the order of the characters, and I don't know the Hebrew alphabet so I am afraid to mess up and put the letters in the wrong order by accident. You probably have more experience dealing with such things so could you try? Template:list:Latin script letters/en has an example you can work from, but Hebrew doesn't use letter casing so there would probably be only one letter per line. —CodeCa t 02:20, 11 January 2013 (UTC)Reply

I gave it a whirl. I'm not really sure how it should look; I kept the letters in the left-to-right order they already had (which is kind of backward-looking, since Hebrew is read right-to-left, but not a huge deal), except that since the Latin uppercase letters were unseparated from their lowercase counterparts, I did the same thing for Hebrew medial and final forms, so for those I had to put them in right-to-left order, because Template:Hebr looks like an April Fool's prank. Feel free to reverse the overall order, or make any other tweaks, or anything. —Ruakh 02:40, 11 January 2013 (UTC)Reply

Thank you! I will trust your judgement when it comes to Hebrew because I know nothing about it so I can't judge how it should look or what looks good or bad. If you think the order should be reversed that is ok, but keep in mind that English users will expect the order of list elements to be ordered left to right, even if the individual items are to be read right to left. Template:list:days of the week/yi was made that way too. —CodeCa t 02:46, 11 January 2013 (UTC)Reply

Personally, I think it looks weird and would prefer right-to-left. days of the week/yi is acceptable to me, but that's because English-speaking users might be going to those pages for their semantic value, in which case left-to-right order is most logical. This is going to be chiefly used for the letters' value in writing, not semantic meanings, and users who are looking at the pages are more likely to know that Hebrew is right-to-left. It would then reflect the universal presentation of the script letters, rather than our idiosyncratic mixture of directions, as if someone was taking xkcd seriously. —Μετάknowledge^{discuss/deeds} 03:06, 11 January 2013 (UTC)Reply

Note that the Arabic alphabet lists (Template:list:Arabic script letters/ar, Template:list:Arabic script letters/fa, etc.) are currently right-to-left. --Wiki Tiki 89 19:23, 11 January 2013 (UTC)Reply

FWIW, I would support putting the Hebrew letters of the alphabet in, erm, alphabetical order, right-to-left. I find it rather surreal that they're listed left-to-right. - -sche (discuss) 04:41, 13 January 2013 (UTC)Reply

Done —Μετάknowledge^{discuss/deeds} 05:35, 13 January 2013 (UTC)Reply

I've reverted and re-done it a different way, I hope you don't mind. (Putting the letters in reverse order, while forcing that order to be presented LTR, seems rather hackish to me. Logically, it makes more sense to put the letters in the correct order, presenting it RTL. Please revert if there was a reason for the other approach.) —Ruakh 06:23, 13 January 2013 (UTC)Reply

I think HTML already treats Hebrew text as RTL by default, so the RTL markers probably aren't necessary. —CodeCa t 13:05, 13 January 2013 (UTC)Reply

You are correct, but I included them because I think that maybe our templates should include LTR markers around Hebrew-script text, so I'm operating under the premise that maybe we'll do that someday. If so, then in the rare cases that we really want the context containing Hebrew-script text to be RTL, we would have to explicitly include RTL markers; and in the meantime, they're harmless; so it seemed like a sort of future-proofing. —Ruakh 16:03, 13 January 2013 (UTC)Reply

Hackish? Yeah, kinda. I honestly just did it because it was easier that way (somewhat embarrassingly, while singing the call-and-response אלפבית song...) —Μετάknowledge^{discuss/deeds} 16:50, 13 January 2013 (UTC)Reply

接管

Latest comment: 11 years ago23 comments4 people in discussion

I don't speak IPA, but something tells me 接管 is not pronounced /ʨk/. ---> Tooironic (talk) 04:23, 13 January 2013 (UTC)Reply

I agree. —Ruakh 04:28, 13 January 2013 (UTC)Reply

Done —Μετάknowledge^{discuss/deeds} 05:26, 13 January 2013 (UTC)Reply

Well, you fixed that one entry — and thank you :-) — but that doesn't really solve the overall problem . . . —Ruakh 06:17, 13 January 2013 (UTC)Reply

Tooironic only complained about one entry. As you know better than I, you can certainly scan a database dump for Mandarin IPA significantly shorter than the pīnyīn values, which should bring up all examples of this bug, and I'm sure that between me and the Mandarin regulars we can fix them all by hand. —Μετάknowledge^{discuss/deeds} 16:54, 13 January 2013 (UTC)Reply

Wait, really? I guess it never occurred to me that y'all would be willing to do that. The list of all 183 problematic entries is at User:Metaknowledge/py-to-ipa-problems. Thank you! :-D —Ruakh 17:07, 13 January 2013 (UTC)Reply

Сумпор! Wait, wrong language. Uh, um, 了不起！(or something like that, not sure if I got it right...) While we're at it, can you explain why this bug even happened? —Μετάknowledge^{discuss/deeds} 17:11, 13 January 2013 (UTC)Reply

I really don't know. Either the template was broken to begin with and no one even noticed, or something broke in the template during the process of making it substitutable. (The latter case has two subcases: the breakage could have been substitution-specific — like, say, maybe an #if: was made safesubstitutable, but its condition contained a nonsafesubstituted template, in which case the order of evaluation would have been such that the #if: would misbehave — or the breakage could have been general, like, some part of the template accidentally got deleted during that process. In either of those subcases, it's most likely, but not certainly, my fault.) In either case, the problem wasn't noticed until well after it had been substituted everywhere. I tried to go back later and figure out what had happened, but I couldn't: the template was just too messy and indecipherable (and it didn't seem like a high priority, since figuring out the problem would not really help in reversing the problem). —Ruakh 17:18, 13 January 2013 (UTC)Reply

Yes, I just realised what a massive problem this bot has caused now. Who knows how many IPA transcriptions it has stuffed up. 厭煩, 冷靜, 堅固, 搭配, 割傷, 荒誕, 學術, 學問, 記住, 評價, the list goes on and on. Is anyone going to address this? Or is a mass "reversal" on this bot's changes necessary? ---> Tooironic (talk) 03:17, 14 January 2013 (UTC)Reply

Re: "Who knows how many IPA transcriptions it has stuffed up": I don't think anyone knows for sure, but it's presumably either zero (if they were already messed up) or 183 (if they weren't). Re: "Is anyone going to address this?": Well, mostly I'd been quietly ignoring the issue out of a sense of frustration over the whole thing. (The substitution problems were just the last straw in the whole mess of dealing with this template.) But Metaknowledge (talk • contribs) has now offered his and your assistance in fixing all of them. :-P If that doesn't work out . . . I'm not capable of fixing these broken pronunciations, but I can certainly go through and remove them, if people want. —Ruakh 03:33, 14 January 2013 (UTC)Reply

If you are not capable of fixing the mess afterwards, then don't mess with it in the first place. Substituting the unchanged template with related string templates missing will of course generate erroneous pronunciations. To fix all this, restore the related string templates (even if temporarily), replace all {{IPA|...|lang=cmn}} with {{subst:py-to-ipa|... (parameters from {{cmn-...|pin=***}}) }}. 129.78.32.22 03:43, 14 January 2013 (UTC)Reply

The related string templates weren't missing at the time. But, uh, nice try. Better luck next time? —Ruakh 03:49, 14 January 2013 (UTC)Reply

I don't know how many related templates you misdeleted whilst doing these substitutions and I don't care. Your mess anyway. 129.78.32.22 03:56, 14 January 2013 (UTC)Reply

I didn't delete any of them. So: zero. Oh, but that's right, you just said that you don't care. So you're just trolling. Which is convenient for me, because I'm sick of replying to you, and now I don't have to: trolling is grounds for blocking, so the next time you comment here, I can just revert & block. Problem solved. :-) —Ruakh 04:02, 14 January 2013 (UTC)Reply

@Tooironic: I really intend to do it, just not today. Wanna help? —Μετάknowledge^{discuss/deeds} 04:04, 14 January 2013 (UTC)Reply

Sorry to say this but before this is done en masse, the IPA on 接管 is incorrect, cf. the zh.wikt page. Tone sandhi has not been taken into account because whoever was generating this pronunciation was relying on User:Wjcd/py-ipa, a tool that only generates IPA for monosyllabic pinyin. 129.78.32.22 04:47, 14 January 2013 (UTC)Reply

Er, I can't understand the numerical notation they use... the only tone sandhi rules I know are bordering 3rds make the first one(s) 2nd(s), 3rds followed by other tones generally don't come back up, and 一 and 不 are exceptions. What am I missing? —Μετάknowledge^{discuss/deeds} 04:58, 14 January 2013 (UTC)Reply

The actual picture if it is to be represented by IPA is more complex than those two rules. There are basically six rules, and these rules make the third tone non-existent in compounds. The four tones in Beijing Mandarin are value-wise 55, 35, 214, 51 (Superscript numbers 1-5 are equivalent to IPA tone letters ˩˨˧˦˥). When they combine,
1) 55/35/51 + 214 = 211 + 214;
2) 214 + 214 = 35 + 214;
3) 214 + ø = 21(4);
4) non-ø + 211/214 + non-ø = non-ø + 1 + non-ø;
5) 51 + 51 = 53 + 51;
6) tone sandhi of 一 and 不.
Apply these rules repeatedly, until a stable tonal profile is obtained, where no sandhi rule from above can be applied any more. This gives the final IPA pronunciation. 129.78.32.22 05:13, 14 January 2013 (UTC)Reply

Wow, thank you! I feel enlightened (and somewhat miseducated). I'll memorize this method straightaway (taking notes for the time being). Anything else that I ought to know but probably don't from using py-to-ipa and reading online guides written by non-linguists? —Μετάknowledge^{discuss/deeds} 05:54, 14 January 2013 (UTC)Reply

Re above, I would help but I know nothing about IPA. Nor am I willing to learn. So many other Mandarin-related tasks to be done. Good luck! ---> Tooironic (talk) 03:50, 16 January 2013 (UTC)Reply
Oh, one small thing. I noticed that you've tagged these pronunciations as "Beijing". Are they really? I assume they're just Standard Mandarin.... ---> Tooironic (talk) 03:52, 16 January 2013 (UTC)Reply
Did I do that? I intended Putonghua. —Μετάknowledge^{discuss/deeds} 03:54, 16 January 2013 (UTC)Reply

Actually, that's how it was created. I don't know enough to challenge that. —Μετάknowledge^{discuss/deeds} 03:56, 16 January 2013 (UTC)Reply

Tbot support

Latest comment: 11 years ago2 comments2 people in discussion

Hi,

There are some questions/requests in User talk:Ruakh/Tbot.js#Testing_and_making_it_work_with_other_languages you may have missed. --Anatoli ^{(обсудить}/^вклад) 03:29, 14 January 2013 (UTC)Reply

Thanks. I saw the comments, but I didn't know quite how to reply . . . I'll try. —Ruakh 03:51, 14 January 2013 (UTC)Reply

Aramaic prefixes and suffixes

Latest comment: 11 years ago10 comments3 people in discussion

I noticed an issue with our Aramaic prefix and suffix entries. It seems that whoever added them put the hyphen on the wrong side (I presume to circumvent bidirectionality issues), in addition to the fact that for the Hebrew script it should be a makaf. I corrected several of the Hebrew script ones and even noticed that you yourself fixed -ל to ל־ a while back. The problem is I don't know how many more of them there are and I haven't even gotten to fixing the Syriac script ones. I was wondering if you could generate a list of entries whose titles match "-.*|.*-" and whose body contains an Aramaic L2. If there are too many of them, maybe you could even use your bot to fix them?

I would do it myself but I can't get Python to stop throwing Unicode errors as it reads the dump (which I have found to be caused by a broken XML reader library). Also, would you happen to know if the Syriac script has a version of the hyphen analogous to the Hebrew makaf?

Thanks. --Wiki Tiki 89 03:26, 22 January 2013 (UTC)Reply

As of the January 10th dump, we had four: -ב,‎ -ד,‎ -ו,‎ and -דיל. So presumably we now have none, though -ו is broken now. (I didn't filter by script or anything, so apparently no Syriac-script entries had this problem.) As for whether Syriac has something makaf-like — I really have no idea. —Ruakh 05:46, 22 January 2013 (UTC)Reply

Our resident Syriac experts are 334a (talk • contribs) (for the Classical language) and Rafy (talk • contribs) (for the modern language). —Μετάknowledge^{discuss/deeds} 05:48, 22 January 2013 (UTC)Reply

Some or all of the entries in question were added by 334a, so I don't know how much he knows about Unicode. Then the problem might be a little more complicated because I encountered Syriac script links with this problem, I guess they must have been redlinks (I only looked at them within the wiki code so I wouldn't have noticed). And I assume it would be harder to find the problem in links than in entries.

Also -ו isn't really broken, I redirected it to ־ו (the Hebrew 3rd person singular possessive suffix). When I realized the latter didn't exist, I was too lazy to create it. --Wiki Tiki 89 06:01, 22 January 2013 (UTC)Reply

I'm not sure what "broken" means to you, but to me, a redirect to a redlink is a broken redirect: "broken"! (I mean, I often create a redirect a few minutes before creating the entry it redirects to, but Template:Hebr.) —Ruakh 06:25, 22 January 2013 (UTC)Reply

Fine, I created ־ו. --Wiki Tiki 89 06:46, 22 January 2013 (UTC)Reply

Thank you. :-) —Ruakh 06:49, 22 January 2013 (UTC)Reply

It turns out the Syriac-script entries with this problem use the L2 "Classical Syriac" rather than "Aramaic". Can you do another dump analysis, maybe for any L2 header containing the string "Syriac"? --Wiki Tiki 89 18:48, 22 January 2013 (UTC)Reply

As of last night's database dump, there were only two: -ܘ and -ܒ. —Ruakh 03:34, 23 January 2013 (UTC)Reply

I guess it just shows how poor our coverage of Aramaic/Syriac is. --Wiki Tiki 89 05:15, 23 January 2013 (UTC)Reply

How to use Category:Hebrew personal pronouns?

Latest comment: 11 years ago2 comments2 people in discussion

I noticed that you reverted (all?) my edits where I added Hebrew personal pronouns to Category:Hebrew personal pronouns. If you look at that category, it is now clearly missing many of Hebrew personal pronouns. Do you disagree that those are Hebrew personal pronouns, or do you find that category useless, or something else? --Thv (talk) 06:46, 25 January 2013 (UTC)Reply

I think the category is fine; the problem isn't that you added entries to it, but that you removed entries from Category:Hebrew pronouns. The headword line should still be {{head|he|pronoun|…}}; Category:Hebrew personal pronouns should be added explicitly at the end of the language section. (Sorry; I had intended to either fix these myself afterward, or leave you a message about it, but then it slipped my mind. Thanks for asking about it.) —Ruakh 15:10, 25 January 2013 (UTC)Reply

Page links dump and invalid page IDs

Latest comment: 11 years ago2 comments2 people in discussion

There are around 5000 links in the pagelinks dump that have nonexistant page IDs (the ID is not present in the main dump). Do you know what the deal is with these? DTLHS (talk) 03:24, 26 January 2013 (UTC)Reply

If a page-ID is in the range [1, 3846626], but is not present in the latest pages-articles.xml, then I assume that usually (always?) means it was assigned to a now-deleted page. If such a page-ID occurs in pagelinks.pl_from, then I imagine it's simply that the link-record failed to be deleted when the page was. (I don't know anything about the history. Honestly, I wouldn't have been shocked if MediaWiki had simply never deleted such records, but if you only found around 5000 such links, then that's apparently not the case.) Do you notice any obvious pattern in these IDs, like, do they mostly clump in narrow ranges, or anything like that? —Ruakh 03:52, 26 January 2013 (UTC)Reply

On certain Latvian grammatical words

Latest comment: 11 years ago2 comments2 people in discussion

Since you asked me a question about the Latvian red links in grammatical templates... The phrases vīriešu dzimte, sieviešu dzimte correspond to masculine and feminine respectively. Now dzimte is one of those 19th-century "neologisms", derived from dzimt "to be born", and means simply "(grammatical) gender". So in principle they should be linked as two words -- "vīriešu" and "dzimte", "masculine" and "gender", i.e., SoP, right? Or should I assume that "vīriešu dzimte" is a phrase simply because it is the "official" name of the masculine gender, it is official grammatical terminology? --Pereru (talk) 15:11, 26 January 2013 (UTC)Reply

Wiktionarians have argued on this point since time immemorial, and the results of those arguments have been very inconsistent. I think the safest route is probably to link to each word separately, rather than to link to a two-word phrase that may or may not be idiomatic. —Ruakh 16:55, 26 January 2013 (UTC)Reply

February 2013

Linking and tabbed languages

Latest comment: 11 years ago11 comments3 people in discussion

Hello Ruakh --

I saw your recent link fix at ニゴロブナ, thanks for that. Your edit comment brought back to my mind an idea I've been toying with for a bit, that of creating a template for listing JA terms, similar to {{l|ja}} but specifying the lang for the two transliterations (kana and romaji).

For instance, the JA editors I'm aware of (myself, Haplology, I think Anatoli and James Jiao) have used wikicode formatting in lists like that seen at 御#Derived_terms:

* {{l|ja|御子|tr=[[みこ]], ''[[miko]]''}}: a shrine maiden

Your mention that this might screw up tabbed languages makes me wonder if this format is sufficient, but I don't use tabbed languages and don't really know. My idea was to leverage {{l|ja}} into something that might look like {{l-ja|御子|みこ|miko}} and be equivalent to {{l|ja|御子}} ({{l|ja|みこ}}, ''{{l|ja|sc=Latn|miko}}''), with all links properly pointing to the correct language.

Your thoughts? -- Eiríkr Útlendi │ Tala við mig 06:24, 6 February 2013 (UTC)Reply

PS -- no worries about the other day; we all have days like that. o_O

I created {{ja-l}} a few months back for this purpose, doing exactly what you describe; but it doesn't seem to have had any uptake. Since then, quite a few more such language-specific {{l}} templates have been created, as subpages of {{l}}, so I guess we should create {{l/ja}}. (It can just redirect to {{ja-l}}. Or we can do it the other way, moving {{ja-l}} to {{l/ja}}.) —Ruakh 15:27, 6 February 2013 (UTC)Reply

That template looks more complicated than just a simple linking template, though. Do you think it is a viable candidate for {{l/ja}}? —CodeCa t 16:56, 6 February 2013 (UTC)Reply

Yes. At least, I think that any {{l/ja}} would need to include all of this complexity. No? —Ruakh 02:38, 7 February 2013 (UTC)Reply

I'd prefer it if the basic linking templates were kept as simple as possible. That doesn't mean a template that combines them can't be created, but I would imagine there are situations where the extra code isn't necessary. It's always possible to make a bigger template out of smaller ones, but the reverse is not true, so the common denominator should probably be kept low. —CodeCa t 02:55, 7 February 2013 (UTC)Reply

But I think that this is as simple as it gets for Japanese. Or at least, it's the simplest thing that could be called "l". (I mean, I'm not dogmatic about it. If you have an idea for how you think {{l/ja}} should look, I would certainly keep an open mind. But that's how it seems to me right now.) —Ruakh 05:49, 7 February 2013 (UTC)Reply

I was thinking that it would only link to one word, so it would resemble {{l/sh/Cyrl}} but with another script. We often link to Serbo-Croatian words in pairs (both scripts, since both get an entry) but the linking templates don't support this directly since it is often desired not to link to a pair of words. I figured Japanese could work the same way, with the three (kanji, hiragana, romaji) or four (katakana too) representations of the word linked individually. {{ja-l}} already makes multiple links, so it can act as a convenient replacement for multiple instances of {{l/ja}} together. —CodeCa t 14:21, 7 February 2013 (UTC)Reply

I see what you're saying. I guess the thing is, that may be what {{l}} should be, but it's not what it is: it may have been intended, in part, as a single-link template, but what it is is an approximation to {{onym}}. (An inferior one, granted, but a very widely used one.) So I think {{l/ja}} needs to be what {{ja-onym}} would be, if {{ja-onym}} existed. —Ruakh 03:11, 8 February 2013 (UTC)Reply

Hmm, and thinking it through further, this becomes a more complicated problem space -- some JA terms spelled in kanji have multiple readings, such as 魚釣り, which could be read as either Template:Jpan uotsuri or Template:Jpan sakanatsuri. This is common enough that the template should ideally be able to handle probably up to three pairs of kana/romaji readings. The logic used at {{compound}} might be a useful reference. -- Eiríkr Útlendi │ Tala við mig 17:17, 6 February 2013 (UTC)Reply

I think that with those, the best approach is to use the template multiple times. I mean, they're separate words in that case, no more related than English (deprecated template usage) live /lɪv/ and (deprecated template usage) live /laɪv/. —Ruakh 02:38, 7 February 2013 (UTC)Reply

FWIW, I've never used {{ja-l}} simply because I didn't know it existed. (^^); -- Eiríkr Útlendi │ Tala við mig 17:19, 6 February 2013 (UTC)Reply

I'm sorry... can we try to sort it out?

Latest comment: 11 years ago14 comments2 people in discussion

I'm sorry for my outburst in the Grease Pit. Sometimes I get a bit overly emotionally attached to certain things because I have a strong idea of what is right or wrong. I was also a bit frustrated at the prospect of having yet another public discussion be derailed by the issue, when it's clearly just between us. I'd like to understand what the problem is in a calmer setting.

First let me explain what I think is your stance, so that we can get any misconceptions out of the way? As far as I'm aware, you prefer certain code templates to have prefixes so that there is a technical barrier for their usage, and only templates that have been explicitly coded around that barrier will accept such codes. I also think that you want those prefixes to be used so that Wiktionary users realise that they are not "normal" codes, and will hopefully act accordingly when using them. I remember you saying something like "different things should look different". Is that correct?

Now, with Lua, we don't actually have a need for the prefixes themselves, because there are other ways to implement "prefixes". For example, we could put them in separate modules, so that anyone who is using those codes will be explicitly aware that they are to be imported and used from a distinct location. So even if a technical barrier is desirable, there may be better and more "Lua-like" ways to do it than with prefixes embedded in the code string. My objection to these technical barriers in general is that while I do agree that different things should look different, regular codes don't actually work that differently from reconstructed codes. In fact, as far as I'm aware, they only differ when linking is concerned, since reconstructed languages are placed in a different location and use a different naming scheme. And implicitly that means that all entries need a sort key, but sort keys may be desirable for non-Appendix languages too (which Lua will be a tremendous help for, by the way!). All other uses are the same: expanding their names (like in the multitude of category boilerplate templates), categorising their entries (Category:Proto-Germanic language is no different from Category:English language), and so on.

So my objection is that while it may be good to make the difference explicit, it also makes it more difficult to handle all cases when there are no differences at all. A template like {{poscatboiler}} should not need to have special support for reconstructed codes, because it treats them exactly the same as regular codes. Similarly for {{head}}, which should just work fine for appendix languages, except for the links. When every template needs special support, it implicitly disables that template for reconstructed languages until someone takes the time to fix it. Which can be frustrating at times. —CodeCa t 14:48, 15 February 2013 (UTC)Reply

I accept your apology. I'm sorry, too.

Re: "I was also a bit frustrated at the prospect of having yet another public discussion be derailed by the issue, when it's clearly just between us": I don't follow. It seems to me that the only way that it can be "just between us" is if absolutely no one else cares one way or the other; and if that were the case, then public discussion would be both unnecessary and impossible.

Re: "As far as I'm aware, you prefer certain code templates to have prefixes so that there is a technical barrier for their usage, and only templates that have been explicitly coded around that barrier will accept such codes": This is not true. I actually really hate the templates that try to "code around" this — {{langprefix}} and so on.

Re: "I also think that you want those prefixes to be used so that Wiktionary users realise that they are not 'normal' codes, and will hopefully act accordingly when using them. I remember you saying something like 'different things should look different'": Yes.

Re: first half of your third paragraph, from "Now, with Lua, we" to "prefixes embedded in the code string": This section presupposes that I want a technical barrier. Now that I've clarified that I don't want that, I think this section is obsolete?

Re: second half of your third paragraph, or more specifically, re: "regular codes don't actually work that differently from reconstructed codes": I think they do. I think that the difference between "languages we include" and "languages we don't include" (such as reconstructed languages) is an absolutely fundamental distinction. (By comparison: I'm sure you wouldn't want {{term|parabola|lang=la}}, {{term|parabola|lang=fr}}, {{term|parabola|lang=es}} to generate “(deprecated template usage) parabola, (deprecated template usage) parole, (deprecated template usage) palabra” on the grounds that the only difference between the three words is the location of the entry for them.)

To put this another way — if you really think that reconstructed languages should be treated just like regular languages, then you should propose that they be put in mainspace. Or if you really think that they belong in appendices, then we shouldn't be treating them like regular languages in all these other respects. But I think that this half-measure — putting them in appendices, but then pretending that these appendices are just regular entries — is really the worst of both worlds.

Re: your last paragraph: I mostly agree, except that I reach the opposite conclusion: this is exactly why these prefixes need to appear directly in the wikitext, rather than forcing templates like {{poscatboiler}} to add special logic to hack around the missing prefixes.

—Ruakh 21:30, 16 February 2013 (UTC)Reply

Ok, I think I understand, but I don't quite agree. What exactly is there to gain from making editors so aware of the distinction? I mean, is there ever a problem when they're not aware, given that it's pretty clear through other means that we treat certain languages differently? Do we really need to make it more obvious that Proto-Germanic belongs in an appendix? And why do you think that adding "proto:" to a language code will somehow enable editors to make that connection? To illustrate this: several editors have, in the past, added {{head}} to reconstructed entries. That is the kind of situation where I think that {{head}} should just work. People expect it to work, and don't understand why it doesn't. I highly doubt that putting a prefix on the code will make even the slightest difference; after all, the editors in question were already very much aware that the language was different, because it was in the appendix namespace and had a different name. Yet that fact apparently did not prevent them from concluding that {{head}} should work there as it does in mainspace. So given that reality, adding "proto:" to the code seems like nothing more than pointless bureaucracy, which really does not help in the slightest with what you intend it to do. I agree with you that {{langprefix}} was a bad idea, but it was created because there was a need for it. That need would not have existed if we had decided back then to drop the prefixes from the templates. So, perhaps ironically, {{langprefix}} would not have existed today had you conceded then. —CodeCa t 04:59, 17 February 2013 (UTC)Reply

I'm not suggesting that we add proto: to the language code: it's already there. (See e.g. {{proto:gem-pro}}.) I'm suggesting that we not remove it. (And I'm not sure what you mean by "pointless bureaucracy", anyway, since the assigning codes is inherently an exercise in bureaucracy, and there's no difference bureaucracy-wise between assigning codes like proto:gem vs. gem-pro vs. Proto-Germanic.) And you're right about {{langprefix}}: if I hadn't insisted on making a distinction, or if you hadn't insisted on making no distinction, or if Daniel hadn't insisted on making all templates as complicated as possible, then we wouldn't have ended up with the worst-of-all-worlds situation that we have now. —Ruakh 15:40, 17 February 2013 (UTC)Reply

Well, considering that we'd still want to mark our text HTML-wise as Proto-Germanic, we'd necessarily have to use something like "proto:gem-pro". Being HTML-correct is really the main reason why we don't use the prefixes as part of the "proper" code, and the consequence is that the "proto:" that is part of the language template's name becomes nothing more than a technical barrier, which in turn necessitated {{langprefix}}. I think we both seem to agree that it's not a good thing. So which options do we have?

Keep as it is. That seems like the most complicated option to me, because prefixed template names don't easily translate to prefixed names in Lua. Unless we somehow decide to store the codes with the prefixes, and add them each time we want to look them up (essentially, Lua-fy {{langprefix}} along with the rest)? That seems rather cumbersome, and like I mentioned, we could just decide to use different tables for reconstructed languages to the same effect, if this is what we want. Nevertheless, if we decide that the method for accessing codes is going to be different for different types, then some analogue to {{langprefix}} is going to be inevitable.
Remove the prefixes entirely (use just "gem-pro" everywhere). This will probably be the easiest to implement, because we already use prefixless codes in all of our entries. This option allows us to get rid of any nastiness that prefixes involve, and has the benefit that the code that is typed in articles is the same that will appear in the HTML lang= attribute.
Add the prefixes as part of the canonical code (use just "proto:gem-pro" everywhere). This will require a bot to update all the uses. This makes the prefix somewhat redundant because the code already ends with -pro, but that will not apply to, for example, Klingon (conl:tlh in this scheme). It does have the advantage (to you at least) that it's explicit that the code is "different". However, a disadvantage is that any code that creates HTML (which in practice will be most if not all) will have to remove the prefix before putting it in the lang= attribute. Essentially then, we end up with a kind of "reverse langprefix", except one that will need to be used far more often than langprefix itself currently is.
Some other option?

—CodeCa t 17:30, 17 February 2013 (UTC)Reply

lang="gem-pro" is not valid, so we actually don't (or at least, shouldn't) want that in our HTML. —Ruakh 01:39, 18 February 2013 (UTC)Reply

But then neither should any of our other exceptional codes. Does HTML actually mandate the use of ISO-639 as part of its standard? And if so, what does it say that should be done with content that is not in an ISO-recognised language? —CodeCa t 02:25, 18 February 2013 (UTC)Reply

Re: first sentence: Yes, I absolutely agree. (This also applies, in a slightly different way, to a handful of exceptional codes used by WMF in general, rather than en.wikt specifically.) Re: second sentence: Not exactly ISO 639, but lang="gem-pro" isn't valid. What the spec requires is that the value of lang="…" "be a valid BCP 47 language tag, or the empty string", where the empty string means "the primary language is unknown".^[13] A BCP 47 language tag is not the same as ISO 639 language code, though they're related, and have a lot of overlap. (For example: es "Spanish" is both; es-ES "Spanish Spanish" is a valid BCP 47 language tag but not an ISO 639 language code (the es subtag is an ISO 639 language code, the ES subtag is an ISO 3166 country code); and spa "Spanish" is a valid ISO 639 language code but not a valid BCP 47 language tag (because BCP 47 requires that two-letter codes be preferred to their three-letter synonyms).) For a friendly-but-thorough guide to BCP 47, see http://www.w3.org/International/articles/language-tags/ and http://www.w3.org/International/questions/qa-choosing-language-tags. Re: third sentence: It really depends. In the case of gem-pro, we can keep the gem subtag, but the pro subtag is invalid (it means Old Provençal, and can only be used as the first subtag of a language tag). We might write lang="gem" (perfectly valid, but vague: it means "some Germanic language"), or we might write something like lang="gem-x-proto" (also valid: x means "private use", and everything after it merely has to be syntactically valid, since its semantics are defined by private agreement). Obviously neither of these is ideal, but I think we're unlikely to find any better approach. (You can read the three pages I linked to, and form your own opinions.) —Ruakh 03:18, 18 February 2013 (UTC)Reply

I remember reading something about x- but I wasn't sure what it would be needed for. As I understand it right now, a tag is split into parts with hyphens as separators. When a part is "x" it means "everything following is nonstandard". So the x-pro part means "do not try to interpret 'pro' as you normally would". If I'm not mistaken, "x-gem-pro" would also be valid? But browsers would not be able to understand from such a code that it is Germanic, whereas with "gem-x-pro" they would parse "gem". Is that correct? —CodeCa t 03:41, 18 February 2013 (UTC)Reply

Yes, that's correct. (Well, except the word "parse". Technically even the stuff after x is still parsed, so it still has to be syntactically valid. Something like lang="gem-x-proto_germanic" would not be valid. But I assume you're just using the word "parse" colloquially, and I shouldn't read too much into it?) —Ruakh 04:01, 18 February 2013 (UTC)Reply

Yes sorry, I meant it more as that it doesn't try to interpret the meaning of what it reads. In any case, if that is how it is, then I think we should change the codes we currently use so that they match the standard. It seems a bit hypocritical that we worry about other parts of the lang= attributes but ignore this. I think the easiest way would be to insert -x- in all the exceptional codes, but we could also change -pro to -proto if that is allowed. We could, as an exception, decide to leave out that part in template names, so that the old name is still used for naming templates (out of convenience), like {{gem-verb}} or {{ine-noun}}. Going back to your desire to make proto-languages appear distinctive... do you think that it's distinctive enough if the code ends in "-pro(to)"? —CodeCa t 04:14, 18 February 2013 (UTC)Reply

Yeah, {{context|…|lang=proto:gem-x-proto}} might be overkill. Personally I'd prefer {{context|…|lang=proto:gem}}, but I think I can accept {{context|…|lang=gem-x-proto}} as a compromise. (It's not ideal from my standpoint, because the -x- is really there because Ethnologue doesn't include the language, rather than because we don't — we might end up having -x- in some languages we allow, and lacking it in some languages we don't — but I think I can accept it.) —Ruakh 04:33, 18 February 2013 (UTC)Reply

So you would prefer -proto over -pro then? What would be the next step for this change? We'd probably want to discuss it more widely first... —CodeCa t 04:39, 18 February 2013 (UTC)Reply

Yes, I'd prefer -proto over -pro. (My impression is that the only reason -pro was introduced is that someone thought it was more standard to use a three-letter subtag. Which is sort-of true — real extension subtags are three letters — but pro isn't and can't be a real extension subtag, so it's nonstandard either way.) —Ruakh 04:42, 18 February 2013 (UTC)Reply

olibanum

Latest comment: 11 years ago3 comments2 people in discussion

Would appreciate some {{attention}} here if you have a sec…. Ƿidsiþ 07:36, 16 February 2013 (UTC)Reply

Done. —Ruakh 21:31, 16 February 2013 (UTC)Reply

Cheers. Ƿidsiþ 21:51, 16 February 2013 (UTC) Another one: (deprecated template usage) tohu-bohu! Ƿidsiþ 15:29, 19 February 2013 (UTC)Reply

Whitelisting pages

Latest comment: 11 years ago6 comments3 people in discussion

Happy Purim. I'm almost positive you know how to whitelist pages; so do I in general, but this case is complicated enough that I'm afraid of getting it wrong and wonder if I can bother you, please, to handle it (if, of course, you agree). The relevant discussion was 'closed' here and here's a handy link to the JS.—msh210℠ (talk) 05:37, 24 February 2013 (UTC)Reply

Happy Purim! And yeah, I'm who wrote MediaWiki:Gadget-PatrollingEnhancements.js, so am generally your best bet for changes to it. (Obviously we all try to write JS that anyone else can understand and edit, but that's easier said than done.) I've now made the change that you indicate; see MediaWiki:Gadget-PatrollingEnhancements.js?diff=19626792&oldid=19232663. —Ruakh 06:22, 24 February 2013 (UTC)Reply
- Thanks!—msh210℠ (talk) 07:33, 24 February 2013 (UTC)Reply

Um, can somebody please create an entry for פורים? TIA —Μετάknowledge^{discuss/deeds} 06:52, 24 February 2013 (UTC)Reply

Done.—msh210℠ (talk) 07:33, 24 February 2013 (UTC)Reply

Thanks! —Μετάknowledge^{discuss/deeds} 15:44, 24 February 2013 (UTC)Reply

Name of Module:he-utilities

Latest comment: 11 years ago2 comments2 people in discussion

I had already created Module:sl-common for a similar purpose. It's probably better to use the same names, so which should we use? —CodeCa t 01:37, 25 February 2013 (UTC)Reply

Let me preface this by saying that I think we're in a bit of a discovery-and-experimentation phase, so for things that don't affect anything and are easily changed, it might not be worth worrying too much about consistency just yet. I mean, it's nice for different languages' modules to have similar names, so they're easier to discover and remember, but it doesn't really matter if they don't. Note that template-names are not always consistent between languages, and it's never really caused a problem; the difference in behaviors of different languages' templates has always dwarfed the difference in names. And while it's a bit ugly to do so, we can always create what might be called "shims" or "pseudo-redirects", e.g. creating Module:he-common as return require('Module:he-utilities'). (Disclaimer: not tested.) But I don't actually object to consistency, of course, and since you've raised the point, I'll reply.

Re: "common" vs. "utilities": I have no preference. To me the names imply slightly different things, but both seem applicable here. If you prefer "common", feel free to move Module:he-utilities accordingly.

—Ruakh 03:34, 25 February 2013 (UTC)Reply

Red (well, black) links in Latvian inflection tables

Latest comment: 11 years ago2 comments2 people in discussion

You had asked me about a month ago to add the Latvian words for certain grammatical notions that are mentioned in Latvian inflection tables. I have just finished doing that, and I've also crossed out the respective items at User:DTLHS/WantedPages. I just wanted to let you know. --Pereru (talk) 21:00, 25 February 2013 (UTC)Reply

Thanks! —Ruakh 07:21, 26 February 2013 (UTC)Reply

Lua: Calling a function through a string that has its name

Latest comment: 11 years ago8 comments2 people in discussion

In Module:nl-verb, the export.conjugate function has what is basically a switch statement that "forwards" the call to the correct function. But that could be written more neatly if I could just tell it to call "conjugate_" .. conj_type as though it were a function. In other words, to call a function through a string with its name (which you can construct dynamically). Do you know if it's possible to do this? —CodeCa t 21:46, 27 February 2013 (UTC)Reply

I don't believe there's any way to do exactly what you describe — AFAIK locally-scoped identifiers are only available statically — but it's easy to do approximately what you describe; I've edited Module:nl-verb to show what I mean. Actually, you were already most of the way there. —Ruakh 05:07, 28 February 2013 (UTC)Reply

It should be possible in theory, because you can already put functions into tables, so maybe there is a table that stores all globals or something like that? I remember in PHP you could do this quite easily. —CodeCa t 14:51, 28 February 2013 (UTC)Reply

There is a table that stores all globals — it's _env — but due to the nature of lexical scope, local variables are a different beast entirely. (But why does it matter?) —Ruakh 15:36, 28 February 2013 (UTC)Reply

Well, before you changed it, those functions were global, so they could have been called through that table without making one myself. On the other hand, for "security" reasons it might be better to explicitly restrict the list of callable functions (so that someone doesn't try to invoke, say, the function "make_table"). —CodeCa t 15:47, 28 February 2013 (UTC)Reply

Re: "those functions were global": Oh my gosh, you're right. I had assumed that function declared local identifiers, like in JavaScript, but it doesn't. That explains why you were able to call functions before their declarations.
I think it goes without saying that we should avoid global variables, including global functions.
—Ruakh 15:57, 28 February 2013 (UTC)Reply

I don't think that makes sense. Being able to order functions the way we want, without having to worry about declaring them in advance, is a very good thing. Functions are global in most other languages, too. Then again, I wonder what significance "global" has in this case. If a function is global in one module, can it be called globally from another module that imports it? —CodeCa t 16:11, 28 February 2013 (UTC)Reply

That is a fascinating question. I had simply taken for granted that "global" means "global" — why else, for example, would we be writing local p = {} in all our modules — but you are quite right to ask it, because as far as I can tell by testing, global variables are actually not shared between modules. Not only does a module not see the globals of a module that it imports, but what's more, even the MediaWiki-provided globals are not really shared. For example, every module has mw, but setting mw.foo in one module does not affect other modules that import it. (Incidentally, the same is true of the debug console: it doesn't see globals created within the body of the module you're debugging.) So, yeah, disregard my previous statement: we can use "globals" all we want.
By the way, I mentioned _env above, but I was misreading the documentation: it's actually _G.
—Ruakh 03:56, 1 March 2013 (UTC)Reply

March 2012

Something funny

Latest comment: 11 years ago2 comments2 people in discussion

Lua's syntax requires that functions have names that are identifiers, but since functions can also be put in tables, you can really name them anything. I discovered that that includes:

local export = {}

export[""] = function(frame)
-- ...
end

return export

{{#invoke:something|}}

I thought that was kind of funny, since I don't know many other languages that let you do this. —CodeCa t 02:34, 2 March 2013 (UTC)Reply

Maybe I'm missing what you're getting at, but it seems to me that the same is true in most scripting languages, including Perl ($foo{''} = sub { ... }), JavaScript (window[''] = function () { ... }), and Python (foo[''] = lambda : ...). You'll find this in any language that offers associative arrays and first-class functions. —Ruakh 07:04, 2 March 2013 (UTC)Reply

How do you think I should do this?

Latest comment: 11 years ago4 comments2 people in discussion

I would like to work on converting most uses of {{head}} in Dutch to templates specific to Dutch (which are Lua-fied and faster). But many of them would essentially be the same, because for most of them only a headword is actually needed, nothing else. It definitely makes sense for there to be a single Lua function that does all of them (with a parameter to specify which PoS to categorise in). But I'm not sure what to do on the template side of things. The current approach taken by most languages is to have a single template for each PoS, so with Lua that would mean having many templates all invoke the same Lua function, with a parameter to specify the PoS. An alternative approach is to create {{nl-head}} and give it a parameter that is then passed onto Lua. Basically, it would move the PoS-parameter from being in the templates to being in the entries themselves. However, there is the danger that some editors will see {{nl-head|preposition}} and think, hey, I can probably write {{nl-head|noun}} too! And that's something we definitely don't want. —CodeCa t 15:40, 5 March 2013 (UTC)Reply

Re: first sentence: We really need to Lua-ify {{head}}, too. For obvious reasons, I think it'll be a while before we figure out languages well enough that we can really Lua-ify {{head}} properly, but I don't think there's much benefit to moving away from {{head}} in the meanwhile (at least for that reason). Besides, unless there are ==Dutch== entries with large numbers of POS sections, the Dutch-specific templates aren't "faster" in any meaningful sense. (Is a page that takes 18.1ms "faster" than a different page that also takes 18.1ms?)

Re: {{nl-head|noun}}: If that's really something that we never want, then I don't think there's really problem, since {{nl-head}} can simply add a cleanup category — or simply call the function for {{nl-noun}}, which will add a cleanup category due to missing arguments.

—Ruakh 16:41, 5 March 2013 (UTC)Reply

You haven't really answered my question though. I am wondering whether having separate templates like {{nl-prep}}, {{nl-interj}} the way we current do is preferable to having a single {{nl-head}} with a parameter. I do prefer it for consistency reasons, but I thought that some people might not like it because it leads to creating a separate template for every PoS (even if the Lua code behind each is shared). —CodeCa t 18:06, 5 March 2013 (UTC)Reply

I have no preference. This question seems better-suited to Wiktionary talk:About Dutch. —Ruakh 06:13, 6 March 2013 (UTC)Reply

Memory Stick and memory stick

Latest comment: 11 years ago2 comments2 people in discussion

Hi there. I wasn't aware that removal of (supposed) definitions like that required verification in that way (although it makes sense, I'm simply not hugely familiar with Wiktionary rules; I have spent a lot more time editing Wikipedia and am more used to the burden of proof being the other way round). Thanks for pointing me to rfv-sense. Alphathon (talk) 06:55, 7 March 2013 (UTC)Reply

The burden of proof is still the same way 'round: the sense will be removed if no one presents evidence for it. (You don't have to present evidence for its nonexistence, or anything like that.) It's just that we prefer to leave the content in-place, with the warning tag, while the initial discussion is going on. —Ruakh 07:04, 7 March 2013 (UTC)Reply

Split comma-separated genders

Latest comment: 11 years ago13 comments3 people in discussion

I am not sure if that is a good idea. Besides making the module slower, it would also end up enabling that behaviour for all templates, even though most of them would probably not need it. I would prefer it if the calling module would perform the split, rather than the gender-and-number module. —CodeCa t 16:20, 8 March 2013 (UTC)Reply

I figured that most templates would need it, and that it was best to handle it consistently in the ideal way, rather than having some templates split on comma, some split on space, some that support only a single gender/number specification, and so on. (Incidentally, if you want to change it to accept only commas or only spaces, I'd be down with that. I wasn't sure which was better, but supporting both is probably actually not good, because then people will be inclined to use a comma followed by a space, and the module doesn't currently handle empty specifications intelligently. Alternatively, of course, we could change the module to handle empty specifications intelligently.) The major group of templates that don't need it are the ones that completely generate their gender/number information internally (rather than taking it from template parameters), and of course, such templates can simply ignore this feature. I don't buy the "making the module slower" argument, because the module always calls split at least once anyway, so even if split were this incredibly expensive function that dwarfed all other aspects of the module, this would still only be a less-than-factor-of-two slowdown. —Ruakh 03:38, 9 March 2013 (UTC)Reply

The slowdown for a single call will not be terribly significant, but this function may be called hundreds of times on a page because of the genders in translation tables. So every small amount can easily multiply, and it's probably a good idea to remove anything that isn't strictly necessary. I don't know which templates would actually need it, though. Can you give an example of a case where multiple genders can't be passed in as a list? —CodeCa t 10:13, 9 March 2013 (UTC)Reply

Templates don't support lists. (BTW, the Lua term is actually "sequence", but I'll use your terminology for this comment.) So any Luified template that accepts multiple genders from the user will have to offer some non-list mechanism for doing so. One approach is to have a series of separate parameters (say, 1 and g2 and g3) and then assemble them into a list. Another is to have a single parameter and split it into a list. I think the latter is clearly superior from the user's standpoint.
Re: "it's probably a good idea to remove anything that isn't strictly necessary": Fortunately, it's obvious that you don't actually believe that, because if you did, then the module would only contain export.format_single and export.COMMA (the latter being "'', ''"), and calling modules would assemble their genders into a string rather than into a list. Instead, you made an effort at encapsulation, at exposing only a single function, export.format, for software-engineering reasons. I think that's fine. But it is clearly inferior from a performance standpoint, and the only reason to do it is if we care about humans.
—Ruakh 19:14, 9 March 2013 (UTC)Reply

Ok, I understand that part. But the main module doesn't have to support the list-splitting itself. For example, take {{es-noun}} as an example, which accepts "mf" as a gender. There is nothing wrong with that, but it is incompatible with both the new module and the old templates. Consequently, the template has to convert the gender information into a new format, through a conditional which then forwards it onto the templates {{m|f}}. What I am proposing is to allow each individual template to specify in its own terms how multiple genders are to be indicated, and to expose a single interface on the Lua side, using a table of strings. An example would be Module:nl-head, which has a g2= parameter. If a template decides to combine multiple genders into one parameter, then it also carries the responsibility of splitting/interpreting its parameter before passing it on to Module:gender and number. So basically, I am arguing that splitting on commas/spaces should not happen in Module:gender and number, but in the modules that call it. Of course, if you think that we should rather get rid of multiple parameters for genders (therefore, remove the g2= parameter) and use a single string that contains all information encoded within it in some format, then that's different. But I'm not sure what benefits there would be in such an approach. The advantage to forcing each module to take "responsibility" for the split itself is that it is able to analyse the genders itself and perhaps add categories. For example, both {{nl-noun}} (through Module:nl-head) and {{sl-noun}} check to see whether the gender is correct; such a check would be more difficult if the whole multi-gender parameter is passed verbatim to Module:gender and number, and would probably mean that the calling module has to split the string anyway to get the information from it, which somewhat defeats the purpose of deciding to let Module:gender and number perform the split. —CodeCa t 19:35, 9 March 2013 (UTC)Reply

Re: {{es-noun|mf}}: If we want to keep these sorts of ad hoc notations, then fine, but it would be better to write {{es-noun|m,f}}. This way it's easy for all templates to do it the same way — a way that's (hopefully¹) easy for users to remember.
Re: g2=: Absolutely I think this is a bad user interface, incredibly inconsistent between otherwise-identical templates. We were restricted to these sorts of hacks when we were tacking multiple-gender support onto a system that already supported a single gender, but we aren't anymore! (Also, BTW, if you're going to be all microoptimization-obsessed beyond anything that could conceivably be measured, then I believe you should prefer splitting in Lua over templates that take multiple parameters.)
Re: {{nl-noun}} and {{sl-noun}}: If they perform the split anyway, then there's nothing to discuss; they can pass the resulting table into export.format, exactly as you'd already planned. (Note that your argument applies just as well to your existing list approach: both {{nl-noun}} and {{sl-noun}} have to loop over the gender specifications to validate them, which defeats the purpose of letting Module:gender and number handle the looping!)
1. Speaking of users, we should ask them about this. I mean, I already know what Stephen will say, and if you start the discussion then I already know what DCDuring will say, but we should ask normal editors, too.
—Ruakh 20:12, 9 March 2013 (UTC)Reply

In that case, I think supporting a single gender parameter for all languages is a good idea. But I love to be nitpicky so I am not sure if I like separating them with commas. How about "m/f" instead of "m, f"? Keep in mind that while it may be nice to have the gender code entered the same way as it's displayed, there's no guarantee that the current module will always display "m, f". Maybe we will decide someday that we prefer "m or f" instead. —CodeCa t 20:21, 9 March 2013 (UTC)Reply

Yeah, I'm not married to commas. One thing that I don't like about my single-parameter approach is that it essentially creates a mini-language with two infix operators, so it needs to be obvious at a glance which operator has higher precedence. I'm not sure that commas meet that test: is it obvious that m,f-p means "{m},{f-p}" and not "{m,f}-{p}"? I'm not sure. A better option might be semicolons: m;f-p, maybe? Or maybe it's really not possible without spaces: m f-p or m, f-p or m; f-p or whatnot. One advantage of commas, of course, is that as long as we do display them with commas, the commas will be the easiest operator to remember. —Ruakh 21:21, 9 March 2013 (UTC)Reply

I don't think it has to be obvious, as long as it's consistent. —CodeCa t 22:52, 9 March 2013 (UTC)Reply

(I'm a bit late to the party, I guess, and haven't even looked at the code y'all're discussing, so am commenting based only one what I've gleaned from the discussion here (and what minimal intelligence I can lay claim to).) How about m;f,p to code "masculine; feminine plural"? I think that's easy for non-coders to remember, as it sort-of matches normal English usage. Even better, how about m,fp or m;fp — but only if concatenated without delimiters is possible, which I don't know.—msh210℠ (talk) 04:26, 10 March 2013 (UTC)Reply

Re: concatenated without delimiters: That would be awkward, because not all the templates in Category:Gender and number templates have single-letter names. I don't see any truly ambiguous potential sequences, but it would still be icky. (Also, that category doesn't contain all possible codes. {{pf.}} and {{impf}} belong to essentially the same class, and CodeCat is now hoping to introduce an, in, and pr.) But comma-and-semicolon seems fine to me. —Ruakh 07:02, 10 March 2013 (UTC)Reply

Perfect and imperfect can be separated from the others, because they are used for a different part of speech. At least, I'm not aware of any verb that has gender. Verb forms may, but we indicate that in the form-of definition rather than on the headword line, and no verb lemma has a single gender as far as I know. —CodeCa t 13:30, 10 March 2013 (UTC)Reply

It's true that we're unlikely to combine {{pf.}} ("perfective") or {{impf}} ("imperfective") with {{m}} or {{p}}, but we probably want the same module to handle them, for a few reasons:

{{t|xx|foobar|m}} and {{t|xx|foobar|impf}} should both work, and {{t}} shouldn't need to examine its argument to try to decipher what module handles it.
we want user-input to be handled analogously; if m;f means "masculine or feminine", and pf and impf mean "perfective" and "imperfective" (respectively), then pf;impf should mean "perfective or imperfective").
we want presentation to be analogous.

Incidentally, your pr code ("personal") suggests a point of possible overlap between noun classes and verb classes, though that doesn't really matter in and of itself.

—Ruakh 23:24, 10 March 2013 (UTC)Reply

Final-removing Lua function

Latest comment: 11 years ago3 comments2 people in discussion

Do we have a Lua function that takes a single Hebrew-script word, and if it contains no finals spits it back out, but converts finals to their medial forms? I could use one for Yiddish, and I imagine it could be quite helpful for Hebrew as well. Please note that the requirements will be slightly different, though, because in Yiddish the medial form of Template:Hebr is Template:Hebr. Thanks! —Μετάknowledge^{discuss/deeds} 03:47, 15 March 2013 (UTC)Reply

I've now created Module:yi-utilities with such a function. You might also look through Module:he-utilities and see if there's anything there that you want to appropriate. (It does have a function to convert an individual letter from medial-or-final to medial, but not one that accepts an entire word. I guess there's no reason it couldn't.) —Ruakh 17:21, 15 March 2013 (UTC)Reply

Excellent! Thanks! —Μετάknowledge^{discuss/deeds} 19:38, 15 March 2013 (UTC)Reply

Some advice?

Latest comment: 11 years ago3 comments2 people in discussion

I'm working on Module:ca-head, and I have a question about the make_plural function. The way it should work is like this: each replacement is tried in sequence, and as soon as a replacement is made, the result is returned. It works already, but it seems like a rather bad way to do it because each possibility has to be matched twice: first to see if it's in the string, and then again to do the actual replacement. Would you know if a more elegant way to do this, which automatically "aborts" all remaining possibilities once a successful replacement is made? —CodeCa t 21:30, 20 March 2013 (UTC)Reply

I think "elegant" is subjective, but probably the tersest approach is to write a helper function that handles an arbitrary number of non-cascading substitutions — maybe something like this:

function ending_swapper(base, ...)
    local swaps = { ... }
    local num_swaps = # swaps
    for i = 1, num_swaps, 2 do
        local ret, n = mw.ustring.gsub(base, swaps[i] .. '$', swaps[i+1])
        if n > 0 then
            return ret
        end
    end
    return nil
end

and then use it something like this:

function make_plural(base, gender)
    local ret = ending_swapper(base, "ça","ces", "ca","ques", "qua","qües", "ja","ges", "ga","gues", "gua","gües", "a","es")
    if ret then return ret end
    ret = ending_swapper(base, "à","ans", "[èé]","ens", "([gq])uí","%1uins", "([aeiou])í","%1ïns", "í","ins", "[òó]","ons", "ú","uns")
    if ret then return ret end
    if gender:find("^mf?$") then
        ret = ending_swapper(base, "às","asos", "[èé]","esos", "([gq])uís","%1uisos", "([aeiou])ís","%1ïsos", "ís","isos", "[òó]s","osos", "ú","usos", "[çsxz]","%0os")
        if ret then return ret end
        if base:find("sc$") or base:find("st$") or base:find("xt$") then return base .. "s", base .. "os" end
    end
    if gender == "f" then
        if base:find("s$") then return base end
        if base:find("sc$") or base:find("st$") or base:find("xt$") then return base .. "s", base .. "es" end
    end
    return base .. "s"
end

. . . which isn't an all-or-nothing deal. For example, you could take the concept of having a helper function that calls gsub and that returns nil when there's no match, but instead of taking many arguments at once, you could chain the calls like

return h(base, "ça", "ces") or h(base, "ca", "ques") or ... or (gender:find("^mf?$") and (h(base, "às" ,"asos") or h(base, "[èé]", "esos") or ...)) or ...

. Or whatever.

—Ruakh 06:52, 21 March 2013 (UTC)Reply

It does look like an ok solution, not the clearest one though. Terseness is nice but it shouldn't be detrimental to code clarity. I do like your or-solution though... that kind of fits my idea of "elegant" because it uses the language's own idioms. I'll see what I can do. Thank you. —CodeCa t 13:42, 21 March 2013 (UTC)Reply

A request for your input

Latest comment: 11 years ago1 comment1 person in discussion

Can you have a look at Module talk:ru-translit#How can this be used from another Lua module?? —CodeCa t 12:57, 11 April 2013 (UTC)Reply

technical question

Latest comment: 11 years ago1 comment1 person in discussion

Hi. Could me tell what is the parameter that was added to an url, for example, http://en.wiktionary.org/w/index.php?title=Title&action=edit to display name of MediaWiki messages instead of normal text, like (PAGETITLE) instead of its title etc. I can't remember and I can't find it here. Maro 23:55, 14 April 2013 (UTC)Reply

@@ Line 1,092: / Line 1,092: @@
 Can you have a look at [[Module talk:ru-translit#How can this be used from another Lua module?]]? {{User:CodeCat/signature}} 12:57, 11 April 2013 (UTC)
+== technical question ==
+Hi. Could me tell what is the parameter that was added to an url, for example, <tt><nowiki>http://en.wiktionary.org/w/index.php?title=Title&action=edit</nowiki></tt> to display name of MediaWiki messages instead of normal text, like (PAGETITLE) instead of its title etc. I can't remember and I can't find it [http://www.mediawiki.org/wiki/Manual:Parameters_to_index.php here]. [[User:Maro|Maro]] 23:55, 14 April 2013 (UTC)

User talk:Ruakh: difference between revisions

Revision as of 23:55, 14 April 2013

October 2012

interwikilinks

Marrone

Templates outside a language section

Advices

Druidry

RE: A bunch of copyvio

Joseon814

Reply (rite of passage, bot)

Spanish feminine nouns

ku-noun

Dump question

Template:l

Dumps

excessive and defective spellings

November 2012

דובים

clarification

Debug Javascript?

Hebrew direct object suffixes

Recent reversions

I need to delete some redirects again

Re: m:Bureacrat

December 2012

Vote

Talkback

Another user having trouble seeing quotations today

Hebrew stress marks

Quotes

Taking over from Tbot

Binyan of ניגש

Is this possible?

A couple of deletion requests

Tbot-like JavaScript

January 2013

ISBN

Tbot support

Aramaic prefixes and suffixes

How to use Category:Hebrew personal pronouns?

Page links dump and invalid page IDs

On certain Latvian grammatical words

February 2013

Linking and tabbed languages

I'm sorry... can we try to sort it out?

Whitelisting pages

Name of Module:he-utilities

Red (well, black) links in Latvian inflection tables

Lua: Calling a function through a string that has its name

March 2012

Something funny

How do you think I should do this?

Memory Stick and memory stick

Split comma-separated genders

Final-removing Lua function

Some advice?

A request for your input

technical question

Navigation menu

Search