Jump to content

Wikipedia:ᎾᎥ ᏄᎾᏓᎸ ᎾᎿᎢ

From Wikipedia

Ideas to aggressively delete or mostly blank poor-quality entries

I am currently streaming a video conference about the Cherokee language and language-revitalization efforts from Western Carolina University's Cherokee Language program[1] where Dr. Hartwell Francis, the former director, is speaking. To paraphrase him, he stated that the Cherokee-language community has been aware of this project for a decade, but that many of the current Wikipedia articles are of remarkably poor quality because of the influence of bot-generated entries. This makes even looking at them incredibly demotivating because the challenge of fixing them is so daunting. Thus, he suggests cleaning up the project and perhaps revitalizing it through aggressive deleting of low-quality entries.

A possible alternative to mass deletion might be mass partial blanking to encourage more human-created edits. A poor quality article could be reduced to some boiler-plate text like "XXX is in need of an article" or "XXX had the previous article replaced with this text because it was poor-quality, but please feel free to begin a new article now", but in Cherokee. I am not a Cherokee-speaker, but I could help make these edits (or teach someone else to) if a boiler-plate text and a list of the low-quality demotivating entries were provided. I plan to notify Dr. Hartwell Francis about this thread, and let him know that I believe an bold approach like this is warranted (and feasible) in order to deal with what appears like a stagnant project. Any ideas? Biosthmors (talk) 16:00, 26 ᎠᏄᏱ 2018 (UTC)[reply]

Ping to User:Ooswesthoesbes, who I see has helped start short articles here (for example ᏣᎳᎩ ᎫᎾᏕᎶᏆᏍᏗ ᏚᏓᏥᏍᎬᎢ). Biosthmors (talk) 16:16, 26 ᎠᏄᏱ 2018 (UTC)[reply]

If partial blanking or mass deletion of those pages results in more activity from native (or at least fluent) speakers, I have no objections :) Although I believe pages like ᏅᏃᎯ are also important to look at, as they seem to be bot generated articles which are not fully translated and probably incorrectly as well. --Ooswesthoesbes (talk) 16:35, 26 ᎠᏄᏱ 2018 (UTC)[reply]
Thanks for the reply Ooswesthoesbes. For ᏅᏃᎯ, yes, I'd guess, just based on hearing people talk about Cherokee Wikipedia, that a lot is probably incorrectly translated. The concept of turning articles into much shorter versions of themselves ("stubs") should also be considered. (For reference, on English Wikipedia the concept is referred to as "stubifying".) If a couple sentences of useful content can be preserved or generated, then massively chopping a gargantuan poor-quality article down to a factual sentence or short paragraph is also a significant improvement. Biosthmors (talk) 17:20, 26 ᎠᏄᏱ 2018 (UTC)[reply]

Cherokee Wikipedia is a 2007 guinea pig victim of an experiment in machine translation run by Jeffrey Merkey. The shit was just dumped onto the pages en masse. Any page longer than one sentence was produced by a computer program, and no human ever looked at it. Seb az86556 (talk) 21:41, 26 ᎠᏄᏱ 2018 (UTC)[reply]

Thanks for the info. I found this showing more info about their contributions elsewhere. Biosthmors (talk) 21:53, 27 ᎠᏄᏱ 2018 (UTC)[reply]
It's a very bad situation indeed. Automatic translation is especially bad when used for languages that are totally unrelated and function with a different grammar, such as from English to Cherokee. I propose to use a bot to add a template on top of all pages, so we can work to a way to fix them. --Ooswesthoesbes (talk) 13:41, 28 ᎠᏄᏱ 2018 (UTC)[reply]
My understanding is that there are several hundred articles on this wiki? I've been clicking the random page generator button (top left "ᎤᏍᏆᏂᎪᏗ ᎤᏆᏓᏛ", for anyone who doesn't know] and blanking down to "Subject ...", as I just did here. This is blunt, but I just want to get started. I did see someone edit a page on this wiki to add a category, marking an article as machine translated. Maybe most of the horrendous articles are already categorized? I'll post that category when/if I find it. Biosthmors (talk) 18:11, 29 ᎠᏄᏱ 2018 (UTC)[reply]
Category:Machine-translated articles. So after one cleans one up, remove the category? I just did this with ᏓᎬᎾ ᏥᏳ ᏗᏔᎳᏗᏍᏗ ᏚᎦᏘᎸᏒᎢ. Biosthmors (talk) 18:31, 29 ᎠᏄᏱ 2018 (UTC)[reply]
908 articles on chr wikipedia as of february, for what it's worth. Biosthmors (talk) 22:53, 29 ᎠᏄᏱ 2018 (UTC)[reply]
If we just had a list of articles on here sorted by size that would really help, since the large ones appear to be the garbage ones. Biosthmors (talk) 23:26, 29 ᎠᏄᏱ 2018 (UTC)[reply]
Special:LongPages is the link for this. Biosthmors (talk) 14:32, 3 ᎧᏩᏂ 2018 (UTC) Thanks Biosthmors[reply]

Thanks Biosthmors for starting this conversation and Ooswesthoesbes and Seb az86556 for providing context and clarification on the machine translation debacle. I see that editor was banned from Wikipedia a decade ago.

I also gave some remarks at the symposium at Western Carolina (I'm Derek). I think there would be interest in organizing an edit-a-thon out at Western Carolina, and hopefully we could get some of the fluent speakers to attend along with advanced learners. It would be great if, in preparation, we could clean all the garbage articles out. I like the idea of blanking them with boilerplate. It would be nice if there was a bot to do this, but if there are just a few hundred articles, we could probably get it done manually? I'm not a speaker, but could coordinate the translation of the boilerplate or template by a fluent speaker.

What else should we be considering if we wanted to reboot this project with an edit-a-thon? --R12ntech (talk) 18:24, 9 ᎧᏩᏂ 2018 (UTC)[reply]

I'm pretty sure we can get a user run a bot to mark all pages automatically. That shouldn't be too much work. --Ooswesthoesbes (talk) 08:18, 10 ᎧᏩᏂ 2018 (UTC)[reply]

Shall I make a request so someone will run a bot on this wiki to mark all pages with a category like [[Category:To be checked]]? Or would you prefer the category to have another name? --Ooswesthoesbes (talk) 09:23, 13 ᎧᏩᏂ 2018 (UTC)[reply]

That would be great, thanks. --R12ntech (talk) 15:40, 13 ᎧᏩᏂ 2018 (UTC)[reply]
A user has already indicated she is willing to help us out :) --Ooswesthoesbes (talk) 16:28, 18 ᎧᏩᏂ 2018 (UTC)[reply]
Hi. Is it correct that simply all pages in the main namespace should be added to the category "To be checked"? Should I run my bot with or without a bot flag? --MF-Warburg (talk) 12:12, 19 ᎧᏩᏂ 2018 (UTC)[reply]
Yes, that's correct. As there are virtually no edits on this wiki, I think a bot flag is not necessary. --Ooswesthoesbes (talk) 08:34, 20 ᎧᏩᏂ 2018 (UTC)[reply]
Done. By the way, I also unprotected a lot of pages in the Wikipedia namespace, see Special:Log/protect, which might be worth to look at / update / delete / ... --MF-Warburg (talk) 12:59, 20 ᎧᏩᏂ 2018 (UTC)[reply]

Random thoughts

I searched for ᎠᎹᏍᎧᎦᎯ at cherokeedictionary.net but there were no hits. I noticed that one source defined a water fall as ᎠᎹ ᎦᏙᎣᏍᎬᎢ, but I believe that was derived a self-published source, for what it's worth. I wonder if it is an incorrect title. Biosthmors (talk) 10:14, 2 ᎧᏩᏂ 2018 (UTC)[reply]

That website didn't have a definition for badger, so it's not as comprehensive as I was hoping, for what it's worth. Biosthmors (talk) 16:56, 2 ᎧᏩᏂ 2018 (UTC)[reply]

Interface translation

This Wikipedia was created before the Incubator system, so there has apparently never been a major push to complete the interface translation. If anyone wants to work on this, the Cherokee portal on translatewiki is here: [2], and the direct link to the "most important messages" for the Mediawiki interface is here: [3]

Here's a brief article on the topic: Translating the software that powers Wikipedia --R12ntech (talk) 19:32, 11 ᎧᏩᏂ 2018 (UTC)[reply]

Admin

Considering we probably got a lot of clean-up work to do, I want to request admin rights to delete unsalvageable pages, so we don't get a way too long backlog. --Ooswesthoesbes (talk) 14:59, 18 ᎧᏩᏂ 2018 (UTC)[reply]