Help:Creating a bot: Difference between revisions

Content deleted Content added
No edit summary
Tags: Reverted Mobile edit Mobile web edit
Logging in: de-emphasize re-inventing the wheel
 
(12 intermediate revisions by 10 users not shown)
Line 48:
 
===Logging in===
Approved bots need to be logged in to make edits. Although a bot can make read requests without logging in, bots that have completed testing should log in for all activities. Bots logged in from an account with the bot flag can obtain more results per query from the MediaWiki API (api.php). Most bot frameworks should handle login and cookies automatically, but if you are not using an existing framework, you will need to follow these steps.
 
Use of a bot framework is recommended as they handle login and cookies. Common frameworks include [[mw:Manual:Pywikibot|pywikibot]] for Python and [[mw:Manual:Mwn|mwn]] for Node.js. The manual steps below can be followed if you are implementing your own framework.
For security, login data must be passed using the [[HTTP POST]] method. Because parameters of [[HTTP GET]] requests are easily visible in URL, logins via GET are disabled.
 
To log a bot in using the [[mw:API|MediaWiki API]], two requests are needed:
Line 135:
Major functionality changes of approved bots must be [[Wikipedia:Bots/Requests_for_approval|approved]].
 
==General guidelines for running a bot==
shoter
In addition to the official bot policy, which covers the main points to consider when developing your bot, there are a number of more general advisory points to consider when developing your bot.
 
===Bot best practices===
* Set a custom [[User agent|User-Agent]] header for your bot, per the [[meta:User-Agent policy|Wikimedia User-Agent policy]]. If you don't, your bot may encounter errors and may end up blocked by the technical staff at the server level.
* Use the [[mw:Manual:Maxlag parameter|maxlag parameter]] with a maximum lag of 5 seconds. This will enable the bot to run quickly when server load is low, and throttle the bot when server load is high.
**If writing a bot in a framework that does not support maxlag, limit the total requests (read and write requests together) to no more than 10/minute.
* Use the [[mw:API|API]] whenever possible, and set the query limits to the largest values that the server permits, to minimize the total number of requests that must be made.
* Edit (write) requests are more expensive in server time than read requests. Be edit-light and design your code to keep edits to a minimum.
** Try to consolidate edits. One single large edit is better than 10 smaller ones.
* Enable [[HTTP persistent connection]]s and [[HTTP compression|compression]] in your HTTP client library, if possible.
* Do not make multi-threaded requests. Wait for one server request to complete before beginning another.
* Back off upon receiving errors from the server. Errors such as timeouts are often an indication of heavy server load. Use [[exponential backoff|a sequence of increasingly longer delays between repeated requests]].
* Make use of [[mw:API:Assert|assertion]] to ensure your bot is logged in.
* Test your code thoroughly before making large automated runs. Individually examine all edits on trial runs to verify they are perfect.
 
===Common bot features you should consider implementing===
====Manual assistance====
If your bot is doing anything that requires judgment or evaluation of context (e.g., correcting spelling) then you should consider making your bot manually-assisted, which means that a human verifies all edits before they are saved. This significantly reduces the bot's speed, but it also significantly reduces errors.
 
====Disabling the bot====
It should be easy to quickly disable your bot. If your bot goes bad, it is your responsibility to clean up after it! You could have the bot refuse to run if a message has been left on its talk page, on the assumption that the message may be a complaint against its activities; this can be checked using the API <code>meta=userinfo</code> query (<span class="plainlinks">[https://en.wikipedia.org/w/api.php?action=query&meta=userinfo&uiprop=hasmsg example]</span>). Or you could have a page that will turn the bot off when changed; this can be checked by loading the page contents before each edit.
 
====Signature====
Just like a human, if your bot makes edits to a talk page on Wikipedia, it should sign its post with four tildes <nowiki>(~~~~)</nowiki>. Signatures belong '''only''' on talk namespaces with the exception of project pages used for discussion (e.g., [[WP:AFD|articles for deletion]]).
 
====Bot Flag====
A bot's edits will be visible at [[Special:RecentChanges]], unless the edits are set to indicate a bot. Once the bot has been approved and given its bot flag permission, one can add "bot=True" to the API call - see [[mw:API:Edit#Parameters]] in order to hide the bot's edits in [[Special:RecentChanges]]. In Python, using either mwclient or wikitools, then adding '''{{Green|1=bot=True}}''' to the edit/save command will set the edit as a bot edit - e.g. {{code|1=PageObject.edit(text=pagetext, bot=True, summary=pagesummary)}}.
 
====Monitoring the bot status====
If the bot is fully automated and performs regular edits, you should periodically check it runs as specified, and its behaviour has not been altered by software changes. Consider adding it to [[Wikipedia:Bot activity monitor]] to be notified if the bot stops working.
 
==Open-source bots==