Wikidata:Requests for permissions/Bot/MajavahBot
Jump to navigation
Jump to search
MajavahBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Taavi (talk • contribs • logs)
Task/s: Import version and metadata information for Python libraries from PyPI.
Function details: For items with PyPI project (P5568) set, imports the following data from PyPI:
- software version identifier (P348) (from PyPI releases). The latest release is marked as preferred, and the preferred rank is removed from older versions if it was added by this bot.
- issue tracker URL (P1401), user manual URL (P2078), source code repository URL (P1324), source code repository URL (P1324) (from the metadata of the latest release)
Additionally the PyPI project (P5568) value will be updated to the normalized name if it's not already in that form.
Taavi (talk) 19:54, 11 July 2023 (UTC)
- how many statements do you think this will add? don't some packages have...lots of versions? BrokenSegue (talk) 20:05, 11 July 2023 (UTC)
- Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)
- i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)
- I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does
year.month.patch
type releases so the first digit changing isn't really meaningful. - However I can filter out all packages generated from https://github.com/vemel/mypy_boto3_builder, as those are all very similar and not intended for human use directly anywyays. That cuts the total number of versions to a third (~70k) even before doing any other per-package limits. Taavi (talk) 21:15, 11 July 2023 (UTC)
- See also Wikidata:Requests for permissions/Bot/RPI2026F1Bot 5 for discussion of a previous similar task (seems not active) and Github-wiki-bot imports version data from GitHub (see e.g. history of modelscope (Q120550399)); however you should care that version numbers may be different between GitHub and PyPI.--GZWDer (talk) 11:38, 12 July 2023 (UTC)
- I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does
- i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)
- Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)
- Oh yes, the RPI2026F1Bot task looks somewhat similar. I'm aware of Github-wiki-bot, but there are quite a few PyPI projects that are not hosted on GitHub, and I think my code should be able to handle items with data from both and ensure the two bots don't start edit warring for example. Taavi (talk) 17:23, 12 July 2023 (UTC)
- @Taavi: Please make some test edits. --Wüstenspringmaus talk 11:05, 29 August 2024 (UTC)