Page MenuHomePhabricator

Wikimedia\NormalizedException\NormalizedException: {parameter1}
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
labels.normalized_message
[{reqId}] {exception_url}   Wikimedia\NormalizedException\NormalizedException: {parameter1}
error.stack_trace
from /srv/mediawiki/php-1.41.0-wmf.2/extensions/GrowthExperiments/includes/Util.php(185)
#0 /srv/mediawiki/php-1.41.0-wmf.2/extensions/GrowthExperiments/includes/HomepageHooks.php(452): GrowthExperiments\Util::logStatus(StatusValue)
#1 /srv/mediawiki/php-1.41.0-wmf.2/includes/HookContainer/HookContainer.php(160): GrowthExperiments\HomepageHooks->onBeforePageDisplay(OutputPage, MediaWiki\Skins\Vector\SkinVector22)
#2 /srv/mediawiki/php-1.41.0-wmf.2/includes/HookContainer/HookRunner.php(942): MediaWiki\HookContainer\HookContainer->run(string, array, array)
#3 /srv/mediawiki/php-1.41.0-wmf.2/includes/OutputPage.php(2894): MediaWiki\HookContainer\HookRunner->onBeforePageDisplay(OutputPage, MediaWiki\Skins\Vector\SkinVector22)
#4 /srv/mediawiki/php-1.41.0-wmf.2/includes/MediaWiki.php(944): OutputPage->output(boolean)
#5 /srv/mediawiki/php-1.41.0-wmf.2/includes/MediaWiki.php(579): MediaWiki->main()
#6 /srv/mediawiki/php-1.41.0-wmf.2/index.php(50): MediaWiki->run()
#7 /srv/mediawiki/php-1.41.0-wmf.2/index.php(46): wfIndexMain()
#8 /srv/mediawiki/w/index.php(3): require(string)
#9 {main}
Impact
Notes

There are a few like this they come with the parameters:

gesuggestededit=1
getasktype=image-recommendation
section=all
title=Grotesco (or some othertitile)
veaction=edit

Details

Request URL
https://pt.wikipedia.org/w/index.php?geclickid=*&genewcomertasktoken=*&gesuggestededit=*&getasktype=*&section=*&title=*&veaction=*

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Fallout from rEGRE367d79d557df: PSR-3 normalize errors.

Timelevelchannelhostwikiparameter1
Apr 5, 2023 @ 12:27:59.355ERRORexceptionmw1496rowikiInvalid source type for Radka_Toneff:
Apr 5, 2023 @ 12:13:08.712ERRORexceptionmw2365rowikiInvalid source type for Pace:
Apr 4, 2023 @ 19:25:01.630ERRORexceptionmw2310zhwikiNo recommendation found for page: 抽象藝術

Somehow normalized_message ends up being [{reqId}] {exception_url} Wikimedia\NormalizedException\NormalizedException: {parameter1}, parameter1 ends up containing the actual message, and the actual parameter ends up getting lost. Possibly comes from using RawMessage, or rawmessage.

Some other code paths seem to work, e.g. here normalized_message is Link suggestion not found for "{parameter1}" and parameter1 is the page name.

Change 906018 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@master] Status::getPsr3MessageAndContext: Special-case rawmessage

https://gerrit.wikimedia.org/r/906018

Change 906095 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] Use RawMessage instead of rawmessage when it can be parametrized

https://gerrit.wikimedia.org/r/906095

Change 906018 merged by jenkins-bot:

[mediawiki/core@master] Status::getPsr3MessageAndContext: Special-case rawmessage

https://gerrit.wikimedia.org/r/906018

I think in this case, source is actually an empty string.

In includes/NewcomerTasks/AddImage/ImageRecommendationData.php we have:

private const KIND_TO_SOURCE = [
		'istype-lead-image' => ImageRecommendationImage::SOURCE_WIKIPEDIA,
		'istype-wikidata-image' => ImageRecommendationImage::SOURCE_WIKIDATA,
		'istype-commons-category' => ImageRecommendationImage::SOURCE_COMMONS
	];

And when constructing an ImageRecommendationData object we do:

private function getSourceFromKind( array $kind ): string {
		foreach ( $kind as $key ) {
			if ( array_key_exists( $key, self::KIND_TO_SOURCE ) ) {
				return self::KIND_TO_SOURCE[ $key ];
			}
		}
		return '';
	}

so when we return with StatusValue::newFatal:

return StatusValue::newFatal( 'rawmessage',
				'Invalid source type for ' . $titleTextSafe . ': ' . strip_tags( $source ) );

source can be an empty string.

In this case, it looks like the addition of "istype-section-topics-p18" to the kind field is causing the problem.

kharlan@mwmaint2002:~$ curl http://localhost:6030/public/image_suggestions/suggestions/plwiki/2011993
...
    {
      "wiki": "plwiki",
      "page_id": 2011993,
      "id": "231ff42c-cf33-11ed-91f4-f4e9d4dbab80",
      "image": "NeomorphusRadiolosusSmit.jpg",
      "confidence": 50,
      "found_on": null,
      "kind": [
        "istype-section-topics-p18"
      ],
      "origin_wiki": "commonswiki",
      "page_rev": 65815773,
      "section_heading": null
    },

We'll want the logic for selecting the highest confidence image recommendation to also ensure it picks something with a valid kind value for page level image suggestions? Or maybe we don't need to log every validation failure, but should only log if all image recommendation data objects failed validation?

There are about 16 errors reported so far in wmf.5 (logstash link).

We'll want the logic for selecting the highest confidence image recommendation to also ensure it picks something with a valid kind value for page level image suggestions? Or maybe we don't need to log every validation failure, but should only log if all image recommendation data objects failed validation?

Thanks for tracking that down. I think it's a good idea to log unexpected validation failures. istype-section-topics-p18 is an expected validation failure (although it wasn't when this code was written). We can handle that in T329278: Section-level images: Create tooling to mock API IMO (since creating the new API handler will consist of deciding which types to ignore).

kostajh changed the task status from Open to In Progress.Apr 28 2023, 10:31 AM
kostajh triaged this task as Medium priority.

Change 906095 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Use RawMessage instead of rawmessage when it can be parametrized

https://gerrit.wikimedia.org/r/906095

I don't fully understand the intent on the "silently ignore" approach from I9ae7a13a55ee75b4072f008ddc418c1e960c04b8. We're still getting a bunch of Invalid source type for <page_title>: unknown errors because reasonably "unknown" fails the check in includes/NewcomerTasks/AddImage/ImageRecommendationDataValidator.php:

if ( !in_array( $source, ImageRecommendationImage::KNOWN_SOURCES, true ) ) {...}

Is this a way to monitor unkown sources by regularly checking event count?

Mainly it was a way to avoid logging production errors; since then, the logging code was since improved in rEGRE56acd66dc8cd: Section images: Do not treat unexpected kinds as production errors to not do that anyway. Then the (hopefully temporary) hack in rEGREa51539a9db3f: Section images: Accept all API response 'kind' fields was added, which pretends that unknown kinds (as in, not present in KNOWN_SOURCES) are istype-wikidata-image or istype-section-topics (depending on whether there is a section specified in the API response) so now 'istype-depicts' => 'unknown' has the effect of not showing that type of recommendation to the user at all (ie. the only way to get filtered out by ImageRecommendationDataValidator is when KNOWN_SOURCES maps the kind to some invalid source value).

The implementation is a mess and should be improved (we should have an explicit disallow list; and I don't think there's much use of ImageRecommendationDataValidator being a standalone class, when validation logic is specific to the API being used), but the behior is correct, I think:

  • We want to show unknown kinds to the user right now, because the kind names are changed from time to time, and we prefer lying to the user about the source over giving "Suggestions are no longer available" for every single task. Maybe we could have a dedicated "unkown" source type which gets presented to the user as such; but I think it would be unecessarily confusing. So we just say it's a Wikidata recommendation (new/unknown types will probably be related to Wikidata in some way).
  • We want to log unknown types so we can notice them and adjust the source messages (and so turn them into known types).
  • We want to completely ignore depicts because it's lower quality than the other article-level image recommendations.

In the short term, we should probably just change 'istype-depicts' => 'unknown' to 'istype-depicts' => 'ignored/depicts' or such to make it less confusing. I am not sure if we want to log it or not (maybe somewhat useful for statistics).