Page MenuHomePhabricator

[epic] SDoC: Determine baseline for metrics
Closed, ResolvedPublic

Description

As the SDC General project ramps up, we'll need to figure out a baseline for metrics on Commons in order to measure future successes. This ticket will be used as a base for the needs/wants for this endeavor.

This is scheduled to take place in Q2, as per goals: https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2017-18_Q2#Segment_2:_Search_integration_and_exposure

Let's do this by:

  • Investigating what and if sufficient eventlogging is already setup for Commons search
    • if not, investigate what are the appropriate metrics to track
    • is eventlogging setup only on Commons or is it on other projects
      • Multi-projects at once?
      • Search satisfaction schema
  • How do users generally use Commons
    • Are there behaviors that we can easily identify to help make search better
    • Is there a way to tell what the current zero results rate is for searches on Commons
  • How many search "hits" are based on a match in the file name vs. description vs. category
  • Better analysis of how many files may be currently "unfindable" because of lack of categorization, unhelpful file name, no description (or poor description)
  • Analysis of how many contributions are made by individuals vs. mass-tools/institutions

Things to think about / keep in mind:

  • Zero results rate (ZRR)
  • More relevant results
    • What is ‘relevant’
  • Clickthroughs from cross-project searches
  • API usage
  • User engagement
    • ‘People were able to learn more’
  • User satisfaction
  • Effort users spend finding something
  • Time users spend finding something
  • Tracking downloads of media
  • Steps to unique queries
  • How many embeds on other projects
  • How many times a specific file/image has shown up in searches
  • How many files/images never show up in searches
  • How many searches are by exclusions
    • ‘Pictures of cats but not calicos’

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Since this will involve some exploratory analyst work on Commons, can that include some general Commons metrics? @Capt_Swing can set up the general metric work, if the analysts can run the numbers.

Yup, we've got an upcoming meeting with @Capt_Swing to discuss just that!

Some specific desired metrics from multimedia (some are simple/basic, others...maybe not so much):

  • how many search "hits" are based on a match in the file name vs. description vs. category
  • better analysis of how many files may be currently "unfindable" because of lack of categorization, unhelpful file name, no description (or poor description)
  • analysis of how many contributions are made by individuals vs. mass-tools/institutions
debt renamed this task from Determine baseline for metrics on Commons to [epic] Determine baseline for metrics on Commons.Oct 3 2017, 11:25 PM
debt renamed this task from [epic] Determine baseline for metrics on Commons to [epic] SDoC: Determine baseline for metrics.Oct 3 2017, 11:33 PM
debt raised the priority of this task from Medium to High.
debt updated the task description. (Show Details)

Please have in mind that metrics for commons exist is https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm , let's make sure those are looked at when this work is taking place.

Nice! Thank you for documenting.

@Ramsey-WMF Is there any feedback about the baseline metrics from the team? Could we resolve this ticket and other child tickets?

Hi @chelsyx . The main feedback I'd give is: AWESOME! Thanks to both you and @mpopov for fantastic work. Commons is a little wild, but you bravely navigated the jungle :)

We've shared the findings with a number of team members over the past few weeks and I haven't heard anything further, so I think we can close this baseline round.