Page MenuHomePhabricator

Migrate to PHP 7 in WMF production
Closed, ResolvedPublic

Assigned To
None
Authored By
tstarling
Sep 20 2017, 10:19 PM
Referenced Files
None
Tokens
"Evil Spooky Haunted Tree" token, awarded by awight."Like" token, awarded by Aklapper."Love" token, awarded by Ladsgroup."Love" token, awarded by Michael."Love" token, awarded by jcrespo."Love" token, awarded by kostajh."Yellow Medal" token, awarded by Liuxinyu970226."Mountain of Wealth" token, awarded by xSavitar."Yellow Medal" token, awarded by Ricordisamoa.

Description

Following the strategy announcement by the HHVM team and subsequent discussion on pikitech-l, including benchmarks indicating HHVM no longer has a performance advantage, it is proposed that we migrate the WMF MediaWiki appservers from HHVM to PHP 7.

The current plan involves first migrating the appservers from jessie to stretch (T174431).

High-level checklist

Phase 1: PHP 7 and extensions/integrations installed:

  • Plain PHP 7.0 available on the command-line on all mw servers (e.g. for manual use via mwscript).
  • Deciding how to install PHP 7 and how to do CGI from Apache.
  • Plain PHP 7.0 available on mwdebug servers via php-fpm.

Phase 2: Verification of PHP 7 interaction with MediaWiki.

Features to be installed, tested, and known to work correctly; any behaviour/config differences between PHP7 and HHVM to be equalised or justified and documented.

  • Memcached client works fine with MW.
  • mysqli driver works fine with MW.
  • APCu works fine with MW.
  • Session handler work fine with MW.
  • Post-send functions works fine with MW. - T209981
  • Error and fatal error reporting works fine with MW. – T187147
  • Audit and fix as needed differences in INI settings. – T211488

Phase 3: Public opt-in via WikimediaDebug.

  • Add nginx handling to expose PHP7 for opted-in requests.
  • Provide opt-in via X-Wikimedia-Debug for mwdebug servers.
  • Debug profiling works fine with MW (e.g. xhprof, or tideways) – T206152
  • Sampling profiling works fine with MW. – T176916
  • Any misc issues we find from opt-in tests and anything blocking from PHP 7.2 support.

Phase 4: Public opt-in via beta feature.

  • Expose PHP7 vs. HHVM generation to JS in a header mw.config key.
  • Look at whether we should distinguish NavTiming metrics based on whether they're served by php7 or HHVM. (Not sure whether this is valuable.)
  • PHP 7.2 available on all mw servers via php-fpm (web, api, jobrunners, videoscalers)
  • Provide opt-in via MediaWiki Beta Feature for any web/api appserver.
  • Upgrade from PHP 7.0 to PHP 7.2.

Phase 5: PHP 7.2 on by default:

  • Ramp up A/B test of the beta feature for web/api app servers.
  • Switch default for all users on web appservers and api appservers.
  • Switch mwscript CLI on maintenance hosts and deployment hosts. – T195392
  • Switch job runners and video scalers.

Summary

Taking into account the lack funding for appserver work, as well as the end of the year fundraising and Christmas freezes, the (tentative!) timeline I proposed is:

  • Upgrade the appserver fleet (w/ HHVM) to Debian stretch, including the ICU migration, in Q3 FY17-18 (circa February/March 2018)
  • Begin PHP7 planning and initial implementation work in Q4 FY17-18, e.g. including a few test servers
  • Fund the work in FY18-19 and complete it early in the year (Q1 or Q2 at the latest)

That's from an SRE perspective -- I think other tasks related to this migration (such as CI, beta, code/extension changes etc.) can proceed semi-independently. Given our velocity, I think we'll end up be the ones blocking others, rather than us waiting on these.

During the SRE offsite/onsite we came up with the following plan:

  • We need to remove PHP 5 usage by May 2018 (branch date for the next Mediawiki LTS)
    • Only open issue is dumps, which is showing good progress (needs wikidiff build and review of php7 puppet patch)
  • We're planning the ICU migration (detailed in T177498) to happen this quarter:
    • HHVM package for icu57 needs to be upgraded to 3.18.8
    • Initial testing will happen on mwdebug
    • Also deployment-prep will be upgraded to the new packages (allows preparation of patches to use more recent Unicode)
    • For actual migration Community Liaisons need to be involved (user-visible effects)
    • First all app servers need to be upgraded, the run the conversion script (will take several days)
  • Once we have upgraded to HHVM/ICU57 on jessie, we can upgrade to HHVM 3.18/stretch (we should be able to complete that by Q4)
  • Once we're on stretch we can upgrade to HHVM 3.24
  • We came up with the following migration plan towards PHP 7:
    • Running php7-fpm on different port
    • Depending on cookie route to HHVM or php7-fpm for initial testing
    • Add PHP7 as a beta feature for more widespread exposure
    • Gradually ramp up the PHP usage towards all users
    • If all goes fine we should be complete before the fundraising silent period, otherwise after that
    • Profiling is currently a known blocker – T176916

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedMoritzMuehlenhoff
ResolvedMoritzMuehlenhoff
ResolvedMoritzMuehlenhoff
ResolvedMoritzMuehlenhoff
ResolvedNone
Resolved Quiddity
ResolvedLadsgroup
ResolvedJoe
ResolvedLegoktm
ResolvedLegoktm
Resolvedhashar
Resolvedhashar
Resolvedssastry
ResolvedSmalyshev
ResolvedLegoktm
Resolvedtstarling
Resolvedtstarling
Resolvedtstarling
Resolvedtstarling
DeclinedNone
ResolvedNone
ResolvedNone
ResolvedNone
ResolvedDzahn
ResolvedRobH
Resolved Cmjohnson
ResolvedMoritzMuehlenhoff
ResolvedPapaul
ResolvedSmalyshev
Resolvedjcrespo
ResolvedJdforrester-WMF
ResolvedNone
ResolvedDzahn
Resolvedaaron
ResolvedJoe
ResolvedJoe
ResolvedKrinkle
ResolvedBPirkle
ResolvedJoe
ResolvedJoe
Resolved Gilles
ResolvedJoe
ResolvedPRODUCTION ERRORAnomie
ResolvedAnomie
ResolvedKrinkle
Resolvedjijiki
ResolvedTgr
Resolvedjijiki
ResolvedMoritzMuehlenhoff
ResolvedArielGlenn
DuplicatePRODUCTION ERRORNone
ResolvedPRODUCTION ERRORReedy
ResolvedPRODUCTION ERRORJoe
ResolvedKrinkle
Resolvedthcipriani
ResolvedDzahn
ResolvedNone
Resolvedjijiki
ResolvedNone
ResolvedNone
Invalidjijiki
Resolvedjijiki
ResolvedTheDJ

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 534520 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] [DNM] layout: Drop HHVM jobs from -wmf branches

https://gerrit.wikimedia.org/r/534520

Change 534608 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/WikimediaEvents@master] Stop tagging edits at PHP7

https://gerrit.wikimedia.org/r/534608

Change 534608 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Stop tagging edits as PHP7

https://gerrit.wikimedia.org/r/534608

Change 534847 had a related patch set uploaded (by Jforrester; owner: Reedy):
[mediawiki/extensions/WikimediaEvents@wmf/1.34.0-wmf.21] Stop tagging edits as PHP7

https://gerrit.wikimedia.org/r/534847

Change 534847 abandoned by Jforrester:
Stop tagging edits as PHP7

Reason:
Well, this never got deployed. Whoops.

https://gerrit.wikimedia.org/r/534847

It seems all of traffic is on php7. Can we drop things now? 😈😈😈

It seems all of traffic is on php7. Can we drop things now? 😈😈😈

Imagine we should wait a few weeks in case of show stoppers. Definitely after 1.34 is branched...

It seems all of traffic is on php7. Can we drop things now? 😈😈😈

Imagine we should wait a few weeks in case of show stoppers. Definitely after 1.34 is branched...

(To be clear, after REL1_34 is cut but way before 1.34.0 is released; the plan is to back-port HHVM removal to 1.34.)

Congratulations to everyone involved in this migration, this is excellent work!

I seem to recall How we made editing Wikipedia twice as fast having a big impact on the public perception of HHVM. Is a new post in the works? Someone from the broader community is likely interested in the what/why/how of the switch back to Zend.

@Ricordisamoa The switch to Zend PHP 7.2 is not motivated by immediate speed gains.

In 2015, we migrated to HHVM because of major speed gains compared to Zend PHP 5. I would describe the current switch not as a migration "to" Zend, but rather a migration "away" from HHVM. The original migration was complicated because it required adding support for a new engine (HHVM). This was considered worth doing given the speed gains it would bring.

But, 3 years later, the HHVM team announced they would stop offering the Zend-compatible mode of HHVM that Wikipedia and MediaWiki depend on (more information in the task description). MediaWiki has always supported Zend so in terms of engineering support, migrating to PHP 7.2 was relatively easy. Our benchmarks have confirmed that over these past few years, Zend has learned a lot from the industry (including from HHVM) and PHP 7.2 is now as fast or faster than HHVM for us. Over time, I also expect future Zend releases to bring even more speed improvements. And, now that we only have to support 1 engine again, we'll be able to make better use of our limited resources, including on the performance front :)

Removed a few unrelated tasks from the tree that need to happen after this, but aren't part of the same goal (have to draw the line somewhere, conceptually almost everything from the past 10 years was a blocked to this in some way, as this will be a blocker for everything in the next 10 years).

Given T234384#5540226 by @Joe:

FTR, we did remove hhvm from production meaning we're not serving traffic with it, and we won't go back anymore.

I boldly close this ticket.