Some testing needs resources that are not easily available on WMCS. Having test servers in the production network reduces the testing abilities (those server would be subject to production constraints).
For a real world use case: we need to test alternatives to blazegraph as a backend for wikidata query service. We have test servers in the production network (wdqs1009 and wdqs1010). Those servers are fine for testing beta versions of the current blazegraph based wdqs, but are not adapted to test random alternatives (current puppetization is in conflict, installing random unpackaged software is forbidden, ...).
This sounds very much like "labs on real hardware". There were conversations about this already, but I don't think there is anything concrete yet.
I think there is a valid use case here that is unlikely to go away, so we need to address it.
More formally, those points should be addressed:
- high level of resources (CPU, disk space, IO, ...)
- ability to experiment (full root for users, ability to break thing, to not be safe)
- easy to iterate (no direct dependency on other team to use the machine)
- ...
This needs refinement and discussion, let's get this discussion started (again).