Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider changing fragment serialization to be compatible with current implementations #944

Closed
smaug---- opened this issue Mar 25, 2016 · 27 comments
Assignees
Labels
compat Standard is not web compatible or proprietary feature needs standardizing removal/deprecation Removing or deprecating a feature

Comments

@smaug----
Copy link

See https://bugzilla.mozilla.org/show_bug.cgi?id=1259723

I don't have strong opinion on this. Gecko and Blink at least seem to have different behavior than what the spec defines, and they have had that for a long time. So changing it might be risky.
But then, maybe Edge is following the spec here without compatibility issues?

@annevk
Copy link
Member

annevk commented Mar 26, 2016

Test by @zcorpan: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3504. I don't have Edge. Anyone?

The specification here seems much better given that it doesn't require modification of the DOM to get something that roundtrips. But if nobody wants to fix this...

@domenic
Copy link
Member

domenic commented Mar 30, 2016

Edge logs "FAIL"

@annevk
Copy link
Member

annevk commented Mar 31, 2016

Okay, so what do we think? Do we make roundtripping worse to be compatible with implementations or hope one of the implementers is brave enough to fix this? I guess I'm happy to align with implementations, even though it's crappy. The specification did try for a bit.

@annevk annevk added removal/deprecation Removing or deprecating a feature compat Standard is not web compatible or proprietary feature needs standardizing labels Mar 31, 2016
@smaug----
Copy link
Author

Have web devs asked for the change (in implementations)? I couldn't find any related bug in bmo
(except https://bugzilla.mozilla.org/show_bug.cgi?id=1259723).

@zcorpan
Copy link
Member

zcorpan commented Mar 31, 2016

I found this: https://phabricator.wikimedia.org/T44667

A serializer must emit an additional newline if the pre content (without the optional newline) in the DOM starts with a newline. Sadly, the innerHTML method in modern browsers does not seem to implement this, so we have to work around it.

zcorpan added a commit to web-platform-tests/wpt that referenced this issue Mar 31, 2016
The HTML spec requires this for proper round-tripping, but only
Presto has implemented it so far.

Ref. whatwg/html#944
@zcorpan
Copy link
Member

zcorpan commented Mar 31, 2016

Tests for the spec's current behavior at web-platform-tests/wpt#2734

@domenic
Copy link
Member

domenic commented Mar 31, 2016

so we have to work around it.

This kind of implies to me that if browsers did try to fix this, they'd start breaking code.

@zcorpan
Copy link
Member

zcorpan commented Mar 31, 2016

That's a risk, but it could also fix many more sites that are currently "broken" because they don't work around it.

It seems like the Wikipedia people wanted the browsers to fix this; the Chromium bug blocks https://bugs.chromium.org/p/chromium/issues/detail?id=463348

Also this behavior is shipped in Opera 12 and I couldn't find any bug reports about anything being broken. I don't know where in Wikipedia the difference would manifest itself. I tried previewing <pre>\n\nx</pre> in my Wikipedia "sandbox" in Opera 12 but that didn't have too many linebreaks...

cc @Krinkle

@Krinkle
Copy link
Member

Krinkle commented Mar 31, 2016

@zcorpan The sandbox test you did most likely did not trigger the relevant code path. The wikitext markup is currently parsed server-side and transformed into HTML and then cached/served to readers as-is. There is no interaction with a DOM there.

This serialisation bug affected Wikipedia in its WYSIWYG editor (VisualEditor; demo).

The Node.js server (Parsoid) that converts wikitext into an annotated HTML5 DOM for visual editing currently works around this bug to ensure newlines are preserved. It works around it by adding a data attribute storing the fact that one new line must be prepended to the contents upon reserialisation (example; downstream bug report).

Since this logic is universally applied; Yes, I would expect a change in this regard to cause a regression.

We might be able to move the workaround client-side instead of server-side. Then we'd have to do a "feature test" to see if the current browser exhibits the buggy behaviour or not. Though it'd be a bit painful given it'd be synchronous DOM interaction just for an old silly bug. But then again, a "feature flag" seems rather overkill for something like this.

There's also something swirling in my head that suggests maybe this only affects parsing of entire HTML documents (e.g. during navigation or via DOMParser), and not via Element#innerHTML. Not sure if that's true or possible in current browsers.

@catrope
Copy link

catrope commented Apr 1, 2016

Let me clarify a few things.

First of all, the spec didn't always say what it says currently. Until about 2009, it required the behavior that browsers implement now. Then someone realized this caused problems, so they changed the spec, but the change was clearly wrong and inconsistent so no browser implemented it. In 2013, the spec was changed to what it is now. I don't have links handy for this change (because it was made right before I tried to report the issue), but IIRC it was in late January 2013.

Secondly, T44667 is a red herring, because it's only about an edge case involving zero newlines vs one newline, which is not directly related to this bug. The important code working around this bug is already client-side and already uses feature detection, so it won't regress if browsers are fixed (we already need this for Opera 12). See wikimedia/VisualEditor/blob/master/src/ve.utils.js#L1021

As for innerHTML, that was definitely affected last I checked, because you can run foo.innerHTML = foo.innerHTML; multiple times and that'll eat a newline each time you run it.

@catrope
Copy link

catrope commented Apr 1, 2016

@zcorpan
Copy link
Member

zcorpan commented Apr 1, 2016

Thank you!

In baba6fa the spec's serialization was changed from always emitting the newline to only doing so if the data starts with a newline. I suppose this was around the time it was implemented in Presto.

So if I understand correctly, Wikipedia works around this, but won't regress if browsers change to match the spec because it already bug-checks to deal with Opera 12? (I tested @Krinkle's links in Opera 12 and they appeared to work the same between Opera 12 and Chromium-based Opera for <pre>\n\nx</pre>.)

@zcorpan
Copy link
Member

zcorpan commented Apr 5, 2016

Given the information so far I suggest we leave the spec as is. Please reopen if new issues come up.

@zcorpan zcorpan closed this as completed Apr 5, 2016
@annevk
Copy link
Member

annevk commented Apr 5, 2016

@zcorpan is there a single implementer willing to change though?

@jdm
Copy link
Member

jdm commented Apr 27, 2016

https://bugzilla.mozilla.org/show_bug.cgi?id=838954 is on the list of things we're intending to work on soon.

jdm pushed a commit to web-platform-tests/wpt that referenced this issue Apr 28, 2016
* Test initial LF in pre/textarea/listing with innerHTML

The HTML spec requires this for proper round-tripping, but only
Presto has implemented it so far.

Ref. whatwg/html#944

* fixup! Test initial LF in pre/textarea/listing with innerHTML

Address @jdm's comment.
ivanzr pushed a commit to ivanzr/web-platform-tests that referenced this issue Jun 29, 2016
…tests#2734)

* Test initial LF in pre/textarea/listing with innerHTML

The HTML spec requires this for proper round-tripping, but only
Presto has implemented it so far.

Ref. whatwg/html#944

* fixup! Test initial LF in pre/textarea/listing with innerHTML

Address @jdm's comment.
@smaug----
Copy link
Author

@zcorpan Do you know whether blink would be willing to make the change?
There is now a patch for Gecko to enable the spec behavior in Nightly builds.

@zcorpan
Copy link
Member

zcorpan commented Aug 9, 2016

@smaug----
Copy link
Author

http://dev.ckeditor.com/ticket/14814#ticket
So, this might not be web compatible enough.

@zcorpan zcorpan reopened this Sep 19, 2016
@zcorpan
Copy link
Member

zcorpan commented Sep 20, 2016

Thanks. Without coordinating shipping this change in multiple browsers at the same time, I suppose this is unlikely to be successful. Do people still want to try to push this through or should we revert the spec and advice Web developers to insert the LF themselves if they want innerHTML to roundtrip?

cc @tkent-google @cdumez @travisleithead

@cdumez
Copy link

cdumez commented Sep 20, 2016

Seems like a risky change, especially given @smaug---- 's feedback.

@catrope
Copy link

catrope commented Sep 20, 2016

It sounds like ckeditor was not expecting spec-compliant behavior from browsers, because almost all of them behave in the same non-compliant way. This probably means ckeditor breaks in Opera 12, but few people use that any more.

Wikipedia's visual editor works in both cases, because it detects whether the browser is compliant or eats newlines, and adjusts its behavior accordingly. We did this for future-proofing but also for compatibility with Opera 12.

@smaug----
Copy link
Author

FWIW, we're backing out the change in Nightly, and feels like we can't do the change ever unless every other browser do it also around the same time.

@zcorpan zcorpan self-assigned this Sep 23, 2016
zcorpan added a commit that referenced this issue Sep 23, 2016
This was implemented in Presto in ~2012, and recently implemented
in Gecko, but it broke CKEditor (http://dev.ckeditor.com/ticket/14814#ticket)
so it is being backed out again in Gecko.

Fixes #944.
@zcorpan
Copy link
Member

zcorpan commented Sep 23, 2016

PR to change the spec: #1815
PR to fix test: web-platform-tests/wpt#3814
I also resolved/commented the issues for Chromium, WebKit and Edge.

domenic pushed a commit that referenced this issue Sep 24, 2016
This was implemented in Presto in ~2012, and recently implemented
in Gecko, but it broke CKEditor (http://dev.ckeditor.com/ticket/14814#ticket)
so it is being backed out again in Gecko.

Fixes #944.
@cscott
Copy link
Contributor

cscott commented Oct 18, 2016

It's quite sad that HTML doesn't round trip now. :(

Speaking from the Wikimedia side, we'd be much happier if we could always parse(serialize(doc)) and get back the same document. Previously only Visual Editor was affected by this, since on server-side the domino serialization implementation was spec-compliant. Now we won't be able to rely on "standard" serialization behavior on either server or client side. :(

cscott added a commit to fgnass/domino that referenced this issue Oct 18, 2016
This means that conforming HTML doesn't cleanly roundtrip when
serialized and reparsed, which is quite unfortunately.

Regardless: whatwg/html#944
@zcorpan
Copy link
Member

zcorpan commented Oct 18, 2016

It is indeed sad.

Note that what you do on the server side is up to you, you don't have to follow the spec. You could for example have non-compliant parser and serializer that roundtrips a lot better than the spec does.

alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
This was implemented in Presto in ~2012, and recently implemented
in Gecko, but it broke CKEditor (http://dev.ckeditor.com/ticket/14814#ticket)
so it is being backed out again in Gecko.

Fixes whatwg#944.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing removal/deprecation Removing or deprecating a feature
Development

No branches or pull requests

9 participants