W3C First Public Working Draft
Copyright © 2021 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and permissive document license rules apply.
This specification specifies an Hypothetical Render Model (HRM) that constrains the complexity of an IMSC Document Instance.
The model is not intended as a specification of the processing requirements for implementations. For instance, while the model defines a glyph buffer for the purpose of limiting the number of glyphs displayed at any given point in time, it neither requires the implementation of such a buffer, nor models the sub-pixel character positioning and anti-aliased glyph rendering that can be used to produce text output.
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document was published by the Timed Text Working Group as a First Public Working Draft using the Recommendation track.
Publication as a First Public Working Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 2 November 2021 W3C Process Document.
This specification specifies an Hypothetical Render Model (HRM) that constrains the complexity of a IMSC Document Instance.
This specification uses the same conventions as [imsc].
Document Instance. See Section 2.2 at [ttml2].
IMSC Document Instance. A Document Instance that conforms to any profile defined in any edition of [imsc].
Intermediate Synchronic Document. See Section 9.3.2 at [ttml2].
Root Container Region. See Section 2.2 at [ttml2].
Related Video Object. A Related Media Object that consists of a sequence of image frames, each a rectangular array of pixels.
Related Media Object. See Section 2.2 at [ttml2].
presented region. See [imsc].
presented image. See [imsc].
empty ISD. an Intermediate Synchronic Document with no presented region.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key word SHALL in this document is to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
Unless noted otherwise, this specification applies to an IMSC Document Instance.
A sequence of consecutive intermediate synchronic documents conforms to the Hypothetical Render Model if is processed without error as defined in § 6. General.
This section is non-normative.
The model illustrated in Figure 1 operates on successive intermediate synchronic documents obtained from an input IMSC Document Instance, and uses a simple double buffering model: while an intermediate synchronic document En is being painted into Presentation Buffer Pn (the "front buffer" of the model), the previous intermediate synchronic document En-1 is available for display in Presentation Buffer Pn-1 (the "back buffer" of the model).
The model specifies an (hypothetical) time required for completely painting an intermediate synchronic document as a proxy for complexity. Painting includes drawing region backgrounds, rendering and copying glyphs, and decoding and copying images. Complexity is then limited by requiring that painting of intermediate synchronic document En completes before the end of intermediate synchronic document En-1.
Whenever applicable, constraints are specified relative to Root Container Region dimensions, allowing subtitle sequences to be authored independently of Related Video Object resolution.
To enable scenarios where the same glyphs are used in multiple successive intermediate synchronic documents, e.g. to convey a CEA-608/708-style roll-up (see [CEA-608] and [CEA-708]), the Glyph Buffers Gn and Gn-1 store rendered glyphs across intermediate synchronic documents, allowing glyphs to be copied into the Presentation Buffer instead of rendered, a more costly operation.
Similarly, Decoded Image Buffers Dn and Dn-1 store decoded images across intermediate synchronic documents, allowing images to be copied into the Presentation Buffer instead of decoded.
The Presentation Compositor SHALL render in Presentation Buffer Pn each successive intermediate synchronic document En using the following steps in order:
The Presentation Compositor SHALL start rendering En:
The Presentation Compositor never renders an ISD more than IPD ahead of its presentation time and treats sequences of empty ISDs as a single ISD.
The duration DUR(En) for painting an intermediate synchronic document En in the Presentation Buffer Pn SHALL be:
DUR(En) = S(En) / BDraw + DURT(En) + DURI(En)
where
The contents of the Presentation Buffer Pn SHALL be transferred instantaneously to Presentation Buffer Pn-1 at the presentation time of intermediate synchronic document En, making the latter available for display.
It is possible for the contents of Presentation Buffer Pn-1 to never be displayed. This can happen if Presentation Buffer Pn is copied twice to Presentation Buffer Pn-1 between two consecutive video frame boundaries of the Related Video Object.
It SHALL be an error for the Presentation Compositor to fail to complete painting pixels for En before the presentation time of En.
Unless specified otherwise, the following table SHALL specify values for IPD and BDraw.
Parameter | Initial value |
---|---|
Initial Painting Delay (IPD) | 1 s |
Normalized background drawing performance factor (BDraw) | 12 s-1 |
BDraw effectively sets a limit on fillings regions - for example, assuming that the Root Container Region is ultimately rendered at 1920×1080 resolution, a BDraw of 12 s-1 would correspond to a fill rate of 1920×1080×12/s=23.7×220pixels s-1.
IPD effectively sets a limit on the complexity of any given intermediate synchronic document.
The total normalized drawing area S(En) for intermediate synchronic document En SHALL be
S(En) = CLEAR(En) + PAINT(En )
where CLEAR(En) = 0 if n=0 or En-1 is an empty ISD, and CLEAR(En) = 1 otherwise.
To ensure consistency of the Presentation Buffer, a new intermediate synchronic document requires clearing of the Root Container Region.
PAINT(En) SHALL be the normalized area to be painted for all regions that are used in intermediate synchronic document En according to:
PAINT(En) = ∑Ri∈Rp NSIZE(Ri) ∙ NBG(Ri)
where R_p SHALL be the set of presented regions in the intermediate synchronic document En.
NSIZE(Ri) SHALL be given by:
NSIZE(Ri) = (width of Ri ∙ height of Ri ) ÷ (Root Container Region height ∙ Root Container Region width)
NBG(Ri) SHALL be the total number of elements within the tree rooted at region Ri that satisfy the following criteria:
region
, body
, div
, p
or
span
; andtts:backgroundColor
is not 0
.NBG(Ri) counts the number of tts:backgroundColor
attributes specified span
elements.
In a common scenario illustrated below, this results in the complexity of painting (relatively small) span backgrounds to be equal to painting the background of (relatively much larger) region that essentially fills the root container.
This can be addressed by excluding span
from the NBG(Ri) computation, and instead including tts:backgroundColor
in the list of glyph properties at https://www.w3.org/TR/ttml-imsc1.1/#paint-text.
An element and its parent that satisfy the criteria above and share identical computed values of
tts:backgroundColor
are counted as two distinct elements for the purpose of computing NBG(Ri).
The set
element is not included in the computation of NBG(Ri). While it can affect the
computed values of tts:backgroundColor
, it is removed during Intermediate Synchronic Document
construction.
The Presentation Compositor SHALL paint into the Presentation Buffer Pn all visible pixels of presented images of intermediate synchronic document En.
For each presented image, the Presentation Compositor SHALL either:
Two images SHALL be identical if and only if they reference the same encoded image source.
The duration DURI(En) for painting images of an intermediate synchronic document En in the Presentation Buffer SHALL be as follows:
DURI(En) = ∑Ii ∈ Ic NRGA(Ii) / ICpy + ∑Ij ∈ Id NSIZ(Ij) / IDec
where
NRGA(Ii) is the Normalized Image Area of presented image Ii and SHALL be equal to:
NRGA(Ii)= (width of Ii ∙ height of Ii ) ÷ ( Root Container Region height ∙ Root Container Region width )
NSIZ(Ii) SHALL be the number of pixels of presented image Ii.
The contents of the Decoded Image Buffer Dn SHALL be transferred instantaneously to Decoded Image Buffer Dn-1 at the presentation time of intermediate synchronic document En.
The total size occupied by images stored in Decoded Image Buffers Dn or Dn-1 SHALL be the sum of their Normalized Image Area.
The size of Decoded Image Buffers Dn or Dn-1 SHALL be the Normalized Decoded Image Buffer Size (NDIBS).
Unless specified otherwise, the following table SHALL specify ICpy, IDec, and NDBIS.
Parameter | Initial value |
---|---|
Normalized image copy performance factor (ICpy) | 6 |
Image Decoding rate (IDec) | 1 × 220 pixels s-1 |
Normalized Decoded Image Buffer Size (NDIBS) | 0.9885 |
In the context of this section, a glyph is a tuple consisting of (i) one character and (ii) the computed values of the following style properties:
tts:color
tts:fontFamily
tts:fontSize
tts:fontStyle
tts:fontWeight
tts:textDecoration
tts:textOutline
tts:textShadow
In the case where a property is prohibited in a profile of IMSC, the computed value of the property specified in [ttml2] can be used.
While one-to-one mapping between characters and typographical glyphs is generally the rule in some scripts, e.g. latin script, it is the exception in others. For instance, in arabic script, a character can yield multiple glyphs depending on its position in a word. The Hypothetical Render Model always assumes a one-to-one mapping, but reduces the performance of the glyph buffer for scripts where one-to-one mapping is not the general rule (see GCpy below).
For each glyph associated with a character in a presented region of intermediate synchronic document En, the Presentation Compositor SHALL:
The duration DURT(En) for rendering the text of an intermediate synchronic document En in the Presentation Buffer is as follows:
DURT(En) = ∑gi ∈ Γr NRGA(gi) / Ren(gi) + ∑gj ∈ Γc NRGA(gj) / GCpy
where
The Normalized Rendered Glyph Area NRGA(gi) of a glyph gi SHALL be equal to:
NRGA(gi) = (fontSize of gi as percentage of Root Container Region height)2
NRGA(Gi) does not take into account decorations (e.g. underline), effects (e.g. outline) or actual typographical glyph aspect ratio. An implementation can determine an actual buffer size needs based on worst-case glyph size complexity.
The contents of the Glyph Buffer Gn SHALL be copied instantaneously to Glyph Buffer Gn-1 at the presentation time of intermediate synchronic document En.
It SHALL be an error for the sum of NRGA(gi) over all glyphs Glyph Buffer Gn to be larger than the Normalized Glyph Buffer Size (NGBS).
Unless specified otherwise, the following table specifies values of GCpy, Ren and NGBS.
Normalized glyph copy performance factor (GCpy) | |
---|---|
Script property (see Standard Annex #24 at [UNICODE]) for the character of gi | GCpy |
latin, greek, cyrillic, hebrew or base | 12 |
any other value | 3 |
Text rendering performance factor Ren(Gi) | |
Block property (see [UNICODE]) for the character of gi | Ren(Gi) |
CJK Unified Ideograph | 0.6 |
any other value | 1.2 |
Normalized Glyph Buffer Size (NGBS) | |
1 |
The choice of font by the presentation processor can increase rendering complexity. For instance, a cursive font can generally result in a given character yielding different typographical glyphs depending on context, even if latin script is used.
This section is non-normative.
This section is non-normative.
This section is non-normative.
Referenced in:
Referenced in: