XForms State Handling at a Glance

Rationale

Updated 2012-02-07.

Some aspects of XForms processing are time-consuming, including:
  • creating the initial data structures associated with the XForms markup
  • statically analyzing aspects of the XForms page, including XPath expressions
  • responding to Ajax requests that act upon an XForms page
For these reasons, the XForms engine in Orbeon Forms is supported by mechanisms of state handling and caches.

Different states of a live XForms page

Each live (i.e. in a state where a user can interact with it) XForms page is made of two types of information:
  • Static information, AKA the page's static state, which never changes for the lifecycle of the page. This includes:
    • what models and initial instances of the page
    • the static descriptions of controls on the page
    • the description of event handlers, actions
    • etc.
  • Dynamic information, AKA the page's dynamic state, which typically changes over the lifecycle of the page. This includes:
    • current instance documents
    • current state of the tree of control, including
      • values of the XForms controls
      • unrolled repeat iterations
      • states of switch/cases and dialogs
      • etc.
The dynamic state is first created when the page is loaded, and then is updated as the user interacts with the page.

Both static state and dynamic state have two representations:
  • a representation as Java objects
  • a serialized representation

Static state

The static state is implemented in class XFormsStaticState and related.

Static state is computed during a phase also called static analysis.

Dynamic state

The dynamic state is implemented in class XFormsContainingDocument and related.

Cacheable information

For each XForms page, mainly two types of information can be cached:
  • static state:
    • if there are two requests for a page with the same static information, then the static state can be cached and shared between the two pages
    • this saves a lot of the XForms initialization work
    • NOTE: this occurs before any of the XForms processing per se starts, i.e. before any instance is loaded and before any event handlers runs on the page
  • dynamic state:
    • between Ajax requests on a given page, the dynamic state information is kept in a cache so that it can be retrieved quickly when the Ajax requests comes in

Hierarchy of caches

Here is how various caches are used:
  • when possible, the XPL (XML pipelines) caches the document's digest and initial incoming HTML for initial generation of the HTML page
    • see: XFormsToXHTML
  • static information of XForms documents is cached in memory by digest
    • implementation: XFormsStaticStateCache (uses MemoryCacheImpl)
    • property: oxf.xforms.cache.static-state.size
  • running XForms documents are cached in memory by UUID
    • implementation: XFormsDocumentCache (uses MemoryCacheImpl)
    • property: oxf.xforms.cache.documents.size
  • when running documents are ousted from XFormsDocumentCache, they are serialized to a store
    • implementation: EhcacheStateStore extends XFormsStateStore
    • configuration: oxf:/config/ehcache.xml

XForms state manager

The state manager (XFormsStateManager) handles the lifecycle of XFormsContainingDocument, including:
  • managing access to XFormsDocumentCache
  • managing access to XFormsStateStore
The lifecycle of documents is outline in an interface, XFormsStateLifecycle. See XForms Document Lifecycle.

State serialization

Static state

The static state is a serialized String version of an XML representation of the state.

Dynamic state

The dynamic state used to be serialized String version of an XML representation of the state.

As of 2012-02, the dynamic state is first converted to a snapshot (DynamicState.scala), which itself can be serialized as a Seq[Byte] when needed.

What gets serialized:
  • uuid
  • sequence number
  • other details
    • deployment type, request context path, request path, container type, container namespace
    • versioned path matchers
    • pending uploads information
  • subset of controls state, including
    • switch selected cases
    • repeat indexes
    • last Ajax response if present
  • all instances, except instances that are inline, readonly and non-replaced (all conditions must apply)
    • NOTE: cacheable instances only serialize their metadata, not their XML document, so they are cheaper to serialize
  • in noscript mode, the HTML template
Full control state / tree doesn't get serialized because it's fully reconstructible from information above.

XForms state store

The state store manages serialized static state information and dynamic state information.

This is done in two phases:
  • in-memory storage
  • persistence storage
NOTE: As of 2010-08, evidence suggests that serialized versions of static and dynamic state might not take much less space than their object tree representations. This means that we could:
  • improve the size of the serialized versions
  • AND/OR skip the in-memory storage completely and directly go to disk
NOTE: As of 2011-01, the store is configurable through ehcache.xml, so the trade-off between memory and disk is a bit more configurable.

Handling of the HTML template

[ADDED 2012-02-07]

When a page is initially processed, the incoming page markup (usually XHTML+XForms) is temporarily stored while initial XForms processing takes place. When XForms processing is done, the template is "replayed" and placeholders are filled out with values coming from the XForms control tree. For example, an <xforms:input> element is replaced with corresponding HTML markup, including the initial value of the field.

Often, the template can  be discarded after the page has been initially sent to the client. That is, it is not stored into the static or dynamic state of the page.

For performance reasons however, it is kept around in the XPL cache, which means that if the input of the XForms page is fully cacheable, the XForms processor does not need to re-read the input completely. This means that it can not only find the static state via the document's digest, but also the template.

The XPL cache might miss in these cases:
  1. the data got expired from the cache
  2. the input is not fully cacheable
  3. the input changes between requests
Now, even if the input changes between requests, the XForms part of it might still have the same digest. In this case, the input is re-read entirely, re-digested. The static state might be found in the static state cache, but the template is re-recreated. This can happen if, for example, non-XForms aspects of the page change between requests, for example the page's header or footer. If the page is produced with XSLT, or JSP, and/or via separate deployment, this can happen.

Now there are two situations where the template must be kept around:
  1. noscript mode, where a full HTML page is produced for every response
  2. full updates, which also need to keep some HTML around (ideally, only those parts of the page that are needed)
So that must be kept either in the static state or the dynamic state.

Historically, this was kept in the static state only. At some point, when support for the static state digest was introduced, and pages with HTML changing between requests, this was moved to the dynamic state, except for full updates, which were still referenced from the dynamic state (not necessarily correctly).

As of 2012-02, the thinking is the following:
  • noscript
    • the template is kept in the static state or the dynamic state, via configuration
  • full updates
    • the template is kept only in the static state, the reasoning being that full updates only deal with content beneath XForms controls (like <xforms:group>), and if that content varies then different XForms documents will be produced (while content "around" XForms controls is allowed to vary)
  • no noscript and no full updates
    • the template is not kept within either state
  • NOTE: at this time, noscript and full updates are exclusive, as you get a different XForms document if you enable noscript or not

Comments