Comments? Feedback?

This wiki does not yet support public comments (a limitation of Google Sites), so we encourage you to post your comments either:

On Twitter by responding to @orbeon.

On our community mailing list: subscribe sending an email to ops-users-subscribe@ow2.org (content of subject/body doesn't matter), you'll get a response with the email to use to send your message to the community mailing list.

Recent site activity

Orbeon Forms FAQ‎ > ‎

FAQ - 4. XML Pipelines (XPL)

What is an XML pipeline and why do I care?

XML pipelining is an approach to processing XML where the inputs and outputs of multiple processing steps (e.g., XSLT transformations) are connected together using a pipeline metaphor. Orbeon has implemented an XML pipeline engine in Java that executes a declarative XML pipelining language called XPL.

"Programming" pipelines using declarative XML instead of writing procedural code results in a significant increase in productivity for tasks that require high volume or complex XML processing. XML documents enter a pipeline, are efficiently processed by one or more processors as specified by XPL instructions, and are then output for further processing, display, or storage. XPL features advanced capabilities such as document aggregation, conditionals ("if" conditions), loops, schema validation, caching, and sub-pipelines.

XML pipelines are built up from smaller components called XML processors. An XML processor is a software component which consumes and produces XML documents. New XML processors are most often written in Java, but most often developers do not need to write their own processors because the engine provides a comprehensive library. Example processors include an XSLT processor, database processors that interface with both SQL and native XML databases, and a serializer processor that writes XML documents to disk. XPL orchestrates these to create business logic, similar to the way Java code "orchestrates" Java object method calls.

What is XPL?

At the core of Orbeon Forms lies a powerful XML processing engine that natively speaks the XML Pipeline Language (XPL). XPL is a declarative language for processing XML using a pipeline metaphor. XML documents enter a pipeline, are efficiently processed by one or more processors as specified by XPL instructions, and are then output for further processing, display, or storage. XPL features advanced capabilities such as document aggregation, conditionals ("if" conditions), loops, schema validation, and sub-pipelines.

The Orbeon XPL pipeline engine used in Orbeon Forms is designed for low-memory consumption and supports transparent caching.

What is an XML processor?

The term XML processor is commonly used to refer to XML parsers. In the context of Orbeon Forms, the term XML processor is used to refer to any software component consuming and/or producing XML documents. An XML processor can also simply be called an XML component.

What does an XML pipeline look like?

This particular example illustrates a simple 2-stage XPL pipeline that performs an XSLT transformation on an XML document that is located on disk, and then writes the result back to a another file on disk. For more details about XPL, please see the XPL and Pipelines reference documentation.


2-Stage XPL Pipeline Diagram


2-Stage Pipeline XPL Code

Is there a specification for XPL?

As of February 2005, a draft specification has been completed amd submitted to W3C. It serves as a basis of discussion for an XPL 1.0 specification.

As of December 2005, XML Processing Model Working Group meetings have started at W3C. This working group, of which Orbeon is a member, is in charge of working on a standard XML processing language. While the deliverable of the working group will certainly not be exactly XPL, we do hope that in its first version it will cover a significant number of use cases currently covered by the XPL implementation found in Orbeon Forms.

As of August 2008, the XML Processing Model Working Group has released several working drafts of the XProc XML Pipeline Language. A Last Call should be coming very soon.

Is anybody free to implement XPL?

Yes.

How much of the XPointer specification does XPL support?

Orbeon Forms supports a subset of XPointer. You can use the XPointer document#xpointer(/xpath/expression) syntax to extract a nodeset from a document.

How can I pass parameters to an XSLT stylesheet?

It is possible by importing the stylesheet within another stylesheet, as follows:

<p:processor name="oxf:xslt" xmlns:p="http://www.orbeon.com/oxf/pipeline"><p:input name="data" href="..."/><p:input name="config"><xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><!-- This is the stylesheet to pass parameters to --><xsl:import href="tour.xsl"/><!-- Here we assign a value to the "start" parameter --><xsl:param name="start" select="'a1'"/></xsl:stylesheet></p:input><p:output name="data" id="..."/></p:processor>

How does caching work in Orbeon Forms?

Caching mechanisms should have no impact on the behavior of a system, except for a gain in performance. This is also the principle followed by the Orbeon Forms cache: when you develop an application with Orbeon Forms, you should be able to ignore that caching takes place. This may be all you want to know about caching in Orbeon Forms!

To get a better understanding of the underlying mechanism, consider the example below:

Let's assume the following:

  • The XSLT processor's config input is an XSLT stylesheet on disk
  • The XSLT processor's data input is an XML file on disk
  • The XSLT stylesheet does not contain imports, includes, the document() function, or calls to Java code.

Under those assumptions, the XSLT transformation does not have side effects, which means that if neither the XSLT stylesheet ( config input) nor the input XML document ( data input) change, the output of the transformation will be the same.


If you were to keep the result of the transformation, knowing that both inputs have not changed since the last time you generated that output, you wouldn't have to actually run the transformation again: you could just reuse the result you already have. This is the basic of caching in Orbeon Forms.

In this example, the result of the XSLT transformation is used to create an XUpdate processor configuration, which is an XUpdate program that will be interpreted or compiled by the XUpdate processor. Instead of keeping the result of the XSLT transformation as an XML document, the XUpdate processor can cache the compiled XUpdate program. Compared to a solution where the XML document is cached, this technique saves memory by not keeping the XML document in cache, and saves processing power by preventing the XUpdate processor to recompile its program.

Therefore, if neither the XSLT transformer's config nor data input has changed, the XUpdate processor can keep the same program in the Orbeon Forms object cache, saving an XSLT stylesheet compilation, an XSLT transformation, and an XUpdate program compilation.

The same mechanism applies to the XSLT transformer configuration: the compiled XSLT stylesheet can be kept in the Orbeon Forms object cache. And if the output of the XUpdate transformation is sent to an HTML serializer, the HTML output could be cached as well.

In general, Orbeon Forms does not unnecessarily cache the XML documents passed between processors. Instead, it caches the result of time-consuming operations, as illustrated above.

Some cases are more complex than the example shown here. For example in the case of XSLT 1.0, Orbeon Forms handles caching and dependencies related to imports, includes as well as the XPathdocument() function when the URL passed to the function is static. Some processors, like the SQL processors, never allow their output to be cached.

Do debug attributes make my application slower?

The Orbeon Forms cache avoids executing the parts of a pipeline that do not need to be re-evaluated. However when the a debug attribute is set, the point where the attribute is set needs to be evaluated so that meaningful data can be displayed. This can cause the performance to degrade. Debug attributes should be removed in production.

Can I disable debug attributes globally?

Not at this time.

When invoking a processor in XPL, what is the difference between a name and an id?

Processors can be compared to functions in traditional programming languages. Processors (just like functions) have inputs (arguments) and outputs (return values). Each input and output has a name. The name is part of the processor's interface. For instance, using an informal function-like notation, the XSLT processor interface is:

(data) = xslt(data, config)

since the XSLT processor has 2 inputs named data and config, and one output named data.

The interface to a processor is the contract that defines what inputs and outputs do. If you are using an existing processor, for example the XSLT transformer, you have to use the names declared by that processor. To know what names you must use, you have to consult the documentation for each processor. We have tried to be consistent and to use "config" and "data" as often as possible. You can for example call the XSLT transformer like this:

<p:processor name="oxf:xslt"><p:input name="config" href="stylesheet.xsl"/><p:input name="data" href="input.xml"/><p:output name="data" id="my-output"/></p:processor>

A pipeline can also be viewed as a processor. If that pipeline decides to export inputs and outputs, it must do so using the <p:param name="..."/> syntax. This defines its interface. You can compare this to writing your own method in Java, as opposed to using an existing method. The difference is that in Java, you address method parameters by position when you call a method. In XPL, you always address them by name. Therefore it is important to use the right name when you call a processor (or a pipeline).

This also applies to the cases where the Web Application Controller calls your own pipelines: the PFC has to know the names of the inputs and outputs to connect to. Therefore, you have to use the names (data and instancedocumented.

If you implement a pipeline or write a new processor in Java, and you don't have any external naming constraints (such as the ones defined by the PFC, or if somebody expects to call your pipeline using names defined in advance), you are free to use any name.

In XPL an id can be assigned to an output when invoking a processor. This id can then be used latter on in the same pipeline to refer to that specific output, and for instance connect it to the input of another processor. Ids are similar to variable names in most programming languages.

I am confused by all those oxf: prefixes. What do they mean?

By default Orbeon Forms makes two uses of the string oxf::

  • As an XML namespace prefix in XPL. The XML namespace prefix oxf is typically mapped to the XML namespace URI http://www.orbeon.com/oxf/processors. This namespace is used to resolve XML qualified names (also referred to as "QNames") that identify the built-in XML processors in XPL, such as oxf:xslt. These are constructs specific to XML.

  • As a URL scheme. The oxf URL scheme is used to address Orbeon Forms resources through the Orbeon Forms Resource Manager. They are part of URLs such as oxf:/config/properties.xml. Such URLs are supported throughout the Orbeon XForms XPL engine and XForms engine.

Having both uses in the same file may be confusing. But while you have to use the oxf scheme to access Orbeon Forms resources, you don't have to use the oxf prefix in XPL: you can use any prefix you like, for example:

<p:config xmlns:p="http://www.orbeon.com/oxf/pipeline" xmlns:processor="http://www.orbeon.com/oxf/processors"><p:processor name="processor:xslt">...</p:processor>...</p:config>

Sign in  |  Recent Site Activity  |  Revision History  |  Terms  |  Report Abuse  |  Print page  |  Powered by Google Sites