Processors - Other Serializers

Scope

Serializers are processors with no XML output. A serializer, just like any processor, can access files, connect to databases, and take actions depending on its inputs. See also the HTTP serializer.

URL serializer

The URL serializer mirrors the functionality of the URL generator. Instead of reading from of URL, it writes its data input as XML into a URL.

NOTE: The oxf:, file: and http: protocols allow writing.

When using the oxf: protocol, the Filesystem and WebApp resource managers support write operations.

The URL serializer takes a config input with a single url element containing the URL to write to. The data input is serialized according the rules of the XML serializer.

<p:processor name="oxf:url-serializer">
<p:input name="config">
<config>
<url>oxf:/path/current.xml</url>
</config>
</p:input>
<p:input name="data" href="#xml-data"/>
</p:processor>

File serializer

The File serializer supports decoding binary or text data encapsulated in XML documents and writing it to a file on disk. The file serializer can write to a file you choose or to a temporary file. When writting to a temporary file, you need to read the data output to get the URL of the temporary file that was produced and which will consist of an element url containing URL of the temporary file, such as: <url>file:/tmp/gaga.tmp</url>.

The configuration consists of the following elements:

Element Name Type Purpose Default Value
If data written to a file you specify file Absolute or relative path to a file Specifies the file to write to. none
directory Optional absolute path to a directory Specifies the path relative to which the file element is resolved. none
url oxf: or file: url of the file The file location can be specified either by a URL or by an OS dependent file/directory path (since July 20th, 2012)
none
append boolean If the file already exists: appends content to the file if true, or replaces the file if false. false
make-directories boolean If the file is located in a directory that doesn't exist: creates the necessary directories if true, or raises an error if false. false
If data written to a temporary file scope Can be request, session, or application
  • If set to request, then the temporary file is removed at the end of the HTTP request.
  • If set to session, then the temporary file is removed when the session of the current user expires.
  • If set to application, then the temporary file is removed when the servlet is stopped (typically when the application server is stopped).
none
proxy-result boolean

Whether the resulting URL must be proxied. If false, the URL is a temporary server-side URL. If true, the URL is an absolute path which can be used from the web browser to retrieve the temporary file. The path does not contain the servlet context.

This can only be true if the <scope> element is session.

false
content-type content type, without any attributes Indicates the content type to use. application/octet-stream for binary mode, text/plain for text mode
force-content-type boolean Indicates whether the content type provided has precedence. This requires a content-type element. false
ignore-document-content-type boolean Indicates whether the content type provided by the input document should be ignored. false
encoding valid encoding name Indicates the text encoding to use. utf-8
force-encoding boolean Indicates whether the encoding provided has precedence. This requires an encoding element. false
ignore-document-encoding boolean Indicates whether the encoding provided by the input document should be ignored. false
cache-control/use-local-cache boolean Whether the resulting stream must be locally cached. For documents or binaries that are large or known to change at every request, it is recommended to set this to false. true

Here is how you serialize a document produced by a pipeline to a file on disk:

<!-- Convert a document to serialized XML -->
<p:processor name="oxf:xml-converter">
<p:input name="config">
<config>
<encoding>utf-8</encoding>
</config>
</p:input>
<p:input name="data" href="#my-document"/>
<p:output name="data" id="converted"/>
</p:processor>
<!-- Write the document to a file -->
<p:processor name="oxf:file-serializer">
<p:input name="config">
<config>
<directory>build/doc/reference</directory>
<file>single-file-doc.html</file>
<make-directories>true</make-directories>
<append>false</append>
</config>
</p:input>
<p:input name="data" href="#converted"/>
</p:processor>

Note the use of the XML converter processor, which serializes the XML document produced by the pipeline to a textual representation of XML.

Alternatively (since July 20th, 2012), you can write the document specifying the location by a file: URL:

<p:processor name="oxf:file-serializer">
<p:input name="config">
<config>
<url>file:build/doc/reference/single-file-doc.html</url>
<make-directories>true</make-directories>
<append>false</append>
</config>
</p:input>
<p:input name="data" href="#converted"/>
</p:processor>
Or using an oxf: URL (assuming for instance that the root directory for oxf: is build/doc):
<p:processor name="oxf:file-serializer">
     <p:input name="config">
         <config>
             <url>oxf:/reference/single-file-doc.html</url>
             <make-directories>true</make-directories>
             <append>false</append>
         </config>
     </p:input>
     <p:input name="data" href="#converted"/>
</p:processor>

Here is how you can copy a file specified with a URL from one location to the other, by using the URL generator and the File serializer:

<!-- Read original file -->
<p:processor name="oxf:url-generator">
<p:input name="config">
<config>
<url>file:/my-image.jpg</url>
</config>
</p:input>
<p:output name="data" id="image-data"/>
</p:processor>
<!-- Write to another file -->
<p:processor name="oxf:file-serializer">
<p:input name="config">
<config>
<file>/my-copied-image.jpg</file>
</config>
</p:input>
<p:input name="data" href="#image-data"/>
</p:processor>

In the following example, data is written to a temporary file which will be deleted when the session of the current user expires. The URL of the temporary file is returned by the File serializer through its data output.

<p:processor name="oxf:file-serializer">
<p:input name="config">
<config>
<scope>session</scope>
</config>
</p:input>
<p:input name="data" href="#data-to-write"/>
<p:output name="data" id="url-written"/>
</p:processor>

Scope serializer

The Scope serializer can store documents into the application, session and request scopes. It works together with the Scope generator.

The Scope serializer has a config input in the following format:

<config>
<key>cart</key>
<scope>application|session|request</scope>
<session-scope>application|portlet</session-scope>
</config>
key The <key> element contains a string used to identify the document. The same key must be used to store and retrieve a document.
scope

The <scope> element specifies in what scope the document is to be stored. The available scopes are:

  • application - The application scope starts when the Web application is deployed. It ends when the Web application is undeployed. The application scope provides an efficient storage for data that does not need to be persisted and that is common for all users. It is typically used to cache information (e.g. configuration data for the application read from a database).
  • session - The session scope is attached to a given user of the Web application. It is typically used to store information that does not need to be persisted and is specific to a given user. It is typically used to cache the user's profile.
  • request - The request scope starts when an HTTP request is sent to the server. It ends when the corresponding HTTP response is sent back to the client. The request scope can be used to integrate a Orbeon Forms application with legacy J2EE servlets.
session-scope

The <session-scope> element specifies in what session scope the document is to be stored. This element is only allowed when the <scope> element is set to session. The available session scopes are:

  • application - access the entire application session. This is always a valid value.
  • portlet - access the local portlet session. This is only valid if the processor is run within a portlet.

If the element is missing, a default value is used: application when the processor runs within a servlet, and portlet when the processor runs within a portlet.

In addition to the config input, the Scope serializer has a data input receiving the document to store.

Since September 11, 2013: It is possible to delete a document by running the scope serializer with the "null document" as its data input for the scope and key of the document to delete. The "null document" is <null xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>.

NOTE:

The Session serializer, previously used, is now deprecated. Use the Scope serializer with session scope instead.

Null serializer

The Null serializer acts as a black hole. The data input is read and ignored. This processor is useful when a pipeline or a branch of a p:choose element doesn't have to return any document.

<p:processor name="oxf:null-serializer">
<p:input name="data" href="#document"/>
</p:processor>

Flushing the output stream

All serializers (XML, HTML, text, and FOP) will flush their output stream when they encounter the following processing instruction: <?oxf-serializer flush?> This instruction allows the browser to display a Web page incrementally. Incremental display is typically useful when sending large tables or when the first part of a Web page could be displayed, while the rest of the page cannot until a time consuming action is performed.

Legacy http serializers

NOTE: Use of these serializers should be replaced by converters connected to the HTTP serializer.

These serializers share a common functionality: writing their data input to an HTTP response. Typically, this means sending data back to a client web browser. This can be done in a Servlet environment or a Portlet environment. All share the same configuration, but differ in how they convert their input data. The following describes the common configuration, then the specifics for each serializer.

NOTE: When using the command-line mode, instead of sending the output through HTTP, the HTTP serializers send their output to the standard output. In such a case, the parameters that do not affect the content of the data, such as content-type, status-code, etc. are ignored.

All serializers send the cache control HTTP headers, including Last-Modified, Expires and Cache-Control. The content type and content length headers are also supported.

Configuration

The configuration consists of the following optional elements.

Element Purpose Default
content-type content type sent to the client Specific to each serializer
encoding The default text encoding utf-8
status-code HTTP status code sent to the client SC_OK, or 100
error-code HTTP error code sent to the client none
empty-content Forces the serializer to return an empty content, without reading its data input false
version HTML or XML version number 4.01 for HTML (ignored for XML, which always output 1.0)
public-doctype The public doctype "-//W3C//DTD HTML 4.01 Transitional//EN" for HTML, none otherwise
system-doctype The system doctype "http://www.w3.org/TR/html4/loose.dtd" for HTML, none otherwise
omit-xml-declaration Specifies whether an XML declaration must be omitted false for XML and HTML (i.e. a declaration is output by default), ignored otherwise
standalone If true, specifies standalone="yes" in the document declaration. If false, specifies standalone="no" in the document declaration. not specified for XML, ignored otherwise
indent Specifies if the output is indented true
indent-amount Specifies the number of indentation space 1
cache-control/use-local-cache Whether the resulting stream must be locally cached. For documents or binaries that are large or known to change at every request, it is recommended to set this to false. true
header Adds a custom HTTP header to the response. The nested elements name and value contain the name and value of the header, respectively. You can add multiple headers. none
<config>
<content-type>text/html</content-type>
<status-code>100</status-code>
<empty-content>false</empty-content>
<error-code>0</error-code>
<version>4.01</version>
<public-doctype>-//W3C//DTD HTML 4.01//EN</public-doctype>
<system-doctype>http://www.w3.org/TR/html4/strict.dtd</system-doctype>
<omit-xml-declaration>false</omit-xml-declaration>
<standalone>true</standalone>
<encoding>utf-8</encoding>
<indent-amount>4</indent-amount>
<cache-control>
<use-local-cache>true</use-local-cache>
</cache-control>
<header>
<name>Content-Disposition</name>
<value>attachment; filename=image.jpeg;</value>
</header>
</config>

XML serializer

This serializer writes XML text. The output is indented with no spaces and encoded using the UTF-8 character set. The default content type is application/xml.

<p:processor name="oxf:xml-serializer">
<p:input name="config">
<config>
<content-type>text/vnd.wap.wml</content-type>
</config>
</p:input>
<p:input name="data" href="#wml"/>
</p:processor>

HTML serializer

The HTML serializer's output conforms to the XSLT html semantic. The doctype is set to HTML 4.0 Transitional and the content is indented with no space and encoded using the UTF-8 character set. The default content type is text/html. The following is a simple HTML serializer example:

<p:processor name="oxf:html-serializer">
<p:input name="config">
<config/>
</p:input>
<p:input name="data" href="#html"/>
</p:processor>
NOTE: The XML 1.0 Specification prohibits a DOCTYPE definition with a Public ID and no System ID.

Text serializer

The Text serializer's output conforms to the XSLT text semantic. The output is encoded using the UTF-8 character set. This serializer is typically useful for pipelines generating Comma Separated Value (CSV) files. The default content type is text/plain.

<p:processor name="oxf:text-serializer">
<p:input name="config">
<config/>
</p:input>
<p:input name="data" href="#text"/>
</p:processor>
Comments