URL rewriting

This page is obsolete and available for historical purposes only.

Why do you need URL rewriting?

As a web applications developer, you often have to write URLs in your Web pages. For example, an HTML <a> element contains an href attribute specifying the destination of a link. The URLs you write usually end up directly unmodified in the web browser. In many application platforms, this causes a number of problems:

Problems in a servlet environment

Absolute URLs (starting with a scheme such as http: or https:) are usually reserved to refer to external sites or applications. But when referring to the current application, relative URLs, in the form of relative paths or absolute paths, are commonly used instead. In these cases, you must make sure that the URL interpreted by the Web browser and the application server refers to the correct page or resource:

  • Using relative paths: such paths are interpreted by the web browser as relative to a URL base, usually the URL of the page being requested, unless specified differently. For example:
    • The browser requests /orbeon/example1/page1
      • Relative path page2 → absolute path /orbeon/example1/page2
      • Relative path ../page3 → absolute path /orbeon/page3
      • Relative path ../example2/page4 → absolute path /orbeon/example2/page4
    • The problem: if a page is moved in the hierarchy of pages, all the relative paths within that page have to be changed.
  • Using absolute paths: such paths start with a "/". The issue with this solution is that you have to write the exact absolute path, including a servlet context path such as /orbeon.
    • The problem: hard coding the servlet context path in every URL makes it impossible to change the servlet context without changing all the URLs in the application. To alleviate this issue, you might use relative URLs, with the other problems mentioned above.

Problems in a portlet environment

The issue is even more important with Java Portlets (JSR-168/286), as URLs must be generated by calling a specific Java API. With page template languages such as JSP, this is done using tag libraries. All the pages in an application must be modified when moved from a deployment as a servlet to a deployment as a portlet.

Solution

With Orbeon Forms, the solution to these issues is called URL rewriting. What this means is that Orbeon Forms transparently processes the URLs you write in order to make your life easier. This is possible thanks to the Orbeon Forms Page Flow Controller epilogue, which contains a URL rewriting mechanism for HTML and XHTML documents.



Using URL rewriting

Enabling URL rewriting

That's the easy part: URL rewriting is enabled automatically in Orbeon Forms. Just write a plain HTML, XHTML or XForms document, and URLs are rewritten appropriately!

Writing URLs

With Orbeon Forms, it is recommended you write URLs as:

  • Absolute URLs, when referring to external resources:

    <a href="http://www.google.com/">Go to Google</a>
  • Relative paths, for resources that are likely to remain fixed relatively to the page containing the URL:

    <img src="../images/back.png"/><a href="edit-page">Go to the Edit Page</a>
  • Absolute paths (without the servlet context path) in all other cases:

    <img src="/images/back.png"/><a href="/apps/my-app/edit-page">Go to the Edit Page</a>

Controlling URL rewriting

In certain cases, you may want to disable URL rewriting. This is done through an extension attribute:

<a href="/my/path" f:url-norewrite="true">Follow Me</a>

The f:url-norewrite attribute is supported on:

  • <xhtml:a>
  • <xforms:submission> in "optimized" mode in servlet environment, to allow accessing other servlet contexts
  • <xforms:submission> with GET method and replace="all"
  • <xforms:load>
  • <xforms:output> with xxforms:download appearance or image/* mediatype

With portlets, you may want to control the type of URL instead of using the default URL type. You control this with an extension attribute (currently only supported on <a href>):

<a href="/my/path" f:url-type="action">Follow Me</a>

Without f:url-type="action", the URL would be handled by Orbeon Forms as a "render" URL (see URL types below).

The f:url-type attribute is supported on:

  • <xhtml:a>: supports "resource | action | render"
  • <xforms:load>: supports "resource"
  • <xforms:submission>: supports "resource" for non-optimized submissions

For both attributes, you have to make sure that the namespace xmlns:f="http://orbeon.org/oxf/xml/formatting" is declared. You can declare it once and for all on the root element of your document.

NOTE: In the future, support for these attributes should be expanded.


Types of URLs

Orbeon Forms knows of the following URL types:
  • Resource URLs
    • client-facing resources including images, CSS, and JavaScript resources
  • Render URLs
    • client-facing render URLs
    • this term is more commonly used in a portlet environment
  • Action URLs
    • client-facing  action URLs
    • this term is more commonly used in a portlet environment
  • Service URLs
    • server-facing URLs used in general to reach services
[TODO: elaborate]


URL rewriting in HTML/XHTML markup

This section describes the default URL rewriting implementation in Orbeon Forms. It is implemented in the processors oxf:xhtml-rewrite and oxf:html-rewrite

NOTE: In plain HTML/XHTML markup, there is no production of service URLs because all URLs are understood to be client-facing.

What gets rewritten?

The following elements/attributes combinations are handled, whether in HTML or XHTML markup:

  • Action URLs
    • area
      • @href
    • form
      • @action
      • @method
        • NOTE: In a portlet environment, if no @method attribute is supplied, an HTTP POST is forced, because the Portlet specification recommends submitting forms with POST. If a method is supplied, the method is left unchanged.
  • Render URLs
    • a
      • @href: render URL
        • if f:url-type is specified, a resource or action URL can be produced
        • f:portlet-mode and f:window-state are also supported
  • Resource URLs
    • input[@type='image']
      • @src
    • link
      • @href
    • img
      • @src
    • frame
      • @src
    • iframe
      • @src
    • script
      • @src
    • td
      • @background
    • body
      • @background
    • div
      • @href (non-standard HTML, but usefule e.g. for Dojo content)
    • object
      • @codebase
      • @classid
      • @data
      • @usemap
      • @archive
    • applet
      • @codebase
      • @archive
The f:url-norewrite attribute can be used to disable URL rewriting for a sub-tree of the document:
  • url-norewrite="true": disables rewriting
  • url-norewrite="false": enables rewriting (default)
NOTES:
  • Absolute URLs (starting with a scheme such as http: or https:) are always left unmodified.
  • The special case of URLs starting with a query string (e.g. ?name=value) is handled. This last syntax is supported by most Web browsers and because of its convenience, it is supported by the rewriting algorithm as well.
  • URLs are parsed by the rewriting algorithm, so you should make sure that URLs are well-formed.
  • When using XHTML markup, you have to make sure your elements are in the regular XHTML (http://www.w3.org/1999/xhtml) namespace, or they won't be rewritten.
  • Rewriting also occurs on some XForms elements. In that case, rewriting is handled by the XForms engine. For details, refer to the following section.

Rewriting in a servlet environment

All the attributes are rewritten as follows:
  • If the URL is absolute (starting with a scheme such as http: or https:), it is left unchanged.
  • If the URL is a relative path (not starting with a "/"), it is left unchanged.
  • If the URL is an absolute path, the URL is rewritten
    • usually, this means that the servlet context path is prepended
    • in some cases, like separate deployment, a different context is prepended depending on the type of resources
    • with versioned resources, a version number might be added for resource URLs
[TODO: examples]

Rewriting in a portlet environment

With portlets, you write your URLs as you would in a regular servlet-based application, and the rewriting processors take care of calling the Portlet API to encode all relevant URLs. Portlets make a distinction between several URL types:

  • Render URLs
    • rewritten using the Portlet API RenderResponse.createRenderURL()
    • the resulting URL results in a render URL targeting the current portlet
  • Action URLs
    • rewritten using the Portlet API RenderResponse.createActionURL()
    • the resulting URL results in an action URL targeting the current portlet
  • Resource URLs
    • rewritten using the Portlet API RenderResponse.createResourceURL()
    • the resulting URL results in an resource URL targeting the current portlet

Orbeon Forms rewrites URLs to these different types based on the HTML or XHTML attribute names.

NOTE: As of 2009, Orbeon Forms assumes a Portlet 2 implementation supporting JSR-286.

Some special handling is performed within <script> elements: in the script text, occurrences of the string wsrp_rewrite_ are replaced with the Portlet namespace as obtained by the Portlet API method RenderResponse.encodeNamespace(null). This allows you to write script with access to namespaced identifiers produced by your portlet container.

Since portlets do not have the concept of path, URL paths are encoded as a special portlet parameter named orbeon.path. Relative paths are resolved against the current path stored in orbeon.path. The following illustrates action URL and render URL rewriting:

Initial Path Resulting Portlet Parameters
/example1/page1?name1=value1&name2=value2
  • orbeon.path=/example1/page1
  • name1=value1
  • name2=value2
?name1=value1&name2=value2
  • name1=value1
  • name2=value2
Assuming the current value of orbeon.path is /example1/page1:

../example2/page2?name1=value1&name2=value2
  • orbeon.path=/example2/page2
  • name1=value1
  • name2=value2



URL rewriting in XForms markup

Similarities and differences

The goal behind URL rewriting in XForms is similar to that of URL rewriting in HTML/XHTML: making your life easier!

Here are the similarities and differences between XForms documents and pure HTML/XHTML documents:
  • Author-specified URLs present on XHTML elements
    • go through the XForms process without modification and are rewritten by the regular XHTML rewriting process described above
    • except that Attribute Value Templates (AVTs) on XHTML elements are processed by the XForms engine
  • Author-specified URLs present on XForms elements
    • are processed by the XForms engine (see details below)
  • In addition:
    • The XForms processor outputs links to some internal resources not specified by the form author:
      • JavaScript
      • CSS
      • some images such as
        • /ops/images/xforms/spacer.gif
        • /ops/images/xforms/help.png
        • /ops/images/xforms/calendar.png
    • Certain controls can output resources or HTML markup containing URLs
      • xforms:output[@appearance = 'xxforms:download' or @mediatype = ('text/html', 'image/*')]
      • xforms:label/hint/help/alert may contain dynamically-produced HTML

Service URLs

  • xforms:instance
    • the src or resource attribute links to an external instance definition

      <xforms:instance src="/my-app/initial-data.xml"/>
    • This feature allows for improved modularity by separating an XForms instance definition from an XHTML page. It also allows for producing XForms instances dynamically.
    • NOTE: if the instance to load is a static file on the server, it is advised for performance reasons to specify an absolute URL instead, e.g.:

      <xforms:instance src="oxf:/apps/my-app/initial-data.xml"/>
  • xforms:model
    • the schema attribute links to an external schema definition

      <xforms:model schema="/my-app/schema.xsd">
          ...
      </xforms:model>
    • NOTE: if the schema to load is a static file on the server, it is advised for performance reasons to specify an absolute URL instead, e.g.:

      <xforms:model schema="oxf:/apps/my-app/schema.xsd">
          ...
      </xforms:model>
  • xforms:submission (except with the GET method and replace="all" combination)
    • the action or resource attributes specifies the URL of the submission

      <xforms:submission id="my-submission" method="post" resource="/my-app/service/save" replace="instance" .../>
    • NOTE: if the submission uses the GET method and loads is a static file on the server, it is advised for performance reasons to specify an absolute URL instead, e.g.:

      <xforms:submission id="my-submission" method="get" resource="oxf:/apps/my-app/service/read" replace="instance" .../>

Instances, submissions and schemas are loaded as service URLs:

  • They typically address services running on the application server running Orbeon Forms or another server.
  • The base URL against which service URLs are resolved can be configured separately using the oxf.url-rewriting.service.base-uri property.
    • If the base URL is an absolute path instead of an absolute URL, the scheme, server name and port are obtained from the current request.

Render URLs

  • xforms:load
    • the resource attribute specifies an URL that must be loaded on the client after the action completes

      <xforms:load resource="/fr/orbeon/builder/summary"/> 
  • xforms:submission with GET method and replace="all"
    • the action or resource attributes specifies the URL of the submission

      <xforms:submission id="my-submission" method="get" resource="/my-app/page2" replace="all" .../>
  • xhtml:a
    • the href attribute specifies an URL that must be loaded on the client when the user activates the anchor
      • this is handled by XForms when
      • the attribute is specified as an AVT, e.g. <a href="{/my/link}">
      • the element is dynamically produced by xforms:output with HTML mediatype
      • the element is a descendant of xforms:label, xforms:help, xforms:hint, or xforms:alert

Resource URLs

  • xforms:output with image mediatype
    • the single-node binding points to the URL of an image to load on the client

      <xforms:output bind="my-image" mediatype="image/*"/>

  • xhtml:img
    • the src attribute specifies an URL that must be loaded on the client
      • this is handled by XForms when
      • the attribute is specified as an AVT, e.g. <img src="{/my/image}">
      • the element is dynamically produced by xforms:output with HTML mediatype
      • the element is a descendant of xforms:label, xforms:help, xforms:hint, or xforms:alert
  • xforms:message, xforms:label, xforms:help, xforms:hint, and xforms:alert
    • these elements may use an src attribute to refer to external content.
    • NOTE: XForms 1.1 moves responsibility for the src attribute on these elements to the host language. It is advised not to use this attribute.
Like with XHTML, absolute URLs are not rewritten.

NOTE: As of October 2009, as described above only the HTML <a href> and <img src> are processed specially by the XForms engine. This is a current limitation and ideally all the HTML elements described in the previous section should be processed.

xml:base resolution

All non-absolute URLs processed by the XForms engine are resolved relatively to a base URI. The base URI is, by default, the path of the request URL that caused the processing of the XForms page, with special handling of the servlet context if necessary. It is also possible to override this behavior by adding an explicit xml:base attribute on an XForms element or any ancestor element. Processing goes as follows:
  • All URLs first go through xml:base resolution to produce an absolute URL or an absolute path.
  • Then the URLs are rewritten according to rules depending on the type of URL.
NOTE: As of October 2009, no xml:base attribute processing is performed in plain HTML/XHTML documents that do not contain any XForms, with the exception of the oxf:xinclude processor which performs such resolution for purposes of resolving inclusion URLs, and adds xml:base attributes to its output as per the XInclude specification.

Examples of service URL resolution

Assume the following:
  • URL requesting the current page: http://a.org/orbeon/app/page1
  • Orbeon Forms deployed context: orbeon

oxf.url-rewriting.service.base-uri property Initial URL Resolved URL Comment
blank or missing property http://b.com/instance http://b.com/instance

Absolute URL is left untouched.

/instance http://a.org/orbeon/instance

Absolute path resolves against the servlet context.

page2/instance http://a.org/orbeon/app/page2/instance

Relative path resolves against the request URL.

http://example.org/my/context http://b.com/instance http://b.com/instance

Absolute URL is left untouched.

/instance http://example.org/my/context/instance
  • Absolute path resolves against the service base URL.
  • Works as if the path /my/context was a servlet context.
page2/instance http://example.org/my/context/app/page2/instance
  • The relative path resolves against the request path then against the service base URL.
  • Works as if the path /my/context was a servlet context.
/my/context http://b.com/instance http://b.com/instance

Absolute URL is left untouched.

/instance http://a.org/my/context/instance
  • Absolute path resolves against the service base URL.
  • The scheme/host/port come from the request.
  • Works as if the path /my/context was a servlet context.
page2/instance http://a.org/my/context/app/page2/instance
  • The relative path resolves against the request path then against the service base URL.
  • The scheme/host/port come from the request.
  • Works as if the path /my/context was a servlet context.

[TODO: portlet examples]

Examples of render URL resolution

Initial URL Resolved URL Comment
http://www.google.com/ http://www.google.com/

Absolute URL is left untouched.

/app/page2 http://a.org/orbeon/app/page2

Absolute path resolves against the servlet context.

../app2/page1 http://a.org/orbeon/app2/page1

Relative path resolves against the request URL.


[TODO: portlet examples]

Examples of resource URL resolution

[TODO: servlet examples]
[TODO: portlet examples]



URL rewriting in separate deployment

What's different

In separate deployment, the XForms engine has to perform extra rewriting work:
  • The URLs you write in your application must point to your own web application.
  • URLs produced by Orbeon Forms for its internal use, such as CSS, JavaScript and image resources, must point to the Orbeon Forms web application.
In general, this is transparent to the application author.

Versioned resources

This is an Orbeon Forms PE feature.

NOTE: Since Orbeon Forms 3.9, versioned resources are enabled by default in Orbeon Forms PE.

When enabling versioned resources (see also the versioned resources documentation), resource URLs can be further rewritten by Orbeon Forms to include a version number.

In separate deployment, versioned resources are enabled only for Orbeon Forms resources. Your application resources are not versioned.


Known limitations

Portlets

  • The input document should not contain:

    • Elements and attribute containing the string wsrp_rewrite
    • Namespace URIs containing the string wsrp_rewrite
    • Processing instructions containing the string wsrp_rewrite
  • It is not possible to specify:

    • A destination portlet mode
    • A destination window state
    • A secure URL


Implementation notes

These notes are destined to the Orbeon Forms contributor:
  • ServletExternalContext returns the original values returned by the servlet API, except in case of inclusion, where the request attributes specifying the location the included servlet are used.
  • oxf:xforms-to-xhtml produces output which contains non-rewritten URLs.
  • Rewriting of the XHTML resulting from XForms processing is performed by oxf:xhtml-rewrite.
  • For Ajax responses, rewriting is performed separately for URLs (images, load, GET submission) and HTML output.
  • The servlet rewriting algorithm is implemented in URLRewriterUtils.
    • This uses the fact that a request was forwarded, or that an Orbeon filter was present, to alter the result.
  • When the Orbeon filter is in use, it is expected that "everything" goes through the filter:
    • requests for Orbeon CSS, JavaScript and image resources
    • requests for /xforms-server and /xforms-server-submit
      • NOTE: later, this constraint could be relaxed (assuming the session is still forwarded)
Comments