Processors - Zip and Unzip

Zip processor

The Zip processor compresses a set of files you specify by URI. Typically you will use the Zip processor to compress a number of temporary files that you have previously created.
  • The data input lists the files to compress inside a <files> element:
    • For each file it contains a <file> element.
    • On <file>, the name attribute specified the file name to be used in the zip archive.
    • The content of <file> is the URI to the file, e.g. file:///tmp/somefile.txt. The file: and oxf: protocols are currently supported.
  • The <files> element accepts the following optional attributes which may be used by the consumer:
    • file-name
    • status-code
  • The processor generate the zip file in the data output as a binary document.

<p:processor name="oxf:zip">
    <p:input name="data">
        <files file-name="">
            <file name="file1.txt">file:///tmp/somefile.txt</file>
            <file name="dir/file2.txt">file:///tmp/someotherfile.txt</file>
    <p:output name="data" id="zip"/>

Unzip processor

The Unzip processor uncompresses a zip file. Each file in the zip file is uncompressed to a temporary file, which is deleted at the end of the request.
  • The data input is the zip file as a binary document.
  • The data output is an XML document which lists the files that have been uncompressed.
For instance, assuming the zip file is uploaded to a service, and the file is stored in the instance as an URI, you'll want to first read the file with the URL generator before you uncompress it:

<p:processor name="oxf:url-generator">
    <p:input name="config" transform="oxf:xslt" href="#instance">
        <config xsl:version="2.0">
                <xsl:value-of select="/instance/file"/>
    <p:output name="data" id="zip"/>

<p:processor name="oxf:unzip">
    <p:input name="data" href="#zip"/>
    <p:output name="data" id="zip-file-list"/>

The format of the unzip data output is the similar to the format of the zip data input, but you get to additional pieces of information about each file: its size (size attribute) and its data and time (dateTime attribute). For instance, the output can look as follows:

    <file name="file1.txt" size="123" dateTime="2007-09-11T18:23:04">file:///tmp/somefile.txt</file>
    <file name="dir/file2.txt" size="456" dateTime="2007-09-11T18:23:04">file:///tmp/someotherfile.txt</file>