Create an XML manifest document describing the contents of an archive file.
<p:declare-step type="p:archive-manifest"> <input port="source" primary="true" content-types="any" sequence="false"/> <output port="result" primary="true" content-types="application/xml" sequence="false"/> <option name="format" as="xs:QName?" required="false" select="()"/> <option name="override-content-types" as="array(array(xs:string))?" required="false" select="()"/> <option name="parameters" as="map(xs:QName, item()*)?" required="false" select="()"/> <option name="relative-to" as="xs:anyURI?" required="false" select="()"/> </p:declare-step>
The p:archive-manifest
step creates an XML manifest document describing the contents of the archive file appearing on its source
port (for
instance a ZIP file).
Ports:
Port | Type | Primary? | Content types | Seq? | Description |
---|---|---|---|---|---|
|
|
|
|
| The archive file to create the manifest for. |
|
|
|
|
| The created XML manifest document. See the |
Options:
The p:archive-manifest
step takes an archive file (for instance a ZIP file) on its source
port and returns on its result
port
an XML document describing the contents of the archive: the archive manifest. The archive manifest format is described
in the p:archive
step.
Archive manifests can be used in several ways. Some examples:
To inspect which files are present in an archive, for instance to check whether what you’ve got is complete.
As an input manifest for p:archive
. This step takes, on its manifest
port, a manifest like the one
produced by p:archive-manifest
and uses this to create a new archive or update an existing one. You could for instance first get a manifest using
p:archive-manifest
, change it to reflect the changes you need and then feed it to p:archive
to produce a new archive.
Archives come in many formats. The only format the p:archive-manifest
step is required to handle is ZIP. However, depending on the XProc processor used,
other formats may also be processed.
One of the things the p:archive-manifest
step does is determining the content-type (MIME type) of the archive entries. This is usually done based on
the filename/extension. It is recorded in the manifest c:entry/@content-type
attribute.
Sometimes it is useful to override this mechanism and assign specific content-types to some of the entries. For instance, the files
Microsoft Office produces (.docx
, .xlsx
, etc.) are archives with a lot of XML documents inside. Some of these
documents have the extension .rels
and would therefore not be recognized as XML documents. The
override-content-types
option makes it possible to adjust this behavior.
The value of the override-content-types
option must be an array of arrays. The inner arrays must have exactly two
members:
The first member must be an XPath regular expression.
The second member must be a valid a MIME content-type.
Determining an archive entry’s content-type is now as follows:
The inner arrays of the override-content-types
option value are processed in order of appearance (so order is
significant).
The XPath regular expression (in the first member of the inner array) is matched against the full path of an entry
in the archive (as in matches($path-in-archive, $regular-expression)
).
If a match is found, the content-type (the second member of the inner array) is used as the entry’s content-type.
If no match was found for all the inner arrays, the normal mechanism for determining the content-type is used.
For example: setting the override-content-types
option to [ ['.rels$', 'application/xml'],
['^special/', 'application/octet-stream'] ]
means that all files ending with .rels
will get the content-type
application/xml
. All files in the archive’s special
directory (including sub-directories) will get the
content-type application/octet-stream
. See also the Overriding content types example.
Assume we have a simple ZIP archive with two entries:
An XML file in the root called reference.xml
An image in an images/
sub-directory called logo.png
.
The following pipeline creates an archive manifest for this ZIP file:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:archive-manifest/> </p:declare-step>
Resulting archive manifest:
<c:archive xmlns:c="http://www.w3.org/ns/xproc-step"> <c:entry name="images/logo.png" content-type="image/png" href="file:/…/…/test.zip/images/logo.png" method="deflated" size="86656" compressed-size="85694" time="2024-07-04T11:12:22.4+02:00"/> <c:entry name="reference.xml" content-type="application/xml" href="file:/…/…/test.zip/reference.xml" method="deflated" size="78" compressed-size="77" time="2024-07-09T19:58:50.75+02:00"/> </c:archive>
As you can see, the XProc processor I’m using to process this example (MorganaXProc-III) adds a few extra attributes to the
<c:entry>
elements: size
, compressed-size
and time
.
Also note the contents of the c:entry/@href
attributes: they are a combination of the full path/filename of the archive and the
path of the entry within the archive (as in the c:entry/@name
attribute). The c:entry/@href
attribute plays an
important role when creating archives using p:archive
.
This example uses the same ZIP archive as in Basic usage. The following pipeline explicitly sets the content type for
.png
files to application/octet-stream
:
Pipeline document:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:archive-manifest> <p:with-option name="override-content-types" select="[ ['\.png$', 'application/octet-stream'] ]"/> </p:archive-manifest> </p:declare-step>
Resulting archive manifest:
<c:archive xmlns:c="http://www.w3.org/ns/xproc-step"> <c:entry name="images/logo.png" content-type="application/octet-stream" href="file:/…/…/test.zip/images/logo.png" method="deflated" size="86656" compressed-size="85694" time="2024-07-04T11:12:22.4+02:00"/> <c:entry name="reference.xml" content-type="application/xml" href="file:/…/…/test.zip/reference.xml" method="deflated" size="78" compressed-size="77" time="2024-07-09T19:58:50.75+02:00"/> </c:archive>
This example uses the same ZIP archive as in Basic usage. It sets the relative-to
to
file:///test/
. This is reflected in the c:entry/@href
attributes:
Pipeline document:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:archive-manifest relative-to="file:///test/"> </p:archive-manifest> </p:declare-step>
Resulting archive manifest:
<c:archive xmlns:c="http://www.w3.org/ns/xproc-step"> <c:entry name="images/logo.png" content-type="image/png" href="file:///test/images/logo.png" method="deflated" size="86656" compressed-size="85694" time="2024-07-04T11:12:22.4+02:00"/> <c:entry name="reference.xml" content-type="application/xml" href="file:///test/reference.xml" method="deflated" size="78" compressed-size="77" time="2024-07-09T19:58:50.75+02:00"/> </c:archive>
The only document-property for the document appearing on the result
is content-type
, with value
application/xml
. Note it has no base-uri
document-property and no document-properties from the document on
the source
port survive.
A relative value for the relative-to
option gets de-referenced against the base URI of the element in the pipeline it is
specified on. In most cases this will be the path of the pipeline document.
The only format this step is required to handle is ZIP. The ZIP format definition can be found here.
Error code | Description |
---|---|
It is a dynamic error if the map | |
It is a dynamic error if the format of the archive does not match the specified format, cannot be understood, determined and/or processed. | |
It is a dynamic error if the | |
It is a dynamic error if the specified value for the | |
It is a dynamic error if the specified value is not a valid XPath regular expression. | |
It is a dynamic error if the base URI is not both absolute and valid according to RFC 3986 . | |
It is a dynamic error if a supplied content-type is not a valid media type of the form “ |
This description of the p:archive-manifest
step is for XProc version: 3.1. This is a required step (an XProc 3.1 processor must support this).
The formal specification for the p:archive-manifest
step can be found here.
The p:archive-manifest
step is part of categories:
The p:archive-manifest
step is also present in version:
3.0.