Changes the media type of a document.
<p:declare-step type="p:cast-content-type"> <input port="source" primary="true" content-types="any" sequence="false"/> <output port="result" primary="true" content-types="any" sequence="false"/> <option name="content-type" as="xs:string" required="true"/> <option name="parameters" as="map(xs:QName,item()*)?" required="false" select="()"/> </p:declare-step>
The p:cast-content-type
step takes the document appearing on its source
port and changes its media type according to the value of the
content-type
option, transforming the document if necessary.
Ports:
Port | Type | Primary? | Content types | Seq? | Description |
---|---|---|---|---|---|
|
|
|
|
| The document to change the media type of. |
|
|
|
|
| The resulting document. |
Options:
A document flowing through an XProc pipeline has a media type, which tells the XProc processor what kind of document it
is dealing with. The media type of a document is recorded in its content-type
document-property. Example values are
text/xml
for XML documents, application/json
for JSON documents, etc. For more information about media types see for
example Wikipedia.
The p:cast-content-type
step has a required content-type
option and tries to cast (change) the media type of the document appearing on
its source
port according to the value of this option. Sometimes this is a (very) simple operation: for instance, changing one XML
media type to another just changes the value of the content-type
document-property. However, you can also request more
complex changes, like converting an XML document into JSON or vice versa.
Of course, not every media type can be cast into every other media type. The following sections describe what you can (and cannot) do. If
you request an impossible cast, error XC0071
is raised.
A brief explanation of media types and how XProc treats them can be found in the XProc media type usage section below.
When the input document is an XML document (has an XML media type), the following casts are supported:
Casting to another XML media type simply changes the content-type
document-property.
Casting to an HTML media type changes the content-type
document-property and removes any
serialization
document-property.
Casting to a JSON media type converts the XML into JSON:
The XPath and XQuery Functions and Operators 3.1
standard defines an XML format for the
representation of JSON data. The XPath function xml-to-json()
converts this format into a JSON conformant string (and for further processing,
parse-json()
turns this string
into a map/array).
If an input document of p:cast-content-type
is conformant to this XML format for the representation of JSON
data, it’s converted into its JSON equivalent (like calling parse-json(xml-to-json())
). See
Converting the XML representation of JSON for an example.
If the input document has a <c:param-set>
root element and <c:param name="…" value="…"/>
child elements (the
c
prefix here is bound to the http://www.w3.org/ns/xproc-step
namespace), it will turn this into a JSON
map with the values of the name
attributes as keys. See the Converting param-sets example.
Param-sets are an XProc 1.0 construct, used for passing parameters (there were no maps in those days). Unless you’re converting XProc 1.0 steps into 3.x, it’s unlikely you’ll need this feature.
In all other cases it’s up to the XProc processor what happens. It could turn your XML into some kind of JSON, but it could just as well raise an error.
A serialization
document-property is removed when converting to JSON.
Casting to a text media type converts the XML into text. The incoming XML comes out as text, as a string, complete with tags, attributes, etc.
The result of this conversion is the same as calling the XPath serialize($doc, $param)
function, where $doc
is the document to convert and $param
is its serialization
document-property. See the Converting XML to text example.
A serialization
document-property is removed.
Casting to any other media type where the input document is a <c:data>
document (see c:data documents) results in a
document with the specified media type and a representation that is the content of the <c:data>
element after decoding it. The
value of the c:data/@content-type
attribute and the value of the content-type
option of p:cast-content-type
must be the
same!
A serialization
document-property is removed.
Casting to any other media type where the input is not a valid <c:data>
document is implementation-defined and therefore
dependent on the XProc processor used.
When the input document is an HTML document (has an HTML media type), the following casts are supported:
Casting to another HTML media type simply changes the content-type
document-property.
Casting to an XML media type changes the content-type
document-property and removes a
serialization
document-property.
Casting to a JSON media type is implementation-defined and therefore dependent on the XProc processor used.
Casting to a text media type works the same as casting an XML media type to text. See casting XML to text above.
Casting to any other media type is implementation-defined and therefore dependent on the XProc processor used.
When the input document is a JSON document (has a JSON media type), the following casts are supported:
Casting to another JSON media type simply changes the content-type
document-property.
Casting to an HTML media type is implementation-defined and therefore dependent on the XProc processor used.
Casting to an XML media type converts the JSON into XML according to the rules specified in the XPath XML format for the representation of JSON data. See the Converting JSON into XML example.
A serialization
document-property is removed.
Casting to a text media type converts the JSON into text. The incoming JSON (which in XProc consists of maps/arrays) comes out as text, as a string.
The result of this conversion is the same as calling the XPath serialize($doc, $param)
function, where $doc
is the document to convert and $param
is its serialization
document-property.
A serialization
document-property is removed.
Casting to any other media type is implementation-defined and therefore dependent on the XProc processor used.
When the input document is an text document (has a text media type), the following casts are supported:
Casting to another text media type simply changes the content-type
document-property.
Casting to an XML media type parses the text value of the document by calling the XPath parse-xml()
function. This
assumes of course that the text is a well-formed XML document. If not, error XD0049
is raised.
Casting to an HTML media type parses the document into an HTML document. How this is done is implementation-defined and therefore
dependent on the XProc processor used. If unsuccessful, error XD0060
is raised.
Casting to a JSON media type parses the document by calling the XPath parse-json($doc, $param)
function, where $doc
is the document to convert and $param
is its serialization
document-property.
A serialization
document-property is removed.
Casting to any other media type is implementation-defined and therefore dependent on the XProc processor used.
When the input document has any other media type (meaning XProc treats it as a binary document), the following casts are supported:
Casting from an unrecognized media type to an XML media type produces a <c:data>
document (see c:data documents). The
<c:data/@content-type>
attribute is the document’s content type. The content of the c:data
element is the
base64 encoded representation of the document. See the Converting a binary media type into XML example.
A serialization
document-property is removed.
Casting from an unrecognized media type to a HTML, JSON, text or other unrecognized media type is implementation-defined and therefore dependent on the XProc processor used.
<c:data>
documentsThe p:cast-content-type
step uses <c:data>
documents to convert XML from and into binary media types (the c
prefix here is bound
to the http://www.w3.org/ns/xproc-step
namespace):
Attribute | # | Type | Description |
---|---|---|---|
| 1 |
| The MIME type of the content. |
| ? |
| The character set of the content, for instance |
| ? |
| The encoding of the content. The most used encoding is |
A document media type (in XProc passed around in the content-type
document-property) tells XProc
(and your code if it needs to know this) what kind of document we’re dealing with: the document type. XProc
recognizes and handles five document types: XML, HTML, JSON, text and binary.
The relation between document type and media type is as follows:
Document type | Media types | Examples |
---|---|---|
XML |
|
|
HTML |
|
|
JSON |
|
|
Text |
(not matching one of the XML or HTML media types) |
|
Binary | Anything else |
|
If an input document of p:cast-content-type
is conformant to the XPath XML format for the representation of JSON data and the content-type
option is a JSON media type, p:cast-content-type
converts
this into its JSON equivalent.
The following source document is a shortened version of the example in the XPath standard:
<map xmlns="http://www.w3.org/2005/xpath-functions"> <string key="desc">Distances </string> <boolean key="uptodate">true</boolean> <null key="author"/> <map key="cities"> <array key="Brussels"> <map> <string key="to">London</string> <number key="distance">322</number> </map> </array> </map> </map>
Pipeline document:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:cast-content-type content-type="application/json"/> </p:declare-step>
The resulting JSON map:
{"desc":"Distances ","uptodate":true,"author":null,"cities":{"Brussels":[{"to":"London","distance":322}]}}
Param-sets are constructs used in the XProc 1.0 days for passing sets of parameters, for instance to XSLT stylesheets. The current
version uses maps for this. To enable converting param-sets into maps, p:cast-content-type
contains support for this. In XProc, a map is JSON data, so the
content-type
option must be a JSON media type.
The source param-set document:
<c:param-set xmlns:c="http://www.w3.org/ns/xproc-step"> <c:param name="param1" value="y"/> <c:param name="param2" value="1234"/> </c:param-set>
Pipeline document:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:cast-content-type content-type="application/json"/> </p:declare-step>
The resulting JSON map:
{"param1":"y","param2":"1234"}
JSON maps are passed around as XPath maps, so it’s easy to store such a map in a variable and use it later. Just add the following
variable declaration directly after the p:cast-content-type
invocation:
<p:variable name="param-set-map" as="map(*)" select="."/>
Unless you’re converting XProc 1.0 code into a newer version, i’s unlikely you’ll need this param-set conversion feature.
Let’s convert this simple XML document into text:
<input-document timestamp="2024-08-23T09:12:45"> <text color="red">Hi there!</text> </input-document>
Pipeline document:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:cast-content-type content-type="text/plain"/> </p:declare-step>
The resulting text (it looks like it is another XML document, but it is just text):
<?xml version="1.0" encoding="UTF-8"?> <input-document timestamp="2024-08-23T09:12:45"> <text color="red">Hi there!</text> </input-document>
Now assume we need this text representation without the XML header (the <?xml … ?>
part at the top). The p:cast-content-type
step
uses the document serialization
document-property to guide the conversions. This document-property is a map containing the
required serialization properties. For this
example: map{'omit-xml-declaration': true()}
.
Document-properties can be specified using the p:set-properties
step. The value of the properties
option
of p:set-properties
is itself a map, with the document-property names as keys. Therefore, its value becomes a map within a map:
map{'serialization': map{'omit-xml-declaration': true()}}
.
The following code (using the same input document as above) does the trick:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:set-properties properties="map{'serialization': map{'omit-xml-declaration': true()}}"/> <p:cast-content-type content-type="text/plain"/> </p:declare-step>
Result document:
<input-document timestamp="2024-08-23T09:12:45"> <text color="red">Hi there!</text> </input-document>
Converting JSON into XML means p:cast-content-type
produces XML according to the XPath XML format for the representation of JSON
data specification. Here we do the inverse of what is done in the Converting the XML representation of JSON example.
Source document:
{"desc":"Distances ","uptodate":true,"author":null,"cities":{"Brussels":[{"to":"London","distance":322}]}}
Pipeline document:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"/> <p:output port="result"/> <p:cast-content-type content-type="text/xml"/> </p:declare-step>
Result document:
<map xmlns="http://www.w3.org/2005/xpath-functions"> <string key="desc">Distances </string> <boolean key="uptodate">true</boolean> <null key="author"/> <map key="cities"> <array key="Brussels"> <map> <string key="to">London</string> <number key="distance">322</number> </map> </array> </map> </map>
This example transforms a piece of text that has been given the (bogus) media type of x/x
into XML. Because XProc does not
recognize this media type, it treats the document as binary. The result of the p:cast-content-type
step is the document’s base64
encoded contents, wrapped in a <c:data>
element.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"> <p:input port="source"> <p:inline content-type="x/x">Hi there!</p:inline> </p:input> <p:output port="result"/> <p:cast-content-type content-type="text/xml"/> </p:declare-step>
Result document:
<c:data xmlns:c="http://www.w3.org/ns/xproc-step" content-type="x/x" encoding="base64">SGkgdGhlcmUh</c:data>
If the value of the content-type
option and the media type of a document are the same, the document will appear unchanged
on the result
port.
p:cast-content-type
preserves all document-properties of the document(s) appearing on its source
port.
Exceptions are the content-type
document-property which is updated accordingly and the
serialization
document-property which is sometimes removed.
Error code | Description |
---|---|
It is a dynamic error if the | |
It is a dynamic error if the | |
It is a dynamic error if the | |
It is a dynamic error if the | |
It is a dynamic error if the map | |
It is a dynamic error if the text value is not a well-formed XML document | |
It is a dynamic error if the text document does not conform to the JSON grammar, unless the parameter liberal is true and the processor chooses to accept the deviation. | |
It is a dynamic error if the parameter duplicates is reject and the text document contains a JSON object with duplicate keys. | |
It is a dynamic error if the parameter map contains an entry whose key is defined in the specification of
| |
It is a dynamic error if the text document can not be converted into the XPath data model | |
It is a dynamic error if a supplied content-type is not a valid media type of the form “ |
This description of the p:cast-content-type
step is for XProc version: 3.1. This is a required step (an XProc 3.1 processor must support this).
The formal specification for the p:cast-content-type
step can be found here.
The p:cast-content-type
step is part of categories:
The p:cast-content-type
step is also present in version:
3.0.