p:invisible-xml (3.0) 

Performs invisible XML processing.

Summary

<p:declare-step type="p:invisible-xml">
  <input port="source" primary="true" content-types="any -xml -html" sequence="false"/>
  <output port="result" primary="true" content-types="any" sequence="true"/>
  <input port="grammar" primary="false" content-types="text xml" sequence="true"/>
  <option name="fail-on-error" as="xs:boolean" required="false" select="true()"/>
  <option name="parameters" as="map(xs:QName, item()*)?" required="false" select="()"/>
</p:declare-step>

The p:invisible-xml step parses the document on the source port using invisible XML. The grammar for this must be provided on the grammar port. The result will be emitted on the result port.

Ports:

Port

Type

Primary?

Content types

Seq?

Description

source

input

true

any -xml -html

false

The source document to parse using the invisible XML grammar provided on the grammar port.

If the grammar port is empty, this must contain a valid invisible XML grammar. See the description of the grammar port.

result

output

true

any

true

The result of parsing the document on the source port.

grammar

input

false

text xml

true

One of the following:

Options:

Name

Type

Req?

Default

Description

fail-on-error

xs:boolean

false

true

Determines what happens if the document cannot be parsed:

parameters

map(xs:QName, item()*)?

false

()

Parameters used to control the parsing. The XProc specification does not define any parameters for this option. A specific XProc processor (or parser used) might define its own.

Description

Invisible XML (or ixml) is a method for treating non-XML documents as if they were XML, enabling authors to write documents and data in a format they prefer while providing XML for processes that are more effective with XML content.

The p:invisible-xml takes a document, usually text, and parses this using an invisible XML grammar into an XML document. The grammar must be provided on the grammar port. The result will appear on the result port.

Invisible XML has both a text and an XML representation and you can use both representations on the grammar port. Converting the text to the XML grammar can be done by leaving the grammar port empty and providing the text based grammar on the source port. See Parsing the invisible XML grammar for an example.

In most cases, p:invisible-xml relies on an external parser. You’ll probably have to do some XProc processor dependent configuration before this step will work. Please consult the XProc processor documentation about this.

Examples

Basic usage

We’re going to use a very basic invisible XML grammar that parses a written date into XML. The grammar looks like this:

date: s?, day, s, month, (s, year)? .
-s: -" "+ .
day: digit, digit? .
-digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9".
month: "January"; "February"; "March"; "April";
       "May"; "June"; "July"; "August";
       "September"; "October"; "November"; "December".
year: (digit, digit)?, digit, digit .

The input document is:

31 December 2021

Using the p:invisible-xml step to parse this, the result is as follows:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">

  <p:input port="source"/>
  <p:output port="result"/>

  <p:invisible-xml>
    <p:with-input port="grammar" href="grammar.txt"/>
  </p:invisible-xml>

</p:declare-step>

Result document:

<date>
   <day>31</day>
   <month>December</month>
   <year>2021</year>
</date>

Parsing the invisible XML grammar

We can parse the text representation of an invisible XML grammar into its XML representation by leaving the grammar port empty and provide the text grammar on the source port. Using the same grammar as in Basic usage, the result is:

Pipeline document:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">

  <p:input port="source"/>
  <p:output port="result"/>

  <p:invisible-xml>
    <p:with-input port="grammar">
      <p:empty/>
    </p:with-input>
  </p:invisible-xml>

</p:declare-step>

Result document:

<ixml>
   <rule name="date">
      <alt>
         <option>
            <nonterminal name="s"/>
         </option>
         <nonterminal name="day"/>
         <nonterminal name="s"/>
         <nonterminal name="month"/>
         <option>
            <alts>
               <alt>
                  <nonterminal name="s"/>
                  <nonterminal name="year"/>
               </alt>
            </alts>
         </option>
      </alt>
   </rule>
   <rule mark="-" name="s">
      <alt>
         <repeat1>
            <literal tmark="-" string=" "/>
         </repeat1>
      </alt>
   </rule>
   <rule name="day">
      <alt>
         <nonterminal name="digit"/>
         <option>
            <nonterminal name="digit"/>
         </option>
      </alt>
   </rule>
   <rule mark="-" name="digit">
      <alt>
         <literal string="0"/>
      </alt>
      <alt>
         <literal string="1"/>
      </alt>
      <alt>
         <literal string="2"/>
      </alt>
      <alt>
         <literal string="3"/>
      </alt>
      <alt>
         <literal string="4"/>
      </alt>
      <alt>
         <literal string="5"/>
      </alt>
      <alt>
         <literal string="6"/>
      </alt>
      <alt>
         <literal string="7"/>
      </alt>
      <alt>
         <literal string="8"/>
      </alt>
      <alt>
         <literal string="9"/>
      </alt>
   </rule>
   <rule name="month">
      <alt>
         <literal string="January"/>
      </alt>
      <alt>
         <literal string="February"/>
      </alt>
      <alt>
         <literal string="March"/>
      </alt>
      <alt>
         <literal string="April"/>
      </alt>
      <alt>
         <literal string="May"/>
      </alt>
      <alt>
         <literal string="June"/>
      </alt>
      <alt>
         <literal string="July"/>
      </alt>
      <alt>
         <literal string="August"/>
      </alt>
      <alt>
         <literal string="September"/>
      </alt>
      <alt>
         <literal string="October"/>
      </alt>
      <alt>
         <literal string="November"/>
      </alt>
      <alt>
         <literal string="December"/>
      </alt>
   </rule>
   <rule name="year">
      <alt>
         <option>
            <alts>
               <alt>
                  <nonterminal name="digit"/>
                  <nonterminal name="digit"/>
               </alt>
            </alts>
         </option>
         <nonterminal name="digit"/>
         <nonterminal name="digit"/>
      </alt>
   </rule>
</ixml>

Additional details

  • The document appearing on the result port only has a content-type property. It has no other document-properties (also no base-uri).

  • The resulting document will in the vast majority of cases be XML. However, the implementation allows for returning other document types. If, how and when this happens is implementation defined and therefore dependent on the XProc processor used.

Errors raised

Error code

Description

XC0205

It is a dynamic error if the source document cannot be parsed by the provided grammar.

XC0211

It is a dynamic error if more than one document appears on the grammar port.

XC0212

It is a dynamic error if the grammar provided is not a valid Invisible XML grammar.

Reference information

This description of the p:invisible-xml step is for XProc version: 3.0. This is a required step (an XProc 3.0 processor must support this).

The formal specification for the p:invisible-xml step can be found here.

The p:invisible-xml step is part of categories:

The p:invisible-xml step is also present in version: 3.1.