Pipeline Step Summary

Pipeline steps

The following pipeline steps are supported in Tag.

Some steps require an external XProc engine (like the embedded Morgana engine), while other steps only work in Tag (i.e., tag:* extension steps).

Container
- p:if
- p:choose
- p:for-each
- p:group
- p:viewport
Standard
File
Validation
Tag extension steps

Private Tag extension steps - not available in general release yet (contact us to find out more).

Optical character recognition
Entity detection within formatted or plain text
Text analysis to detect sentiment (the mood of a document), key phrases, language and syntax (e.g., find nouns and their related verbs)
Translation between a wide range of languages
Topic modeling – scan a collection of documents and establish common themes or subjects
Text-to-speech
Speech-to-text which can be used to transcribe call center calls or other voice recordings into text

p:if

This step provides a guard for the child steps it contains. If the test attribute expression returns true, all child steps will be run. If it returns false, nothing happens.

Attributes

test - requires an expression that returns true or false
collection - if true, the (XPath) default collection will contain all documents passed to this step, and the context item will be undefined (default is false)

More information on p:if can be found in the XProc standard.

p:choose

This step makes a choice between multiple outcomes. Each outcome is defined by a p:when child step, which stores a test attribute expression. A p:choose may also store an p:otherwise child step.

When this step runs, each p:when is tested in sequence. The first one with a test that returns true wins and is the only outcome to run. If none of the whens return true, the p:otherwise step will run if it exists.

Note that p:when steps are very similar to p:if steps. In particular, they have a collection attribute that works the same way.

More information on p:choose can be found in the XProc standard.

p:for-each

This step stores a list of child steps that may be run zero or more times. It provides a looping mechanism for all documents passed to it.

When this step runs, it runs all child steps by passing in only 1 document at a time. The output from this step contains the results from all runs arranged into 1 sequence (using output ports defined by its last child step).

When the child steps are run, a current input port is automatically created to store the single document passed to the child steps for that run. That document is also passed in to the first child step as the default readable port.

An alternative to passing documents to this step is to use a p:with-input instruction which can load external documents, pipe them from other steps, or define inline documents.

More information on p:for-each can be found in the XProc standard.

p:group

This step is a convenience wrapper for its child steps. It runs as a subpipeline in the same way that a pipeline does.

More information on p:group can be found in the XProc standard.

p:viewport

This step works on a single XML or HTML input document, and can process multiple chunks (subtrees) of it in sequence.

It uses a match attribute pattern to select a list of nodes. Each node is wrapped in a document (if necessary) and passed to the child steps one at a time. This temporary document is also made available using the named current input port.

The output from this step is a sequence of documents (one for each matched node). Each one is a copy of the input document, where the matched node is replaced by the result of running the child steps for that node. In this way multiple "views" of the input document are provided.

Attributes

match - requires an XSLT selection pattern

This step requires an external XProc engine like Morgana.

More information on p:viewport can be found in the XProc standard.

p:add-attribute

This step adds a single attribute to a set of matching elements. The match option selects zero or more elements to modify.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

attribute-name - requires a name for the attribute
attribute-value - requires a value for the attribute
match - an XSLT selection pattern (default is '/*')

Note that p:set-attributes can be used to set multiple attributes at once.

More information on p:add-attribute can be found in the XProc standard.

p:add-xml-base

This steps changes the xml:base attribute used by expressions to resolve relative URLs. This lets you point at a folder of images and then refer to them in pipeline steps using local files names.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

all - if false only update children of document node, otherwise remove xml:base on all descendants
relative - if true use a URI relative to the inherited base URI, otherwise use a full URI

This step requires an external XProc engine like Morgana.

More information on p:add-xml-base can be found in the XProc standard.

p:archive

Creates a ZIP file archive of all documents passed to it, or of all files specified by the manifest input port. The report output port stores a manifest report of the created ZIP file. Updates of existing ZIP files are possible using the archive input port.

Inputs

source - accepts: any
manifest - accepts: xml
archive - accepts: any

Outputs

result - produces: any
report - produces: application/xml

Options

format - output file format (default is ZIP)
parameters - optional parameters (ref: Morgana)
relative-to - can be used when creating a manifest

This step requires an external XProc engine like Morgana.

More information on p:archive can be found in the XProc standard.

p:archive-manifest

Creates an archive manifest for the ZIP file passed to it.

Inputs

source - accepts: any

Outputs

result - produces: application/xml

Options

format - input file format (default is ZIP)
override-content-types - used to partially override the content-type mechanism
parameters - optional parameters (ref: Morgana)
relative-to - can be used when creating the manifest

This step requires an external XProc engine like Morgana.

More information on p:archive-manifest can be found in the XProc standard.

p:cast-content-type

Creates a new document by changing the media type of the document passed to it. This converts files between XML, HTML, JSON and text according to specific rules.

Inputs

source - accepts: any

Outputs

result - produces: any

Options

content-type - requires the content type to create
parameters - optional parameters (not used in Tag)

The accepted content types include: application/xml, text/html, application/json and text/plain.

More information on p:cast-content-type can be found in the XProc standard.

p:compare

Compares two single documents for equality. A c:result document is produced as output containing "true" or "false". If differences are found, a summary of them may appear on the differences port.

Inputs

source - accepts: any
alternate - accepts: any

Outputs

result - produces: application/xml
differences - produces: any

Options

fail-if-not-equal - fs true, report error instead of creating result document
method comparison method (default is "deep-equal")
parameters - optional parameters (ref: Morgana)

More information on p:compare can be found in the XProc standard.

p:compress

Compresses a single document using the GZIP format.

Inputs

source - accepts: any

Outputs

result - produces: any

Options

format output file format (default is GZIP)
parameters - optional parameters (ref: Morgana)
serialization - optional serialization property (ref: Morgana)

This step requires an external XProc engine like Morgana.

More information on p:compress can be found in the XProc standard.

p:count

Counts the number of documents passed to it. Stores a single c:result element containing the count on the result output port.

Inputs

source - accepts: any

Outputs

result - produces: application/xml

Options

limit - if > 0, step will count at most that many documents (e.g., can check if > 1 documents exist without processing all of them)

More information on p:count can be found in the XProc standard.

p:delete

Deletes items specified by the match option from the document passed to it and stores the resulting document on the result port.

Inputs

source - accepts: xml html

Outputs

result - produces: text xml html

Options

match - requires an XSLT selection pattern

More information on p:delete can be found in the XProc standard.

p:error

Generates a dynamic error using the input passed to the step. The error can be caught using a p:try step - the result output port is an authoring convenience and is never actually updated.

Inputs

source - accepts: text xml

Outputs

result - produces: any

Options

code - requires a unique code to identify the error

This step requires an external XProc engine like Morgana.

More information on p:error can be found in the XProc standard.

p:filter

Selects portions of the document passed to it based on an expression, and stores them on the result output port.

Inputs

source - accepts: xml html

Outputs

result - produces: text xml html

Options

select - requires an expression to select content

This step requires an external XProc engine like Morgana.

More information on p:filter can be found in the XProc standard.

p:hash

Generates a hash, or digital "fingerprint", for the document passed to it, and injects it (using hexadecimal characters) into the source document. The result is stored on the result output port.

The match option is used to select nodes (the default pattern selects the document node). Each selected node is used to create a hash, and then is replaced by that hash. If the document node is replaced, the result is a text file that only contains the hash.

Inputs

source - accepts: xml html

Outputs

result - produces: text xml html

Options

algorithm - requires one of: "crc", "md", or "sha"
value - requires a string to use when creating the hash
match - an XSLT selection pattern (default is '/*/node()')
parameters - optional parameters (ref: Morgana)
version - optional version of the algorithm used

This step requires an external XProc engine like Morgana.

More information on p:hash can be found in the XProc standard.

p:http-request

Used to call web APIs using http or https internet URLs. If the method (e.g., POST) supports a body, the request body is constructed using the document(s) passed to the source input port. The response from the call is stored on the result output port.

Details about the outcome of the request will appear as a map on the report output port. The map will contain entries like status-code, base-uri and headers.

Inputs

source - accepts: any

Outputs

result - produces: any
report - produces: application/json

Options

href - requires a URL to call
assert - if this expression returns false report an error
auth - optional map of authorization information (e.g., username, password, MD5 checksum)
headers - map of HTTP header values
method - HTTP request method which can be "GET", "POST", "HEAD", "PUT", "DELETE", "CONNECT", "OPTIONS" or "TRACE" (default is "GET")
parameters - optional parameters (ref: XProc spec, Morgana)
serialization - used to control serialization of request body during a POST request

More information on p:http-request can be found in the XProc standard.

p:identity

Stores an exact copy of what is passed to it on the result output port. These steps can provide a handy way to load documents for the next step using p:inline, p:pipe or p:document child instructions.

Inputs

source - accepts: any

Outputs

result - produces: any

More information on p:identity can be found in the XProc standard.

p:insert

Inserts the insertion input port's document into the source input port's document using the match selection pattern.

For every matched node, the insertion is made according to the position option.

Inputs

source - accepts: xml html
insertion - accepts: xml html

Outputs

result - produces: xml html

Options

match - an XSLT selection pattern (default is '/*')
position - one of "first-child", "last-child", "before" or "after" (default is "after")

More information on p:insert can be found in the XProc standard.

p:json-join

Joins the sequence of documents passed to it into a single JSON document (an array) and stores it on the result output port. If any input documents are not JSON, they are automatically converted to JSON content if possible.

Inputs

source - accepts: any

Outputs

result - produces: application/json

Options

flatten-to-depth - controls how content appearing on the source input port is flattened

This step requires an external XProc engine like Morgana.

More information on p:json-join can be found in the XProc standard.

p:json-merge

Merges the sequence of documents passed to it into a single JSON document (a map/object) and stores it on the result output port. If any input documents are not JSON, they are automatically converted to JSON content if possible.

Inputs

source - accepts: any

Outputs

result - produces: application/json

Options

duplicates - one of "reject", "use-first", "use-last", "use-any" or "combine" (default is "use-first")
key - expression used when merging sequences, arrays and maps to create unique keys (default is 'concat("_",$p:index)')

This step requires an external XProc engine like Morgana.

More information on p:json-merge can be found in the XProc standard.

p:label-elements

Generates a label for each matched element and stores that label in the specified attribute.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

attribute - name of the attribute to insert (default is 'xml:id')
label - expression to create the label (default is 'concat("_",$p:index)')
match - an XSLT selection pattern (default is '*')
replace - if true allow replacement of existing attribute (default is true)

This step requires an external XProc engine like Morgana.

More information on p:label-elements can be found in the XProc standard.

p:load

Has no inputs, but stores the document specified by the href option on the result output port. The loaded document content type can be XML, JSON, HTML, text or "other" binary data.

Outputs

result - produces: any

Options

href - requires URI to load the document from
content-type - can override the automatically detected content type
document-properties - optional XProc document properties to apply
parameters - these vary by loaded content type

More information on p:load can be found in the XProc standard.

p:make-absolute-uris

Makes an element or attribute's value in the document passed to it an absolute URI in the result document. For every node selected by the match option, its string value is resolved against the specified base URI and the resulting URI is used as the matched node's entire contents in the result document.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

match - requires an XSLT selection pattern
base-uri - used to resolve attribute URIs

This step requires an external XProc engine like Morgana.

More information on p:make-absolute-uris can be found in the XProc standard.

p:namespace-delete

Deletes all namespaces identified by the specified prefixes from the document passed to it. The namespace declarations are removed, and any nodes that use those namespaces will have no namespace in the document stored on the result output port.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

prefixes - requires a list of namespace prefixes to delete

This step requires an external XProc engine like Morgana.

More information on p:namespace-delete can be found in the XProc standard.

p:namespace-rename

Renames any namespace declaration, or use of a namespace, in the document passed in to a new namespace URI. The command may affect elements, attributes or both according to the apply-to option.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

apply-to - one of "all", "elements" or "attributes" (default is "all")
from - the namespace URI to be renamed which may be empty
to - the new namespace URI which may be empty

This step requires an external XProc engine like Morgana.

More information on p:namespace-rename can be found in the XProc standard.

p:pack

Merges two document sequences into one. The step takes each pair of documents (one from source and one from alternate input ports), wraps them with a new element specified by the wrapper option, and writes that element to the result output port as a document.

If either input sequence is longer than the other, then wrap each of its remaining documents by themselves.

Inputs

source - accepts: text xml html
alternate - accepts: text xml html

Outputs

result - produces: application/xml

Options

wrapper - requires a name to wrap result documents in

This step requires an external XProc engine like Morgana.

More information on p:pack can be found in the XProc standard.

p:rename

This step renames elements, attributes, or processing-instruction targets. Each node selected by the match option is renamed to the name specified by the new-name option.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

new-name - requires a new name for the selected nodes
match - an XSLT selection pattern (default is '/*')

This step requires an external XProc engine like Morgana.

More information on p:rename can be found in the XProc standard.

p:replace

Replaces nodes selected by the match option with the top-level node(s) of the replacement output port's document.

Inputs

source - accepts: xml html
replacement - accepts: text xml html

Outputs

result - produces: text xml html

Options

match - requires an XSLT selection pattern

More information on p:replace can be found in the XProc standard.

p:set-attributes

This step sets attributes on all elements selected by the match option. If an attribute of the same name already exists, it will be updated with a new value.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

attributes - requires a map of attribute names and values to set
match - an XSLT selection pattern (default is '/*')

This step requires an external XProc engine like Morgana.

More information on p:set-attributes can be found in the XProc standard.

p:set-properties

This step sets XProc document properties on the passed in document and saves it to the result output port. The content of the document is not modified.

Inputs

source - accepts: any

Outputs

result - produces: any

Options

properties - requires a map of property names and values to set
merge - if true merge with existing properties, otherwise replace the entire set (default is true)

This step requires an external XProc engine like Morgana.

More information on p:set-properties can be found in the XProc standard.

p:sink

This step accepts a sequence of documents and discards them. It has no output.

It can be used to stop the default flow of documents from one step to the next, in particular when used with p:choose steps. For example, one p:when step allows content to flow to the next step, while another p:when or p:otherwise step does not.

Inputs

source - accepts: any

More information on p:sink can be found in the XProc standard.

p:split-sequence

This step accepts a sequence of documents and divides it into two sequences. The test option expression is applied to each document on the source input port. If the result is true the document is copied to the matched output port, otherwise it is copied to the not-matched output port.

If the initial-only option is true, then when the first document that does not satisfy the test expression is encountered, it and all the documents that follow it are written to the not-matched output port.

Inputs

source - accepts: any

Outputs

matched - produces: any
not-matched - produces: any

Options

test - requires an expression to test each input document with
initial-only - if true stop testing after the first fail (default is false)

This step requires an external XProc engine like Morgana.

More information on p:split-sequence can be found in the XProc standard.

p:store

Saves the input document to a URI. The URI specified by the href option must reference a local file using the "file:" URI scheme (e.g., "file:///c:/path/to/myfile.txt" on Windows, and "file:///path/to/myfile.txt" on Mac).

The result output port stores a copy of the document passed in (just like p:identity). The result-uri output port stores the location of the stored document in a c:result document.

Inputs

source - accepts: any

Outputs

result - produces: any
result-uri - produces: application/xml

Options

href - requires a URL pointing to a file location on the host computer or a local area network
serialization - map of settings that can be used to modify how the result document is saved

More information on p:store can be found in the XProc standard.

p:string-replace

This step replaces all nodes selected by the match option with the replacement value. The replacement value is the string result of evaluating the expression in the replace option, using the matched node as the XPath context node.

If the document node is matched, the entire document is replaced by the string value of the replace expression. What appears on the result output port is a text document containing the replacement value.

Inputs

source - accepts: xml html

Outputs

result - produces: text xml html

Options

match - requires an XSLT selection pattern
replace - requires an expression to create the replacement value

More information on p:string-replace can be found in the XProc standard.

p:text-count

This step counts the number of lines in a text document and stores a single c:result XML document containing that number on the result output port.

Inputs

source - accepts: text

Outputs

result - produces: application/xml

This step requires an external XProc engine like Morgana.

More information on p:text-count can be found in the XProc standard.

p:text-head

Copies lines from the beginning of a text document to a text document on the result output port.

If the count option is positive, copy the first count lines. If it is zero, copy all lines. If it is negative, copy all lines except the first count lines.

Inputs

source - accepts: text

Outputs

result - produces: text

Options

count - requires the number of lines to copy or skip

This step requires an external XProc engine like Morgana.

More information on p:text-head can be found in the XProc standard.

p:text-join

This step joins two or more text documents together, into one text document on the result output port. The separator option can be used to insert a string between documents.

If the prefix option is provided, it will start the result document even if no input documents are provided. The suffix option does the same thing to the end of the result.

The override-content-type option can be used to change the content type of the result document to a valid text-based type (e.g., CSV, JSON).

Inputs

source - accepts: text

Outputs

result - produces: text

Options

override-content-type - can change the result document's content type
prefix - a string to start the result document
separator - a string to separate all documents
suffix - a string to end the result document

This step requires an external XProc engine like Morgana.

More information on p:text-join can be found in the XProc standard.

p:text-replace

This step replaces all occurrences of substrings in a text document with a given replacement string.

The pattern option selects the substrings to replace. It must be a valid XPath regular expression. The flags option

may be used to refine how the pattern is interpreted.

Inputs

source - accepts: text

Outputs

result - produces: text

Options

pattern - requires a regular expression used to select substrings
replacement - requires a string to replace matches with
flags - can be used to interpret the pattern

This step requires an external XProc engine like Morgana.

More information on p:text-replace can be found in the XProc standard.

p:text-sort

This step sorts lines in a text document, and stores the result on the result output port.

The sort-key option is an expression that is applied to each line - the result is used to sort the lines.

The case-order defines whether upper-case letters are to be collated before or after lower-case letters.

Inputs

source - accepts: text

Outputs

result - produces: text

Options

case-order - one of "upper-first" or "lower-first" (default is language-dependent)
collation - how strings are compared with each other (ref: Morgana)
lang - language whose collating conventions are to be used (default is from Windows or Mac computer)
order - one of "ascending" or "descending" (default is "ascending")
sort-key - expression used to sort the lines (default is '.')
stable - if false, the order of lines that have the same sort key may change (default is true)

This step requires an external XProc engine like Morgana.

More information on p:text-sort can be found in the XProc standard.

p:text-tail

Copies lines from the end of a text document to a text document on the result output port.

If the count option is positive, copy the last count lines. If it is zero, copy all lines. If it is negative, copy all lines except the last count lines.

Inputs

source - accepts: text

Outputs

result - produces: text

Options

count - requires the number of lines to copy or skip

This step requires an external XProc engine like Morgana.

More information on p:text-tail can be found in the XProc standard.

p:unarchive

This step accepts a single archive (e.g., ZIP) file, and copies zero or more of its entries to the result output port.

The include-filter and exclude-filter options can specify filters to include or exclude entries. If neither exists all entries are copied to the result. If include-filter(s) exist, entries that match them are copied to the result. If exclude-filter(s) exist, entries that match them are not copied to the result. If both exist, the include filter(s) are processed first.

Inputs

source - accepts: any

Outputs

result - produces: any

Options

exclude-filter - sequence of strings to exclude that each must be a valid XPath regular expression
format - can specify the archive format (e.g., ZIP), if it can't be automatically detected
include-filter - sequence of strings to include that each must be a valid XPath regular expression
override-content-types - used to partially override the content-type detection mechanism
parameters - map of settings to control unarchiving (ref: Morgana)
relative-to - used to create the base URI of unarchived documents

This step requires an external XProc engine like Morgana.

More information on p:unarchive can be found in the XProc standard.

p:uncompress

Accepts a compressed document (e.g., GZIP) and stores an uncompressed version on the result output port.

Inputs

source - accepts: any

Outputs

result - produces: any

Options

content-type - can override the detected content type (default is application/octet-stream)
format - can specify the archive format (e.g., GZIP), if it can't be automatically detected
parameters - map of settings to control uncompression (ref: Morgana)

This step requires an external XProc engine like Morgana.

More information on p:uncompress can be found in the XProc standard.

p:unwrap

This step replaces matched elements with their children. The match option contains a pattern that must refer to document or element nodes. Every selected node is replaced by its children, effectively "unwrapping" the children from their parent.

This step can cause some unusual effects. For example, if a document node is selected which consists of a root element containing only text, the result is a document node with a single text node. In this case, the result document's content type will become text/plain.

Inputs

source - accepts: xml html

Outputs

result - produces: application/xml text/plain

Options

match - an XSLT selection pattern (default is '/*')

This step requires an external XProc engine like Morgana.

More information on p:unwrap can be found in the XProc standard.

p:uuid

This step generates a UUID (guaranteed unique identifier string) and injects it into the source document. The version option may be used to select a specific version of the UUID algorithm. An example of a version 4 (default) UUID is "b98a4da4-9c90-48a7-9508-b25546d0a0f1".

The match option contains a pattern that selects nodes to replace with the UUID. If more than one node matches, the same UUID is used to replace each one. If the document node is selected, the result is a text document that only contains the UUID.

Inputs

source - accepts: xml html

Outputs

result - produces: text xml html

Options

match - an XSLT selection pattern (default is '/*')
version - UUID algorithm version (default is version 4 UUID)

This step requires an external XProc engine like Morgana.

More information on p:uuid can be found in the XProc standard.

p:wrap

This step wraps matching nodes in the source document with a new parent element. The match option selects nodes to wrap. The wrapper option provides the name of a new parent element.

Each matched node is replaced by an element named by the wrapper option, which contains a copy of the replaced node and all its descendants.

The group-adjacent option can be used to gather adjacent nodes to be wrapped. It contains an expression that is evaluated for each matching node, which must return sibling nodes (i.e., no nodes between) that will reside together in the new parent.

Inputs

source - accepts: xml html

Outputs

result - produces: application/xml

Options

match - requires an XSLT selection pattern
wrapper - requires a name for the new wrapper element
group-adjacent - expression used to gather sibling nodes that will also reside in the new parent

This step requires an external XProc engine like Morgana.

More information on p:wrap can be found in the XProc standard.

p:wrap-sequence

This step accepts a sequence of documents and produces a single document, or optionally a new sequence of documents. The wrapper option provides the name of all document elements copied to the result output port.

Usually, all source documents end up in one result document. If the group-adjacent option is used, multiple documents could be created to group sequentially adjacent documents. It contains an expression that is evaluated for each source document, and will group sibling documents that have the same expression result.

Inputs

source - accepts: text xml html

Outputs

result - produces: application/xml

Options

wrapper - requires a name for the new wrapper element
group-adjacent - expression used to gather sibling documents that will also reside in the new parent

More information on p:wrap-sequence can be found in the XProc standard.

p:www-form-urldecode

This step decodes a x-www-form-urlencoded string into a JSON representation. A JSON map will appear on the result output port.

It does not have an input port - the value option must contain a string of parameter values encoded using the x-www-form-urlencoded algorithm. Each name/value pair is copied to the JSON map as a key/value entry.

Outputs

result - produces: application/json

Options

value - requires a x-www-form-urlencoded encoded value

This step requires an external XProc engine like Morgana.

More information on p:www-form-urldecode can be found in the XProc standard.

p:www-form-urlencode

This step encodes a set of parameter values as a x-www-form-urlencoded string. It does not have an input port - the parameters option contains a map of key/value pairs that are encoded.

Outputs

result - produces: text/plain

Options

parameters - requires a map of key/value pairs to be encoded

This step requires an external XProc engine like Morgana.

More information on p:www-form-urlencode can be found in the XProc standard.

p:xinclude

This step invokes an XInclude processor using the source document, which is assumed to contain XInclude instructions. The result is copied to the result output port.

Inputs

source - accepts: xml html

Outputs

result - produces: xml html

Options

fixup-xml-base - if true base URI fixup will be performed
fixup-xml-lang - if true language fixup will be performed

This step requires an external XProc engine like Morgana.

More information on p:xinclude can be found in the XProc standard.

p:xquery

This step invokes an XQuery processor on a sequence of source documents, using XQuery instructions provided on the query input port. The result document(s) is/are copied to the result output port.

Inputs

source - accepts: any
query - accepts: text xml

Outputs

result - produces: any

Options

parameters - used to set the query’s external variables
version - may specify the version of XQuery used (default is 3.1)

More information on p:xquery can be found in the XProc standard.

p:xslt

This step invokes an XSLT processor on a sequence of source documents, using XSLT instructions provided on the stylesheet input port. The result document(s) is/are copied to the result output port.

The populate-default-collection option is used to control whether all source documents form the default collection for the XSLT transformation.

The secondary output port may contain secondary results from the transformation, which are defined in the XSLT stylesheet using the xsl:result-document instruction.

Inputs

source - accepts: any
stylesheet - accepts: xml

Outputs

result - produces: any
secondary - produces: any

Options

global-context-item - used as global context item
initial-mode - the initial mode for the invocation
output-base-uri - sets the base output URI
parameters - used to define top-level stylesheet parameters
populate-default-collection - default collection instructions
static-parameters - used to define static parameters
template-name - initial template to invoke
version - may specify the version of XSLT used (default is 3.0)

More information on p:xslt can be found in the XProc standard.

p:directory-list

This step lists the contents of a directory/folder on your computer or local network. The path option selects the directory using a URI (e.g., "file:///c:/path/to/myfile.txt" on Windows, and "file:///path/to/myfile.txt" on Mac). If a relative URI is provided, it is resolved against the directory containing the pipeline.

The result output port will contain a c:directory document, which contains c:file, c:directory and c:other entries like the following:

<c:file xml:base="file1.txt" name="file1.txt" />

The max-depth option may contain either the string "unbounded" or an integer. An integer value of 0 means that only information about the specified directory is returned. A value of 1 (the default), also returns information about the selected directory's immediate children. Larger values recurse deeper into subfolders.

The include-filter and exclude-filter options can specify filters to include or exclude entries. If neither exists all entries are added to the result. If include-filter(s) exist, entries that match them are added to the result. If exclude-filter(s) exist, entries that match them are not added to the result. If both exist, the include filter(s) are processed first.

Outputs

result - produces: application/xml

Options

path - requires a URI selecting the directory to scan
detailed - if true, the result will contain more file/folder details (e.g., size, last-modified) - default is false, which means only name and xml:base attributes are included
exclude-filter - sequence of strings to exclude that each must be a valid XPath regular expression
include-filter - sequence of strings to include that each must be a valid XPath regular expression
max-depth - "unbounded", or an integer indicating the maximum subfolder depth to recurse into (default is 1)
override-content-types - used to partially override the content-type mechanism

This step requires an external XProc engine like Morgana.

More information on p:directory-list can be found in the XProc standard.

p:file-copy

This step copies a file or directory to a target location. The href option selects the file or folder to copy, and the target option selects the destination.

If the target is a non-existing folder, it will be created before copying begins. For file copying, if the target is a file, that file name will be used, otherwise the original file name will be retained. A similar approach is used for folders.

If the overwrite option is false, no existing file will be replaced.

If the copy is successful, a c:result document will be written to the result output port containing the absolute URI of the target. If an error occurs (e.g., no permission to save files); if the fail-on-error option is false the step returns a c:error document, otherwise an error is raised.

Outputs

result - produces: application/xml

Options

href - requires a URI selecting the source file/folder
target - requires a URI selecting the destination to copy into
fail-on-error - if false, return an error document instead of raising an error (default is true)
overwrite - if false, prevent existing files from being replaced (default is true)

This step requires an external XProc engine like Morgana.

More information on p:file-copy can be found in the XProc standard.

p:file-create-tempfile

This step creates a temporary file. The temporary file is guaranteed not to already exist when the step is called.

If the href option contains the URI of an existing directory, the temp file will be created here. Otherwise, a Windows or Mac system folder will be used.

If the prefix option is provided, the temp file name will start with it. If the suffix option is provided, the temp file name will end with it.

If the temporary file is created successfully, a c:result document containing the absolute URI of this file is written to the result output port.

Outputs

result - produces: application/xml

Options

delete-on-exit - if true, attempt to delete temp file when the pipeline finishes running (default is false)
fail-on-error - if false, return an error document instead of raising an error (default is true)
href - URI to a directory where the temp file should be created
prefix - start of the temp file name
suffix - end of the temp file name

This step requires an external XProc engine like Morgana.

More information on p:file-create-tempfile can be found in the XProc standard.

p:file-delete

This step deletes a file or a directory identified by the href option. If a directory is selected, the recursive option must be true or the directory must be empty.

If successful, a c:result document containing the absolute URI of the file or directory deleted will be written to the result output port.

Outputs

result - produces: application/xml

Options

href - requires a URI to the file or folder being deleted
fail-on-error - if false, return an error document instead of raising an error (default is true)
recursive - if true and a folder is selected, also delete all child files and folders (default is false)

This step requires an external XProc engine like Morgana.

More information on p:file-delete can be found in the XProc standard.

p:file-info

This step returns information about a file, directory or other file system object identified by the href option.

If a file is identified, a c:file document is written to the result output port. It includes at least these attributes (name, readable, writable, hidden, last-modified, size and content-type).

If a folder is identified, a c:directory document is written to the result output port. It includes the same attributes as above.

If something other than a file or folder is identified, a c:other document is written to the result output port. It includes a name attribute.

Outputs

result - produces: application/xml

Options

href - requires a URI to the item being queried
fail-on-error - if false, return an error document instead of raising an error (default is true)
override-content-types - used to partially override the content-type mechanism

This step requires an external XProc engine like Morgana.

More information on p:file-info can be found in the XProc standard.

p:file-mkdir

This step creates a directory identified by the href option. If this command involves missing parent directories, they will be created automatically.

If successful, a c:result document is written to the result output port containing the absolute URI of the directory. If the directory already exists, nothing is done but the c:result document is still created.

Outputs

result - produces: application/xml

Options

href - requires a URI to the directory being created
fail-on-error - if false, return an error document instead of raising an error (default is true)

This step requires an external XProc engine like Morgana.

More information on p:file-mkdir can be found in the XProc standard.

p:file-move

This step moves a file or directory identified by the href option, to a location identified by the target option.

If the target option specifies an existing directory, the step attempts to move a file or directory into that directory.

If the move is successful, a c:result document is written to the result output port containing the absolute URI of the target.

Outputs

result - produces: application/xml

Options

href - requires a URI to the file or directory being moved
target - requires a URI to the directory into which items will be moved
fail-on-error - if false, return an error document instead of raising an error (default is true)

This step requires an external XProc engine like Morgana.

More information on p:file-move can be found in the XProc standard.

p:file-touch

This step updates the modification timestamp of a file identified by the href option. If the specified file does not exist, an empty file will be created at that location.

If the timestamp option is set, the file's timestamp is set to this value. Otherwise, the file's timestamp is set to the current system's date and time.

Outputs

result - produces: application/xml

Options

href - requires a URI to the file being updated
fail-on-error - if false, return an error document instead of raising an error (default is true)
timestamp - timestamp to be used

This step requires an external XProc engine like Morgana.

More information on p:file-touch can be found in the XProc standard.

p:validate-with-relax-ng

This step validates XML or HTML source content using Relax NG (RNG) instructions, provided on the schema input port. This is the same format used in Tag for content generation. In the Scribe app *.rng files are called data setup files.

Errors and warnings are written to the report output port. If successful, the source document is copied to the result output port, possibly augmented by DTD compatibility or PSVI annotations.

Inputs

source - accepts: xml html
schema - accepts: text xml

Outputs

result - produces: xml html
report - produces: xml json

Options

assert-valid - if true, raise an error if the input is not valid (default is true)
dtd-attribute-values - if true, apply DTD compatibility conventions (default is false)
dtd-id-idref-warnings - if true, report DTD compatibility errors (default is false)
parameters - optional parameters (ref: Morgana)
report-format - specify report format (default is 'xvrl')

This step requires an external XProc engine like Morgana.

More information on p:validate-with-relax-ng can be found in the XProc standard.

p:validate-with-schematron

This step validates XML or HTML source content using Schematron instructions, provided on the schema input port.

Errors and warnings are written to the report output port. If successful, the source document is copied to the result output port, possibly augmented by PSVI annotations.

Inputs

source - accepts: xml html
schema - accepts: xml

Outputs

result - produces: xml html
report - produces: xml json

Options

assert-valid - if true, raise an error if the input is not valid (default is true)
parameters - map containing Schematron external variables
phase - starting Schematron validation phase
report-format - specify report format (default is 'svrl')

This step requires an external XProc engine like Morgana.

More information on p:validate-with-schematron can be found in the XProc standard.

p:validate-with-xml-schema

This step validates XML or HTML source content using XML Schema instructions, provided on the schema input port.

Errors and warnings are written to the report output port. If successful, the source document is copied to the result output port, possibly augmented by PSVI annotations.

Inputs

source - accepts: xml html
schema - accepts: xml

Outputs

result - produces: xml html
report - produces: xml json

Options

assert-valid - if true, raise an error if the input is not valid (default is true)
mode - one of "strict" or "lax" (default is "strict")
parameters - map containing external parameters
report-format - specify report format (default is 'xvrl')
try-namespaces - if true, attempt to dereference namespace URIs to locate schema documents (default is false)
use-location-hints - if true, use schema location hints (default is false)
version - version of XML Schema to be used

This step requires an external XProc engine like Morgana.

More information on p:validate-with-xml-schema can be found in the XProc standard.

tag:connector

This step uses a pre-defined Tag connector to call a web API. The connector must be loaded in the Connect app and is referenced using the ref option. Connectors can be imported and exported in the "Manage preferences" panel (top-right Account menu).

Connectors store all information needed to make a web API call including the URL, headers and user authentication information. When an apikey is required to authenticate web API users, Tag can securely save apikeys using preferences and access them via this step.

When content must be uploaded to the web API as part of a call (e.g., for HTTP POST requests), the connector must store the post body to upload. When the tag:connector step is used in a pipeline, the p:insert step can be used to update the post body before the call is made.

The output of this step depends on the web API called. The most common formats are JSON, XML and text. The response received from the web API is copied to the result output port as-is.

Outputs

result - produces: any

Options

ref - requires the name of a connector to run

This extension step only works in the Tag XProc engine.

More information on tag:connector can be found in nSymbol step documentation.

tag:csv

This step converts a CSV (comma-separated values) document into an XML document. A simple XML structure is created comprised of multiple <r> elements that each contains one child for every column.

CSV headers are read from the first row unless the read-headers option is false. Headers are used to name <r> child elements - if not available, <v> elements are used.

The namespace option may be used to define a namespace in the result XML.

A future version of Tag may extend this step to handle XML to CSV conversion.

Inputs

source - accepts: text

Outputs

result - produces: xml

Options

namespace - a namespace URI for the result XML
read-headers - if true, treat values in the first row as headers (default is true)

This extension step only works in the Tag XProc engine.

More information on tag:csv can be found in the nSymbol step documentation.

tag:docx

This step converts an XSL-FO document (the default rich text format in Tag) into a DOCX document (*.docx file) that can be opened in one of several popular word processors.

The output of this step is considered binary from a pipeline perspective. Typically, a p:store step is used to save it to a file.

Only a subset of format settings are converted, roughly corresponding to the available format tools in the Tag rich text editor.

A future version of Tag may extend this step to handle DOCX to XSL-FO conversion.

Inputs

source - accepts: xml

Outputs

result - produces: any

This extension step only works in the Tag XProc engine.

More information on tag:docx can be found in the nSymbol step documentation.

tag:google

This step allows you to call Google APIs if you have a Google business account. Google has a vast selection of APIs available to access Google resources like Drive, Docs, Sheets, Email and much more.

At a minimum, you need to provide the href and scope options for each API call. These are defined by Google documentation. API calls must be enabled in your Google Cloud account (see link to Tag docs below for more info).

When calling a Google API for the first time, a login challenge is made. You must be logged in to your Google account in a web browser. Tag will detect this, and open a web page that allows you to authorize the scope(s) required for that web API call (this is the same OAuth 2.0 permission granting mechanism used in mobile apps).

This permission can be reused many times, until it eventually expires and displays the permission form to you again. Importantly, it can be reused by other API calls that require the same scope.

The user option is normally not needed. It may be useful if you are calling multiple APIs with differing scopes. It is used to cache permissions on your computer.

The response from the API call is stored on the result output port. The report output port is used to store a JSON report if one is returned by the Google API.

Inputs

source - accepts: any

Outputs

result - produces: any
report - produces: json

Options

href - requires a Google API URI to call
scope - requires a space-separated list of scope identifiers
method - HTTP request method which can be "GET", "POST", "PUT", "PATCH" or "DELETE" (default is "GET")
parameters - map of parameters expected by API
user - optional user name used during login

This extension step only works in the Tag XProc engine.

More information on tag:google can be found in the nSymbol step documentation.

tag:html

This step converts an XSL-FO document (the default rich text format in Tag) into an HTML document (website page) that can be opened in any web browser.

The save-as-xhtml option allows you to save the result as an XHTML document, which is a form of pure XML. While Tag tries to treat HTML and XHTML in a consistent way, there may be situations (in particular with other software programs) where using XHTML provides an advantage.

The output of this step is HTML or XML, which can both be processed further by other pipeline steps. A p:store step can be used to save it to a file.

Only a subset of format settings are converted, roughly corresponding to the available format tools in the Tag rich text editor.

A future version of Tag may extend this step to handle HTML to XSL-FO conversion.

Inputs

source - accepts: xml

Outputs

result - produces: xml html

Options

save-as-xhtml - if true, save the result as XHTML instead of HTML (default is false)

This extension step only works in the Tag XProc engine.

More information on tag:html can be found in the nSymbol step documentation.

tag:json-as-xml

This step converts a JSON document into an XML document.

There are two ways to perform this conversion which is controlled by the method option. The xpath method is the conversion method used by the XPath json-to-xml() function. It creates accurate, yet verbose, XML to represent the input JSON.

The other conversion method is jackson, which refers to the popular Jackson open source library. The XML created by Jackson is less verbose and may be more suitable for some purposes. This is the default method for this step. In some cases, this method will not be possible (due to complexity of the input JSON) and the xpath method will need to be used.

Inputs

source - accepts: json

Outputs

result - produces: xml

Options

method - method to convert JSON to XML which must be "jackson" or "xpath" (default is "jackson")

This extension step only works in the Tag XProc engine.

More information on tag:json-as-xml can be found in the nSymbol step documentation.

tag:prompter

This step pauses execution of a pipeline to prompt the user for input.

The type option dictates what kind of prompter appears:

confirm - displays a message with OK and Cancel buttons (returns "yes" or "no")
info - displays an information message (returns an empty string)
prompt - displays a message and prompts with a text box (returns a non-empty string or null)
yes-no-cancel - displays a message with Yes, No and Cancel buttons (returns "yes", "no" or null)

If null is returned, the pipeline will stop running. All other values are wrapped in a c:result document and written to the result output port.

Outputs

result - produces: xml

Options

message - requires a message for the user
prompt - initial value for the prompt type
title - title for the prompter dialog
type - type of prompter which must be "confirm", "info", "prompt" or "yes-no-cancel" (default is "prompt")

This extension step only works in the Tag XProc engine.

More information on tag:prompter can be found in the nSymbol step documentation.

tag:sleep

This step pauses execution of the pipeline for a specific duration of time. It can be used to simulate longer-running steps for demos, or during prototype development.

Options

millis - the number of milliseconds to sleep (default is 500)

This extension step only works in the Tag XProc engine.

More information on tag:sleep can be found in the nSymbol step documentation.

tag:sparql

This step reads remote SPARQL endpoints (semantic databases). A text document containing a SPARQL query is passed in, and used to query a SPARQL endpoint using the server URI and some additional settings.

Note that the query can be generated using logic and/or data by prior steps in the pipeline. This is a very powerful way to access SPARQL endpoints.

The result is saved as XML in a similar way to tag:sql. Each row in the result set creates a repeating element, which has child elements for all returned variables. There is no guarantee that all repeating elements have exactly the same child elements.

Inputs

source - accepts: text

Outputs

result - produces: xml

Options

server - requires the endpoint's URI
password - password if needed
port - port number if needed
user - user name if needed

This extension step only works in the Tag XProc engine.

More information on tag:sparql can be found in the nSymbol step documentation.

tag:sql

This step reads local or remote SQL databases. A text document containing a SQL query is passed in, and used to query a SQL database using the type option ("access", "mysql" or "sql-server"), the server option URI, and some additional settings.

Note that the query can be generated using logic and/or data by prior steps in the pipeline. This is a very powerful way to access SQL databases.

The result is saved as XML where each row in the result set creates a repeating element, which has child elements for all result columns. All repeating elements have the same child elements, although some may be empty.

Inputs

source - accepts: text

Outputs

result - produces: xml

Options

server - requires the server URI
type - requires type of SQL database which must be "access", "mysql" or "sql-server"
database - database name if needed
password - password if needed
port - port number if needed
user - user name if needed

This extension step only works in the Tag XProc engine.

More information on tag:sql can be found in the nSymbol step documentation.

tag:xml-as-json

This step converts an XML document into a JSON document.

There are two ways to perform this conversion which is determined by the input XML. If the XML references the "http://www.w3.org/2005/xpath-functions" namespace, it is converted to JSON exactly like the XPath xml-to-json() function.

If that namespace is not present, the Jackson open source library is used to perform the conversion. If Jackson is unable to perform the conversion, an error is reported and the pipeline will stop.

The save-as-array option may be used during Jackson conversion. Jackson can't handle multiple map siblings with same name, and some data is not preserved. Instead, this option stores an expression that will "flatten" the XML structure into something that converts to an array (e.g., the expression selects a list of repeating elements from somewhere within the XML hierarchy).

Inputs

source - accepts: xml

Outputs

result - produces: json

Options

save-as-array - an expression to select repeating elements for a Jackson conversion

This extension step only works in the Tag XProc engine.

More information on tag:xml-as-json can be found in the nSymbol step documentation.