XML

See the following sections to parse XML input or format XML output.

Setting XML Stream Properties

You can set the following properties to parse the format of XML input or to format XML output. Select the necessary component and expand the Format property in the Stream pane.

Property name Value
DTD Identifier Specify the system identifier. This replaces the system identifier when the stream reads external XML outside the Flow Service. For example, the part between quotes is the system identifier:
<!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
Read External Entity Specify whether to load external entities or not. See Referencing External Entities.
true Load external entities and expand entity references.
false Don't load external entities and don't expand entity references.
Validate DTD Specify whether to validate the XML stream against the DTD. For an external DTD, see Referencing External Entities.
true Validate the document against the DTD and throw an error if a validation error occurs. If a system identifier is specified, the validation will use that DTD. If the DTD is an external entity, you must set Read External Entity to true.
false Don't validate against the DTD.
Output Encoding Select the stream's encoding. When reading an external XML document, if the external XML specifies the encoding, this property is not used.
utf-8 Unicode UTF-8
shift_jis Shift JIS
euc-jp EUC-JP
iso-2022-jp ISO-2022-JP
utf-16 Unicode UTF-16
Windows-31J Windows-31J
Linefeed Specify the line-feed characters in the XML output.
※CR can't be specified.
Platform Convert to the platform's standard line-feed character.
CR+LF Convert to CR+LF.
LF Convert to LF.
Output Format Specify how to format the output XML.
Don't format Don't change white space, line-feed characters, or indentation.
Trim space Delete white-space-only nodes (including any line-feed characters).
Indent Delete white-space-only nodes (including any line-feed characters) and indent.
Write XML Declaration Select whether to output the XML declaration.
true Output the XML declaration.
false Don't output the XML declaration.
Use Empty Tag Specify how to output empty elements.
true Output empty elements with a self-closing tag. For example, <p/>.
false Output empty elements with a pair of starting and closing tags. For example, <p></p>.
Namespace Define a namespace for a prefix (see below).

Configuring Prefixes and Namespaces

When specifying a prefix for element and attribute names, you must define a namespace to avoid an error. To add a namespace, click the field for the Namespace property, then specify the Prefix and its URI.

Referencing External Entities

You can use a relative URI to reference a DTD or external entity. The relative URI you specify will be resolved using a standard DTD folder for the system. The default DTD folder for the system is [DATA_DIR]/system/schema.

Example

For example, to use the relative path below, you would need to put the file test.dtd in the system DTD folder.

<!DOCTYPE test SYSTEM "test.dtd">

Defining the Stream Fields

See below to map the stream's fields to values in the XML.

Setting the Field Properties

In the Stream pane, you define the fields to match the structure of the XML.

FieldName Specify the name of the element or attribute.
Type Select String, Boolean, Integer, Double, Decimal, or Datetime. (You can't specify the binary type here.)
Repeat Specify whether the element or attribute can occur multiple times. See Defining the XML Structure below.
Node Type Select Element or Attribute.
Label Enter a display name for use in the mapping window.

Note

Note that you'll get an error if you configure the following:
  • Define more than one top-level element.
  • Set the Repeat property for an element when the same element is defined again at the same level. This will result in a compile-time error. (You can define the same elements at the same level, otherwise.)
  • Define the same attributes at the same level.
  • Set the Repeat property for the top-level element.

Defining the XML Structure

Right-click a field and then click Ascend Tree or Descend Tree to move a field up or down in the hierarchy.

Defining the Field Names

The field name uniquely identifies the field in the XML. See below for example field definitions, details on validation of field names, and details on processing.

Example Field Definitions

See below for example definitions for elements and attributes.

Defining prefixes: The last field of the example uses the prefix "x". For any given prefix "x", you must define the namespace URI -- click the Namespace field in the Stream pane.

FieldName Repeat Node type XPath
root None Element /root
  record Exist Element /root/record
    attr1 None Property /root/record/@attr1
    element1 None Element /root/record/element1[1]
    element1 None Element /root/record/element1[2]
    element2 None Element /root/record/element2
      x:element3 Exist Element /root/record/element2/x:element3

Mapping Fields to XML

The Flow service internally uses XPath expressions to identify the field in the XML. The XPath expression is an absolute path from the document root.

Note

You can't explicitly set XPaths for fields. The Flow Service internally evaluates the XPath expressions.

Validating Field Names

Note the conditions for a field name to be valid:

Extracting Field Values

If a field's node type is an element, the field value maps to the content of the element. If the node type is an attribute, the field value maps to the value of the attribute. If there is no element or attribute that matches the field definition, the field value maps to null.

The field value of an element that has subelements will be a string that contains the subelements' text joined together. For example, if the XML is as below, the <p> element's value will be "abcdefghi".

<p>
  abc
    <a href="http://foo.bar/">
    def
    </a>
  ghi
</p>

Determining the Record Numbers

The repeat element separates the XML document into records. See below to follow the process and determine the record number.

Note

Note that the record number of the stream and the record number of the mapper may not match -- the mapper reconstructs the field definitions using only the fields that are being mapped.

Outputting an Empty Stream

An empty XML stream consists of the following XML:

<?xml version="1.0" encoding="utf-8"?>
<root/>
 

To the top of this page