Converting between XML and JSON with Apigee: what you need to know

Apigee can do lots of things. One of those things is to dynamically convert between an XML payload to a JSON "equivalent". Common reasons for doing this:

  • creating a RESTful facade for an existing SOAP service (via a SOAP to REST proxy within Apigee)
  • providing content-negotiation in the proxy layer - for example, let the caller or client specify whether it wants XML or JSON, and the proxy layer (Apigee) can format it correctly. The backend (upstream) need not be aware.

There may be other reasons as well. Regardless of your reasoning, when converting between XML and JSON, you need to be aware: there are significant differences between XML and JSON, as well as limitations of the XMLToJSON and JSONToXML policies that you should understand. I'll explain those, and then I will present a couple of working proxy solutions that implement SOAP to REST conversions without requiring lots of custom code.

Data Types

JSON:

{ "myNum": 9, "myBool": true, "myStr": "val",
  "myEmptyObject": {}, "myEmptyStr": "", "myNull": null }

JSON is strongly typed. Numbers, booleans, strings, empty strings, empty objects and nulls can all be clearly specified.

XML:

<StringOrNumber>12345</StringOrNumber>
<StringOrBoolean>false</StringOrBoolean>
<EmptyObjectOrEmptyStringOrNull></EmptyObjectOrEmptyStringOrNull>

Unless you are using XML Schema Definitions (XSDs), XML does not clearly delineate between data types. Chances are good that if the value of an element is "false", the type is boolean. However, it could also be a string (<TeethType>false</TeethType>).

Similarly, most ZIP codes look like numbers in XML. However, the standard data type to use for a ZIP code is string, because you can have ZIP codes with leading zeroes or ZIP+4 formats. Without the context of what the field means, an automatic converter cannot convert both <ZIPCode> (string) and <Age> (number) correctly.

Empty strings ( "" ), empty objects ( { } ) and nulls also look identical in XML.

Conversion:

JSONToXML will handle data conversions correctly, since the data type is just basically stripped.

XMLToJSON provides configuration options to choose whether to detect numbers, booleans and nulls. Where necessary, you might need to tweak data types post-conversion where the types may be ambiguous.

Arrays

                             JSON                       XML
  
Empty array:                  []                   -nothing in the XML-

Single item array:    [ { "name": "val" } ]          <name>val</name>

Multiple item array:    [ "Arm", "Leg" ]      <item>Arm</item><item>Leg</item>

JSON arrays are specified using square brackets.

XML arrays are represented by repeating an element name multiple times.

XML arrays can also be represented as zero or more elements of the same name wrapped in another element:

<Languages>
  <item>JavaScript</item>
  <item>Java</item>
  <item>Python</item>
</Languages>

Conversion:

JSONToXML can generally convert arrays correctly.

For wrapped XML arrays, the XMLToJSON conversion looks a bit strange:

{"Languages":{"item":["JavaScript","Java","Python"]}}

For non-wrapped arrays, XMLToJSON has difficulty converting zero element arrays. There are no elements in the XML, so there won't be anything in the JSON.

Single element arrays also are a problem: they look identical to a single object, so the JSON will show an object instead of an item in an array.

Make sure you fix converted array and data type idiosyncrasies -- don't force your app developers to deal with inconsistent types.

Root Elements

XML requires a single named root element. All other elements must exist within this element.

JSON does not require a root element: the top level is generally an unnamed object or array.

XMLToJSON will convert as expected, but a dummy root element will sometimes be created for the other direction (shown are the conversions with the default JSONToXML settings):

[ "abc", "def" ]        ->  <Array><Item>abc</Item><Item>def</Item></Array>

{ "a": "abc" }          ->  <a>abc</a>                 (only one item in object)

{ "a": "abc", "b": 14 } ->  <Root><a>abc</a><b>14</b></Root>    (multiple items)

Namespaces and Attributes

XML has namespaces and attributes, JSON does not. If you need those when converting from XML to JSON (you often do need at least the attributes), they need to be put somewhere else in the JSON payload. XMLToJSON allows you to specify where these will go.

Multi-line

XML is allowed to span multiple lines. JSON is also allowed to be multi-line (see ECMA-404 from Oct 2013), but some legacy JSON parsers do not support multi-line JSON. If you need to support these, consider returning your JSON on a single line. However, you should always accept JSON that spans multiple lines.

Converting between formats

When a payload to be converted is simple and predictable, you can often use ExtractVariables to get data out of the input payload and AssignMessage to build the converted payload. When this is easy, this will be the most efficient solution. This is often the solution for SOAP requests, even when the corresponding SOAP responses are very complex.

When you have more complex payloads to convert, especially SOAP responses and payloads with arrays, you'll generally need a different solution that includes XMLToJSON and JSONToXML.

Your first inclination might be to use custom JavaScript code to tweak the payload before the JSONToXML conversion or after the XMLToJSON conversion. This becomes untenable as the number of different payloads/SOAP responses grows.

When I had to create about 50 SOAP to REST APIs in a relatively short period of time, I came up with a scheme that used a specially-crafted XSLT file for each response and a single JSON cleanup function that could fix the array and data types coming out of the XMLToJSON policy. This was great for developer productivity (once I fought through the steep learning curve for XSLT). You can see an example of this solution at https://github.com/apigeecs/soap-xsl-example .

While the XSL policies performed well enough, it made sense to explore a JavaScript solution that doesn't require lots of custom code. I've created a JavaScript library, called JSMapr, that allows you to specify simple transformation steps to modify a JavaScript object in place. This can be used after an XMLToJSON policy to do complex conversions required for SOAP to REST proxies. A proxy using JSMapr to call the same SOAP backend as the XSL example can be found at https://github.com/apigeecs/soap-js-mapr .

Comments
Not applicable

@mdunker@apigee.com

Is forming json structure itself in xslt a good idea?So,we can avoid having xslt transformer,javascript,xml to json convertor.

For eg:

Having xslt like below ,

<xsl:stylesheet version="1.0">
<xsl:template match="soapenv:Body"> 

{"Products":

 { "expiryDT":" 

<xsl:value-of select="ProductResponse/expirydate"/>" ,

 "Status":"<xsl:value-of select="ProductResponse/status"/>" 
}
}

 </xsl:template>
 </xsl:stylesheet>
mdunker
Staff

@RK4

Building JSON using XSLT is not a very common pattern, but it should work. However, XSLT is not as efficient as some other methods. In your example (which looks like 2 simple extracts from a single structure), I would probably use XPath in an ExtractVariables policy to get the expirydate and status fields, and then an AssignMessage to write those values into a JSON payload. Remember also that JSON containing newlines is technically not valid, so you'd want that JSON to all be on a single line.

Not applicable
@mdunker@apigee.com

Thanks for your quick response.

I gave a small xml example to ask my query.

Actually,the xml I am working with is big and contains arrays.I think I cannot go with having single line for the entire json pattern in xslt,for a big xml.

So,its better if I use xml to xml in xslt and then go with xml to json conversion.

I tried your method of having Arrays and strings(~STR~) for proper xml to json conversion.And its working awesome..Thankyou...

Not applicable

I am facing one issue with the XML to JSON conversion policy. I am getting XML response from backend. XML contains certain element which can appear once or multiple times. After getting the response from BE, I am using XMLtoJSON policy to convert it to JSON format.

Input :

<tac>adadadada</tac> <tac>asasasasa</tac> <id>12121212112</id> <version>1.0</version>

Output :

{"tac" : [adadadada, asasasasa], "id" : 12121212112, "version" : "1.0"}

Now if tac element is appearing only once then input and output is as follows.

Input :

<tac>asasasasa</tac> <id>12121212112</id> <version>1.0</version>

Output :

{"tac" : asasasasa, "id" : 12121212112, "version" : "1.0"}

As it clear from mentioned two payload that tac element could be a single value or Array. So, In our API flow, we have to check if tac is an instance of Array or not.

Then depending on the condition, We are proceeding ahead with the northbound response.

Does XML to JSON policy provides capabilities which allows API developer to configure within the policy itself if they want certain element to converted to be an array even though element is appearing only once in XML payload.

So if, API developer is configuring tac element for array then response could be like this from XMLtoJSON policy itself.

{"tac" : [asasasasa], "id" : 12121212112, "version" : "1.0"}

In this case, no need to check the instance of object for Array going forward.

Also, If this feature not available then can we have this kind of feature in recent times.

Thanks,

Varun

Not applicable

Edge's XMLtoJSON policy is probably the wrong choice for anything but trivial XML responses. Assuming you have a properly defined XML document with an associated XSD schema, you should use a message validation policy to validate the XML, then apply and XSLT to transform the XML to valid JSON directly with an XSL transformation policy. You can then control the way the XML is transformed to JSON (e.g., building arrays consistently when you need them) - remember, XML has far more accurate data type specification capability than JSON, which has only four data types (string, boolean, floating point, object {arrays are objects}). The following project may help you get started on your XSLT: https://github.com/bramstein/xsltjson.

Unfortunately, the JSON to XML path is a bit harder - that policy does not have the option to allow an XSD to guide the transformation to XML. Nor does JSON have the equivalent of schemas to ensure that your data is well formed. You will have to trust that your JSON source is correctly formatted by the source, and prepare for the possibility that the resulting XML data will be rejected as malformed by the destination.

Not applicable

Why the statement "XML is allowed to span multiple lines. JSON cannot be multi-line.

As developers, we get used to seeing nicely-formatted JSON spanning multiple lines. This is pretty-printed JSON. It is not valid JSON."

ECMA 404, The JSON Data Interchange Format specifically states: Insignificant whitespace is allowed before or after any token. The whitespace characters are: character tabulation (U+0009), line feed (U+000A), carriage return (U+000D), and space (U+0020). Whitespace is not allowed within any token, except that space is allowed in strings.

mdunker
Staff

You are correct -- ECMA 404 (Oct 2013) allows for JSON to span multiple lines, though earlier JSON specs did not allow this. Legacy parsers may possibly have issues, but I'll change wording.

mdunker
Staff

Agreed -- XML when used with an XSD schema does have rich typing.

DChiesa
Staff

Yes, since some time in 2016, the XMLToJSON policy includes a TreatAsArray element to solve the problem you are describing.

Harish123
Participant I

Is there any way we can convert XML to Canonical XML form using Apigee Policies i.e. XSLT.? or we have to do it using extension Policies.

Version history
Last update:
‎03-07-2015 12:36 AM
Updated by: