How to extract specific values from responses in HTML format

Currently Apigee does not provide policies to extract content from a request or response in HTML format. While you could use some out of the box strategies to achieve that.

If the HTML response is XML-compliant. You could treat the HTML response as XML document and parsing it by using XSLT and XPATH expressions in the ExtractVariable policy. For example, Let us say you get the following piece of a HTML response from a service callout policy, the HTML response will be stored in flow variable "calloutResponse.content" by default.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
  <head>
    <title></title>
    <script></script>
  </head>
  <body>
      <form id="form_id" name="form_name" action="https://example.com" method="POST">
      <input type="hidden" name="hidden_name" value="hidden_value" />
      </form>
  </body>
</html>

And you would like to extract the value of "hidden_name" from this HTML response:

<input type="hidden" name="hidden_name" value="hidden_value" />

You could create a ExtractVariables policy just like below:

<ExtractVariables async="false" continueOnError="false" enabled="true" name="EV-HTML-Extract">
    <DisplayName>EV-HTML-Extract</DisplayName>
    <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
    <Source clearPayload="false">calloutResponse.content</Source>
    <XMLPayload stopPayloadProcessing="false">
        <Namespaces>
            <Namespace prefix="dir">http://www.w3.org/1999/xhtml</Namespace>
        </Namespaces>
        <Variable name="hidden_name" type="string">
            <XPath>/dir:html/dir:body/dir:form/dir:input[@name="hidden_name"]/@value</XPath>
        </Variable>
    </XMLPayload>
</ExtractVariables>

Then this policy will read the value and assigns it to "hidden_name" variable.

For HTML response which is not XML-compliant. You can assign entire response payload to a variable as a string. And then try to extract specific part of the HTML text by using regex match operations in any language you would prefer (JAVA, javascript, python etc)

Comments
Not applicable

I find xml to json and extract from json more easier than the xml extraction.

Version history
Last update:
‎11-03-2020 07:17 PM
Updated by: