Regular Expression Policy Incorrect XPath Attributes Extraction

Hello, @dino! Need some advice on XPath syntax for Regular Expression Protection Policy. We need to check all attributes in XML Request body. We use such configuration to check "Value" attribute and it works fine:

Screenshot 2021-07-14 162756.png

 In case we try to check all attributes we use next pattern which for unknown reason does not work:

ochal_0-1626269557353.png

To check if our pattern is correct we have tried to use Online XPath emulator. And it shows that the pattern is correct. What is the possible reason for this? We had similar issue on other project - Apigee RegExp could not select correct element from the correct pattern (in different emulators it was working). How we can handle this? Dino Would be extremely grateful for your thoughts and time!

Solved Solved
0 3 640
1 ACCEPTED SOLUTION

Hi Ochal, I'd be glad to try to help out. 

First, can you clarify for me what you are testing ? You wrote "we try to check all attributes we use next pattern which for unknown reason does not work", but you did not explain what you mean by "does not work". What are you seeing, and what are you expecting to see?

I conducted some tests here, and for my simple tests, the XPath Expression with a wildcard for the attribute name, is working correctly.

When using the RegularExpressionProtection policy on an XML payload, there are two interesting things you can specify: The regular expression itself, and the XPath specifying what to check. In this case, we are not interested in varying the regex, so I used a simple pattern of "bad" for all cases. The idea is that if any attribute value matching the XPath I specify has the word "bad" in it, then the policy should detect a threat. Does this make sense?

I have two RegularExpressionProtection policies. The one that looks at wildcard attributes looks like this: 

<RegularExpressionProtection name="REP-XML-Wildcard-Attrs">
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <Source>contrivedMessage</Source>
  <XMLPayload>
    <Namespaces/>
    <XPath>
      <!-- check content of any attribute on any element -->
      <Expression>//*/@*</Expression>
      <Type>string</Type>
      <Pattern>bad</Pattern>
    </XPath>
  </XMLPayload>
</RegularExpressionProtection>

The one that looks for a SPECIFIC attribute looks like this: 

<RegularExpressionProtection name="REP-XML-Specific-Attr">
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <Source>contrivedMessage</Source>
  <XMLPayload>
    <Namespaces/>
    <XPath>
      <!-- check for a specific attribute on any element -->
      <Expression>//*/@value</Expression>
      <Type>string</Type>
      <Pattern>bad</Pattern>
    </XPath>
  </XMLPayload>
</RegularExpressionProtection>

 

 

I tested these scenarios:

case # actual XML Expression Observed Result As expected?
1
<child attr1='acceptable content'>123</child>
//*/@*
no fault Y
2
<child attr1='acceptable content'>123</child>
//*/@value
no fault Y
3
<child value='acceptable content'>123</child>
//*/@*
no fault Y
4
<child value='acceptable content'>123</child>
//*/@value
no fault Y
5
<child attr1='bad content'>123</child>
//*/@*
FAULT Y
6
<child attr1='bad content'>123</child>
//*/@value
no fault Y
7
<child value='bad content'>123</child>
//*/@*
FAULT Y
8
<child value='bad content'>123</child>
//*/@value
FAULT Y

Attached please find the API proxy I used to test this. To invoke it, 

 

 

curl -i $endpoint/regexprotection-4/t1
curl -i $endpoint/regexprotection-4/t2
 ...
curl -i $endpoint/regexprotection-4/t8

 

 

View solution in original post

3 REPLIES 3

Hi Ochal, I'd be glad to try to help out. 

First, can you clarify for me what you are testing ? You wrote "we try to check all attributes we use next pattern which for unknown reason does not work", but you did not explain what you mean by "does not work". What are you seeing, and what are you expecting to see?

I conducted some tests here, and for my simple tests, the XPath Expression with a wildcard for the attribute name, is working correctly.

When using the RegularExpressionProtection policy on an XML payload, there are two interesting things you can specify: The regular expression itself, and the XPath specifying what to check. In this case, we are not interested in varying the regex, so I used a simple pattern of "bad" for all cases. The idea is that if any attribute value matching the XPath I specify has the word "bad" in it, then the policy should detect a threat. Does this make sense?

I have two RegularExpressionProtection policies. The one that looks at wildcard attributes looks like this: 

<RegularExpressionProtection name="REP-XML-Wildcard-Attrs">
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <Source>contrivedMessage</Source>
  <XMLPayload>
    <Namespaces/>
    <XPath>
      <!-- check content of any attribute on any element -->
      <Expression>//*/@*</Expression>
      <Type>string</Type>
      <Pattern>bad</Pattern>
    </XPath>
  </XMLPayload>
</RegularExpressionProtection>

The one that looks for a SPECIFIC attribute looks like this: 

<RegularExpressionProtection name="REP-XML-Specific-Attr">
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <Source>contrivedMessage</Source>
  <XMLPayload>
    <Namespaces/>
    <XPath>
      <!-- check for a specific attribute on any element -->
      <Expression>//*/@value</Expression>
      <Type>string</Type>
      <Pattern>bad</Pattern>
    </XPath>
  </XMLPayload>
</RegularExpressionProtection>

 

 

I tested these scenarios:

case # actual XML Expression Observed Result As expected?
1
<child attr1='acceptable content'>123</child>
//*/@*
no fault Y
2
<child attr1='acceptable content'>123</child>
//*/@value
no fault Y
3
<child value='acceptable content'>123</child>
//*/@*
no fault Y
4
<child value='acceptable content'>123</child>
//*/@value
no fault Y
5
<child attr1='bad content'>123</child>
//*/@*
FAULT Y
6
<child attr1='bad content'>123</child>
//*/@value
no fault Y
7
<child value='bad content'>123</child>
//*/@*
FAULT Y
8
<child value='bad content'>123</child>
//*/@value
FAULT Y

Attached please find the API proxy I used to test this. To invoke it, 

 

 

curl -i $endpoint/regexprotection-4/t1
curl -i $endpoint/regexprotection-4/t2
 ...
curl -i $endpoint/regexprotection-4/t8

 

 

Thanks a lot for your response! (I'm working on that task with the author of the post).

In fact, all we needed is to extract all attributes from our XML payload, but for some reason, our policy wasn't working as expected. Your example helped a lot cause we weren't sure how to check that this "//*/@*" expression worked as expected. Probably we have some problems with pattern or something. 
Anyway, thanks for your answer!

Thanks a lot, Dino, for your advice and help!