JavaRegex fails when input has CRLF in it

I have a JavaRegex I'm using in RouteRules on several proxies.

<Condition>(request.content JavaRegex ".*\u0022CountryCode\u0022\s*:\s*\u0022(GB|DE)\u0022.*")</Condition>

In one proxy, it works fine whether or not "CountryCode":"GB" in the request.content is preceded by a CRLF or not.
In another proxy, including the CRLF anywhere in the request.content before the string causes the Regex to fail.

If there's another way to look for this string to select the proper route rule, I'm open to it. I'd prefer using dot notation to find the string in the proper location in the request.content; however, JavaRegex is an acceptable solution, if arbitrary whitespace didn't break it.

Solved Solved
0 3 147
1 ACCEPTED SOLUTION

In one proxy, it works fine whether or not "CountryCode":"GB" in the request.content is preceded by a CRLF or not.
In another proxy, including the CRLF anywhere in the request.content before the string causes the Regex to fail.

You're saying the Java Regex works differently depending on the proxy? The exact same Condition, placed the same way, with the same input content (same encoding), works differently in different proxies? If that's what you're saying, that would be a bug. I'd advise you to raise that with Apigee support to diagnose that and get it fixed. It suggests that there is some difference in how the proxies are deployed. That shouldn't happen. That feels mysterious, and I'd want to sort that out if I were you.

On the other hand, maybe you just want your regex to work and not be bothered with all of that. If that's the case I would suggest trying this: embed a (?s) at the start of the regex, which tells the regex to accept a CR or LF character as a match for the dot (wildcard). Something like this:

<Condition>(request.content JavaRegex "(?s).*\u0022CountryCode\u0022\s*:\s*\u0022(GB|DE)\u0022.*")</Condition>

Then, retest and see if that solves your problem with the CR/LF thing.

View solution in original post

3 REPLIES 3

In one proxy, it works fine whether or not "CountryCode":"GB" in the request.content is preceded by a CRLF or not.
In another proxy, including the CRLF anywhere in the request.content before the string causes the Regex to fail.

You're saying the Java Regex works differently depending on the proxy? The exact same Condition, placed the same way, with the same input content (same encoding), works differently in different proxies? If that's what you're saying, that would be a bug. I'd advise you to raise that with Apigee support to diagnose that and get it fixed. It suggests that there is some difference in how the proxies are deployed. That shouldn't happen. That feels mysterious, and I'd want to sort that out if I were you.

On the other hand, maybe you just want your regex to work and not be bothered with all of that. If that's the case I would suggest trying this: embed a (?s) at the start of the regex, which tells the regex to accept a CR or LF character as a match for the dot (wildcard). Something like this:

<Condition>(request.content JavaRegex "(?s).*\u0022CountryCode\u0022\s*:\s*\u0022(GB|DE)\u0022.*")</Condition>

Then, retest and see if that solves your problem with the CR/LF thing.

Adding the (?s) does resolve the issue. I'm not sure exactly which whitespace character it catches that the .* didn't catch before, but since it works no matter what (or how many) whitespace characters I throw in there, I'm moving on.

Excellent!  I love it when that happens.