How can I detect duplicate keys in JSON request body?

I know the RFC states "The names within an object SHOULD be unique" but most parsers, including the one used by Edge, just use the last occurrence of the duplicate.

I would like to detect this situation and raise a fault 400 Bad Request.

I'm validating my request using my Open API Spec and tv4, but it does not detect the duplicates either.

Here's what I'm thinking:

  1. Use a JSON parser that detects duplicates
  2. Use a JSON validator that detects duplicates
  3. Compare original request with JSON.stringify(JSON.parse(request.content), but original request may be a "pretty" version so simple string comparison won't work.

Anyone else solve this problem?

2 8 30.8K
8 REPLIES 8

Detecting duplicate keys requires parsing the JSON object in a streaming fashion. The builtin JSON.parse() doesn't do that.

There are streaming parsers available for JavaScript. Clarinet is one. It works in the browser, so it may work nicely in a JS callout as well.

Example

// DuplicateKeyDetectingParser.js
// ------------------------------------------------------------------
//
// created: Tue Sep 18 09:32:28 2018
// last saved: <2018-September-18 11:45:39>


'use strict';


var clarinet = require("clarinet");


function Parser() {
  var keystack = [];
  var indent = '';
  var p = this;
  this.duplicateDetected = null;


  var parser = clarinet.parser();
  parser.onvalue = function (v) {
    // got some value.  v is the value. can be string, double, bool, or null.
  };
  parser.onopenobject = function (key) {
    // opened an object. key is the first key.
    var seenKeys = [];
    seenKeys.push(key);
    keystack.push(seenKeys);
  };
  parser.onkey = function (key) {
    var i = keystack.length - 1;
    var seenKeys = keystack[i];
    // got a subsequent key in an object.
    if (seenKeys.indexOf(key) != -1){
      if ( ! p.duplicateDetected) {
        p.duplicateDetected = [];
      }
      p.duplicateDetected.push(key);
    }
    seenKeys.push(key);
  };
  parser.oncloseobject = function () {
    // closed an object.
    keystack.pop();
    indent = ' '.repeat(keystack.length);
  };
  parser.onopenarray = function () {
    // opened an array.
  };
  parser.onclosearray = function () {
    // closed an array.
  };
  parser.onend = function () {
    // parser stream is done, and ready to have more stuff written to it.
  };


  this.parser = parser;
}


Parser.prototype.status = function() {
  //console.log('status(): ' + JSON.stringify(this.duplicateDetected));
  return this.duplicateDetected;
};


module.exports = Parser;

And this is a test driver:


var testcases = [
      { name: 'simple-good', hasDuplicate: false,  payload:'{"foo": "bar"}'},
      { name: 'simple-fail', hasDuplicate: true, payload: '{ "key1": "a", "key1": "b" }' },
      { name: 'nested1',     hasDuplicate: false,  payload: '{ "key1": "a", "key2": "b", "nested" : { "key1" : "value1", "key2": true}}'}
];


var Parser = require('./DuplicateKeyDetectingParser.js');


testcases.forEach(function(testcase){
  //console.log(testcase.name);
  console.log( '%s: %s %s', testcase.name, testcase.hasDuplicate?'has Dupe':'no Dupe', testcase.payload);
  var p = new Parser();
  p.parser.write(testcase.payload).close();
  console.log('%s: %s', testcase.name, (Boolean(p.status()) === testcase.hasDuplicate) ? 'PASS':'FAIL');
});

Thanks @Dino-at-Google I didn't know there were streaming parsers.

But, wait, we can't use "require" in a JavaScript callout.

yes, you would have to browserify it. Or take other steps to construct a JS-callout compatible shape for this logic.

For option 3, I found a "minify" JavaScript function here that works, but I'm not sure how accurate it is with respect to JSON.parse().

For a reasonable example it worked:

  1. Minify the original request
  2. Parse and stringify the original request
  3. String compare to see if original matches parsed

Seems a bit brute force and it doesn't give me meaningful results.

There's a much cleaner solution, an Apigeekster Egg!!!

Thanks to @DAcharya who was experimenting with same and was trying various policies.

Using a simple Regex Policy you can detect duplicates using the default values for the policy:

<RegularExpressionProtection async="false" continueOnError="false" enabled="true" name="RE-DetectDuplicates">
    <DisplayName>RE-DetectDuplicates</DisplayName>
    <Properties/>
    <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
    <JSONPayload>
        <JSONPath>
            <Pattern ignoreCase="false">pattern</Pattern>
            <Expression>expression</Expression>
        </JSONPath>
    </JSONPayload>
</RegularExpressionProtection>

Given this request:

{
    "fullName": "Joe Shmoe",
    "firstName": "Joe",
    "firstName": "Joe2",
    "lastName": "Shmoe"
}

The result of the fault is:

{
    "fault": {
        "faultstring": "Failed to execute the RegularExpressionProtection StepDefinition RE-DetectDuplicates. Reason: Unexpected duplicate key:firstName at position 72.",
        "detail": {
            "errorcode": "steps.regexprotection.ExecutionFailed"
        }
    }
}

It works on more complex structures too and I haven't found any false positives.

Pretty cool!

Interesting! Built-in!

Tried this Policy in APIGEE X, and it's not working unfortunately, there is any updates ?  

To check for duplicate fields in Apigee X, I guess you would need to implement your own check via a streaming JSON parser. In either JS or Java. 

Like maybe this: https://github.com/DinoChiesa/Apigee-Java-Json-Check