Generally sensitive information (PCI, PII) should not be used in URIs (path segments or query params) because they could be logged by termination points or inadvertently without masking in proxy logging.
This question is not about what is "sensitive" information, but I'll offer a definition:
Sensitive data is any value that has meaning by itself, outside the context of the API (e.g. credit card and social security numbers).
Those are obvious, and there will be other values that are deemed sensitive by a specific implementation (e.g. phone number, account number).
Consider this simple use case, where the phoneNumber is classified as "sensitive and not allowed to be used in the URL:
GET /accounts?phoneNumber=8665551212 // find account by number GET /lines/8665551212/usages // get usages for number
One alternative that comes to mind is to use a search endpoint
POST /accounts/search { phoneNumber=8665551212 } POST /lines/search/usages { phoneNumber=8665551212 }
What other alternatives are there to achieve the same?
Let's expand the use case to be able to differentiate among multiple responses (array) and singular response (object)
Consider this expanded use case
GET /accounts?phoneNumber=8665551212 // find account by number GET /lines/8665551212 // get a unique resource GET /lines/8665551212/usages // get usages for number GET /accounts/1234/lines/8665551212 // get a unique resource
This expanded alternative comes to mind, use search for array response and get for single response. That way the response expectation of the URI doesn't change.
POST /accounts/search { phoneNumber=8665551212 } // expect array response POST /lines/get { phoneNumber=8665551212 } // singular response POST /usages/search { phoneNumber=8665551212 } // expect array response POST /lines/get { phoneNumber=8665551212, account=1234 } // singular
Does the use of "get" as a path segment pass peer review?
Is there really no other identifier that can be used instead of a phone number? I would try to find another identification for that resource and try to continue GET for the line resource.
For search operations, like "find account by number", I would go with a POST body to hide the data.
BTW - GET /accounts/1234/lines/{id} seems redundant if you are also having GET /lines/{id} assuming line identifiers are not reused across different accounts.
In this case phoneNumber is required because its a search term. For the line resource identifier in the path, it makes sense to use another identifier that has no intrinsic meaning.
The use of POST /accounts/search to hide query params used as search, is acceptable and common practice.
Also agree that GET /accounts/1234/lines/{id} could be redundant, I just wanted to show an example where a URL path included 2 sensitive data segments.
Thanks
yes, I was suggesting POST with body for search and another identifier for resource identifier in GET.
Gave this some more thought, maybe too much.
Here's what I came up with:
So now the hypothetical use case, assumes both accNum and phoneNumber are sensitive:
GET /accounts?phoneNumber=8665551212 // find account by number, array response GET /accounts/1234/lines // get lines on accounts, array response GET /accounts/1234/lines/8665551212 // get line details, singular response GET /lines/8665551212 // get line details PUT /lines/8665551212 // update line DEL /lines/8665551212 // delete line
becomes
POST /accounts/search { phoneNumber=8665551212 } // array POST /lines/search { accNum=1234 } // array POST /lines/get { phoneNumber=8665551212, accNum=1234 } // singular POST /lines/get { phoneNumber=8665551212 } POST /lines/put { phoneNumber=8665551212, status=HOLD } POST /lines/delete { phoneNumber=8665551212 }
For URLs with a parent collection resource identifier, you could use a placeholder character “-” to maintain path relations if desired.
POST /accounts/-/lines/search { accNum=1234 } // array POST /accounts/-/lines/get { phoneNumber=8665551212, accNum=1234 } // single
Using search, get, put and delete as path suffixes, insures response expectations are met, search returns an array and verb suffixes return a single resource.
Too goofy?
Here's another approach, thanks to @Dallen
Continue to use search as described above.
For unitary operations, use a POST to the rightmost collection, put the sensitive data in the payload and set "X-HTTP-Method-Override" to the verb instead of adding a verb path segment.
So
GET /lines/8665551212
becomes:
X-HTTP-Method-Override:GET POST /lines { "phoneNumber":"8665551212" }
There's some risk in this approach and requires diligence to differentiate between a POST to create the resource vs a POST to get a resource. As you can see if the header is omitted, it is a create request.
I think I'm deciding that I like this one the best. Why? Because it's relatively orthogonal to the canonical GET.
I see a proxy pre-flow that could look for the override header and set the verb, and munge the path so that the proxy itself could still look for GET /blah/phoneNumber thus being a little more easily maintained.
Thoughts about that pre-flow idea?
That may work, depending on how/when logging is done, may not solve the problem of preventing stuff showing up in logs.
Yet another approach, thanks again to @Dallen
Continue to use search as described above.
For unitary operations and paths with intermediate identifiers, use the same path and verb, move the sensitive path identifier to a header and use a reference to its location as the identifier.
So
GET /lines/8665551212 POST /lines/8665551212/usages - { "data":"120" }
becomes:
X-linesId:8665551212 GET /lines/urn:id:headers.x-linesid POST /lines/urn:id:body.phoneNumber/usages - { "data":"120", "phoneNumber":"5558661212" }
This has the benefit of preserving the verbs and API signatures, merely adjust the processing logic in the proxy.
User | Count |
---|---|
7 | |
2 | |
2 | |
1 | |
1 |