What are alternatives for passing sensitive data in URLs?

Generally sensitive information (PCI, PII) should not be used in URIs (path segments or query params) because they could be logged by termination points or inadvertently without masking in proxy logging.

This question is not about what is "sensitive" information, but I'll offer a definition:

Sensitive data is any value that has meaning by itself, outside the context of the API (e.g. credit card and social security numbers).

Those are obvious, and there will be other values that are deemed sensitive by a specific implementation (e.g. phone number, account number).

Consider this simple use case, where the phoneNumber is classified as "sensitive and not allowed to be used in the URL:

GET /accounts?phoneNumber=8665551212  // find account by number
GET /lines/8665551212/usages          // get usages for number

One alternative that comes to mind is to use a search endpoint

POST /accounts/search { phoneNumber=8665551212 }
POST /lines/search/usages { phoneNumber=8665551212 }

What other alternatives are there to achieve the same?

2 9 25K
9 REPLIES 9

Let's expand the use case to be able to differentiate among multiple responses (array) and singular response (object)

Consider this expanded use case

GET /accounts?phoneNumber=8665551212  // find account by number
GET /lines/8665551212                 // get a unique resource
GET /lines/8665551212/usages          // get usages for number
GET /accounts/1234/lines/8665551212   // get a unique resource

This expanded alternative comes to mind, use search for array response and get for single response. That way the response expectation of the URI doesn't change.

POST /accounts/search { phoneNumber=8665551212 } // expect array response
POST /lines/get       { phoneNumber=8665551212 } // singular response
POST /usages/search   { phoneNumber=8665551212 } // expect array response
POST /lines/get       { phoneNumber=8665551212, account=1234 } // singular 

Does the use of "get" as a path segment pass peer review?

Is there really no other identifier that can be used instead of a phone number? I would try to find another identification for that resource and try to continue GET for the line resource.

For search operations, like "find account by number", I would go with a POST body to hide the data.

BTW - GET /accounts/1234/lines/{id} seems redundant if you are also having GET /lines/{id} assuming line identifiers are not reused across different accounts.

In this case phoneNumber is required because its a search term. For the line resource identifier in the path, it makes sense to use another identifier that has no intrinsic meaning.

The use of POST /accounts/search to hide query params used as search, is acceptable and common practice.

Also agree that GET /accounts/1234/lines/{id} could be redundant, I just wanted to show an example where a URL path included 2 sensitive data segments.

Thanks

yes, I was suggesting POST with body for search and another identifier for resource identifier in GET.

Gave this some more thought, maybe too much.

Here's what I came up with:

  1. Avoid using sensitive data as primary identifiers, use alternate identifiers. This will alleviate some of the situations.
  2. For GET operations as a convenience search on collections (with sensitive data as query params or path params), convert to a POST on the collection with a “search” path suffix and move the params to request body. Always returns an array if any.
  3. For POST operations to create a resource on subresource collections (with sensitive data as path params) move the path param to the request body and use the subresource collection in the URL. Always returns a single resource if any.
  4. For GET, PUT, DELETE operations on a resource (Assuming #1 is not acceptable), convert to POST operation on the collection with a corresponding “{verb}” path suffix (get, put, delete) and move the primary identifier and any parent resource path params to request body. Again, use the subresource as the collection in the URL. Always returns a single resource if any.

So now the hypothetical use case, assumes both accNum and phoneNumber are sensitive:

GET /accounts?phoneNumber=8665551212  // find account by number, array response
GET /accounts/1234/lines              // get lines on accounts, array response
GET /accounts/1234/lines/8665551212   // get line details, singular response
GET /lines/8665551212                 // get line details
PUT /lines/8665551212                 // update line
DEL /lines/8665551212                 // delete line

becomes

POST /accounts/search { phoneNumber=8665551212 }              // array
POST /lines/search    { accNum=1234 }                         // array
POST /lines/get       { phoneNumber=8665551212, accNum=1234 } // singular
POST /lines/get       { phoneNumber=8665551212 } 
POST /lines/put       { phoneNumber=8665551212, status=HOLD } 
POST /lines/delete    { phoneNumber=8665551212 } 

For URLs with a parent collection resource identifier, you could use a placeholder character “-” to maintain path relations if desired.

POST /accounts/-/lines/search { accNum=1234 }                         // array
POST /accounts/-/lines/get    { phoneNumber=8665551212, accNum=1234 } // single

Using search, get, put and delete as path suffixes, insures response expectations are met, search returns an array and verb suffixes return a single resource.

Too goofy?

Here's another approach, thanks to @Dallen

Continue to use search as described above.

For unitary operations, use a POST to the rightmost collection, put the sensitive data in the payload and set "X-HTTP-Method-Override" to the verb instead of adding a verb path segment.

So

GET /lines/8665551212

becomes:

X-HTTP-Method-Override:GET
POST /lines { "phoneNumber":"8665551212" }

There's some risk in this approach and requires diligence to differentiate between a POST to create the resource vs a POST to get a resource. As you can see if the header is omitted, it is a create request.

I think I'm deciding that I like this one the best. Why? Because it's relatively orthogonal to the canonical GET.

I see a proxy pre-flow that could look for the override header and set the verb, and munge the path so that the proxy itself could still look for GET /blah/phoneNumber thus being a little more easily maintained.

Thoughts about that pre-flow idea?

That may work, depending on how/when logging is done, may not solve the problem of preventing stuff showing up in logs.

Yet another approach, thanks again to @Dallen

Continue to use search as described above.

For unitary operations and paths with intermediate identifiers, use the same path and verb, move the sensitive path identifier to a header and use a reference to its location as the identifier.

So

GET /lines/8665551212

POST /lines/8665551212/usages - { "data":"120" }

becomes:

X-linesId:8665551212
GET /lines/urn:id:headers.x-linesid


POST /lines/urn:id:body.phoneNumber/usages - { "data":"120", "phoneNumber":"5558661212" }

This has the benefit of preserving the verbs and API signatures, merely adjust the processing logic in the proxy.