Exception Handling, Retrying and maintaining state capabilities in Apigee

Hi,

I'm pretty sure this is a case of abusing (albeit, unintentionally) Apigee's native architecture/intention and hacking it to behave like an ESB or some sort of message bus because that's just the technology that people have been used to for past 2 decades - but it could also be a case where my book knowledge isn't appropriate for a practical situation involving a real Enterprise's IT systems. My purpose is to confirm my understanding with the larger Apigee community to make sure I'm not being brazen or presumptuous in my thinking.

Scenario: Suppose some data goes from client to target through Apigee. Now suppose for some reason that Data does not reach the target server for 2 possible reasons :-

1. The backend/target system is down for maintenance reasons - outage scenario once a month/5-10 hours a day.

2. Something was wrong in the data i.e. bad request so the server rejected it.

3. There is no other possible reason why something triggered by the client would not be processed by the backend/target system.

Questions/Discussion Points

1) What do you think should happen in this scenario?

-My answer is that the state should always be maintained in the source system that triggered the request otherwise it's not REST (unless I misunderstood the stateless constraint in REST). If it was reason #2 it is all the more reason for the Source to retrigger the request with the correct payload b.

2) Is it not violating REST Architecture constraints (and is it ok to violate REST and stay within Pragmatic REST guidelines) if you ask the middleware or the server to queue/store failed transaction/requests and retry?

3) If the Target system is down for maintenance purposes people seem to be reluctant to make the source queue and retry when it comes back up.

a) Should the source/client be asked to queue and retry?

b) Should the middleware (Apigee or some other message bus like BizTalk) be asked to queue and retry?

c) If the answer to b) is Yes, how do you do that in Apigee? I could leverage Apigee Caching but do you have any better solutions?

Best,

Pranav

Solved Solved
0 2 932
1 ACCEPTED SOLUTION

Queueing is independent of REST. One might say "orthogonal".

  • REST describes a model in which HTTP methods and URLs can be applied to a domain resource model for create, update, retrieval and query.
  • By Queueing, we understand a system that stores requests (and maybe responses?) for asynchronous consumption. System1 can send a message to System2, even though System2 is never available at the same time System1 is available. and the response likewise.

Given that understanding, then considering REST and queueing, In theory an architect could choose one, or the other, or both, or neither. What would it look like to have a RESTful service that queued requests?

REST is used in a synchronous system. Clients are simple, they send requests to services. Services reply with one of a limited set of statuses. 503 Service Unavailable is one of those - it is designed to tell the client "I can't help you now, maybe try again later."

One can imagine a REST endpoint which always returns HTTP status 202, indicating "I've accepted that request, but I don't have a reply for you now." This REST endpoint might enqueue the request for later handling by some other system. With this design, at some later point it would be the responsibility of the client to call again to the REST endpoint to inquire the status of the original request. This makes the client more complicated, as it now must maintain a list of requests, and the status of each one, and then manage the periodic re-query of each of those requests.

A different approach would be a REST service that always returns 200 or 201 (or 204) indicating "that request has been received and handled" for success, and 503 for "I cannot reach the backend just now." That requires the client to keep track of which requests got 503, and of those, which need to be re-sent.

That gives some context for my suggestion that the use of REST and queueing are independent.

Should the middleware (...) be asked to queue and retry?

This is an architectural question you need to decide. Probably the inputs to that decision include the requirements and constraints of your application, as well as which tools, frameworks and systems are available to you, and the operational and performance behavior of those systems.

  • As examples of tools and frameworks, one could imagine a nodejs request "middleware" which enqueues and retries requests. (I just googled and found requestretry which does exactly this) In this case the middleware would be resident on the client; part of the client app. But it could be done this way.
  • Another option is to have an intervening system, a separate process or cluster which is responsible for persisting requests and retrying. We normally see such features in systems called ESB or Queue. How you handle responses and timeouts is an interesting question in this case.
  • There are other options that sort of shade between "in process" and "external system". For example consider the sidecar proxy pattern which is used within Istio. In this pattern, the client app (service requester) never connects directly with the service receiver. Instead the client connects via the sidecar proxy, and that proxy is responsible for knowing which service instances are available, and even retrying requests to those service instances. The queueing is done in memory only, and a fail-then-retry can be completely independent to the client requester. This is handy when the receiving service is "nearby", meaning close to the client app in the network. In the sidecar model, the sidecar proxy is a separate process, but is dedicated to the client itself, so it's almost like a linked-in library.

Apigee is one of those tools or systems. In general Apigee acts a reverse proxy that you can easily configure to do various things, like serve from cache, route, verify credentials, rate limit, and transform requests. It has other capabilities outside of reverse-proxy, such as issuing OAuth tokens directly. And it's extensible.

Apigee is not a queue. Trying to turn Apigee into a queue will not end well. That's not what it is good at. Even so, there is the possibility for you to configure the HTTP Target Endpoint in Apigee to retry failed requests when you have multiple target servers. This works like the sidecar model; the retry queue is simple and in-memory.

You may decide that you wish to embed explicit queue-and-retry semantics in your system. If you take that decision, you can use Apigee as the reverse proxy, but you should not use Apigee to act as the queue. If you want queueing, consider using a tool or system designed for that purpose. For example, Google Cloud Pubsub.

View solution in original post

2 REPLIES 2

Queueing is independent of REST. One might say "orthogonal".

  • REST describes a model in which HTTP methods and URLs can be applied to a domain resource model for create, update, retrieval and query.
  • By Queueing, we understand a system that stores requests (and maybe responses?) for asynchronous consumption. System1 can send a message to System2, even though System2 is never available at the same time System1 is available. and the response likewise.

Given that understanding, then considering REST and queueing, In theory an architect could choose one, or the other, or both, or neither. What would it look like to have a RESTful service that queued requests?

REST is used in a synchronous system. Clients are simple, they send requests to services. Services reply with one of a limited set of statuses. 503 Service Unavailable is one of those - it is designed to tell the client "I can't help you now, maybe try again later."

One can imagine a REST endpoint which always returns HTTP status 202, indicating "I've accepted that request, but I don't have a reply for you now." This REST endpoint might enqueue the request for later handling by some other system. With this design, at some later point it would be the responsibility of the client to call again to the REST endpoint to inquire the status of the original request. This makes the client more complicated, as it now must maintain a list of requests, and the status of each one, and then manage the periodic re-query of each of those requests.

A different approach would be a REST service that always returns 200 or 201 (or 204) indicating "that request has been received and handled" for success, and 503 for "I cannot reach the backend just now." That requires the client to keep track of which requests got 503, and of those, which need to be re-sent.

That gives some context for my suggestion that the use of REST and queueing are independent.

Should the middleware (...) be asked to queue and retry?

This is an architectural question you need to decide. Probably the inputs to that decision include the requirements and constraints of your application, as well as which tools, frameworks and systems are available to you, and the operational and performance behavior of those systems.

  • As examples of tools and frameworks, one could imagine a nodejs request "middleware" which enqueues and retries requests. (I just googled and found requestretry which does exactly this) In this case the middleware would be resident on the client; part of the client app. But it could be done this way.
  • Another option is to have an intervening system, a separate process or cluster which is responsible for persisting requests and retrying. We normally see such features in systems called ESB or Queue. How you handle responses and timeouts is an interesting question in this case.
  • There are other options that sort of shade between "in process" and "external system". For example consider the sidecar proxy pattern which is used within Istio. In this pattern, the client app (service requester) never connects directly with the service receiver. Instead the client connects via the sidecar proxy, and that proxy is responsible for knowing which service instances are available, and even retrying requests to those service instances. The queueing is done in memory only, and a fail-then-retry can be completely independent to the client requester. This is handy when the receiving service is "nearby", meaning close to the client app in the network. In the sidecar model, the sidecar proxy is a separate process, but is dedicated to the client itself, so it's almost like a linked-in library.

Apigee is one of those tools or systems. In general Apigee acts a reverse proxy that you can easily configure to do various things, like serve from cache, route, verify credentials, rate limit, and transform requests. It has other capabilities outside of reverse-proxy, such as issuing OAuth tokens directly. And it's extensible.

Apigee is not a queue. Trying to turn Apigee into a queue will not end well. That's not what it is good at. Even so, there is the possibility for you to configure the HTTP Target Endpoint in Apigee to retry failed requests when you have multiple target servers. This works like the sidecar model; the retry queue is simple and in-memory.

You may decide that you wish to embed explicit queue-and-retry semantics in your system. If you take that decision, you can use Apigee as the reverse proxy, but you should not use Apigee to act as the queue. If you want queueing, consider using a tool or system designed for that purpose. For example, Google Cloud Pubsub.