Transfer 500Mb payload ,Transfer 500MB of data when payload size is 10MB

Can you transfer >500MB of data via apigee gateway when the max payload is 10MB? What's the best way to do this?

0 3 428
3 REPLIES 3

Per the limits page: https://docs.apigee.com/api-platform/reference/limits

A common API pattern is to fetch large amounts of data, such as images, documents, or plain text/JSON. For data sizes greater than 10 MB, Apigee recommends a signed URLs pattern. Other Google products like GCS (Google Cloud Storage) provide reference implementations using this pattern.

So signed URLs for chunks of < 10MB each is the best way to do this.

Not applicable

500MB is huge and no gateway will support this. Streaming also will not work in this case. The option is to do pagination at your target.

I'm not so sure if manual chunking of 500MiB data into 10MiB pieces is a good or practical idea. So I will discuss a streaming option.

From streaming point of view, there are two aspects of your question:



Is it technically possible to transfer requests with big payloads with Apigee?

Yes. I was successfully streaming 10G payload requests. On OPDK. By configuring proxies, Routers and Message Processors, nodes, timeouts, as well as tuning NGINX configuration to enable pass-trough streaming along all components.

On an OPDK instance installed in one GCP project, client workstation sending requests from another GCP project and payloads streamed into GCS buckets of yet another GCP project,

using curl on a client workstation it takes about 8 mins to stream 10GiB, 3.5 minutes for 5GiB, 1m for 1GiB payloads on average.

For gsutil to gs bucket at the same hardware and networking infrastructure it takes 4m for 10GiB, 1m40s for 5GiB, 30s for 1GiB files. Average of three runs on that day.

The straight-to-gcs speed as you can expect is faster but just about twice.

It actually takes less MP overhead to process requests in streaming mode, as theoretically expected.

Is it a good idea?

It depends.

Apigee SaaS is slightly different environment with regards to limits and allowances and request timeout than Apigee OPDK. So that might limit your practical cut-off point for maximal data payload. 500MiB still should fit in the timeout window. As of today, for streamed requests maximum size is not enforced by ~60sec request processing timeout will sure that you're not abusing the platform.

There is not much you can do with regards to payload processing as policies that deal with payload are ignored during streaming.

The fact that it can be done does not mean it is the best idea for all use cases. As Christian said in another answer, majority of customers prefer to off-load big file transfer to some out-of-band process to implement it and use Edge proxies and signed urls to control it [the process]. In this case, you do not need to chunk your data and the speed of transfer will be optimal, including niceties like resumable uploads, parallel uploads and reliability.

To summarise, streaming is officially part of the product features and is used by customers yet you need to be careful to match the implementation of your use cases and your requirements to what Edge has to offer.