Solved: Considerable time taken to send and receive a file...

dbipinchandra · 06-03-2020 09:57 AM

Hi All,

We have the following API proxy on our On Prem Apigee .

As part of the POST request call we are sending a 2.5 MB PDF to a cloud provider and as part of the GET call we are retrieving the same document . We see that Apigee is taking around 250ms Avg(target.sent.end.timestamp - target.sent.start.timestamp) to send the complete document and around 280 ms(target.received.end.timestamp - target.received.start.timestamp) on an average to accept the file as part of the GET response .

Could you please let us know if there is any specific configuration that we can fine-tune on MP . We understand that the time it takes to transfer all the packets of data will also depend on the traffic in the internet at that point of time . But want to make sure we are making all the required configuration changes on our end to make this optimal .

Thanks ,

Bipin

dchiesa1

You said "On prem Apigee". That's important, because how you have networked things between the MP and the backend system will affect your observed performance.

Let's simplify things and just imagine two endpoints on a network. Suppose the network connecting these endpoints is rated at 100Mb/s (sometimes written as 100Mbps (lowercase b)), which mean it can transfer data at 100 megabits per second.

There are 8 bits to the byte, so that translates to 100/8 = 12.5 megaBYTES per second. Sometimes written as 12.5 MB/s (note: uppercase B).

That is the maximum carrying capacity of your network. It's typical for network adapters, switches or routers that are rated at this capacity. If you use virtual networks in AWS or GCP, you will often get networks rated at 100Mbps.

Any protocol has some overhead. Normally with HTTP/1 there is 15-25% overhead with framing, retries, headers and so on. So that reduces your expected max throughput on a 100Mb/s link from 12.5MB/s to about 10MB/s.

How does this compare with what you observe? Your observed transfer is around 2.5MB in 250ms. 250ms is 1/4 second. Normalizing to a per-second basis, you are observing 2.5MB /(1/4) = 2.5 * 4 = 10MB per second transfer rate. Which is exactly what you would expect for a 100Mbps network.

If you use a 1Gbps link, you should expect to see 10x the performance, or 1/10 the time required, compared to what we computed for 100Mbps. That means 2.5MB in 25ms. Being able to achieve that performance will depend on the latency and behavior observed at the backend system - in other words, to fully exploit a 1Gbps network, the backend software system has to be capable of handling (ingesting) 2.5MB in 25ms. Otherwise the sender (Apigee) will not be able to send it that quickly. Apigee will have to wait for the backend to be ready.

BTW this also applies to the client. Suppose:

the client is uploading a 2.5MB payload (via POST) to Apigee.
The Apigee Router and MP are connected via a 1Gbps port on your switch.
The Apigee MP is connected to the upstream via a different 1Gbps port on that switch.
The client is connecting into the Apigee router via a 100Mbps link,

In this case, the maximum throughput the system can be expected to deliver is about 2.5MB in 250ms, which is the max you can get over the 100Mbps link. It doesn't matter how fast the rest of the train can go. If the client can upload only at 100Mb/s, then the rest of the train (router, MP, and upstream) will go at that speed.

Does this help?

If you are using a 100Mbps network, There is nothing you can tune in your Apigee, or on the MP specifically, in order to speed this up. If I am guessing right, your network is the bottleneck, and the Apigee MP will spend a bunch of time waiting for i/o, on both the sending and receiving sides. The only way to improve things is to migrate to a 1Gbps network. (or faster).

If you are already on a 1 Gbps network, then I would want to investigate the performance of that network. You might want to try out wrk2 to measure the performance over the link between the Apigee MP and the upstream system. This will allow you to eliminate the MP and Router from the system and measure ... effectively the maximum practical throughput you can push over that link. If it is much lower than you expect (much lower than 12MB/s for a 100Mbps link, much lower than 120MB/s for a 1Gbps link) then you have some other network issue getting in the way. Fix that first before trying to tune anything in Apigee.

View solution in original post

dchiesa1

You said "On prem Apigee". That's important, because how you have networked things between the MP and the backend system will affect your observed performance.

Let's simplify things and just imagine two endpoints on a network. Suppose the network connecting these endpoints is rated at 100Mb/s (sometimes written as 100Mbps (lowercase b)), which mean it can transfer data at 100 megabits per second.

There are 8 bits to the byte, so that translates to 100/8 = 12.5 megaBYTES per second. Sometimes written as 12.5 MB/s (note: uppercase B).

That is the maximum carrying capacity of your network. It's typical for network adapters, switches or routers that are rated at this capacity. If you use virtual networks in AWS or GCP, you will often get networks rated at 100Mbps.

Any protocol has some overhead. Normally with HTTP/1 there is 15-25% overhead with framing, retries, headers and so on. So that reduces your expected max throughput on a 100Mb/s link from 12.5MB/s to about 10MB/s.

How does this compare with what you observe? Your observed transfer is around 2.5MB in 250ms. 250ms is 1/4 second. Normalizing to a per-second basis, you are observing 2.5MB /(1/4) = 2.5 * 4 = 10MB per second transfer rate. Which is exactly what you would expect for a 100Mbps network.

If you use a 1Gbps link, you should expect to see 10x the performance, or 1/10 the time required, compared to what we computed for 100Mbps. That means 2.5MB in 25ms. Being able to achieve that performance will depend on the latency and behavior observed at the backend system - in other words, to fully exploit a 1Gbps network, the backend software system has to be capable of handling (ingesting) 2.5MB in 25ms. Otherwise the sender (Apigee) will not be able to send it that quickly. Apigee will have to wait for the backend to be ready.

BTW this also applies to the client. Suppose:

the client is uploading a 2.5MB payload (via POST) to Apigee.
The Apigee Router and MP are connected via a 1Gbps port on your switch.
The Apigee MP is connected to the upstream via a different 1Gbps port on that switch.
The client is connecting into the Apigee router via a 100Mbps link,

In this case, the maximum throughput the system can be expected to deliver is about 2.5MB in 250ms, which is the max you can get over the 100Mbps link. It doesn't matter how fast the rest of the train can go. If the client can upload only at 100Mb/s, then the rest of the train (router, MP, and upstream) will go at that speed.

Does this help?

If you are using a 100Mbps network, There is nothing you can tune in your Apigee, or on the MP specifically, in order to speed this up. If I am guessing right, your network is the bottleneck, and the Apigee MP will spend a bunch of time waiting for i/o, on both the sending and receiving sides. The only way to improve things is to migrate to a 1Gbps network. (or faster).

If you are already on a 1 Gbps network, then I would want to investigate the performance of that network. You might want to try out wrk2 to measure the performance over the link between the Apigee MP and the upstream system. This will allow you to eliminate the MP and Router from the system and measure ... effectively the maximum practical throughput you can push over that link. If it is much lower than you expect (much lower than 12MB/s for a 100Mbps link, much lower than 120MB/s for a 1Gbps link) then you have some other network issue getting in the way. Fix that first before trying to tune anything in Apigee.

Considerable time taken to send and receive a file as part of the request and response