how to stream an api response body to client

ccovney
Participant V

Hi everyone,

I am trying to stream an API response body to a client via Express.js in my node.js proxy on Apigee Edge Private Cloud v4.15.04-ws.

High Level:

Client request --> Apigee Edge calls out to remote backend service --> backend service responds with chunked data --> Apigee Edge receives chunked data and streams data to client --> Client

My issue is that I cannot figure out how to stream the data to the client without buffering the data in Apigee. Apigee successfully receives the data from our remote backend service, but I can't find a way to stream this data to the client without saving it to the Apigee file structure first. We cannot be saving every single client request to a buffer.

If anyone has any insight into how this might be achieved, that would be very very much obliged!

Best, Chris

Solved Solved
4 18 13.5K
2 ACCEPTED SOLUTIONS

sgilson
Participant V

Here is the doc page on streaming for more info and links to some examples:

http://apigee.com/docs/api-services/content/enabling-streaming

Stephen

View solution in original post

ccovney
Participant V

Message from Apigee Support:

Hi Chris, I was reviewing streaming bugs related to node.js and there is one bug that is fixed in Private Cloud release 16.01 which is due to be out soon, JIRA # APIRT-1827 which has to do with Low Concurrency Streaming issue. The fix is planned to be back ported to 15.04 WS release, but there is a lot of work involved, thus engineering plans to have the backport ready sometime in April timeframe. Let me know if this helps answer your questions. Thanks, Janice

View solution in original post

18 REPLIES 18

Not applicable

Hi @Chris Covney,

This can be achieved by setting property "response.streaming.enabled" as true as part of the HTTPTargetConnection of TargetEndPoint. Configuration snippet is:

<TargetEndpoint>

...

<HTTPTargetConnection>

<Properties>

<Property name="response.streaming.enabled">true</Property>

</Properties>

<URL>http://{URL}</URL>

</HTTPTargetConnection>

</TargetEndpoint>

Cheers, Rajesh

sgilson
Participant V

Here is the doc page on streaming for more info and links to some examples:

http://apigee.com/docs/api-services/content/enabling-streaming

Stephen

ccovney
Participant V

Hi @rdoda and @sgilson

Thanks for your responses; however, I'm looking for info about accomplishing streaming with Express.js within my node.js proxy. The instruction provided in the linked docs and the snippet from Rajesh above both pertain to the XML configuration of a proxy. This I have already enabled.

Essentially, all I am looking for is a function which is identical to Express' res.send(), but which streams the data instead of buffering it. As of now, my code looks like this:

if (response.statusCode.toString().indexOf("2") === 0) { 

	res.send(body);

}

where "body" is the contents of the file, which needs to be streamed to the client. It appears that in the above case, res.send buffers the data, and we simply cannot scale this for all the gigabyte files our clients download every day.

Any insight would be very much obliged thanks!

Best, Chris

ccovney
Participant V
@arghya das

Thank you for providing the sample node.js streaming proxy. Can you tell me more about it? Looking at the app code, it does not appear that any responses are being streamed. For example, the code body looks like this (where res.send is being used to send the response):

var argo = require('argo');
var express = require('express');
var app = express();
var proxy = argo()
    .target('http://ci.apigeng.com')
    .build();


app.get('/hello', function(req, res) {
    res.send('Hello from Express');
});


app.all('*', proxy.run);
app.listen(3000);

And the target endpoint config looks like this (where streaming is enabled):

    <ScriptTarget>
        <ResourceURL>node://server-usergrid.js</ResourceURL>
        <Properties>
            <Property name="response.streaming.enabled">true</Property>
            <Property name="request.streaming.enabled">true</Property>
        </Properties>
    </ScriptTarget>

All I am really interested is if the res.send function is streaming the data. All my research (and practice) suggest otherwise, hence my interest in an alternative function. Thanks again for all your help so far!

Best, Chris

Yes, you are right. The streaming is being implemented using the http properties for the script target, rather than using the nodescript itself. The nodescript in itself is simply making an http callout. You can implement it in node too, and that should work as well if you have the correct patch versions installed. However, there would be some issues you might hit, if the size of the response is in the order of few GB, or you make too many concurrent requests. We are working on that, but do not have a patch available for addressing the issue at this point.

@arghya das

I hope all is well. Was a patch ever created to fix the node.js streaming of 1+ GB files? Does the latest version of Apigee Private Cloud support node.js streaming of 1+ GB files? Thanks in advance for the help!

Best

Chris

Not applicable

Hi @Chris Covney. It's good to hear from you man!

For XML streaming, I believe you can try xml-stream. I've run it only on my laptop though. Here's an example on how to do it:

var http = require('http');
var XmlStream = require('../lib/xml-stream');


// Request an RSS for a Twitter stream
var request = http.get({
  host: 'api.twitter.com',
  path: '/1/statuses/user_timeline/dimituri.rss'
}).on('response', function(response) {
  // Pass the response as UTF-8 to XmlStream
  response.setEncoding('utf8');
  var xml = new XmlStream(response);


  // When each item node is completely parsed, buffer its contents
  xml.on('updateElement: item', function(item) {
    // Change <title> child to a new value, composed of its previous value
    // and the value of <pubDate> child.
    item.title = item.title.match(/^[^:]+/)[0] + ' on ' +
      item.pubDate.replace(/ \+[0-9]{4}/, '');
  });


  // When <item>'s <description> descendant text is completely parsed,
  // buffer it and pass the containing node
  xml.on('text: item > description', function(element) {
    // Modify the <description> text to make it more readable,
    // highlight Twitter-specific and other links
    var url = /\b[a-zA-Z][a-zA-Z0-9\+\.\-]+:[^\s]+/g;
    var hashtag = /\b#[\w]+/g;
    var username = /\b@([\w]+)/g;
    element.$text = element.$text
      .replace(/^[^:]+:\s+/, '') //strip username prefix from tweet
      .replace(url, '<a href="$0">$0</a>')
      .replace(hashtag, '<a href="https://twitter.com/search/$0">$0</a>')
      .replace(username, '<a href="https://twitter.com/$1">$0</a>');
  });


  // When each chunk of unselected on unbuffered data is returned,
  // pass it to stdout
  xml.on('data', function(data) {
    process.stdout.write(data);
  });
});

I love when I see Node.js questions! Keep asking! Cheers!

@Diego Zuluaga it's awesome to hear from you too!

The XML streaming we will certainly need to use for our extremely large XML responses. However, for the moment, we are investigating how we will be able to stream files of all types from our backend to the client via node.js in Apigee Edge Private Cloud.

The files are no larger than 2GB, but that is still pretty big. We currently use a non-node.js approach and it works just fine, but we are migrating all of our functionality to node and this is a pivotal step.

Any insight for streaming files in node.js on Edge Private Cloud would be very much appreciated!

Also, let me know if you need any more information from me about the backend API, the files, or anything like this. Thanks!

Best, Chris

Got it! In that case, I tried piping the request from the client and response from the server leveraging streams:

// app use will allow both get and post requests
app.use('/api', function(req, res) {
  // so, if I call /api/users the url will be YOUR_API_BASE_URL/users
  // note that the current route part is not included in url "/api" in this case
  var url = 'YOUR_API_BASE_URL'+ req.url;
  var r = null;

  // when it's post request set up a request.post to have our post body
  if(req.method === 'POST') {
     r = request.post({uri: url, json: req.body});
  } else {
     r = request(url);
  }

  // do the piping
  req.pipe(r).pipe(res);
});

Here's a nice article explaining the concept in detail http://node.today/expressjs-proxy/

Please let us know if it works!

@Diego Zuluaga, you are awesome thank you! I will give this a shot and let you know tomorrow!

Best, Chris

Thanks for this tip. I'm going to test this approach my self using the Twitter Streaming API. Is this example also in https://github.com/apigee/api-platform-samples ?

Not applicable

Hi @Chris Covney ,

#1 Pls check the exact release number that you using as 15.04.03 is the release which supports streaming in node.js. (sh /opt/apigee4/bin/get-version.sh on MP should confirm that)

#2 res.send in express as mentioned by @arghya das or node.js pipe as mentioned by @Diego Zuluaga should work if you are 15.04.03 .

#3 Can you just use http target connection to stream instead of node.js ? is there any dependency on node.js ? Looks like there are some limitations w.r.t concurrency even on 15.04.03 if you use node.js for streaming which I observed while doing some node.js + request streaming .

@Maruti Chand

This is very good to know, thank you Maruti. I am on 4.15.03.01, so I will have to use a patch upgrade. Should be easy peazy, but you never know. I'll do this tomorrow and let you know how it all works out.

We are not strictly speaking "dependent" on node, but our agenda is to move away from XML configuration and more towards code-based solutions (such as node).

VERY IMPORTANT THOUGH can you please expound on these said aforementioned concurrency issues? If they are pronounced, it might be a deal-breaker for the node approach. Any info you can provide would be very much appreciated!

Thanks again everyone for all your awesome help!

Best, Chris

ccovney
Participant V

@Diego Zuluaga

and @arghya das and @Maruti Chand

I was able to get node streaming working with @Diego Zuluaga's solution; however, certain documents (depending on size and/or type) get corrupted.

For instance, a 100mb .PDF file becomes marginally corrupted (some pages have distortions). Another instance is a 5mb .MOV file is corrupted such that it simply cannot be opened by Quicktime Player.

I was able to successfully stream small PDFs, .xls, .docx, .txt, and files of this nature.

Perhaps these anomalies are due to our Private cloud version being 4.15.04.00?

As a general recommendation, does Apigee suggest that we use the http properties in an XML configured proxy instead of node?

Thanks again everyone!

Best, Chris

EDIT/UPDATE: Streaming works 100% of the time when executed locally (even when using trireme); however, the corruptions still occur when the same code is executed in Apigee.

Is Apigee using an older version of Trireme and this is what is causing the streaming issues? Can anyone elaborate on the what exactly plagues the node streaming functionality in Apigee? What is fixed in the .03 patch which remedies some of the issues and what issues are outstanding? Thanks guys!

ccovney
Participant V
@Maruti Chand @arghya das @Diego Zuluaga

I hope you all are doing well! I was wondering if there has been any more information or progress on supporting streaming of 1+ GB files in node.js. How about the latest version of apigee which is currently available? Are there any older versions that also support it? Perhaps it's simply still not supported at all? Thanks for your help!

Best,

Chris

ccovney
Participant V

Message from Apigee Support:

Hi Chris, I was reviewing streaming bugs related to node.js and there is one bug that is fixed in Private Cloud release 16.01 which is due to be out soon, JIRA # APIRT-1827 which has to do with Low Concurrency Streaming issue. The fix is planned to be back ported to 15.04 WS release, but there is a lot of work involved, thus engineering plans to have the backport ready sometime in April timeframe. Let me know if this helps answer your questions. Thanks, Janice

Hi Chris, we are running into the same issue. Did you ever get it resolved?

Hi Mike,

We have not yet updated our version of the Apigee Private Cloud (aka OPDK), so I cannot vouch for the aforementioned bug fix. Our temporary solution entails:

1. removing the node.js component,

2. enabling streaming on the proxy and target endpoints within the API Proxy configuration,

3. performing transforms on headers in the API proxy request/response flow

We are in the process of installing the latest OPDK and I will post our node.js streaming findings here.

Best,

Chris