API BaaS, DynamoDB, and Cassandra - which should I use?

1 3 1,062

DynamoDB is a fast, scalable noSQL database service offered by Amazon Web Services. DynamoDB provides table-oriented passivation with variable columnar attributes on each row. DynamoDB is offered as PaaS, Platform as a Service, provisioned with "throughput capacity". This means you pay for the amount of data (reads and writes) you anticipate requiring at peak. AWS provides the necessary compute, network, and storage resources to provide the target capacity.

Originally designed at Facebook, Apache Cassandra provides similar functionality as DynamoDB. In fact, Cassandara derives from DynamoDB and Googles BigTable data model. Cassandra is offered on cloud infrastructure or as an on-premises solution. Cassandra excels at providing high fault tolerance and availability. Cassandra was built with native multi data center replication in mind. Where DynamoDB has only recently added cross region replication, Cassandra is ground up designed as highly distributed system.

Apigee API BaaS is a component of the API Services Edge platform. API BaaS provides app developers with an out of the box cloud datastore. API BaaS is Cassandra under the hood with Apigee provided features including:

  • User management
  • Push notifications
  • Prebuilt schema for common mobile/web use cases
  • An out of the box RESTful API
  • Integration with social networking services.
  • Support for geolocation queries.
  • Integration with Apigee Edge Analytics to bring visibility to App usage.

When considering which platform to use we urge customers to think about the following:

  • Economics. DynamoDB and Cassandra are powerful, lower level technologies. They are a net add to your IT infrastructure. As such they are an extra cost. Unless you have specific use cases that need features of raw Cassandra or DynamoDB we recommend looking at API BaaS as your go to cloud datastore. Edge includes API BaaS (above the SMB tier). API BaaS calls do "count" toward call counts on transaction based subscription plans.
  • Complexity. Managing capacity for Cassandra or provisioned throughput of DynamoDB is an ongoing task. API BaaS capacity management is Apigees responsibility. Customer merely need to select an approrpatie subscription plan. Apigee handles the rest.
  • Developer Focus. DynamoDB requires developers to think about issues of hashkey, rangekey, optimal columnar size, etc. Cassandra developers need to be mindful of the cost implications of replication strategies and the nuances of efficient schema design. API BaaS provides complete access via a RESTful out of the box, developers simply model entities.
  • Operational Support. Both DynamoDB and API BaaS are vendor managed - meaning no one on your team will carry a pager for the data tier. Instead your operational support team will focus on engagement applications. Cassandra teams need to account for the time and effort to manage patches and upgrades.
  • Predictability. DynamoDB excels at providing predictable performance for a given provisioned capacity. Once you optimize your capacity you are in good shape - until performance demands change. Similarly, there is an art to tuning Cassandra. We know this all too well. With API BaaS, Apigee handles these tasks for you providing consistent, predictable performance. Finance will appreciate that a single subscription agreement or software license provides this performance at a predictable cost.
  • Agility. Your eventual target may well be Cassandra or DynamoDB. When time is of the essence, API BaaS provides a reliable, high performance, and highly available data store with little or no setup.

In the end, your use cases are the ultimate guide to what you should be using. Would love to hear your thoughts, questions, and experience in this discussion thread.

Comments
alan
New Member

I would really apply the 80/20 rule. For 80% of NoSQL use cases, the performance capabilities of API BaaS is "good enough". However, for the 20% of NoSQL use cases that have special indexing or scalability requirements, having the ability to tweak indexes and optimize through-put is an advantage the DynamoDB and Cassandra provides over API BaaS. The problem is, very few developers actually know how to "tweak" those kinds of systems (especially C*) - so chose usage of these systems carefully.

Not applicable

The biggest issue when using BaaS is doing bulk data export or custom queries that return aggregated data or large amount of data. There is no other way to access the data than using API calls... which is cumbersome when you want to extract value from the data stored.

Example: If your app is storing transaction logs in BaaS, later when you have a lot of records and you want to find out, let's say, "how many transaction per user in a given time period", the APIs provided by BaaS will make almost impossible to do that kind of query over *your* data.

the underlying Cassandra is not exposed to Apigee Cloud customers, so you have no choice but to use the APIs, which do not scale to extract large amounts of data for later analysis or reporting.

alan
New Member

I agree that not having a bulk export or import API is a pain. Its a feature that is in progress. With regards to scalability however, it is possible to export the collections in parallel, which could help with the speed of export.

Version history
Last update:
‎03-08-2016 10:54 PM
Updated by: