DynamoDB is a fast, scalable noSQL database service offered by Amazon Web Services. DynamoDB provides table-oriented passivation with variable columnar attributes on each row. DynamoDB is offered as PaaS, Platform as a Service, provisioned with "throughput capacity". This means you pay for the amount of data (reads and writes) you anticipate requiring at peak. AWS provides the necessary compute, network, and storage resources to provide the target capacity.
Originally designed at Facebook, Apache Cassandra provides similar functionality as DynamoDB. In fact, Cassandara derives from DynamoDB and Googles BigTable data model. Cassandra is offered on cloud infrastructure or as an on-premises solution. Cassandra excels at providing high fault tolerance and availability. Cassandra was built with native multi data center replication in mind. Where DynamoDB has only recently added cross region replication, Cassandra is ground up designed as highly distributed system.
Apigee API BaaS is a component of the API Services Edge platform. API BaaS provides app developers with an out of the box cloud datastore. API BaaS is Cassandra under the hood with Apigee provided features including:
When considering which platform to use we urge customers to think about the following:
In the end, your use cases are the ultimate guide to what you should be using. Would love to hear your thoughts, questions, and experience in this discussion thread.
I would really apply the 80/20 rule. For 80% of NoSQL use cases, the performance capabilities of API BaaS is "good enough". However, for the 20% of NoSQL use cases that have special indexing or scalability requirements, having the ability to tweak indexes and optimize through-put is an advantage the DynamoDB and Cassandra provides over API BaaS. The problem is, very few developers actually know how to "tweak" those kinds of systems (especially C*) - so chose usage of these systems carefully.
The biggest issue when using BaaS is doing bulk data export or custom queries that return aggregated data or large amount of data. There is no other way to access the data than using API calls... which is cumbersome when you want to extract value from the data stored.
Example: If your app is storing transaction logs in BaaS, later when you have a lot of records and you want to find out, let's say, "how many transaction per user in a given time period", the APIs provided by BaaS will make almost impossible to do that kind of query over *your* data.
the underlying Cassandra is not exposed to Apigee Cloud customers, so you have no choice but to use the APIs, which do not scale to extract large amounts of data for later analysis or reporting.
I agree that not having a bulk export or import API is a pain. Its a feature that is in progress. With regards to scalability however, it is possible to export the collections in parallel, which could help with the speed of export.