Designing KVM's to best perform under EXTREME volu...

Report Inappropriate Content · ‎07-17-2015

I am writing this article after having some great conversations with some of my peers related to an actual enterprise implementation.

Use case and problem statement -

Our client was load testing their very complex Apigee implementation sending traffic in thousands of TPS. It all went fine, however our operations observed loads of multiple magnitude on Cassandra which was concerning. This all came back the way we had designed KVM's for storing config data.

The KVM was designed in a very standard way where we have like 5-7 maps created as logical buckets to hold information like backend targets, backend credentials, log configs etc.. Now since this data is stored into individual maps, we have to execute 5-7 separate KVM policies for each incoming requests to fetch all the data. Execution of each KVM policy made that x number of calls to cassandra which were further multiplied the times number of maps we have (5-7 times x).

Please note that this would have worked just fine under normal traffic, we are talking about EXTREME volume(thousands of TPS) here.

Solution - We are basically considering multiple ways to tackle this problem.

1) Creating a single KVM map (or 2-3 at the most) and have all the config data either as separate keys or all into a single JSON blob. We can deal with extracting variables from that one big retrieve programmatically. This will at least take care of those very high cassandra calls.

2) Cache the KVM retrieved entries - Since all the requests would need this config data, it totally makes sense to cache them for say like a day(unrealistic that things like your target servers,creds are changing within a day). That way all of the data will be almost every time served from the L1 cache, totally skipping the execution of the KVM retrieves.

3) config.json instead of KVM for config data- This in itself is a good topic of discussions where you could have different opinions from different experts depending on the use case. We migrated over from config.json to KVM's for storing config data since we wanted to use something runtime instead of code config. Sometimes it also becomes important to have a single point of config store instead of multiple places for maintenance purpose. Using config.json to inject environment specific variables deploy time is also not many times in alignment with what some clients policies.

@David Allen @Sudheer Gopalam

optimism · ‎08-24-2021

Nice article!🚀

Designing KVM's to best perform under EXTREME volume