Using the Heimdall Proxy to Split Reads and Writes with ACID Compliance

Background

Horizontally scaling your SQL database involves separating the primary writer from read-only instances. This allows the primary to perform dedicated write operations while read queries are offloaded to separate replicas. However, this architecture may result in inconsistent data due to synchronization delays. The Heimdall Proxy solves this problem for developers, database administrators, and architects by ensuring ACID compliance. Customers will save months of development and maintenance at the data access layer without application changes.

SQL Offload

In this blog, we will go over how to 1) Route queries to read and write instances (aka Read/Write splitting) for strong consistency and 2) Automate query caching.

 

The Heimdall proxy is a transparent data access layer that intelligently routes queries to the most optimal data source, resulting in SQL offload and improved response times. The proxy determines whether to retrieve the SQL results from cache or the backend database. If a read query is not deemed cachable, the proxy knows which database instance should be accessed. This architecture is particularly useful for users of Postgres, MySQL, and SQL Server.

 

1.png

Figure 1: Heimdall Proxy Architecture

 

Challenges with Read/Write Splitting

When horizontally scaling out Cloud SQL, developers often encounter data synchronization issues. There’s a time lag when data was written and when the read replicas are updated. The Heimdall proxy routes queries to the appropriate database instance by accounting for replication lag and transactional state. 

 

There are a variety of approaches to Read/Write splitting, but they are not replication lag aware. That is, how do you know when the WRITE has been updated across all database instances? Could you be reading stale data? Other solutions may track the replication lag but are not intelligent enough to determine WHEN it is safe for a particular query to be routed to the read-slave. 

 

Note: The Heimdall Proxy does not support data replication; that is the job of the database. Our proxy detects when data has completed replication and routes queries to the appropriate database instance. 

 

The proxy parses each query to determine what tables are associated with it. The last write time to the table is tracked; as well as the time it took for the update to appear on the read instance.  With this information, the proxy determines which database instance is “safe” to read from–either the read instance or only the write/primary instance. 

Configuration and Setup

Download the Heimdall proxy instance from the Google Cloud Marketplace. The installation will include both the proxy and Central Console. For more information, visit our technical documentation

 

On the Heimdall Central Console, our configuration wizard takes you step-by-step to successfully connect the Heimdall proxy to your GCP services (i.e. application, database, cache), and configure features (e.g. Read/Write split, Caching, Load balancing). Once installed, on the Heimdall Central Console, select “Wizard”, and then “Manual Configuration” shown in Figure 2 below.

 

2.png
 

Read/Write Splitting Configuration 

After completing the Wizard, the configuration will be pre-populated. The “Data Sources” tab should appear like below, with one read/write instance configured along with at least one read-only instance. Ensure that a read/write primary is configured with at least one read-only instance: 

 

3.png
 

In Figure 3, the replication lag was set to 1000ms. Here’s how it would work: 

  • If a table was last written two seconds (2000ms) ago (i.e. outside of the replication lag time), the proxy would consider it “safe” to use the read replica.
  • If the last write was 500ms ago (i.e. within the replication lag time), the proxy would route the query to the primary instance to complete the read operation. This would guarantee fresh, updated responses.

The value of the detected replication lag is found on the Status screen (if any):

4.png

For further Read/Write split control:  

  • A fixed lag window can be statically configured. This allows the replication lag window to spike in a short term without impacting the freshness of the data being returned.
  • For granular query splitting, use our regular expression, rules engine. For instance, you can bypass the replication lag policy for particular queries, and always use the read instance.
  • In cases where the replication lag is not a concern, you can create a Read/Write split rule for particular queries that should be unconditionally read from the write or read instance.  For instance, For instance, you may want to prevent a reporting job from putting load on the primary instance for particular users.  

Finally, in the Rules tab, configure a Reader Eligible rule to designate what queries can be served from the read replica shown in Figure 6.

5.png

Figure 5. “Rules” tab 

 

Real-time analytic charts on the Analytics tab shows read and write queries now splitting:

 

6.png
Figure 6. “Dashboard” tab 

 

Specific query source is found in the Log tab:

 

7.png
 
Real-time Analytics

On the Analytics tab of the central console, the dashboard below shows that Read/Write splitting is now occurring, rerouting read traffic from the primary to the read replicas. The majority of traffic is now routed to the read replicas. It works!

 

8.png

 

Additional validation that Read/Write splitting is working can be seen in the log tab in Figure 7, where the actual source of each query can be found. In this case, the reads are directed to the “Magento-Demo-source-Reader,” while the update is performed on “Magento-Demo-source-Master.” 

 

9.png
Figure 9. “Log” tab

Automated Query Caching

Another way to improve database scale is with SQL caching. In this scenario, a grid cache like Redis is used as a look-aside, SQL results cache. The Heimdall Proxy intelligently determines which queries to cache and automatically invalidates when the proxy detects an update to the database.

 

10.png
 

The best part of our caching solution is its transparency, not requiring any code changes while being database vendor-neutral. Caching and invalidation are automated and fully user-configurable. For more information on how to configure Heimdall for auto-caching, check our automated caching blog

Customer Benefits

To take full advantage of the performance and scalability of your database, application owners must properly interface with their databases. This often requires code changes. The Heimdall database proxy supports optimal SQL offload to maximum Cloud SQL scale. Download a free trial on the Google Cloud Marketplace.

Resources 

 

Author: Roland Lee, Heimdall Data

Heimdall Data, a Google Cloud Partner, offers a Cloud SQL proxy that intelligently manages your SQL connections without application changes. Save months/years in development and maintenance.

Contributors
Version history
Last update:
‎09-03-2021 11:11 AM
Updated by: