AMU User Guide: KVMs migration from Apigee Edge/OnPrem to Apigee X/Hybrid

A comprehensive migration solution for different Apigee version combinations for a specific customer situation is a non-trivial optimization problem, hitting different complexity spots like proxy functionality or environment mapping allocations or processing objects that are not easily extractable from a source instance. The current version of the amu utility targets a specific gap in Apigee toolkits that are not able to extract encrypted value stored in KVM objects. The utility is meant to be used in combination with other export/import utilities to create a complete migration solution.

Toolkits currently used for migration, are relying on the product’s existing Management API. Unfortunately, some objects are not addressed by this approach. The most prominent omissions for  Edge/On-Premises and pre-1.8 Apigee Hybrid versions are:

  • KVMs, 
  • Key stores/Private Keys, 
  • Access tokens.

AMU (stands for Apigee Migration Utility) directly accesses Cassandra keyspaces and tables. Because of this approach, it is possible now to extract objects that were inaccessible through the existing Management APIs.

The design of the AMU takes into account a variety of usage scenarios. A typical one would be migration from OPDK (also known as On-Prem) to Apigee Hybrid. 

Currently supported sources are OPDK/Cassandra 2.0 or Hybrid/Cassandra 3.x.

Generated targets are maven config plugin or apigeecli 'file systems'. They are aimed for import by either config maven plugin or sackmesser or apigeecli utilities. Therefore, you can use any version of Apigee as its target, Edge, On-Prem, X and/or Hybrid.

The utility is also useful for pre-1.8 versions of Apigee Hybrid. Starting from version 1.8, Apigee exposes the KVM management API that allows for functionally complete operations over defined current state of the KVMs and their entries. For previous versions, there was no practical way to extract KVM entries. AMU fills this gap.

The expected audience for this utility are customers or partners who are performing data migration activities between different versions of Apigee. 

AMU Workflow

For maximum flexibility, AMU introduces two distinct operations, emigrate and export. The output of the emigrate operation is an input for the export operation.

The emigrate operation extracts contents of relevant Cassandra tables as appropriate per Apigee/Cassandra version and persists it in a local file system. 

The export operation processes the file system and generates output format suitable for processing by a target CLI tool.

This separation facilitates the ability to provide assistance in troubleshooting of each stage of the workflow separately. It makes it possible to isolate and pass around a specific edge case of input operation.

In scenarios where the emigration operations will need to be performed from a fully managed Apigee Edge deployment, customers and partners won't directly have access to the underlying data storage (Cassandra) for the migration. Customers and Partners will need to contact their executive representatives to discuss the available options as they will coordinate with engineering and support.

 

yuriyl_0-1663796876028.png


As per diagram, amu uses cassandra-cli utility to interact with OPDK, which still bundles Cassandra 2.x. Using Apache Thrift, the keyspace and column family that holds KVM data and encryption material.

In case of Cassandra 3.x, the utility used to extract data is cqlsh and the result is captured as an output of SELECT CQL3 command. 

You can then use amu kvms export command to transform output data into a file/folder structure appropriate for importing via either maven config plugin or by apigeecli.

AMU KVM Decryption Primer

In every Apigee product, KVMs can be defined or scoped at 4 levels: organization, environment, proxy, and revision. 

See for details: https://cloud.google.com/apigee/docs/api-platform/reference/policies/key-value-map-operations-policy...

Depending on the type of the scope, the scope structured value will encode slightly differently.

For Edge or On-Prem versions, a KVM has to be declared encrypted for the data to be encrypted. In X/Hybrid, every KVM is encrypted.

The encryption algorithm used is AES-ECB,128. 

For Apigee X/Hybrid encryption documentation details, see

https://cloud.google.com/apigee/docs/api-platform/cache/key-value-maps#aboutencrypted

https://cloud.google.com/apigee/docs/hybrid/latest/key-encryption

OPDK: https://docs.apigee.com/api-platform/cache/key-value-maps#aboutencrypted

Although AMU contains auxiliary operations to extract KEKs or DEKs from an On-Prem or a Hybrid installation it is useful to manually revisit all the steps to demystify the decryption process.

On-Premises: KVM Values Decryption

The On-Premises version of Apigee uses KEK/DEK/MEK approach to keep the KVM entries encrypted at rest. For data to be encrypted, you have to define a KVM with an encryption flag, __apigee__encrypted, set to true.

The Master Encryption Key or MEK is used to encrypt an access to a vault. The vault is implemented in the form of a Java Key Store file, JKS file format. The JKS file and the passphrase that is used to access it are configured in the /opt/apigee/edge-management-server/conf/credentials.properties file. That is where they can and should be changed and/or rotated from the default values.

 

yuriyl_1-1663796876105.png

The Entry in the vault file is encrypted using the same MEK. 

The entry called datastore-alias is the Key Encryption Key or KEK. It is used to encrypt all the DEKs used to encrypt the data, the KVM values.

There is no way to access contents of private keys or secrets via keytool utility, that is used to create/populate JKS files. You can use a KeyStore Explorer utility to do that. 

Here is an example of a look-and-feel that the Key Explorer interface provides.

yuriyl_2-1663796876189.png
  1. Look up the vault location and the MEK passphrase. Use the Key Explorer to open the vault file.
    When prompted, enter the passphrase to open the vault file, then use it to open the datastore-alias contents, which is our KEK. Make a note of the KEK.

The DEK or Data Encryption Key is stored in a keyvaluemap keyspace  keyvaluemaps_r21 column family row, keyed by an organization name row for every scoped KVM. A structured name codes a scope type and an __ apigee__kvm__.keystore pseudo-value holds a json value structure that contains the DEK for this encrypted KVM. The DEK is stored in an base64 encoded format.

 

yuriyl_3-1663796876187.png

Every KVM then defined for that scope, will have a JSON structure as its value that holds a collection of KVM entries with plain-text names and base64-encoded encrypted values.

There are many ways to execute decryption operations. We are going to use the CyberChef https://gchq.github.io/CyberChef/ utility to illustrate the encryption process.

NOTE: Never use any online-hosted services when you are working with your customer/company encrypted data. CyberChef can be downloaded and hosted locally. That is the right way to do it from the security perspective.

For a given example, the CyberChef's recipe contains two steps. The first one is to decrypt a DEK using the KEK.

  1. In a CyberCher, drag-and-drop first operation of the new recipe that decodes a base64 representation of the DEK to a binary form.
  1. Add an AES decryption operation and enter the KEK from the previous step into the Key field. Make sure the AES mode is set to ECB.
  1. Populate the Input field with a DEK value from the __ apigee__kvm__.keystore value.

 

yuriyl_4-1663796876098.png
  1. Bake the recipe. The Output will contain a decrypted value of the DEK.

The third step of the decryption process is to use DEK to decrypt the KVM value.

  1. Create a different CyberChef recipe. The first operation is to decode the data value from base64 format.
  1. The second operation is to AES/ECB decrypt using the DEK value as a Key. Make sure that the key format is specified as BASE64.
  1. Insert an encrypted KVM value into the the Input field. 
yuriyl_5-1663796876058.png
  1. Bake the recipe. The result is the decrypted data value.

As already mentioned, amu has an operation to export the KEK from the vault and provided storepass programmatically. Example invocation:

export KEK=$(amu kek export --src $SRC --storepass $STOREPASS --vault VAULT)

amu uses following openssl command to decrypt an input data in base64 encoded format using a key in a hex format:

echo -n " datab64 "| openssl enc -aes-128-ecb -d -K " keyhex " -base64 -A

Hybrid: KVM Values Decryption

In the case of Apigee Hybrid, decryption is controlled by DEK keys supplied by the customer for an organization and every environment. Customer provides those values via overrides configuration file, using apigeectl apply command. If customer values are not specified, then default values from the values.yaml file are used. This is a bad security practice and default values should never be used.

This key material is transformed into a Kubernetes secret with a name $ORG_ENV_SHA-encryption-keys and accessed by apigee-runtime.

amu kek command uses this convention to extract a DEK key for an appropriate environment.

 

yuriyl_6-1663796876133.png

Figure. A fragment of overrides file that defines encryption material.

 

yuriyl_7-1663796876183.png

Figure. A Kubernetes Secret manifest that contains KVM organizaton and environment DEKs

As we can see from the following illustration, the kvm table contains encoded scopes for 4 types of KVMs and the value is an AES/ECB 128 encrypted data stored in a base64 format.

 

yuriyl_8-1663796875862.png

We can use either apigeectl encode or ahr-runtime-ctl org-env-sha command from the AHR toolset to calculate the hash of an organization or organization/environment combo.

apigeectl encode --org $ORG --env $ENV

Apigeectl variant is a bit awkward to use for automation scripts, as you still need to extract the hash for your specific purpose, because the command provides a list of apigee components, but there is no version of the command to generate the secret name.

secret="$(ahr-runtime-ctl org-env-sha $ORG "$ENV")-encryption-keys"

Here is a command that amu kek operation uses to extract an KVM encryption key from a kubernetes secret:

echo "$(kubectl -n apigee get secrets $secret --output 'jsonpath={.data.envKvmEncryptionKey}'|base64 -d|base64 -d |xxd -ps)"

AMU: Output File System Formats

The design of the amu kvms command is suitable for its composability with utilities like maven config plugin/sackmesser or apigeecli to create a complete migration solution.

For this purpose, amu export supports two different output formats. They are controlled by a --tgt option. The supported values are maven and apigeecli.

Due to historical reasons, the format are subtly different. 

The maven format uses org and env/<env-name> folder structure. Within each folder, there is a kvms.json file which contains all the kvms defined in this organization or environment.

The kvms.json file uses an object collection with name attribute as a KVM name and entry array of objects that define KVM entries.

 

yuriyl_9-1663796875952.png

Figure.  An example of maven output.

The apigeecli uses a collection of files per each KVM with structured names where the name of the environment and the name of the KVM, separated by an underscore character. 

Foe an env-level KVMs, the file name structure is: env_<env-name>_<kvm-name>_<page-no>.json

For an org-level KVMs, the file name structure is: org_<kvm-name>_kvmfile_<page-no>.json

As Apigee X/Hybrid support pagination for KVM entries, you might have multiple files with a numeric sequential number that corresponds to a page of kvm entries.

The contents of the file is the object that contains a keyValueEntries property that contains the array of objects with an entry name and value, as well as an optional nextPageToken property.

For an apigeecli import kvm entries operation to be successful, a KVM with the required name from the file name should exist before the import operation is executed.

 

yuriyl_10-1663796876074.png

Figure. An apigeecli example of a KVM export output

AMU: Walkthrough

The full list of amu operations and its options are located in the README-amu.md file at github.

In this section, let's use some of those commands to guide us through an end-to-end scenario of migration of KVMs and its contents from Apigee On-Prem, 4.51 to Apigee Hybrid 1.8, using apigeecli utility.

One of the features of the amu cli interface is to use environment variables naming convention that maps into command line options. For example, for --tgt option, if it is not defined during amu invocation, but if the environment variable $TGT is defined, then this variable will be used. For multi-word options, the dash in the option name is mapped into the underscore character of the environment variable. For example, --export-dir is mapped into the $EXPORT_DIR variable.

Every amu operation before its execution verifies that all required options are provided. If any option is missed, an error message will be displayed.

  1. Create an environment variables configuration file that can be re-used to quickly set a working environment.
vi ~/source.env
export AHR_HOME=~/projects/ahr

export PATH=$AHR_HOME/bin:$PATH


export src=opdk

export SRC_VER=4.51

export ORG=apigee-opdk-org

export ENV=test

export TGT=apigeecli


export EMIGRATE_DIR=~/projects/amu/$ORG/$SRC-$SRC_VER

export EXPORT_DIR=~/projects/amu/$ORG/$TGT

export BACKUP_DIR=~/apigee-org-backups/apigee-hybrid-org/$SRC
  1. Log into a VM with an on-prem installation that contains an apigee management server component and installed cassandra-cli utility.
  1. amu is a part of the ahr repository. To install amu utility, 
cd ~

git clone https://github.com/apigee/ahr.git
  1. Define PATH and source the source.env file to populate environment variables.

export AHR_HOME=~/ahr

export PATH=$AHR_HOME/bin:$PATH

source ~/source.env

To export KVMs and their contents:

  1. To extract the KEK and put it into an environment variable so that we do neither display its value no persist it, execute:
export STOREPASS=$(awk -F= '/^vault.passphrase/{FS="=";print($2)}' /opt/apigee/edge-management-server/conf/credentials.properties)
export VAULT=$(awk -F= '/^vault.filepath/{FS="=";print($2)}' /opt/apigee/edge-management-server/conf/credentials.properties)

export KEK=$(amu kek export --src $SRC --storepass $STOREPASS --vault VAULT)
  1. 'Emigrate' contents of the KVM keyspace/collection.
amu kvms emigrate

The resultant <org-name>-kvms.out file will be put into an $EMIGRATE_DIR directory.

  1. Export the KVMs and entries
amu kvms export

The folder structure will be created in the directory named $EXPORT_DIR. 

At this point you can tar/gz the $EXPORT_DIR directory and copy the archives to a box that has the kubectl command and the  apigeecli utility with access to your Apigee Hybrid 1.8 instance. Untar the contents into an $EXPORT_DIR at this target box.

To generate and cache an access token for apigeecli authentication:

apigeecli token cache $(gcloud auth print-access-token)

To import all KVMs:

apigeecli kvms import -o $ORG -f $EXPORT_DIR

To import a specific KVM:

  1. Create a kvm with a required name
apigeecli kvms create -o $ORG -e default-dev -m dev_my-kvm
  1. Import kvm's entries from the generated file
apigeecli kvms entries import -o $ORG -f $EXPORT_DIR/env_default-dev_my-kvm_kvmfile_0.json

At this point, the next step depends on your current situation. Simple case scenario, you might want to import all On-Prem KVMs into an environment with the same name as as in your OPDK environment.  Or you might want to deploy them into different environments, or multiple environments. It is relatively easy to script any scenario.

Here's an example of a simple one-line script that processes a list of files in a current directory that match a wildcard env*_kvmfile_*.json and creates a kvm and generates commands that create kvms and import their entries.

ls -1|grep env_.*_kvmfile |awk '{ match($0, /env_([^_]*)_([^_]*)_kvmfile_([^_]+).json/, arr ); print "apigeecli kvms create -o $ORG -m " arr[1]; print "apigeecli kvms entries import -o $ORG -m " arr[1] " -f " arr[0]}'

For the folder that contains files:

env_default-dev_amu-test-kvm-env_kvmfile_0.json

env_default-dev_secrets1_kvmfile_0.json

The one-liner above will generate an output:

apigeecli kvms create -o $ORG -m default-dev
apigeecli kvms entries import -o $ORG -m default-dev -f env_default-dev_amu-test-kvm-env_kvmfile_0.json
apigeecli kvms create -o $ORG -m default-dev
apigeecli kvms entries import -o $ORG -m default-dev -f env_default-dev_secrets1_kvmfile_0.json

 

AMU: Technical Tidbits 

kubernetes automanagement of cassandra client pod

https://github.com/apigee/ahr/blob/main/bin/amu#L63

For working/troubleshooting/experiments with Cassandra in Hybrid, we need to create and manage a cassandra client pod. This pod is version specific. It is a good idea security-wise to create this pod with a limited amount of time to live. 3600 seconds is a good reasonable value. 

It would be good if we can automate management aspects of the pod lifecycle. amu does just that. It creates a pod using a Kubernetes manifest. On each cqlsh command invocation, it checks  if a pod is running. If it is not, it checks if there is a stale pod. amu deletes a stale pod and if needed, creates a new one. It waits for the pod to be ready.

It automatically propagates credentials  from your Apigee hybrid secret to environment variables, as well as maps a TLS certificate from secret to a mounted volume.

 

yuriyl_11-1663796875916.png

Figure. A fragment of the Cassandra client pod lifecycle management.

Amu has a convenience command that allows you to to start a Cassandra client pod and logs you into it for an interactive session.

amu cassandra cqlsh --hybrid-version $HYBRID_VERSION

List of organizations and environments

Another useful service command allows you to list organizations and environments either for OPDK or Hybrid instances

amu organizations list --org $ORG

Sample output includes pairs of an organization name (tid, or tenant id) and an environment name:

qwiklabs-gcp-04-791a44540dfd,test

Summary

amu is a useful utility that plugs the gap of the existing Apigee management with respect of KVM export ability.

If you develop your own migration solution for different versions of Apigee, you can easily integrate amu into your solution's workflow.

It is an open source project currently published in the ahr repository, https://github.com/apigee/ahr

Its README file is located at the https://github.com/apigee/ahr/blob/main/README-amu.md

This User Guide should be sufficient for you to start to work with the utility as a user. If you wish to contribute to the utility, this guide provides you with information on the design and functionality of the utility.

Contributors
Version history
Last update:
‎09-21-2022 03:01 PM
Updated by: