Introduction

joel_gauci · ‎04-25-2022

Introduction

In this article, we discuss how to leverage Google reCAPTCHA Enterprise with Apigee X. The objective is to demonstrate how to use the two products to build a solution to detect and reject fraudulent requests from bots on the APIs exposed by Apigee.

Requirements

If you are interested in using reCAPTCHA Enterprise and Apigee X, you must follow these requirements:

The reCAPTCHA Enterprise API must be enabled on your Google Cloud Platform (GCP) project: gcloud services enable recaptchaenterprise.googleapis.com
Use an existing Apigee X organization or install a trial one
Use Apigee DevRel to deploy the reCAPTCHA Enterprise reference

This community article is not intended to present reCAPTCHA Enterprise. In case you need to get some information about this product, please refer to the Google Cloud documentation.

Why using reCAPTCHA Enterprise with Apigee?

Here are use cases for which you want to use reCAPTCHA Enterprise with Apigee X/hybrid:

Getting the score on Apigee to build custom reports using DataCapture policy
Honey Potting: a too low score implies a routing to a mock target API, not a rejection
Getting the reCAPTCHA token on an Apigee hybrid runtime (on-premise, other Cloud providers than GCP) where Cloud Armor is not (or cannot be) used

Solution Overview

This section presents the proposed solution based on Apigee (X, hybrid or Edge Public and Private Cloud) and reCAPTCHA Enterprise.

The solution is based on the following components:

Google reCAPTCHA Enterprise: this GCP solution is responsible for generating an encrypted token that contains the validity and the risk score of the client application
Apigee X as the API management platform. Other Apigee runtime options can be considered instead of Apigee X: a Shared Flow is in charge of extracting the reCAPTCHA token from a request header, verify the API key and call the reCAPTCHA Enterprise endpoint to allow or deny the client app request based on the validity status and the risk score. This Shared Flow can be called by any API proxies that need to protect a backend application system
A web browser, which loads the customer web page. This web page is able to invoke the reCAPTCHA Enterprise endpoint in order to obtain a reCAPTCHA token
Application systems or backend APIs that need being protected

How does it work?

The process starts with the browser loading the customer web page served by the backend/web server, and then loading the reCAPTCHA JavaScript client for reCAPTCHA Enterprise.

Then, the client app (a web page or a mobile application) first retrieves a token from a Google Cloud reCAPTCHA endpoint, using a site key as an input parameter.

The reCAPTCHA Enterprise endpoint returns a score for each request, which is based on the end-user interactions with your site or application. By interpreting these scores, you can take appropriate actions for your site or application: this is exactly what Apigee is doing…

reCAPTCHA Enterprise has 11 levels for scores with values ranging from 0.0 to 1.0. The score 1.0 indicates that the interaction poses low risk and is very likely legitimate, whereas 0.0 indicates that the interaction poses high risk and is likely to be fraudulent.

In the configuration reference proposed in Apigee DevRel, we consider that a risk score of 0.6 and above corresponds to a legitimate interaction. Therefore, this value represents the minimum score that an application must obtain to be trusted to use an API.

The risk score can be configured on each Apigee API proxy, which calls the reCAPTCHA Enterprise Shared Flow. As an example, a payment or order API would require a higher score than an API that would return a list of products or a list of branches.

This risk score is configurable on the FlowCallout policy that invokes the Shared Flow which contains the dedicated processing, as shown on the following picture:

Which APIs to protect with reCAPTCHA Enterprise?

As an example, an OAuth20 endpoint that would be exposed on Apigee is a perfect candidate for such a protection as we do not want to deliver an access token to a bot but only to valid client applications used by humans. It is the responsibility of an API developer or security architect to decide which APIs on Apigee require bot detection based on reCAPTCHA Enterprise.

Example of a simple Web Page

In this section, we present a simple web page that includes JavaScript code that is used to get a reCAPTCHA token from a Google reCAPTCHA Enterprise endpoint.

This code is used on the Apigee DevRel reference as an example of an HTML web page.

<!DOCTYPE html>
<html>
  <head>
  <title>Web page - reCAPTCHA enterprise</title>
  <link rel="preconnect" href="https://www.google.com">
  <link rel="preconnect" href="https://www.gstatic.com" crossorigin>
  <script src="https://www.google.com/recaptcha/enterprise.js?render=<sitekey>"></script>
  <script>
      function copyToken() {
          var copyText = document.getElementById("textArea");
          copyText.select();
          navigator.clipboard.writeText(copyText.value);
        }
      function getToken(e) {
        e.preventDefault();
        grecaptcha.enterprise.ready(function() {
          grecaptcha.enterprise.execute('<sitekey>', {action:'home'}).then(function(token) {
             var text = document.createTextNode(token);
             var tag = document.createElement("textarea");
             tag.setAttribute("id", "textArea");
             tag.appendChild(text);
             var element = document.getElementById("token");
             element.appendChild(tag); 
          });
        });
      }
      window.onload = getToken;
  </script>
  </head>
  <body>
    <div id="token">
     <h2>Here is your reCAPTCHA enterprise token:</h2>
    </div>
    <div>
     <button onclick="copyToken()">Copy</button>
  </body>
</html>

The <sitekey> parameter must be replaced with a valid site key. A sitekey is used to verify user interactions on your web pages and mobile applications.

When creating a reCAPTCHA Enterprise sitekey, you can specify the domains or subdomains of websites allowed to use this key.

A sitekey lets you protect your endpoints by verifying user interactions on your web pages and mobile applications.

The sitekey is public: if it is stolen, there is no risk at all as this key is not used to identify the client application on Apigee.

On Apigee we are using app credentials, a JWT token or an OAuth20 access token to identify and authenticate/authorize a client application.

When the client application submits a request on Apigee, it does not provide the sitekey information but the reCAPTCHA token and its credentials (or a security token in the form of a JWT or access token).

Using these credentials or security tokens, Apigee can identify the client application. It is also possible to get custom attributes that have been set at the developper application level in Apigee: the sitekey is one of these custom attributes. Apigee can get it and use it to create an assessment request.

The assessment request consists in posting the following data to a dedicated Google reCAPTCHA enterprise endpoint:

The reCAPTCHA Enterprise token that was originally retrieved from the reCAPTCHA Enterprise endpoint by the web or mobile application
The sitekey associated to the reCAPTCHA Enterprise token
The GCP project identifier, on which the sitekey has been created

Here is an example of a JSON payload that is posted on the reCAPTCHA Enterprise assessment endpoint:

{
  "event": {
    "token": "<recaptcha_token>",
    "siteKey": "<sitekey>"
  }
}

The assessment endpoint for the reCAPTCHA Enterprise endpoint is in the form of: https://recaptchaenterprise.googleapis.com/v1/projects/<gcp_project_id>/assessments

The assessment (response) is interpreted on Apigee. Details on how to interpret assessment are available in the Google docs.

The reCAPTCHA Enterprise assessment endpoint is in charge of setting the validity and calculating the risk score based on the reCAPTCHA token it receives.

Here is an example of an assessment that is retrieved and interpreted on the reCAPTCHA Enterprise Apigee Shared Flow:

{
	"name": "projects/<project_id>/assessments/<assessment_id>",
	"event": {
		"token": "<recaptcha_token>",
		"siteKey": "<sitekey>",
		"userAgent": "",
		"userIpAddress": "",
		"expectedAction": "",
		"hashedAccountId": ""
	},
	"riskAnalysis": {
		"score": 1,
		"reasons": []
	},
	"tokenProperties": {
		"valid": true,
		"invalidReason": "INVALID_REASON_UNSPECIFIED",
		"hostname": "example.com",
		"action": "home",
		"createTime": "2022-04-22T13:10:30.110Z"
	}
}

The tokenProperties.valid and riskAnalysys.score are checked on the Apigee Shared Flow. For example, submitting a reCAPTCHA Enterprise token multiple times will (more than once) causes tokenProperties.valid to be set to false to prevent replay attacks.

Apigee reference for reCAPTCHA Enterprise

To enable the integration between Apigee and reCAPTCHA Enterprise, you can use the Apigee DevRel reference that can be deployed on Apigee.

This reference is composed of the following artifacts:

sf-recaptcha-enterprise-v1: a Shared Flow, which contains the full configuration of the reCAPTCHA Enterprise reference. This Shared Flow is in charge of invoking the reCAPTCHA assessment endpoint to obtain a status of token validity and the risk score evaluation. It is responsible for analyzing these data (validity and score) to take the right decision: reject or accept the incoming request
recaptcha-data-proxy-v1: An example of a proxy that is protected against malicious bot activity by calling the reCAPTCHA enterprise Shared Flow. The target endpoint of this proxy is httpbin.org

recaptcha-deliver-token-v1: an API proxy used to deliver a demo HTML web page that includes a valid reCAPTCHA token. This proxy is not intended to be used in production but only during test phases.
The RecaptchaEnterprise API product
A developer (Jane Doe)
Developer apps with different reCAPTCHA score requirements

reCAPTCHA Enterprise site keys for testing

The reCAPTCHA Enterprise site keys of these two developer apps are designed only for testing and are created using the Google Cloud CLI.

As an example, here is the gcloud command used to create a score-based site key without domain name enforcement and that always returns a risk score of 1.0:

gcloud recaptcha keys create --testing-score=1.0 \
--web \
--allow-all-domains \
--display-name="Always 1" 
--integration-type=score

Apigee Devrel pipeline script

You can use the pipeline script provided in the reference to deploy these artifacts on Apigee X or hybrid, and start experimenting with Apigee and reCAPTCHA Enterprise.

Important: the reCAPTCHA Enterprise reference in Apigee DevRel also proposes a "mock" mode to demonstrate the token validation on a data proxy, in case you cannot enable reCAPTCHA on your GCP project or if you cannot generate a reCAPTCHA token on your client web/mobile application.

Sequence Diagram

In this section, we describe interactions between the different actors involved in the Apigee and reCAPTCHA "dance".

These actors are:

End-users
Client applications
Google reCAPTCHA Enterprise endpoint
Apigee platform

Data proxy
reCAPTCHA Shared Flow

Backend APIs

The sf-recaptcha-enterprise-v1 sharedflow is responsible for extracting the reCAPTCHA token from a dedicated request header (x-recaptcha-token in the example). This Apigee configuration could be modified to extract the token from a JSON payload or form parameters, as examples.

Once the token has been extracted, an API key verification is performed to identify the client application. As the reCAPTCHA site key has been set as a custom attribute of the developer app in Apigee, it is possible to get the site key value and use it to POST the request on the reCAPTCHA Enterprise assessment endpoint.

Based on the response of this endpoint, the sf-recaptcha-enterprise-v1 Shared Flow can allow or deny the request: indeed, the token validity (is the token valid or not?) and the risk score level (is the score greater or equal to 0.6?) are controlled by the Shared Flow.

If the client app is considered to be legitimate, the final step of the sharedflow processing consists of removing the client identifier and the reCAPTCHA Enterprise token from the request as to not propagate these values to any backend systems.

Testing the Solution

Once the Apigee DevRel reference has been configured to work with reCAPTCHA Enterprise and deployed on an environment of your Apigee X or hybrid instance, you must set the following environment variables:

export APIGEE_X_HOSTNAME=<your_apigee_x_hostname>
export APIKEY=<your_clientapp_apikey>

From there, you can use the API proxy (recaptcha-deliver-token-v1) used to deliver an HTML page that includes a valid reCAPTCHA token.

As discussed previously, the Apigee Devrel reference deploys 2 developer apps:

app-recaptcha-enterprise-always0
app-recaptcha-enterprise-always1

These two developer apps use a score-based site key that always returns a set score (1 or 0).

They are only used for testing purposes and must obviously not be used in production.

As defined in their respective names, the first developer app will always be considered as a bot, as the risk score associated with this application will always be 0. On the other hand, the second developer app will always be considered as a legitimate client application and the risk score associated with this application will always be 1.

At this point, access your Apigee X/hybrid console and select the developer application you want to use. Access the custom attribute section of the app and copy the value of the SITE_KEY attribute, as shown on the following picture:

Paste the value of the sitekey attribute into a dedicated environment variable, as shown here:

export SITEKEY=<paste_site_key>

Use a web browser and execute the following URL:

https://${APIGEE_X_HOSTNAME}/recaptcha/v1/token?sitekey=${SITEKEY}

The web page delivers a valid reCAPTCHA token that you can copy and paste into an environment variable, as shown on the following picture:

Click the Copy button to copy the value of the reCAPTCHA Enterprise token

export RECAPTCHA_TOKEN=<paste_recaptcha_token>

Important:

A reCAPTCHA Enterprise token has a validity period of 2 minutes and can be used only once to avoid replay attacks.Once the environment variables have been set, you can enable a Debug session on the recaptcha-data-proxy-v1 API proxy and execute the following request:

curl -H "x-apikey: ${APIKEY}" \
     -H "x-recaptcha-token: ${RECAPTCHA_TOKEN}" \
     https://${APIGEE_X_HOSTNAME}/recaptcha/v1/data/headers

Based on the developer app you have selected for the test ("Always 0" or "Always 1") you get a 200 or 400 response code.

Example: status code 200

In case reCAPTCHA token is valid and risk score is 1.0, you get a 200 status code, as shown here:

Example: status code 400

In case you execute the same request as before, as the reCAPTCHA token can be used only once, it is now considered invalid and you get a 400 status code, as shown here:

Thanks to Omid Tahouri and Daniel Strebel for their feedback on drafts of this article!

Apigee X and reCAPTCHA Enterprise: best friends ever

Introduction

Requirements

Why using reCAPTCHA Enterprise with Apigee?

Solution Overview

How does it work?

Which APIs to protect with reCAPTCHA Enterprise?

Example of a simple Web Page

Apigee reference for reCAPTCHA Enterprise

reCAPTCHA Enterprise site keys for testing

Apigee Devrel pipeline script

Sequence Diagram

Testing the Solution

Example: status code 200

Example: status code 400