Hashcash verification in Apigee Edge (proof-of-work verification)

Got Bots? Want to include proof-of-work checks in your API Proxies? If so, I've got something for you.

Proof-of-work mechanisms are a way a receiver of a message can require the sender of a message to perform some work before constructing a valid message. Hashcash is one mechanism for encoding the proof-of-work. Today I'm going to show you how to include Hashcash checks in your API Proxies.

First, some background. With Hashcash, a server can require that a client send it a string, that hashes to a value that has a specified number of leading zeros. The hash function used in Hashcash is SHA1, but it need not be - you could create a different proof-of-work with a different hash if you like.

The SHA1 function is well defined and there are tools included in the operating system for computing them. For example in my OSX laptop, I can do this to compute a sha1 checksum on a string:

$ echo -n abc | shasum -a 1 
a9993e364706816aba3e25717850c26c9cd0d89d  -

You can look here to find more test vectors for various one-way hash functions.

Because hash functions are one-way, there's no way to predict the value of a hash function based on the input. The only way to know the hash value is to compute it. And this takes time.

Hashcash lets the server say to the client: "you need to send me a string that starts with a specific set of characters, and ends with anything you like. But the SHA1 hash of the string must have N leading zeros." Because the client will have to try lots of strings before finding one, it means the client has to do "work". If the client sends a string that produces a SHA1 with leading zeros, it amounts to proof-of-work. If the client sends a string that does not produce a SHA1 with the requisite number of leading zeros, then the server can reject the request based on a lack of proof-of-work.

There are public-domain libraries that compute Hashcash values. It's pretty simple if you think about it... In psuedo code:

do {
  tstring = compose(fixed_part,timestamp,random_part);
} while (hasLeadingZeros( sha1(tstring), N ));

Doing "the work" on the client side might take 200ms or even 1200ms or more. If you require a hashcash with 32 bits of zeros, it could take weeks of runtime. But on the server, it takes less than a millisecond to verify the work.

This is the secret to bot interdiction. By requiring the client to do work, which is really easy for the server to verify, an API server discourages bots. A bot attack plan relies on a computationally cheap and easy way to communicate with the server. If the protocol requires that the bot takes 5 seconds of compute time before making a request, it makes a bot much less economically feasible.

OK, so proof-of-work discourages bots, and Hashcash is a well-known way to enforce proof-of-work. I've just put together an example that shows how you can embed Hashcash verification into any API Proxy running in Apigee Edge.

The code is all available on github here.

And I've recorded a 12-minute screencast showing how it works, too.

5952-screenshot-20171117-105253.png

I demonstrate it using a Java command-line program as the client generating the Hashcash, but in a production case, it will be the dedicated client running on a mobile device, or on a server, etc. that produces the Hashcash. Check it out.

If you have any questions or feedback, on this, let me know! I'd love to hear it.

Comments
anilsr
Staff

Great article @Dino , +1

API-Evangelist
Silver 5
Silver 5

Good article..tested and it works. Do you have any existing customer who uses this? what is the possible use case which you can share?

dchiesa1
Staff

FYI: I've updated the Java callout to use SHA-256by default.

Everything else works as before.

dchiesa1
Staff

By the way, the same model is possible with other proof-of-work systems. For example Merkle-Tree calculation. A Merkle Tree is a tree structure in which each node represents a crypto hash of a block of data. "Computing the tree" takes a predictable amount of time. Verifying that the computed tree of hashes is correct... takes a very small amount of time.

This asymmetry allows its use in a proof-of-work algorithm. (In fact the Merkle Tree is used as the proof-of-work in bitcoin.). Applying that as a client-enforced rate limiter is pretty straightforward. I have an example for that, too.

Version history
Last update:
‎12-22-2016 06:01 PM
Updated by: