Goodbye — and Good Riddens — To CAPTCHA and Hello To Tokenization

In the UK, we need to provide “anti-money laundering” evidence, and which involves you providing a utility bill to a solicitor. With this…

Photo by olieman.eth on Unsplash

Goodbye — and Good Riddance— To CAPTCHA and Hello To Tokenization

Yesterday, I provided some “anti-money laundering” evidence, and which involved me providing a utility bill to a solicitor with proof of my address. With this, you can provide a PDF of a document from a service provider (such as for a utility bill) and which is addressed to you. To me, this has zero trust, as the PDF is not signed by the service provider, and where anyone can modify a PDF to say whatever you want. We thus live in a fake digital world of trust. We have scaled our paper based world to a digital world, and forgot then this scaling just does work. The number of people who think that PDFs cannot be changed is quite worrying. A proper protected PDF with a digital signature is another matter, but most PDFs are unprotected, as easily converted into Word, and then edited, and recreated back as a PDF.

But, things are changing. Just this week, Apple and Cloudflare announced the usage of Private Access Tokens (PATs), and which will allow trusted users to identify themselves without using those horrible CAPTCHA puzzles. This will be based on creating trusted tokens which are signed by a trusted entity, and which can attest that a user, a browser or a device can be trusted.

I must admit I really dislike using CAPTCHA, especially as they often pose images of US-centric things such as “cross-walks”, trucks and buses. For example, I wasted a few minutes of my life pondering whether this was a motorbike or actually a bicycle:

I think the right answer is “SKIP”, but I might be wrong. For PATs, the server sends an HTTP authentication challenge to clients and requests that the client returns a token that is signed by a token issuer that is trusted on the system. It’s as simple as that!

Privacy Pass

In April 2022, the IETF introduced a new standard for a privacy pass HTTP authentication scheme [here]:

While others, such as Firefox, have tried this type of token access approach, it failed to get large-scale adoption. But, the new announcement from Apple and Cloudflare pushes this to a new level.

Outline

Within Privacy pass, Alice runs a Web server and stores the public key of Trusted university. When Bob — who is part of Trusted university — tries to access the Web pages on Alice’s site, she will send a challenge to Bob, and where he will sign a token with the private key of Trusted university, and passes it to Alice. Alice then checks this with the public key of Trusted university. If it checks out, Alice will deliver the Web pages within a session, and without using CAPTCHA. It is that simple!

For a Web site, the server will have to adopt token issuers, and which include the issuer’s hostname and public key. When the server (Alice) sends a challenge to the client (Bob), it will include the token issuer’s hostname and public key in the challenge. We can probe the public key using:

https://<issuer name>/.well-known/token-issuer-directory

An example of a request to Cloudflare gives [here]:

This challenge contains the type-token, the issuer hostname, and the hostname of the server. With iOS 16 and the new macOS operating system (Ventura), the supported token is type 2 (for publicly verifiable RSA Blind Signatures).

With this, we define the origin (Alice — the Web site), a client (Bob — the client), an attester (the element that provides the client), and a token issuer. For a mobile phone, we might attest to the IMEI of the device. When Bob sends a request for Web pages, Alice sends back an HTTP 401 response, with an “WWW-Authenticate” field. This then contains a challenge for Bob to respond to, and which contains a “token-key” of the public key of the issuer:

WWW-Authenticate: PrivateToken challenge=abc..., token-key=123...

Redeeming a token

A client will check the challenge for its trustworthiness. If it does not check out, they will ignore the challenge. Bob can generate multiple tokens at a time, and then cache these for future accesses. For example, Bob could produce 100 tokens to identify his mobile device and redeem them for each access to a site. This will reduce the overhead of re-attestation. These cached tokens must match the token_type, the issuer_name, the redemption_context, and origin_info. The redemption_context field is contained within the challenge from Alice. Each token, though, has a unique client nonce.

The token that is returned thus has a 16-bit token type value (and which matches the challenge), a 32-byte nonce value (and is a random value), a 32 byte challenge digest (and which is the SHA-256 hash of the original challenge), along with the token key ID (as defined by the token type) and authenticator (as defined by the token type).

Privacy Pass For Anonymization

A demo of the method I will present is [here].

With the Tor network, privacy is seen as the core attribute. But what happens when a user is faced with a CAPTCHA challenge? Well, Tor users use the VOPRF (Verifiable Oblivious Pseudo-Random Function) method to bypass these, and thus preserve a user’s privacy. Within this system, the client connects to an edge server (E) with content. The server then sends a CAPTCHA challenge to the client, and if they solve it, E sends back n blinded values (tokens) which can be used in the future for the same site. When the client reconnects, it will use one of the previously generated values to re-authenticate (and without the user being faced with a challenge).

Cloudflare — in collaboration with Royal Holloway and the University of Waterloo — have built on the method used by Tor and has released Privacy Pass. This is a method of identifying a person as a human while preserving your privacy. It does this with zero-knowledge proof (ZPK) methods and which are used to anonymously identify users across a number of sites, and where they are not tracked. This method is defined in a paper [here] and can now be added to Chrome [here] and Firefox:

This method stops companies such as Google in cross-correlating Internet activity across different sites and aims to provide a credible method of proving that you are a human and not a bot.

Overall Cloudflare defines the key tensions between accessibility, security and convenience, and hope that the pass system can overcome these tensions. In order to differentiate between a bot and a human, the Cloudflare challenge system uses a cookie (CF_CLEARANCE) set on the domain. This cookie is not tied to the user’s identity, but to a bot challenge, they have solved in the past.

  1. Person sends Request.
  2. Server responds with a challenge.
  3. Person sends solution.
  4. Server responds with set-cookie and bypass cookie.
  5. Person sends new request with cookie.
  6. Server responds with content from origin.

The zero-knowledge privacy method that has been used is Elliptic Curve Verifiable Oblivious Pseudo-Random Function (EC-VOPRF) [here].

When a client wants to identify itself, it creates 30 random numbers (x1 to x30) and then hashes these onto an elliptic curve (P-256) (X1 … X30)). These are then blinded by a value (b) by multiplying the points. These are then sent to the server with the CAPTCHA challenge solution. These points are then sent back and multiplied by the private key of the server (k). The resulting pairs [xn,kXN] are then passed when a puzzle needs to be solved. The 30 random numbers will then generate 30 passes:

Within the browser, the user then builds up a challenge solution and gains credits. In the following, I have generated 60 passes by solving two CAPTCHAs exercises:

Every time we use a pass it will run the credit down. If the user runs low, they can regenerate the passes.

Conclusions

After 40 years of a terrible Internet which lacks any real trust, we finally move to a new world where we can trust devices and people. We increasingly use mobile devices to access remote sites, but these are not fit for CAPTCHA challenges. And, so, we finally are moving into an Internet, where trust is integrated and seemless.