The Rise and Fall of the NHSX App

Well, the writing hand been on the wall for a while, but today, the UK government cancelled the centralised approch of the NHSX App, and…

Photo by Patrick Tomasso on Unsplash

The Rise and Fall of the NHSX App

Well, the writing has been on the wall for a while, but today, the UK government officially cancelled the centralised approach of the NHSX App, and are now focusing on the Google/Apple integration:

At the end of May, the cracks were showing when the Financial Times discovered that NHSX has awarded a six-month £3.8 million contract to a company in order to investigate a system using the Google/Apple API.

The version of App trialled in the Isle of Wright was developed by Zuhlke (a Swiss company with a base in London) and Privotal (a US-based company).

The flaws

It was promised to be a world-leading App, but few technical people have ever seen it as being a solution. In fact, on 29 April 2020, many leading UK cybersecurity academics (of which I was one) wrote a letter to the UK Government [here]:

I, with other academics, also wrote a paper for the DHI (Digital Health Innovation) Centre in Scotland, and which was presented to the Scottish Government:

One of our core recommendations to the Scottish Government was to NOT use the NHSX App. Overall, the fundamental design choices were wrong at their core, and it was as if the UK government jumped for the first solution that came their way. But there was a careful path to be taken, and which enabled contact tracing but preserved privacy. Trust was fundamental in the approach taken.

While most of the world focused on a distributed method, the NHSX App stuck its oar-in and went against the tide with its centralised approach and so many more fundamental flaws. These included:

  • Only contacts that you have make after the install of App will be traced.
  • The data is stored on the public cloud and is not stored in the NHS infrastructure.
  • The Bluetooth beacons are sourced in an App, and which is likely to suffer from power drain problems, and also with the App being put to sleep if not being used.
  • A breach of the private key on the server releases all of the contacts.
  • Bob can be traced for a day using his daily public key. This is generated every day and is broadcast every time that Bob comes into contact with someone. By relaying the broadcast, to another place, Bob can be tracked with his daily public key. In the Google/Apple method, a new tracing ID is generated every 10 minutes, and which limits the tracking opportunities.

20 years and stored in the Amazon Cloud

The NHSX App in the UK has not had a good time and has been criticized for its centralisation nature. But it’s really what happens on the back-end that really matters, especially how the data will be used, and how long it will be kept for. And so, the details are finally being released about the NHSX App, and where Public Health England (PHE) says that it will hold the data gathered by the NHS Test and Trace App for 20 years and for five years for contacts:

And to make things worse, they define that it will be stored in the public cloud, but with protection. The details stored also involves storing names, emails addresses and postcode. And who will have access to the data? Well, quite a variety of health agencies, including companies such as Amazon:

The data is thus being stored in the public cloud. It was basically another own goal for the App.

The Apple/Google Approach

And so, in a spirit of openness, Google released its cryptography specification for a new privacy-preserving Bluetooth protocol. Google’s core focus is on key scheduling and Bluetooth Advertisements (what would be thought of as Bluetooth beacons). Overall there are three main encryption keys:

These keys are defined next.

Tracing Key. This is a randomly generated key that is stored on the device and kept private. It is generated as:

Key_Tr= Random(32)

and which creates a random 32-byte (256-bit) value. Some sample code to generate this is [here]:

import osK_track = os.urandom(32)    # 256-bit random keyprint ("Tracing key: "+binascii.hexlify(K_track).decode())

Daily Tracing Key. This key is generated every 24 hours when there is contact tracing. The tracing key (K_track) is then used to generate the Daily Tracking Key (Key_dtk_i) through a 16-byte KDF (Key Derivation Function):

Key_dtk_i =HKDF(Key_Tr , NUL L, (UTF8(“CT-DTK”) || D_i),16)

and where D_i is the day number for the broadcast. If the user tests positive, this key is broadcast from the device, but when proven healthy, it will never leave the device. Some sample code is [here]:

## HKDF(Key, Salt, Info, OutputLength)# DayNumber ← Number of Seconds since Epoch/ (60 × 60 × 24)DayNumber = str(int(time.time()/(60*60*24)))print ("\nDay number: "+DayNumber)D_i = "CT-DTK" + DayNumberKey_day_i = PBKDF2(K_track, D_i).read(16)print ("Daily key: "+binascii.hexlify(Key_day_i).decode())

In this case, we are using PKBKF2 as our key derivation function, as it is a slow method of creating the keys, and thus more robust against dictionary and brute force attacks.

Rolling Proximity Identifiers. This is used to identify the devices within the local proximity and are sent out with Bluetooth Advertisements. These are generated from the Daily Tracing Key (Key_day_i) through a Message Authentication Code (and which is a hash value which is encrypted with a secret value). A change in the Bluetooth MAC address causes a new identifier to be generated. The RPI_i is generated as a 16-byte value as:

RPIi, j ← Truncate(HMAC(Key_day_i, (UTF8(“CT-RPI”) | |TINj)),16)

and where TINj is the interval time that the Bluetooth MAC address changes. We can thus code [here]:

# TimeNumberInterval ← Seconds Since Start of DayNumber/(60 × 10)today = datetime.date.today()seconds_since_midnight = time.time() - time.mktime(today.timetuple())TINj=str(int(seconds_since_midnight //(600*10)))print ("\nTINj (Time interval number): "+TINj)TINj_str = "CT-RPI" + TINjrpi = hashlib.pbkdf2_hmac('sha256', Key_day_i, TINj_str.encode() , 10000)[:16]print ("Rolling ID: "+binascii.hexlify(rpi).decode())

The Diagnosis Keys are then the Daily Tracing Keys which are used when the owner of the device has been proven positive for COVID-19, and uploaded to a server. Here is a very rough demo of the code:

and a demo:

The NHSX App

The spread of COVID-19 is likely to be contained with the usage of contract tracing, and where carriers will be traced for those who they have been in contact with. In some countries of the world, broadcast methods have been used, and where carriers are traced on a map:

In this way, it is possible for those who are in close proximity that they are near a carrier. While this is highly effective in terms of containing the spread, it has significant problems in terms of breaching the rights of privacy:

Some of my patients were more afraid of being blamed than dying of the virus
Lee Su-young, Psychiatrist at Myongji Hospital, South Korea

In the UK, a new NHSX App is being trialled this week in the Isle of Wright. At its core are a number of objectives:

  • There is very little personal information gathered, with active user consent for data collection.
  • Not possible to trace a user or device from Bluetooth beacons (other than being in near proximity to Bob, and we can see beacons from him).
  • Not possible to spoof data for another user.
  • Not possible to see if the user of a phone which has sent a beacon is a carrier.
  • Robust against malicious users, including for replay information about proximity.

An overview of the system is shown in Figure 1. Initially, the user (Bob) installs the App, and the Infrastructure Provider (and which runs in the public cloud) sends back: the Health Authority public key; an InstallationID (which is unique for Bob), and a symmetric key (that will be used for signing Bluetooth broadcasts). These are stored on Bob’s phone, and the details registered with the Infrastructure Provider, along with half of Bob’s postcode.

Figure 1: Overview

When Bob comes in contact with Carol, the contact is stored in a log on Bob’s phone with signal strength and a risk score. When the risk score is high enough, it the log is then sent through Firefox Firebase to the Infrastructure Provide (stored in the public cloud). The NHS then has direct access to the information stored.

Broadcast Value

The values broadcasted by Bob (BroadcastValues) uses the public key sent within the registration process. Each day the device creates a new ephemeral private key on an elliptic curve (P256):

PrivKeyD (daily) = r

PubKeyD (daily) = rG

and where G is the base point on the P-256 elliptic curve. The secret is then:

Z = ECDH (PubServer, PrivKeyD)

This is the elliptic curve Diffie Hellman and creates a key exchange. Next, a key is generated using X9.63 KDF and SHA-256 to give two 128-bit values (and where we split the result into two parts (Key and IV):

Key, IV = KDF(Z,rG)

The payload for the message is then:

m = (Start Date) || (End Date) || (InstallationID) || (Country Code)

and where InstallationID is a 128-bit unique identifier for the person. It is then encrypted with AES (GCM) to give:

Cipher, IntCheck = AES(m,IV)

where IV is the initialisation vector (salt) used, and IntCheck is the integrity check. The broadcast value to the device is then:

BV=(Country Code||PubKeyD||C|| ICV)

and where || is a concatenation. This gives a 856-bit broadcast value. This broadcast value will change every day, and where the daily secret is stored on the server:

Only the server has the private key for the public key (PublickeyS), and only it (and Bob) will be able to determine Z.

When the BV is received by Alice and is based onto the central server. The central server will take the public key (PublicKeyD) and then derive Z, and then generate the same encryption key that Bob used. We thus generate a BV every day and a new PublicKeyD. When there is a connection, Bob sends:

P = (BV || TxPower || TxTime || Auth)

and where TxPower is the power of the sender in dBm, TxTime is the transmission time stamp, and Auth is the HMAC relates to he other contents in the payload and keyed using the sending device’s symmetric authentication key. When received, the server can then extract the daily public key (PubKeyD) and then use this with the ECDH method to derive the shared secret (Z). Once we have this, we can then determine the key used to encrypt the message:

Within the log of the phone, the events are automatically aged out after 28 days.

The method we have outlined here is the Elliptic Curve Integrated Encryption Scheme (ECIES) with AES. If you want to see how it works:

Conclusions

The academic community is there to provide support. Possibly it would have taken a little bit longer to set up, but ‘a stitch in time saves nine’.