The Problem With GDPR Is …

The problem with GDPR is … and it is the same problem that audit/compliance regimes have … it sets out scope but cares little about the…

The Problem With GDPR Is …

The problem with GDPR is … and it is the same problem that audit/compliance regimes have … it sets out a scope but cares little about the detail. The concept of pseudo-anonymity, for example, just doesn’t go far enough in most cases. But the main weakness is that it hasn’t stopped the industry from their current sleepwalking methods. We still hear of data breaches which did not use encryption or failed to properly hash passwords.

Little has thus changed in the way that companies do things, and the concept of systems properly integrating encryption at every stage is something that still has a long way to go. And so what should the EU do next? Well, rather than laying out the scope, they should go straight to the heart of the problem areas, and at the top of the list must be the usage of hashed password, and in the passing of plaintext passwords over a network. Why can’t the EU just define that the storage of a password should relate to the storage of a completely random nonce value?

Unfortunately if this drive was left of major companies, such as Microsoft and Google, we would be waiting for a long time. In the forty years of the Internet, little has actually changed in the ways that we protect documents and send an email.

So here is my Christmas list for the EU, and where companies would comply against different levels:

  • LEVEL 0. All companies will publish their encryption methods for protecting PII (and, if possible, as minimum standards for the rest of their data infrastructure).
  • LEVEL 0. All login systems must define an out-of-band password reset system for users.
  • LEVEL 1. On a data breach, the use of encryption will be detailed within the incident report. This will allow the scope of the breach to be fully understood, and where there is no usage of terms such as “partially encrypted”.
  • LEVEL 1. Wherever personally identifiable information (PII) is used, a salt value must be applied when it is stored, and where the salt value must be electronic separate from the PII value. Within highly sensitive information, such as withing finance and health care, the salt value should be kept on a physically separated system.
  • LEVEL 1. All passwords will be defined as zero-knowledge proof values, and where users identify themselves to systems without revealing their password. No hashed versions of passwords will be used on systems.
  • LEVEL 1. Systems must provide multi-factor authentication for user identity and for high-risk transactions.
  • LEVEL 2. Text-based logs will replace all PII information, such as for IP and MAC addresses, and which can only be resolved within a trusted system.
  • LEVEL 2. As a minimum, password lock-out systems will automatically lock-down out IP addresses on three unsuccessful logins.
  • LEVEL 2. All systems must have systems in place which log system accesses, and which are stored in a secure way.
  • LEVEL 2. All data passed to and from third parties must be encrypted within the data layer.
  • LEVEL 2. Threat monitoring will be in-place for risks, and this will be used to update a risk register.

Now auditors and reviews have a checklist that they can check against. Also, when a breach occurs, companies will be measured against these. The large companies, such as Microsoft, would then be focused to upgrade their operating systems and software to comply with these methods.

A Future System?

Why do we still store passwords on corporate servers?

Why do we still store hashed values of passwords on corporate servers?

Why do we still store hashed values and the salt on corporate servers?

Our fundamental problem is that Alice doesn’t want her password to be revealed to Eve, but Bob stores a hash of her password with a salt value. If Eve gets access to this, she basically just searches a dictionary of common passwords or brute force, and just adds on the salt value that he has used:

Increasingly we use ZKP (Zero-Knowledge Proofs) to prove that the Bob still has knowledge of his password. Another method is PAKE (password-authenticated key exchange) and which supports the hiding of a shared password within network communications. With this we can have a relatively weak shared password on either side, and then communicate to determine a strong shared key:

One of the most implemented PAKEs is SRP (Secure Remote Password) [here], and which is integrated into TLS and into a range of Apple products (such as with the iCloud Key Vault). But it is not the best PAKE around, and OPAQUE aims to provide the best methods around.

A perfect solution?

The perfect solution for our online world is for us to never store a user’s password or hashed version of the password on our servers, as there is no chance of an intruder finding out the password. Our data breach report which just say, that someone had stolen a whole lot of random numbers, and which can never be resolved back to anything that resembles their password.

So it seems impossible, but it’s actually easy. All we have to store is a salt value, and which is generated when the user registers their password. All that is stored is a completely random number:

Now Alice (the client) knows her password, and Bob (the server) only knows her salt. If Bob gets hacked, there is no way that an intruder will be able to know Alice’s password, as there is not even a hashed version on the server (Bob). Initially Alice registers with Bob, and generates a secret client key, and Bob gives her the server’s public key for a future key exchange.

The protocol we define is OPAQUE [link]:

So how does Alice learn the salt, without actually determining what it is? For this we use an oblivious protocol named Diffie-Hellman Oblivious PRF (Pseudo-Random Function). This passes the salt value to Alice, but she cannot determine what the actual value is. The usage of the password registration process makes sure that there are public and private keys used to validate the generation of the shared key.

The client then receives a ciphertext which can only be decrypted within the right knowledge of the keys used in the registration. The decrypted is then checked against the key generated earlier, and the client with them know if the key is correct.

Conclusions

Come on EU … go on make something happen, and just don’t wait with a big fine when something goes wrong. We need to clean up the bad practices of the past, and just saying that encryption should be used is not enough for the auditing process. Citizens need to be told about how their data is being protected, and we should have reporting on it whenever an incident happens. Technically-focused people, and professional bodies, should be shouting out to those who make the laws, and pushing for change, otherwise we have a 20th Century data world existing in the 21st Century.