Apple’s CSAM System … Walking A Fine Balance?

We have a challenge. How do we protect the rights to privacy, against the right of society to protect itself? It happened with the…

Apple’s CSAM System … Walking A Fine Balance?

We have a challenge. How do we protect the rights to privacy, against the right of society to protect itself? It happened with the COVID-19 Bluetooth matching system, and now Apple is aiming to use advanced machine learning and cryptography to protect privacy, but detect criminal activity. If they fail, they could destroy the strong trust levels that their users have in them. When we look at Google, we see a company which does not have a strong track record in preserving the privacy of the user, whereas Apple tends to be well trusted for these things.

For this, Apple is integrating CSAM (Child Sexual Abuse Material) detection within their iPhone infrastructure. Overall Apple claims that they cannot access metadata on images until a given threshold match is reached and that all of the true positives are reviewed by a human before forwarding to the National Center for Missing and Exploited Children (NCMEC).

CSAM detection

CSAM detection uses a database of hashes, and that is then blinded before they are sent to a client’s iPhone. The matching process — known as NeuralHash — is implemented on the device, rather than in the iCloud. NeuralHash analyses an image and finds key features. When the image is stored within iCloud, it creates a cryptographic safety voucher, and where a threshold secret sharing method is used, so that it is not possible to view the contents of the voucher until it exceeds the threshold for known CSAM content (Figure 1). This process is known as the Private Set Intersection (PSI). A threshold has been selected so that there is an extremely low chance of a false positive, and Apple claims that this is 1-in-a-1 trillion. If a false positive occurs, there is a manual review process. On a true positive, the user’s account is disabled and a report is sent to the NCMEC.

Figure 1: Overview Ref

NeuralHash and perception hashing

With NeuralHash we use perception hashing, and which aims to hash values based on key features in the image. This makes sure that images that are perceptually and semantically similar generate similar values. Thus an image that has been scaled or cropped will still have similar hashes to the original image. Initially, the image is fed into a convolutional neural network and which produces n floating-point values. After this, these floating-point values are then converted in m bits (in order to compress and match the data).

The neural network has been self-trained to detect perturbed entities with a range of transformations. It is also trained on images that have large differences, in order to create a training set which only focuses on images that are perceptually and semantically similar. Initially, the CSAM hashes (NeuralHashes) are received by NCMEC. These hashes are then blinded using a secret value onto an elliptic curve point. Apple is the only entity that has the secret value to unblind the hash. This blinded hash table is then stored on the user’s device. The matching of the hashes is then completed on the device, but the score cannot be computed because of the blinded hashes. Once the image is uploaded, Apple can then unblind the hash values, and compute the match.

For example, if we have a hash of h and a base point of G. We produce a point of:

Point = hG

We can then blind this point with x to give:

Blinded hash = x(hG)

It is not possible to determine hG, unless we know x. As Apple know x, the can produce its invert to unblind the hash:

Hash = x^{-1) x(hG) = hG

Here is an example using Secp256k1:

https://asecuritysite.com/encryption/python_secp256k1Inv

Apple is then able to unblind the blinded hashes:

Figure 2: Unlinding the hashes [Ref]

There is then a bit of magic with secret shares, and where the encryption key for the safety voucher can only be generated when there is a match within the hashes (figures 3 and 4).

Figure 3: PSI and Threshold Secret Sharing [Ref]
Figure 4: Only a CSMA match allows the image information to be viewed [Ref]

If you are interested in threshold secret sharing, there is content here:

https://asecuritysite.com/encryption/#thres01

Conclusions

The last great attempt at a challenging area of privacy against society’s right’s to protect itself was with the COVID-19 Bluetooth detector. CSAM is Apple’s next attempt. While some will point fingers at Apple, it is important for technical experts to analyse the methods used, and prove or disprove their claims. There are worries that Apple could start to use their CSAM detection system for other areas of criminal activity detection, and only time will tell how Apple take their methods forward. It is a very fine balance, and even the best cryptography cannot overcome a lack of trust from users.

If you want to learn more about the methods that can be deployed, try here:

https://asecuritysite.com/encryption/