Punycode, X.509 and OpenSSL: SpookySSL

One of the most severe vulnerabilities ever, on the Internet was caused by OpenSSL, and related to Heartbleed. And, recently, OpenSSL was…

Photo by Carlos Nunez on Unsplash

Punycode, X.509 and OpenSSL: SpookySSL

One of the most severe vulnerabilities ever, on the Internet was caused by OpenSSL, and related to Heartbleed. And, recently, OpenSSL was again pinpointed for two new major vulnerabilities (CVE-2022–3786 and CVE-2022–3602). One of these relates to Punycode decoding causing buffer overflows.

Like it or not, our Internet was created to be English-language focused. As the Internet and Web were being developed, there was only one core character set: ASCII. Most of the RFCs that built the Internet defined a character set that was based around ASCII [here] — and which only supported a limited number of characters. With Unicode [here] we have 16 bits to represent characters, and can then represent almost every character we need. To overcome this, we can use Punycode to define an extended character set for email addresses and domain names.

Punycode

Our URL infrastructure for domain names and email addresses is often focused on the ASCII character set. To overcome this, Punycode is used to encode Unicode into ASCII characters. It does this with a Letter-Digit-Hyphen (LDH) subset, and where we define the Unicode characters after a hyphen. So let’s try some German city names [Try]:

const punycode = require(‘punycode’);rtn=punycode.encode(‘München’);
console.log(rtn);
rtn=punycode.encode(‘Köln’);
console.log(rtn);
rtn=punycode.encode(‘Düsseldorf’);
console.log(rtn);

The results are:

Mnchen-b078a
Kln-5t7s
Dsseldorf-g674c

If we look at “München”, then we get “Mnchen-b078a”. The hyphen represents the additional characters, and which are encoded using generalized variable-length integers. If we now try “點看” [here] we get:

Message: Dian Kan 
Encode: c1yn36f

Another example is “xn — — 80ak6aa92e.com”, and which maps to https://www.аррӏе.com/

Punycode and OpenSSL

And, so, today Punycode has been pinpointed in an OpenSSL bug, and affects OpenSSL versions between 3.0.0 and 3.0.6. Initially, both of the identified vulnerabilities (CVE-2022–3786 and CVE-2022–3602 — SpookySSL) were identified as critical but were downgraded to major. These related to X.509 digital certificates with an email address causing address buffer overflows. It can allow for intruders to steal secret information, such as private keys and passwords. Both are linear stack overflows:

  • CVE-2022–3786 relates to the parsing a TLS certificate after validation. This results in a four-byte overflow on the stack. For this, an intruder can create a malicious email address in a certificate to overflow a number of bytes containing the ‘.’ (0x2E) character. on the stack.
  • CVE-2022–3602 relates is also a linear stack overflow vulnerability that occurs when parsing a TLS certificate after validation and focuses on name constraint checking. It should be noted that a trusted CA would have to sign a malicious email address in order for this to have any effect.

Initially, CVE-2022–3602 was graded a CRITICAL, but now downgraded to HIGH, because of the difficulty in getting a malicous email address signed by a trusted CA, and that many platforms now mitigate against stack overflows.

An X.509 certificate has a distinguishing name, such as /C=US/ST=WA/L=Redmond/O=Microsoft/CN=www.microsoft.com. This is then checked against the digital signature for the site. We can also have a Subject Alternative Name (SAN) for other subjects, such as: /C=US/ST=WA/L=Redmond/O=Microsoft/CN=docs.microsoft.com. We can also have email addresses with their subject names /C=US/O=Fred Smith/U=Fred Smith/[email protected].

One example is “潤瑣灹⁥瑨汭ਾ‪桇” [Try] and which is translated into punycode as “xn — — jbc833i9e993ywlec4eo6b9unmb”. The decoding of the punycode adds extra bytes onto the stack and overflows it.

Crashing Systems With a Font: The Homograph Attack

Apart from the OpenSSL vulnerabilities, Punycode has been used in other attacks. A recent vulnerability crashed many Apple iOS applications (such as WhatsApp, Facebook Messenger and Gmail). It was derived from a single character from the alphabet of the Telugu language (and which is a Dravidian language and spoken by over 70 million people). The bug was spotted by the Italian blog: Mobile World.

The vulnerability — known as homograph attack (known since 2001) — was found by a Chinese researcher (Xudong Zheng) and is now often used by scammers to trick users in regions of the world. A recent scam used the apple.com domain and was even signed by a valid digital certificate:

This shows that the certificate is valid (as it goes green), but it is not the Apple site. The epic.com site was used as a demonstrator of the vulnerability:

The site looks to be signed by epic.com, but where we see the Common Name (CN) is xn — — e1awd7f.com:

It works by replacing the characters with Unicode characters, where quite a few characters act differently when they are processed as a Web address. With Punycode, the “ — — “ part defines a prefix which defines that the domain is formatted in ‘ ‘ to represent the Unicode characters:

can crash your device and block access to the Messaging app in iOS, including WhatsApp, Facebook Messenger, Outlook for iOS, Gmail, Safari and Messages for the macOS versions.

Conclusions

Well, it’s not the critical bug we had expected, but it is a major one. Here is some background information on Heartbleed:

If you are interested, here is more information on OpenSSL: