dnstt security

This page lists a history of security problems and fixes in dnstt.

For details of dnstt's security design, see the protocol page.

David Fifield <david@bamsoftware.com>

Last updated: 2021-10-27.

Detection research at FOCI 2021

"Exploring Simple Detection Techniques for DNS-over-HTTPS Tunnels"
Carmen Kwan, Paul Janiszewski, Shela Qiu, Cathy Wang, Cecylia Bocovich
ACM SIGCOMM 2021 Workshop on Free and Open Communications on the Internet (FOCI)
Local cached PDF

Summary from BBS:

The paper explores ways to distinguish the DNS over HTTPS traffic of a DNS tunnel (namely dnstt, previously discussed here) from ordinary browser-generated DNS over HTTPS traffic. Even though DNS over HTTPS (DoH) is encrypted, censors may be able to infer the use of a tunnel by looking at side-channel features like traffic timing and volume. The authors of this paper build data sets of both circumvention and non-circumvention DoH traffic, using Selenium to drive Firefox to Alexa global top sites. The non-circumventor data set captures the DoH produced by Firefox while visiting sites. The circumventor data set captures all the traffic of a Firefox which is configured to use dnstt as a proxy (so it contains not only the browser's DNS queries and responses, but also the tunneled contents of the sites). Analysis of these two data sets turns up three traffic features—average payload length, packet rate (packets/s), and throughput (bytes/s)—and thresholds that distinguish dnstt from browser DoH with nearly 100% precision and 70–80% recall. To give an example of a feature threshold, over a short time window, only 1% of browser DoH has an average packet length of more than 70 bytes; but 56% of dnstt DoH does. The tests require observation of a few hundred or thousand packets before declaring a detection result.

Having observed that dnstt is distinguishable by its use of large packets and high data rates, the authors modify dnstt to diminish these signals, imposing a rate limit of 500 packets/s in both directions, and a downstream data capacity per packet of 100 bytes. (Packets on the wire will actually be bigger than 100 bytes because of DNS encoding, HTTP, and TLS overhead.) The modified dnstt successfully avoids detection attacks based on the average payload length feature, but remains vulnerable in the packet rate and throughput features. The authors test the user experience of browsing through the rate-limited tunnel, selecting 100 sites from the Umbrella 1 million; throughput is decreased by 27 times and page load time is increased by 23 times. While the low speed of the more detection-resistant tunnel may be uncomfortable for browsing, it could still be useful for low-rate applications such as bootstrapping a circumvention system.

Although it is not the main focus of the paper, the authors find that dnstt does not disguise its TLS fingerprint, which is fairly uncommon and distinctive of programs written in Go. They made a fork of dnstt that uses uTLS (previous discussion here) for TLS camouflage.

20210812 log injection vulnerability

The v1.20210812.0 release has a fix for a log injection vulnerability in dnstt-server. The log message NXDOMAIN: not authoritative for %s contained a potentially attacker-controlled name. Because DNS labels may contain any byte value, the log message allowed an attacker to write arbitrary bytes to the dnstt-server log, with a variety of possible effects:

A label containing a newline character (\x0a) could break the format of the log, or inject false log lines.
Log output to a terminal could contain terminal escape sequences which could, for example, change the color of text, or have even worse effects with older terminal emulators.
DNS names with a label that contained the dot character (\x2e) would be logged in an ambiguous way, with the intra-label dot appearing as a label separator.

DNS names are now logged using backslash hex escapes for unusual bytes. This vulnerability was called to mind by the USENIX Security 2021 research paper "Injection Attacks Reloaded: Tunnelling Malicious Payloads over DNS" by Jeitner and Shulman.

2021 security audit by Cure53

PDF: Review-Report Turbo Tunnel Security & Privacy 03.2021

In February and March 2021, several members of the Cure53 team did a security audit of software developed under the Turbo Tunnel project, including dnstt. The audit found security issues of severity Low to Medium in dnstt and one of its dependencies, flynn/noise. Most of the issues are fixed in dnstt v0.20210424.0 and flynn/noise v1.0.0. See also the GHSA-g9mp-8g3h-3c5c security advisory for flynn/noise.

The audit was more focused on secure coding an protocol design than on resistance to detection and blocking. For example, I informed the auditors ahead of time about a lack of TLS camouflage being a detection vulnerability, and they omitted it from the report as a known weakness.

UCB-02-002: Memory leak in acceptStreams() routine of dnstt server (Low)

Fixed in:

Commit b6b803986c (tag v0.20210424.0).

UCB-02-003: Potential nonce overflow in Noise protocol (Medium)

Fixed in:

https://github.com/flynn/noise/pull/44 (tag v1.0.0).
Commit 326771cb95 (tag v0.20210424.0).

UCB-02-004: Deprecated DH25519 Golang API used by Noise (Low)

Fixed in:

https://github.com/flynn/noise/pull/43 (tag v1.0.0).
Commit 326771cb95 (tag v0.20210424.0).

UCB-02-005: Client ID security considerations & Noise authenticated data (Low)

This issue is not addressed as of tag v0.20210424.0. It is true that a third party that can learn the client ID of some other client can send that client ID to the server and receive a DNS response that is intended for the actual client (but would not be able to decrypt the information inside). The suggested mitigation—including the client ID in the "additional data" of the Noise AEAD ciphertexts—is possible, but would not actually prevent the message hijacking issue. The reason has to do with protocol layering. For space and simplicity reasons, Noise messages in dnstt are not authenticated and decrypted at the level of DNS messages, but at the level of a reliable stream, carried within a sequence of multiple DNS messages. At the point where the server receives a query carrying a client ID, to which it must respond, it does not have enough context to bind the client ID with the cryptographic stream which it in part represents.

Properly addressing this issue would require rearchitecting the protocol to authenticated each DNS message independently; i.e., to put the KCP layer inside the Noise layer, not outside. See my 2020-04-25 mailing list post, Security audit of Noise-based DNS tunnel, protocol layering.

UCB-02-006: DoS due to unconditional nonce increment (Low)

Fixed in:

https://github.com/flynn/noise/pull/44 (tag v1.0.0).
Commit 326771cb95 (tag v0.20210424.0).

UCB-02-007: DoS due to missing socket timeouts (Low)

Fixed in:

Commit 23759e203f (tag v0.20210424.0).
Commit 4de69201d1 (tag v0.20210424.0).

UCB-02-008: Lack of rate limiting in Snowflake and dnstt (Info)

This item is informational and does not represent a specific security vulnerability. Rate limiting may help to mitigate certain kinds of resource-exhaustion attacks. As of the v0.20210424.0 tag, there is no such rate limiting in dnstt.

Here is the information I provided the auditors ahead of time regarding scope and interesting parts of the code:

What exactly are we supposed to look at?

I implemented the Turbo Tunnel concept in obfs4, meek, Snowflake, and a DNS tunnel called dnstt. See:

https://www.bamsoftware.com/papers/turbotunnel/#sec:case-studies

https://www.bamsoftware.com/papers/turbotunnel/#sec:availability

The obfs4 and meek implementations are more in the nature of tech demos, and not deployed for release. Therefore I would advise focusing on Snowflake and dnstt. The modified Snowflake is deployed in Tor Browser alpha since version 9.5a13 (May 2020). I'm not aware of any large deployments of dnstt, but it's documented and available for download, and I have tried to have it in a production-ready state (with the exception of issue of TLS fingerprinting mentioned below).

Snowflake was an existing project, which my work added to. The main Snowflake project tracker and source code are here:

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/wikis/home

https://gitweb.torproject.org/pluggable-transports/snowflake.git/

A summary ticket for the new Turbo Tunnel code, and a diff of the primary merge are here:

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/33745

https://gitweb.torproject.org/user/dcf/snowflake.git/diff/?h=turbotunnel-merge-2&id=2022496d3b6fc76b7725135758c37d7d49546d3d&id2=e9b218a65cf731d0cf50a8655602ea82e67c128e

The code has been further modified by me and the Tor anti-censorship team since then, but the diff should tell you what files and what parts of the code to pay attention to. There are a lot of components in the Snowflake system, but my work mainly only affects end-to-end communication between the client and server: the changes are transparent to the "broker" and "proxy" components.

The dnstt source code and documentation, and a discussion thread, are here:

https://www.bamsoftware.com/software/dnstt/#download

https://github.com/net4people/bbs/issues/30

dnstt consists of a client and server, and small auxiliary packages. It's an entirely new application, so I would think all of it is in scope.

Both of these implementations make essential use of a couple of third-party packages, which implement the underlying session/reliability layer. These packages are relevant for security, but it's not code I wrote, apart from a few minor pull requests.

https://github.com/xtaci/kcp-go

https://github.com/xtaci/smux

How many lines of code are there, what are they written in?

All the code is written in Go.

Snowflake

about 1250 lines + 450 test

dnstt

about 3150 lines + 750 test

How will our test setup look like?

For Snowflake, it depends on how many components you want to test, and in how much detail. Much of the attack surface in Snowflake (e.g. rendezvous, proxy enumeration, censor-operated proxies) has nothing directly to do with the Turbo Tunnel work I did for Xiao, which all takes place inside the encrypted WebRTC tunnel that Snowflake always had. To test only the externally observable network behavior, you can download a recent Tor Browser alpha from https://www.torproject.org/download/alpha/ and let it connect—but the externally observable characteristics won't have much to do with the Turbo Tunnel features, apart from packet sizes and timing. If you need more control (for example a proxy-level view to see what passes through the WebRTC tunnel), you will have to set up your own infrastructure, which includes a broker, a server, and proxies. Unfortunately the procedure for setting up infrastructure is not well documented, but you can find some information here:

https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Legacy-Wiki/SnowflakeBrokerInstallationGuide

https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Legacy-Wiki/SnowflakeBridgeInstallationGuide

server/README.md

For dnstt, there is no public infrastructure, but there are installation instructions here:

https://www.bamsoftware.com/software/dnstt/#setup

You'll need a domain name or subdomain to devote to the tunnel, and an Internet-accessible host to serve as the server. You must first generate an encryption key and then configure the server and client to use it. I have a dnstt-tests repository that documents the procedure I used for some performance tests:

https://www.bamsoftware.com/software/dnstt/performance.html

What's the threat model we should assume for this audit?

I think the threat model is like this one:

https://www.bamsoftware.com/papers/thesis/#sec:scope

The client is in a censor-controlled network but the client's own computer is trustworthy. The censor can observe or block all traffic the client sends and receives, but cannot break encryption. The censor can act as a client or (in Snowflake, for example) operate its own proxies. I believe that traffic analysis (measuring distributions of packet sizes and timings) is a valid attack, even if we do not yet really see it used in the wild, especially as it is one of the features most likely to be affected by a Turbo Tunnel model.

Let me close by pointing out some aspects of the program that are worth particular attention.

As stated at https://www.bamsoftware.com/software/dnstt/#caveats, dnstt currently does not implement any form of TLS camouflage, so it is likely detectable by its Go crypto/tls fingerprint. A research team at the University of Waterloo has started on a prototype integration of uTLS into dnstt for a sensible TLS fingerprint:

https://github.com/pjanisze/dnstt-uTLS/tree/dnstt-utls

I reviewed their branch and suggested some changes—my plan was to give them some more time to make the changes and get a patch accepted. If they do not get around to it, I planned to do the uTLS integration myself.

Both the Snowflake and dnstt implementations try to be forward thinking with regard to traffic analysis attacks, by making provision for padding in the protocol. (See common/encapsulation/encapsulation.go in Snowflake and DNSPacketConn.send in dnstt-client/dns.go.) However, the padding functionality is currently unused.

To achieve reasonable performance, dnstt-client needs to aggressively poll (send possibly empty upstream requests) so that the server has a vehicle for sending data downstream without delay. The polling strategy is implemented in DNSPacketConn.recvLoop in dnstt-client/dns.go—see the block comment that begins "Whenever we receive a response with a non-empty payload, we send twice on c.pollChan..." The intention behind this strategy is to manage a dynamic window of in-flight DNS queries in a somewhat TCP-like fashion, but without a lot of state tracking. The dynamics of this strategy may not be good, however, and the Waterloo team reported higher query rates when idle than I would have expected.

dnstt has a mandatory layer of end-to-end encryption, described here:

https://www.bamsoftware.com/software/dnstt/protocol.html#crypto

The protocol layering with respect to crypto deserves attention. The session/reliability layer is implemented using a combination of KCP and smux, and the crypto lies directly between these layers. An observer can see (and potentially manipulate) the KCP-layer features such as sequence and acknowledgement numbers, but the smux streams carried in the KCP layer are encrypted and authenticated. An alternative model would be to do packet-level rather than stream-level crypto, and encrypt each KCP packet.

There's no separate layer of crypto in Snowflake, as it's assumed that the tunnelled protocol implements its own, as is the case with Tor.

Resource exhaustion attacks are potentially interesting. I haven't dug much into the internals of kcp-go and smux, and because they implement a protocol similar to TCP, it's possible that TCP-like attacks apply to them. For example, initiating a new connection causes the server to allocate memory for the equivalent of a TCB; so it may be possible to do something analogous to a SYN flood.

There's a fuzzing harness for dnstt's DNS message parser in dns/fuzz.go.

dnstt is intended to be used with DoH or DoT, but it also supports a non-encrypted UDP mode. The UDP mode is not covert, because the structure of plaintext DNS messages reveals that a tunnel is in use. I have tried to document that the UDP mode is not meant for production, but the option exists.