Running a high-performance pluggable transports Tor bridge

https://www.bamsoftware.com/talks/foci-2023-pt-bridge/

https://www.bamsoftware.com/papers/pt-bridge-hiperf/

Overview

Direct-access transports

"Design of a blocking-resistant anonymity system", :

Today, Tor relays operate on a few thousand distinct IP addresses; an adversary could enumerate and block them all with little trouble. To provide a means of ingress to the network, we need a larger set of entry points, most of which an adversary won't be able to enumerate easily.

Bridge obfs4 193.11.166.194:27020 86AC7B8D430DAC4117E9F42C9EAED18133863AAF cert=0LDeJH4JzMDtkJJrFphJCiPqKx7loozKN7VNfuukMGfHO0Z8OGdzHVkhVAOfo1mUdv9cMg iat-mode=0

Indirect-access transports

meekSnowflake
user → CDN → bridge user → {proxy,proxy,proxy,…} → bridge
Bridge snowflake X.X.X.X:XXXX 2B280B23E1107BB62ABFC40DDCC8824814F80A72 url=https://snowflake-broker.torproject.net.global.prod.fastly.net/ front=cdn.sstatic.net ice=stun:stun.l.google.com:19302 utls-imitate=hellorandomizedalpn

No secret information: only need one bridge.

But now you run into the Tor scaling problem.

Tor blocking in Russia in created a great demand for Snowflake bridges.

How to reduce tor CPU load on a single bridge?
The main Snowflake bridge is starting to become overloaded, because of a recent substantial increase in users. I think the host has sufficient CPU and memory headroom, and pluggable transport process (that receives WebSocket connections and forwards them to tor) is scaling across multiple cores. But the tor process is constantly using 100% of one CPU core, and I suspect that the tor process has become a bottleneck.

In short: run multiple Tor processes on the bridge, ensure they have the same identity keys, and distribute traffic across them using a load balancer.

The traditional way of running a Tor pluggable transports bridge.
Our way, with multiple Tor processes, that permits better scaling.

Getting here required hardware investment as well as the load-balanced multiple-Tor architecture, but the load-balanced multiple-Tor architecture is what made it possible to use the hardware to its fullest.

CPU use by process name.
RAM use by process name.

Most bridges do not need this technique.

It may be useful for other indirect-access transports like Conjure, default obfs4 bridges that serve many users, and maybe even exit nodes.

Hardware eventually becomes a limit anyway—the Tor anti-censorship team has had to establish a second Snowflake bridge.

Snowflake bridge installation guide.

Snowflake Daily Operations fundraising.