Threat modeling and circumvention of Internet censorship (David Fifield dissertation talk)

THREAT MODELING AND CIRCUMVENTION OF INTERNET CENSORSHIP

David Fifield

Committee:
J.D. Tygar (chair)
Deirdre Mulligan
Vern Paxson

September 28, 2017

SCOPE – the “border firewall”

[diagram of border firewall]

In scope:
IP address blocks
DNS blocks
protocol blocks
deep packet inspection
bandwidth throttling
packet injection (RST)

Out of scope:
domain takedowns
server-side blocking
forum moderation

PRINCIPLES

Need to resist blocking by content.
obfuscate/encrypt/disguise the messages you send and receive

Need to resist blocking by address.
an unblocked IP address (domain name, URL, ...) to talk with

The censor’s reluctance to block stems from its aversion to collateral damage – accidental overblocking; i.e. false positives.

PART THE FIRST
Modeling Censors

Academic research on circumvention tends to lack grounding in the real world. Without realistic threat models, researchers risk running an imaginary arms race.

[picture of bean tendrils]

2016 meta-analysis of circumvention research and practice

Practitioners value:
finding and unblocked address in the first place x x x x x x x x x x x x x x x x o o o o o o o o
not matching a small set of blocked behaviors (blacklist) x x x x x x x x x x x x x x x x x x x x o

Academics value:
what you do once you have it x x x x x x x x x o o o o o o o o o o o o o o o o o o o
conforming to a small set of allowed behaviors (whitelist) x x x x x o o o o o o o o o o o o o o o o o o o o o o o o o o

x = deployed systems (with users)
o = undeployed systems

Censors do*:
block IP addresses
block DNS names
mine proxy addresses
scan for proxies
find keywords in URLs
inject TCP RSTs
inject DNS responses
inspect at the packet level
complete shutdowns

Censors don’t:
whitelist IP addresses
whitelist DNS names
analyze packet size + timing
analyze n-grams
do expensive stream reassembly

*to a first approximation; there are many exceptions

ACTIVE PROBING

[diagram of active probing]

The censor pretends to be a client, identifying proxies and adding them to a blacklist.

First seen in 2011 by the Great Firewall of China – still the only censor known to use it.

The censor’s probes come within seconds, from a wide diversity of source addresses.

BRIDGE BLOCKING DELAYS (in progress)

Sometimes censors surprise us with their sophistication (active probing blocks new bridges within seconds).

Sometimes they surprise us with a seeming lack of competence (failing to block a static set of IP addresses).

We should think of censors as complex systems, with many human and machine components, whose goals are sometimes contradictory.

PART THE SECOND
Building Circumvention Systems

DOMAIN FRONTING

[diagram of domain fronting]

TLS SNI doesn’t match HTTP Host header.
The censor sees only the TLS SNI and DNS request.
Intermediary CDN routes according to the Host header.

Try it at home!

$ wget -q -O- https://ajax.aspnetcdn.com/
... Microsoft HTML ...
$ wget -q -O- https://ajax.aspnetcdn.com/ --header "Host: meek.azureedge.net"
I’m just a happy little web server.

$ wget -q -O- https://a0.awsstatic.com/
... Amazon HTML ...
$ wget -q -O- https://a0.awsstatic.com/ --header "Host d2zfqthxsdq309.cloudfront.net"
I’m just a happy little web server.

None of this needs to be kept secret from the censor.

Advantages:
easy to implement and deploy
quantifiably hard to block

Disadvantages:
expensive (<$1K/month)
slower because of HTTP overhead

Now integrated into Tor (as the “meek” pluggable transport), Lantern, Psiphon, Signal – widely used.

[diagram of Tor Network Settings] [diagram of Signal advanced settings]

SNOWFLAKE (in development)

[diagram of Snowflake]

Advantages:
avoids address blocking with lightweight ephemeral proxies
less expensive than full domain fronting

Disadvantages:
significant engineering challenges
WebRTC’s resistance to content-based blocking is unknown.

CLOSING THOUGHTS

THE END