Nabi2013a
The Anatomy of Web Censorship in Pakistan

Investigates web censorship in Pakistan in 2013. Used a publicly known
blacklist as test set: http://propakistani.pk/wp-content/uploads/2010/05/blocked.html.
(They filtered the list down to 307 entries.) Tested on 5 different
networks in Pakistan. Wrote a custom script to do name resolution and
HTTP requests. Over half of the sites on the test set were blocked by
DNS (spoofed NXDOMAIN); less than 2% were additionally blocked by HTTP
(injected 302 redirect). They caught a change in the censorship
infrastructure in April 2013. After the change, injected HTTP redirects
because a static block page (status code 200). According to their
survey, the most-used circumvention systems are public VPNs at 45%
(Hotspot Shield and Spotflux).


Verkamp2012a
Inferring Mechanics of Web Censorship Around the World

This paper studies censorship mechanism in 11 countries using planet lab
nodes and through personal contacts. It focuses on “how” censorship is
conducted. Censorship techniques vary across countries. Some have overt
block pages; some do not. China is the only country to do per–TCP flow
state tracking. They divide censorship mechanisms into three components:
trigger, location, and execution. Triggers are e.g. hostnames or
keywords; locations are e.g. DNS servers or routers; and executions are
e.g. dropping packets or serving a block page.


Sfakianakis2011a
CensMon: A Web Censorship Monitor

The goal of this paper is to detect censored content and filtering
techniques used by the censor. It describes a system called CensMon that
is built on PlanetLab. As opposed to e.g. Herdict, it does not rely on
voluntary user submissions. The paper is more about the design of the
system than about results. Ran for 14 days in 2011 with nodes in 33
countries. Found 193 blocked domain–country pairs (176 of them in
China). Used e.g. Google Hot Trends and Twitter to seed the URL list.
Found blocking mechanisms across all nodes:
  18.2% DNS manipulation
  33.3% IP blocking
  48.5% HTTP filtering
Seems not deployed on a continuing basis.


Park2010a
Empirical Study of a National-Scale Distributed Intrusion Detection System: Backbone-Level Filtering of HTML Responses in China

A measurement of blocking based on keywords in HTTP responses by the
Great Firewall between August 2008 and August 2009. They argue that
filtering (RST injection) of HTTP responses is difficult, compared to
filtering of HTTP requests. They speculate that it is for this reason
that the GFW apparently discontinued filtering of responses some time
between August 2008 and January 2009. (Filtering of HTTP *requests*
continued.) RST injection is inexact; it worked only 51% of the time in
their tests. Predicting sequence numbers once the connection is well
underway is harder than doing so after only the first packet--this is
the key difference between request filtering and response filtering.
Used open web proxies inside China (47 locations); made requests from
outside to a proxy inside and then back out to a server outside. The GFW
improved its RST injection during the study, removing TTL anomalies that
previously existed (Section III-C). There is evidence of diurnal
patterns; i.e., perhaps injection boxes get overloaded at times of high
traffic. Adjacent IPs can see drastically different levels of censor
effectiveness; perhaps because of "traffic engineering" that causes
different IPs even in the same subnet to end up in different buckets
(for example, different routes or DPI boxes). They ran a separate
experiment to measure statefulness of detection of GET requests. In 2 of
10 servers they found that the firewall responded to a naked GET (as in
Clayton2006a), and in 8 of 10 cases it required an established
TCP connection (as in Crandall2007a).


Aryan2013a
Internet Censorship in Iran: A First Look

Did experiments during the two months before the June 2013 presidential
election. Had one vantage point at one ISP. They tried accessing the top
500 Alexa sites. They found these blocking methods: Host header and
keyword filtering, DNS hijacking, and throttling. Filtering by Host
header and keyword is the usual method of blocking. DNS hijacking (to a
blackhole IP address) is only active for three domains: facebook.com,
youtube.com, and plus.google.com. Protocols are throttled, apparently by
a protocol whitelist. HTTP and HTTPS reached 85–89% of theoretical
performance, while SSH reached only 15%. Randomized protocols (like
obfsproxy) were throttled to near zero 60 seconds into the connection.
Additionally, home Internet speeds are limited to 128 KB/s. The firewall
is capable of dropping packets.


Khattak2013a
Towards Illuminating a Censorship Monitor's Model to Facilitate Evasion

Evaluate the Great Firewall from the perspective that it is a NIDS or
traffic monitor. Only looked at TCP, not e.g. DNS. Looked for ways to
evade detection that are expensive for the censor to repair. They find
that the GFW is stateful, but tracks only client-to-server state. The
GFW is vulnerable to a variety of TCP- and HTTP-based evasion
techniques, such as overlapping fragments, TTL-limited packets, and URL
encodings, some easy to fix and some not; some receiver-dependent and
some not.


Winter2012a
How the Great Firewall of China is Blocking Tor

Focuses on the GFW's capabilities against Tor. Reports dynamic blocking
of bridges since October 2011. Had bridges in Singapore, Sweden, and
Russia; as well as open proxies and a VPS in China. Tor directory
authorities are blocked by IP. Blocking of bridges is by IP:port. Blocks
persist only as long as scanners are able to continuously connect to
them. Provoked active probing by simulating Tor connections to a bridge,
and collected 3295 scans over 17 days. Over half of scans come from
202.108.181.70. The remainder seemingly came from ISP pools. The GFW may
be spoofing IPs because the TTLs on probe packets are different than the
measured TTLs some minutes after the probe. Active probing is initiated
every 15 minutes and each burst lasts for about 10 minutes.


Clayton2006a
Ignoring the Great Firewall of China

Observe that the GFW is symmetric: blocks outside-to-inside traffic the
same as inside-to-outside. Sent keyword-carrying ("falun") requests.
Observed injected RSTs (both directions) and forged SYN/ACK (to
desynchronize the ACK counter). After initial keyword-caused blocking,
all further requests (even those without keywords) to the same server
are blocked for a period of time by injected RST. Ignoring RSTs (on both
ends) is a mostly effective means of circumvention. They find that
GFW-injected packets are easy to identify because of their weird TTLs
and other features. Firewall devices at different border points do not
share blocking state. The firewall does not do reassembly: splitting a
keyword across packets hides it. Blocking is by top 7 bits of port
number (blocks of 128 port numbers). Repeated scanning from may IPs
found many unexplained failures to block.


Anonymous2014a
Towards a Comprehensive Picture of the Great Firewall's DNS Censorship

Examines DNS injection by the GFW. Examines the structure of the DNS
injector (individual nodes operate in clusters of several hundred
processes that collectively inject censored responses). Localizes the
location of the DNS monitors (happens only at China's borders). Extracts
a blacklist of keywords (around 15,000 of them). DNS injection in China
was first reported in 2002 (Dong2002a); in 2007 it was a keyword filter
with up to 8 fake reply IPs; in this work, they found 174 poison IPs.
Probed non-responsive IPs with known censored domains, probed responsive
servers with nonexistent domains that contain keywords, and used open
resolvers to issue DNS queries indirectly. Found open resolvers by a
port scan for port 53, and measured all of them that were within China
to see if they experience DNS poisoning; essentially all do. Did binary
search to discover keywords; found about 15,000 and that some were
anchored to the start or end, etc. Sent TTL-limited requests in order to
locate injectors. Observed a consistent false negative rate by injectors
of 0.5–2.0%, which accounts for previous studies (like Crandall2007a)
that thought censorship was happening a few hops in from the border; it
is probably intermittent failures that make it appear this way and all
censorship in fact happens at the border. Tested the Alexa 1M and the
zone files for .com, .net, .org, and .info. Found that 2/3 of censored
domains were already expired but never removed from the blacklist. Found
TTL and IP ID side channels that allowed estimating the number of
injector processes (about 360).


Anderson2014a
Global Network Interference Detection over the RIPE Atlas Network

Used the existent RIPE Atlas network, a globally distributed Internet
measurement network consisting of physical probes hosted by volunteers,
Atlas allows 4 types of measurements: ping, traceroute, DNS resolution,
and X.509 certificate fetching. Uses consensus across countries as a
means of establishing ground truth. Examines two cases studies of Atlas
measurements: Turkey's ban on social media sites in March 2014 and
Russia's blocking of certain LiveJournal blogs in March 2014. In Turkey,
found at least six shifts in policy during two weeks of site blocking.
In blocking twitter.com, the authorities first poisoned DNS for
twitter.com, then blocked the IPs of the Google DNS servers 8.8.8.8 and
8.8.4.4, then blocked Twitter's IPs. They blocked HTTPS access to
torproject.org, but did not block access to the Tor network itself. In
the Russia blocking, they found 10 unique poison IPs used to poison DNS.
They discovered that certain subdomains of livejournal.com resolved to a
distinguished IP, apparently in order to make them easier to block.


Crandall2007a
ConceptDoppler: A Weather Tracker for Internet Censorship

Measures keyword filtering by the Great Firewall and demonstrates the
automated discovery of new keywords by latent semantic analysis using
the Chinese-language Wikipedia as a corpus. Discussed providing an
ongoing, periodic "censorship weather report," but it seems that was
never realized. They found that they needed to establish a TCP
connection before keywords were detected; sending a naked HTTP request
without a preceding SYN resulted in no blocking. Did TTL experiments to
find the location of routers. The location is not uniform; only 30% of
blocking happens at the first hop within China and 28% of end hosts lay
on paths that were not filtered.


Anderson2013a
Dimming the Internet: Detecting Throttling as a Mechanism of Censorship in Iran

Uses Network Diagnostic Tool, a tool built by M-Lab and integrated into
μTorrent, as a means of measuring network throttling in Iran. Through
the measurement of RTT, packet loss, throughput, and fraction of
network-limited time, they identify past periods of network throttling
and identify networks that are less affected by throttling. A major
challenge is distinguishing throttling from other natural network
conditions. They argue that decreased throughput without an accompanying
increase in RTT and packet loss is an indicator of artificial
throttling. They find two major periods of throttling, between Nov 2011
and Aug 2012, and between Oct 2012 and Nov 2012. Academic institutions
are affected by throttling, but less so than other networks, with higher
throughput during throttling and a faster recovery afterward.


Knockel2011a
Three Researchers, Five Conjectures: An Empirical Analysis of TOM-Skype Censorship and Surveillance

Reverse-engineered the client-side censorship keyword lists in TOM-Skype
and tracked changes to the lists over time. There are two lists: one for
censorship (the message doesn't get sent/displayed, and gets logged to a
remote server) and one for surveillance (the message gets
sent/displayed, but it also logged to a remote server). They emphasize
the formulation of testable conjectures and present five conjectures as
to why the censorship works the way it does.


Xu2011a
Internet Censorship in China: Where Does the Filtering Occur?

Focuses on keyword filtering by the Great Firewall, with the goal of
understanding where filters are located at the IP and AS levels. Claims
that keyword filtering is more effective than IP and DNS blocking,
because of reduced collateral damage and difficulty of circumvention.
Most filtering is done at border ASes (ASes with at least one non-China
peer). They find that the firewall is 100% stateful; blocking is never
triggered by an HTTP request without SYN and SYN/ACK. Much filtering
occurs not on the backbone but by regional providers; in AS4134 only 13%
occurred at the backbone.


Anderson2012splinternet
Splinternet Behind the Great Firewall of China

Summarizes censorship technologies used by the GFW: IP blocking by null
routing, DNS injection, and RST injection. Claims that RST injection is
sometimes randomly practiced on HTTPS sites in order simply to degrade
connectivity. Gives figures on collateral damage (even outside China)
from BGP prefix hijacking and DNS pollution.


turkeycensor
Report of the OSCE Representative on Freedom of the Media on Turkey and Internet Censorship

The report analyzes Law 5651 (the "Internet Act") of 2007, which is the
basis for mass blocking of websites in Turkey, with the goal of
reforming the law and bringing it into line with standards of freedom of
expression. Not very interesting for us. Only interesting fact from this
is that it seems Turkey just blocks websites using a blacklist of
keywords and IP addresses (p. 28: "The DNS blocking/tampering and IP
address blocking methods currently used in Turkey for the execution of
blocking orders results in massive over-blocking..."). Between May 2007
and December 2009, ≈3700 websites were blocked under Law 5651. There is
an email, telephone, and SMS hotline for the reporting of websites that
someone thinks should be blocked, under the administration of the
president of Turkey. There were ≈82,000 total calls to the hotline up to
May 2009, and ≈34,000 of those were considered actionable; of the
remainder ≈31,000 were duplicates and ≈16,000 were not considered
illegal. The author finds fault with the law's presumption of guilt.


Marquis2013planet:
Planet Blue Coat: Mapping Global Censorship and Surveillance Tools

Uses Shodan and country-wide port scans to search for Blue Coat devices
and then scans them from Europe and the US. A HTTP response that passes
through a Bluecoat device contains a specific keyword which specifies
the type of the Bluecoat device. This paper searched for two types of
Bluecoat products: ProxySG and PacketShaper. ProxySG is a web filter
that can also MITM SSL. PacketShaper can filter application traffic by
content category. Found 61 Blue Coat ProxySG devices and 316 Blue Coat
PacketShaper devices located all over the world, including in countries
with a history of human rights abuses. An earlier report in 2011 found
Blue Coat devices in use in Syria, a country subject to sanctions that
should not have been able to buy them:
https://citizenlab.org/2011/11/behind-blue-coat/.


Mathrani2010a
Website Blocking Across Ten Countries: A Snapshot

Measured Internet blocking in June 2009 in 10 countries: CA, CN, FR, DE,
IR, NZ, SA, TR, AE, and US. Doesn't have much to say about blocking
techniques. Wrote a tool that, for each URL in a seed list, did a DNS
lookup and then downloads the HTML. If the IP does not match a known IP
for that domain, or if the HTML does not match known HTML for that page.
(It seems they used some kind of crawler to discover different IPs and
HTML--it's unclear--but at any rate the methodology is questionable.)
They found these blocking rates across their seed list of pages that
included the categories Religion, Political, News, Entertainment, and
Social Networks:
	SA: 30%  IR: 26%  CN: 13%  AE: 12%  FR: 2%  DE: 2%  CA: 0%  NZ: 0%  TR: 0%  US: 0%


Leberknight2010taxonomy
A taxonomy of Internet censorship and anti-censorship

The paper is a little confused throughout. It feels like several people
did independent research and made independent categorizations, then
pulled them all together into a weak synthesis that is not coherent. It
purports to be a taxonomy, but it is not really, neither of censorship
nor anti-censorship technologies. They have some bulleted lists and a
cursory survey or existing technologies. They start some ideas like
"four approaches of anti-censorship" on page 9 but then never follow
them up. Nevertheless, there are some good tidbits.

They abstractly classify censors according to the dimensions of attack
mode (e.g. takedown of the endpoint versus in-line IP filtering);
filtering method (e.g. IP filtering, DNS filtering, keyword filtering);
and targets (what is censored, a policy decision). They identify four
approaches to anti-censorship (but do not elaborate at all):
volume-based, speed-based, covert channel–based, and new
technology–based. They claim that sustainable circumvention can only
succeed by increasing the censor's costs, specifically in keyword
filtering and stateful traffic analysis, and that for circumvention
system to be truly successful, it must have an impact on public policy.
Part of the discussion is about anonymous publishing systems that are
outside our scope (e.g. Tangler, Freenet, Free Haven, Eternity Service).

They name as the censor's 8 criteria in Section II.A:
 * Cost
 * Scope (range of communications modalities censored)
 * Scale (capacity of the censorship system)
 * Speed
 * Granularity (specificity of blocking rules)
   [seems conflated with False positives?]
 * False negatives
 * False positives
Then give the evader's 7 criteria in Section IV.A:
 * Availability (?) "there is no use of an anti-censorship technology if
   the target users cannot access it."
 * User-friendliness
 * Verifiability (secure distribution)
 * Scope (range of communications modalities covered)
 * Security (?) "this is the most obvious dimension of an anti-censor"
 * Deniability
 * Performance
But, in Figure 8 and Section VI there is a different set of *9* evader
criteria:
 * Ability to influence social and political
 * Anonymity & Trust
 * Centralized Vs. P2P
 * Universality; Mobile & Web
 * Cost
 * Access Vs. Publish
 * Performance
 * Promotion
 * Usability
Which then somehow morph into the 4 "most critical" criteria that are
not even a subset of the previous list:
 * Anonymity & Trust
 * Usability
 * Technological Superiority
 * Infrastructure Support


oni-china-2005
Internet Filtering in China in 2004-2005: A Country Study

Did testing of accessibility of URL lists, using both a custom URL
tester tool deployed in-state and tests run through proxy servers.
Compared results to those of a control client located in an unfiltered
country. The in-state and proxy tests tended to have different blocking
rates. Found extensive blocking, though overall less than in a 2002
study. Examined blocking rates by URL category. Mostly entire domains
were blocked, but there were some instances of IP blocking and page
blocking (some sites blocked within a domain but others unblocked). They
observed injected RST followed by a zero TCP window size
(https://opennet.net/bulletins/005/) that prevented TCP communication
with the same host for a period of minutes. They give a specific example
of hardware capabilities: The Cisco 12000-series routers can implement
up to 750,000 filtering rules. They give an example of a rule, noting
that it is trivial to modify for keyword blocking:
	Match protocol http url "*root.exe*"
URL filtering could sometimes be bypassed; Google's cache was accessible
if you tweaked the query string. Also tried posting blog comments and
sending emails. Found blog censorship, but with little overlap with a
list of 987 banned keywords extracted from a QQ client. Email filtering
is done by email provider, not at the backbone. Email filtering results
differed by character encoding: GB 2312 had more blocking on average
than UTF-8.

Gives background on legal issues, which include general media
regulation, regulation of ISPs and providers of Internet services, and
regulation of cybercafes. ISPs and Cybercafes are required to keep logs
on their customers. The first Internet regulations came in December
1997.


Chaabane2014a
Censorship in the Wild: Analyzing Internet Filtering in Syria

Analysis of 600 GB of leaked logs from seven Blue Coat SG-9000 proxies
deployed to censor the network backbone in Syria. The logs cover 9 days
in July and August 2011. The proxies were configured in transparent
mode. The logs contain an entry for every HTTP request, with data such
as host and URL path, and whether the request was allowed or denied.
They find evidence of IP blocking, domain blocking, and URL keyword
blocking; also of users evading censorship. About 6% of requests were
denied; of those, most were due to network errors. In total, only about
1% of total requests were censored; however this underestimates
censorship because uncensored web pages will result in many requests. IP
blocking blocked many subnets in Israel. Domain blocking on certain
domains including all of .il. Found blocked URL keywords: "proxy",
"hotspotshield", "ultrareach", "israel", "ultrasurf". Overblocking of
e.g. Google Toolbar and Facebook Like button which use "proxy" in
requests. They claim that the blocking of "proxy" blocks SOCKS proxies,
but that doesn't make sense to me. Fragile URL blocking: appending query
params made a URL unblocked. IM services are disproportionately blocked.
Partial blocks of Twitter, Facebook; some targeted, but some a side
effect of keyword blocks. Tor only lightly censored (1.38%). Only one of
7 proxies seems to block it, and only sporadically. Blocks only OR
traffic, not directory requests. The Blue Coat proxies are not
identical; some appear to have special purposes. Users use BitTorrent to
download circumvention software. People use Google Cache. HTTPS accounts
for only 0.08% of all traffic and 0.82% of censored traffic. There is
evidence of throttling (fewer requests) on Friday, consistent with press
reports.


Ensafi2015a
Analyzing the Great Firewall of China Over Space and Time

Test connectivity between geographically distributed hosts in China and
Tor relays, Tor directory authorities, and web servers. They controlled
neither the hosts in China nor the relays: to measure connectivity
between them they employed "hybrid idle scan" and "SYN backlog scan."
Hybrid idle scan is an idle scan, with the additional feature that it
can also detect unidirectional client-to-server blocking, by exploiting
the fact that the server restransmits SYN/ACKs a small number of times
(incrementing the client's IP ID), and it will send all those
retransmissions because the client's RST that would otherwise stop them
is blocked. SYN backlog scan exploits a feature of Linux where it prunes
backlogged SYNs (retransmits the SYN/ACK fewer times) when the SYN
buffer is more than half full. This is an improvement over a previous
SYN backlog technique that required filling the backlog completely and
observing the transition to SYN cookies. SYN backlog scan can be
employed in two ways (Fig. 5) to see whether it is the SYN or the
SYN/ACK that gets dropped. Selected geographically uniformly distributed
hosts in China, literally slicing by latitude and longitude. Over time,
tested all clients against all servers. Dropped >60% of data because of
churn or errors. They found frequent failures of the firewall resulting
in temporary connectivity, and that such failures happen in bursts of
hours. Observe no significant patterns having to do with geographical
location, though routing may be important. CERNET fails much more often
than other networks. SYN backlog scan confirms earlier results that it
is the SYN/ACK, not the SYN, of Tor relays that is blocked.


Dainotti2011a
Analysis of Country-wide Internet Outages Caused by Censorship

It's an analysis of "kill switch" total Internet outage events in Egypt
and Libya in January, February, and March 2011. Their main tools are BGP
data (from Route Views and RIPE Routing Information Service), a /8
network telescope at UCSD, and (to a lesser extent) active traceroutes.
They were able to detect when blocking was in effect by observing a drop
in traffic (mainly Conficker scans) to their telescope. The most
interesting result is that in Libya, the telescope-measured outage was a
superset of the BGP-measured outage, indicating that the blocking was
not only by BGP. In Egypt: during a blocking event of 5 days, the
telescope data and BGP data agreed as to the start and stop. From BGP,
there was a precipitous drop in advertised IPv4 prefixes from 3,000 to
175. (Interestingly there was no change in IPv6.) Even during the
outage, the telescope receives a little bit of traffic (still mostly
Conficker). They detected DoS attacks against Egyptian government sites
just before and just after the outage. In Libya: there were three
outages. The first two were overnight blocks lasting hours; the third
lasted four days. The first block was full BGP-based; the second was a
mix of BGP and something else (packet filtering); and the third was only
packet filtering. The authors suspect that the censor tested the
packet-filtering block during the second outage, withdrew it, and then
reinstated it. A non-state-controlled satellite ISP was affected equally
as the single other (state-controlled) ISP; the authors suggest it was
jamming of the satellite signals. There were also two day-long DoS
attacks on servers in Libya.


Wolfgarten2006a
Investigating large-scale Internet content filtering

Light analysis of blocking in China, contemporary with
Clayton2006a. Not too deep. The author rented a VPS in China
(China Telecom) in July–August 2006 and ran manual tests. Found DNS
blocking: SERVFAIL or timeout for blocked domains. Confirmed search
engine provider participation in censorship. Confirmed RST blocking of
TCP flows containing keywords. The author gives a survey of possible
circumvention techniques. Claims that using an alternate DNS server
works--means no DNS injection at that time? (Dong2002a had injection as
news in 2002; Tokachu2006a and Lowe2007a §6.4 also say that DNS
censorship happens by injection.) OpenVPN worked in China.

Can do:
	fake DNS replies
	keyword filtering
	RST injection
	IP blocking, presumably
Can't do:
	DNS injection
	block OpenVPN


Clayton2006b
Failures in a Hybrid Content Blocking System

Analyzes British Telecom's "CleanFeed" system that was instituted in
2004. It is intended to block the Internet Watch Foundation's list of
child porn images. It is "hybrid" because it is fast path–slow path:
potentially blockable traffic is redirected to a proxy server, which
decides whether to block. The first level is a simple IP–port match (DNS
names are pre-resolved) and the second pass is a full-fledged web proxy.
This design gives good precision (low overblocking). Identifies ways to
circumvent/attack: serve different content to the CleanFeed system
itself, use a proxy, use source routing, obfuscate URLs, change the site
IP address, fake DNS replies to get important IPs blocked. You get a lot
of leverage from the two–tier design where the first tier is imprecise.
The system unintentionally works as an oracle to discover the list of
domains with blocked URLs: send TTL-limited packets to various IP
addresses; those destined to censored sites reach the proxy without
exhausting the TTL. Cites contemporary blocking systems in the China,
Saudi Arabia, Norway, the US.


Lowe2007a
The Great DNS Wall of China

Found the authoritative DNS servers for 50,000 .cn web sites (it's not
totally specified how they made the list of web sites). Then filtered
for DNS servers located in China, then further limited to only recursive
servers, leaving about 1,600. Tested about 950 presumed-censored
domains, mostly from a 2002 cyber.law.harvard.edu list. Tried queries
for presumed-blocked domains, for those same domains as subdomains of
randomly generated domains, and repeated queries (600) to a single DNS
server. Tried TTL manipulation to find the source of tampering. They
found that about 400 of their 951 domains were tampered with; tampered
responses returned one of only about 20 possible IP addresses, the great
majority sharing exactly 8 addresses. Domains embedded as subdomains of
a randomly generated domain (www.epochtimes.com.<random>.com). The 8
poison IP addresses are interesting (Table 2):
	202.106.1.2	CNCGROUP		Beijing, CN
	202.181.7.85	First Link Internet	North Rocks, AU
	203.161.230.171	POWERBASE-HK		Hong Kong, HK
	209.145.54.50	World Internet Services	San Marcos, CA, U.S.
	211.94.66.147	China United Telecom	Beijing, CN
	216.234.179.13	Tera-byte Dot Com	Edmonton, CA
	4.36.66.178	Level 3 Communications	Broomfield, CO, U.S.
	64.33.88.161	OLM,LLC			Lisle, IL, U.S.
TTL experiments showed that DNS replies are being injected (and the true
DNS replies are not dropped, they just lose the race.) This is different
than Wolfgarten2006a, which claimed that using an alternate DNS server
avoided DNS censorship.


Shen2013freeweb
Freeweb: P2P-Assisted Collaborative Censorship-Resistant Web Browsing

Uses a global DHT (they suggest an implementation called OpenChord),
some of whose nodes are censored and some of which are not. The client
hashes the URL it wants to retrieve and sends a request including the
URL to the node that the hash maps to. The node tries to retrieve the
URL. If it cannot, it again hashes the hash that pointed to it
(h(h(URL))) and forwards the request to the next node. The process
continue until a node is able to download the URL. (These request
messages are encrypted hop-by-hop.) Once downloaded, the node encrypts
the page contents in a reply onion and sends it back to the client along
a client-chosen path. It seems to be an unstated part of the threat
model that the censor is not allowed to externally interfere with the
DHT itself or its protocol messages, only do things like act as a
malicious node. They don't explain how the censored client learns its
list of DHT neighbors or their public keys. Another of their assumptions
is some wishful thinking on the number of proxies: "We envision Freeweb
as consisting of a massive number of joined nodes" without explaining
where those nodes come from. "We set N=2^10 and N=2^14 to simulate a
medium and a large scale network." They claim "request originators and
request forwarders are not distinguishable" but that's true only if URLs
are unpredictable: a forwarder's node ID will be equal to h^n(URL) for
some URL and some small n; an originator's will not. They claim that the
drawbacks of other systems are either a small number of proxies or a
centralized server that can be blocked. Every node along the way gets to
see the URL. They suggest a multiple-download strategy to mitigate the
threat of a malicious node returning a fake web page. There are a lot of
irrelevant details like how the servers cache downloaded pages and how
there is a text-only mode. They run some experiments on PlanetLab. It
seems completely broken: for each URL the censor wants to block, it can
just block the IP of the first node in the hash chain that it can
observe.


Danezis2011covert
Covert Communications Despite Traffic Data Retention

Not as general as its title makes it sound. They identify a global
incremental IP ID counter on a third-party host as a kind of shared
memory that Alice and Bob can use to communicate. The goal is to
communicate through an adversary that logs application-layer payloads
but not full packet captures. They give two examples of how to induce a
host to send packets (thereby incrementing its IP ID): through echo
requests and through TCP window manipulation. They give ways to deal
with IP ID noise caused by other hosts. They make the general statement:
"A covert communication system has to make use of unintended features of
commonly used protocols, in a way that does not arise suspicion, in
order to unobservably relay messages between two users." They claim that
there are many more possibilities for such communications systems. It
seems this was presented at a conference in 2008, but published in a
collection in 2011. It must have been made public in 2006, because
Luo2007counters cites it so.


Luo2007counters
Crafting Web Counters into Covert Channels

Follows up on an idea suggested in Danezis2011covert, using web counters
rather than IP IDs as a shared communication resource. They call their
design WebShare. Sender and receiver share a synchronized clock. To send
a 1, the sender increments the counter; to send a 0, the sender does not
increment. Then the receiver checks the counter to see whether it has
been incremented. They present a few ideas for dealing with noise from
other web visitors and spreading load across multiple counters for more
capacity and less detectability. They run experiments with counters on
public web pages. They can't push a single counter much past 0.5 bit/s
without starting to get a lot of errors.


Zander2008covert
Covert Channels in Multiplayer First Person Shooter Online Games

Encode messages as tiny variations in player movement in Quake III
Arena. (Rook later did something similar, though it does not have to be
as specifically tailored to a particular game protocol.) The idea is
that tiny perturbations are not visible to players, do not affect game
play, and are not detectable through statistical analysis of game
packets. The game sends absolute snapshots of e.g. player coordinates at
20 Hz; in between these it sends relative update packets at 100 Hz. The
data are encoded in the update packets relative to the most recent
snapshot. They avoid using the player's (x, y, z) coordinates and
instead only encode data in the pitch and yaw angles, because the angle,
unlike the coordinates, are completely player-controlled with a few
exceptions (like teleporters). They tested it using the CCHEF framework
and 10-minute matches. They achieve a transmission rate of 7–9 bit/s
(when accounting for transmission errors). One difficultly is that not
all update packets are transmitted to all other players, only to
currently visible ones. Also, game packets are encrypted (obfuscated?),
though the algorithm is known so they can have their external
covert-channel proxy undo and redo the encryption. Based on an
eyeballing of CDFs, there's no change in the distribution of packet
lengths (despite the game's compression of packet contents) and no
change in the distribution of angle deltas.


Tan2011ictai
A Covert Communication Method Based on User-Generated Content Sites

Is based on the idea of storing data on web sites with user-generated
content (like Collage does). Doesn't make a lot of sense. They seem to
1) split up a resource with a k-of-n secret sharing scheme, then 2)
store the shares on various web sites determined by hashing. They make a
hash ring of sites by hashing the URL. They generate a "Resource ID"
(possibly a misnomer--seems like each share would need its own ID, not
just the resource as a whole) by hashing a random seed and the current
time epoch. The publisher finds the site (actually it is a two-step hash
ring: site+search term) with the closest Hamming distance and stores the
resource (again, possibly should be "share") in a file there using some
kind of steganography (which they call "information hiding"). This is
the part that doesn't make sense: the receiver is then supposed to do
the same hashing operation, visit the determined sites, and reconstruct
the shares. "Accordingly, the receiver generates a resource identifier
as the publisher, and then follows the steps in 3.3 to get the location
of the media bunkers to download them." But how does the receiver learn
the resource ID? In Table 1, the only non-time input to GenResID is
explicitly "a random seed." I guess the seed is transmitted out of band?
They criticize Collage because it "requires a secretly shared resource
identifier (message ID) between sender and receiver, and map the tasks
(shared behaviors between sender and receiver) to each resource
identifier ID," but I don't see how their Resource ID is any different.
No mention of how the global list of site URLs (and search terms) is
maintained or communicated to participants. They claim resistance to
"traffic analysis" and "user repudiation," but by those terms they only
mean that receivers only visit common web sites that have other purposes
than circumvention. Their hierarchical DHT seems to be biased: the key
for both steps is the Resource ID, so for a given site, it will never
select search terms that have had values far from the hash of the site
itself.


McPherson2016a
CovertCast: Using Live Streaming to Evade Internet Censorship

The core idea of is modulating data (web pages) into a series of still
bitmaps, then broadcasting the bitmaps over a *live* video stream, such
as are offered on YouTube or twitch.tv. The paper is interesting because
of the different communication model it uses. It's not like a
traditional proxy where a client requests a page and then the system
somehow fetches and returns it. Rather, the operator of the live stream
chooses a set of web sites to broadcast--say, bbc.co.uk, cnn.com,
foxnews.com--then repeatedly crawls and sends the contents of those
sites according to some schedule. The clients watching the stream
receive the sequence of images, demodulate them, and store them in an
offline cache for viewing.

There is no way for a client to tell the stream to retrieve a different
set of web sites--if you want something else, you need to tune in to
another stream that offers it. The communication model reminds me of
television: your choice is limited to what channel to watch. It also
reminds me of Toosheh, which is basically the same idea, using satellite
TV instead of online live streams. The idea is that censors can't block
satellite broadcasts, so you tune in to the Toosheh channel, record the
video, and then run a program to decode the video into a bunch of files.
https://www.wired.com/2016/04/ingenious-way-iranians-using-satellite-tv-beam-banned-data/


Ellard2015a
Rebound: Decoy Routing on Asymmetric Routes Via Error Messages

Rebound is an enhancement to decoy routing, from the same group that
made CurveBall (Karlin2011). Like CurveBall, Cirripede, and TapDance,
Rebound supports asymmetric routes, the decoy router only needing to see
traffic in the client→decoy host direction. Its advantage over these
other systems is that it introduces no routing inconsistencies in the
downstream direction: the decoy router, rather than spoofing packets
"from" the decoy host to the client, instead injects its downstream data
into the client's upstream messages, which the decoy host then reflects
back to the client, unwittingly, in the text of an error message.

Their exmaples and implementation use HTTPS. The effect is to create a
virtual tunnel between client and decoy router than piggybacks atop an
HTTPS session between client and decoy host. For example, if C wants to
send "upstream_data_for_DR" to DR, it will send a TLS ciphertext towards
DH, knowing that it will pass through DR:
	C→DR  "GET /upstream_data_for_DR"
DR, who knows the TLS session keys of the connection between C and DH,
decrypts the ciphertext and add's "upstream_data_for_DR" to its input
queue. The data might be a command to fetch a web page, for example.
When DR wants to send data back to C, it intercepts the next message
from C to DH and changes the message so it contains the downstream data
instead of whatever the client sent. The downstream data might be the
contents of a previous web page fetch, for example. DR then forwards the
modified message on to DH:
	DR→DH "GET /downstream_data_for_C"
DH, which is an ordinary web server, does not have any match for the
path "/downstream_data_for_C", reflects the path back to C in an error
message:
	DH→C  "404 no such file downstream_data_for_C..."
In reality, the messages on the virtual tunnel between the client and
decoy router would have an additional layer of encryption inside the
TLS; i.e., "GET /E(upstream_data_for_DR)", so that the decoy host does
not see plaintext in the forwarded messages it receives.

In order to decrypt and re-encrypt TLS messages, the decoy router needs
to know the session keys. The decoy router knows the client random field
because it lies on the upstream path. The premaster secret, in
CurveBall, is generated deterministically from the client random and the
client's key, so the decoy router knows that as well. The client must
inform the decoy router of the two pieces it does not know: the server
random and the ciphersuite ID. The client sends this information using
stencil coding, a way of choosing plaintext so that certain bits in the
ciphertext are set so as to carry a message. Stencil coding is
reminiscent of the chosen-ciphertext steganography of TapDance--it is
less efficient, but also works with TLSv1.0 (TapDance requires TLSv1.2
for its per-record IVs).

The client needs to poll with "chaff requests" in order to receive
downstream data; the only way the decoy router can send data back to the
client is atop a client-generated request. The decoy host's web server
logs presumably fill up with error messages (unlike TapDance, which
doesn't actually finish requests). Another benefit of reflecting
messages through the decoy host is that its network stack fingerprint,
for instance its TCP characteristics, will be correct.


Zarras2016a
Leveraging Internet Services to Evade Censorship

The main idea is dynamic rotation between multiple pluggable tunnels.
The authors argue that using a single protocol for a tunnel is
distinguishable because it will have a distinctive usage pattern;
therefore one should divide traffic across multiple tunnels (though that
too is a distinctive pattern?). They prototype this idea in a system
called CAMOUFLAGE.

CAMOUFLAGE multiplexes traffic across multiple tunnels, first encrypting
it and then splitting it into packets for reassembly on the server side.
The individual tunnels may re-encrypt their share of the data, or do
other transformations, depending on the specifics of the protocol they
use. The scheduling is done by an element called the "dispatcher" and
the TCP-like reassembly layer is called the "connector."

The prototype CAMOUFLAGE implements four tunnels:
 1. SMTP (like SWEET)
 2. Skype (like SkypeMorph or FreeWave)
 3. online game Runes of Magic (using the voice chat feature, so more
    like FreeWave than like Castle/Rook/Zander2008covert)
 4. Dropbox (like CloudTransport)
Section 5.2 briefly describes an experiment to measure typical levels of
use for these four cover protocols (but there are very few details about
the experiment). The authors give recommended weights for balancing the
use of each tunnel.

The paper doesn't discuss how users find servers, or how they are
protected from discovery by censors. It seems to be aimed at a
run-your-own-server use case.


Bocovich2016a
Slitheen: Perfectly imitated decoy routing through traffic replacement

Slitheen is a decoy routing system that emphasizes security over
deployability, achieving truly indistiguishable traffic and latency
patterns, at the cost of requiring symmetric routes and downstream flow
blocking. It works by overwriting by overwriting "leaf
resources"--responses such as images that do not trigger additional
requests--in an HTTPS conversation with an overt web site. The client
carries on a stream of overt browsing (using a headless browser, for
example); Slitheen parasitically rides on this "carrier channel."
Slitheen sends upstream data by injecting an X-Slitheen header, deleting
or compressing other headers to maintain the request's size. The relay
station, running at a friendly ISP, MITMs the request, interprets and
overwrites any X-Slitheen data, then re-encrypts the request to the
overt destination. In the downstream direction, the relay station,
by carefully tracking the TCP, HTTP, and TLS states, identifies leaf
resources and overwrites their response bodies with downstream data.

Because Slitheen's communication is driven by actual browsing of the
overt site, and only those resources that don't affect traffic patterns
are overwritten, Slitheen is undetectable from a traffic analysis or
website fingerprinting point of view. The relay station never delays
packets, which requires careful treatment of things like fragmentation
and packet reordering--the relay station can't do its own reassembly
because that would introduce detectable delays. There is also some added
delay caused by the overhead of TLS MITM and state tracking, which the
authors measure to be small.

An attractive feature is that some censor attacks result only in denial
of service, not detection of the covert channel. For instance, if the
censor force an asymmetric route, or a route without a relay station,
communication with the overt site will be unaltered. The few attacks
mentioned that work are censor collusion with the overt site, and
censor-controlled trusted CA certificates--though there is a proposed
TLS handshake tweak that could detect the latter, before giving up the
game by sending an X-Slitheen header.


Dornseif2003a
Government mandated blocking of foreign Web content

Studies the blocking of web content in the state of
North-Rhine-Westphalia in Germany between 2001 and 2003. The government
ordered ISPs to block two Nazi web sites in February 2002; some ISPs had
begun blocking earlier than that, after a hearing in October 2001.
Begins with a summary of blocking techniques: packet filtering (at the
IP or TCP layers), DNS tampering (which was then new), and filtering
application-layer proxies. Then covers circumvention techniques:
mirroring, DNS aliases, changing IP addresses, changing ports, HTML
proxies (like translation services), search engines caches and archives,
IP-in-IP tunneling (what we would call a VPN today), third-party
proxies, dial-up connection to another country, encryption (DNSSEC and
HTTPS), and direct access by IP address without DNS. Mentions the
possibility of overblocking (due to virtual hosting, etc.). Discusses
the difficulty of interpreting the precise meaning of the legal order
and of complying with it on a technical level.

The paper concludes with a technical survey of DNS tampering by ISPs in
North-Rhine-Westphalia, conducted in May 2003. The author enumerated
potential ISPs in the state (including e.g. universities) and queried
their recursive DNS servers for the presumed-blocked domains as well as
some controls. They found varied blocking techniques: NXDOMAIN, CNAME
pointing to a different domain, or A records pointing to bogus IP
addresses. Some treat the www subdomain different than other subdomains.
There are varied side effects, depending on the type of DNS tampering,
which may include breaking mail delivery or redirecting HTTP traffic to
localhost.


Zittrain2003a
Internet Filtering in China

Berkman Center study that covers Internet filtering in China. They
conducted various tests between August and November 2002 using open
proxies. (They mention using dial-up connections, rather than open
proxies, from March through May, but don't say anything about those
results.) Constructed a test list of 200K URLs from Yahoo categories,
Google searches for specific terms, and user-submitted blocking reports.
Of those 200K, 19K were blocked. 101/752 (13.4%) porn sites were
blocked; commercial filters blocked 70% to 90%, so China probably does
not use a commercial filter but has its own blacklist. Blocked sites
fell into categories, "Dissident/Democracy", "Health", "Education",
"News", "Government", "Taiwan and Tibet", "Entertainment", "Religion".
The block list is maintained and changes over time.

They found five methods of blocking:
 * web server IP address (the most common method)
 * DNS server IP address (might be a side effect of other IP blocking)
 * DNS redirection (DNS poisoning)
These started only after September 2002:
 * URL keyword filtering, resulting in blockage of 5 to 30 minutes--URL
   escaping works to circumvent
 * HTTP response filtering (resulting in truncated HTML)

There's an alternate version at https://cyber.harvard.edu/filtering/china/.


Knockel2017a
Measuring Decentralization of Chinese Keyword Censorship via Mobile Games

Extracted client-side keyword blacklists from 200+ mobile games
downloaded from Hi Market. As in previous studies of chat and social
apps (e.g. Knockel2015a), there is little similarity between the lists
of different games. Using a Mantel test, they found that the best
predictor of keyword similarity is having the same developer.
Most keywords (51%) were in the "Social" category; next closest was
"Technology" at 17%; there were also personal names and various
homoglyphs/homonyms. Text escaping anomalies hint at the provenance of
lists. Though game companies have to submit keyword blacklists (along
with e.g. game dialog) for approval, the results suggest that
enforcement of keyword censorship is more fragmented than centralized.


Tanash2017a
The Decline of Social Media Censorship and the Rise of Self-Censorship after the 2016 Failed Turkish Coup

Analyzes tweets from Turkey before and after the coup attempt of
2016-07-15. The "before" set is 5.6M tweets (513K censored) from 24 days
during Turkish general election of July 2015. The "after" set is 8.5M
tweets (142K censored) from 75 days following the coup attempt.
Collection methodology uses the Twitter public API, similar to
Tanash2015a; they say they got a nearly complete set of Turkish tweets.
There was a decline in government-censored tweets, which the authors
attribute to an increase in self-censorship by users. They find that 18%
of Turkish tweets were voluntarily removed, whether by deleting of
individual tweets, by deleting an account, or by making an account
protected--but they don't provide a basis of comparison to know whether
that is an unusually large fraction. They use topic analysis and find a
new topic of censorship, the Gülen Movement, with some users perhaps
even posting spurious anti-Gülen tweets to avoid suspicion. Section 4.1
says there was a decline in Tor users after the coup attempt, though I
don't see it. As in Tanash2015a, they found more censored tweets than
are reported in Twitter's transparency report.


Jermyn2017a
Autosonda: Discovering Rules and Triggers of Censorship Devices

Autosonda is a tool for the automated discovery of firewall rule sets
and vulnerabilities, treating the firewall as a black box. It automates
previously used manual techniques. Autosonda aims to discovers the
firewall's "model," "mechanism," and "technique." Model is the set of
feature values that the firewall uses for detection, for example
protocol field values. Mechanism is how the features are extracted or
identified, for example using a regular expression. Technique is how the
blocking is effected, for example a block page or injected RST. The
system relies on running client software inside the censor's sphere of
influence, in cooperation with a server running outside. Some of the
client tests require root. It has automatic fuzzing of some features,
for example replacing "GET /" with "GeT /" or "GET/".

They tried it on 76 open wi-fi networks around New York City. They used
a fetch of xvideos.com (literally: "the number one most popular Adult
category site in Alexa's top 500 sites by category") as a prefilter to
decide whether a network was censored or not. Regarding model, they
found firewalls that examined DNS requests, some that examined the HTTP
Host header, and some that resolved the domain name of the Host header
and then matched against an IP address blacklist. Regarding technique,
some firewalls sent a false DNS response pointing to a block page, and
some that returned a block page in band. Regarding mechanism, they tried
sending HTTP over UDP, which none of the firewalls blocked. They tried
using a ports other than 80; most but not all firewalls still blocked.
It was possible to evade URL blocks by appending a query string. They
tried sending two Host headers at once; roughly half of firewalls looked
at the first, the other roughly half looked at the second, and a small
number looked at both. 11/44 HTTP filters did not reassemble TCP; 7 did
not defragment IP. Changing capitalization in DNS requests did not evade
filters, but sometimes changing .com to .org did (when those happened to
point to the same site). A majority of firewalls are fooled by altering
the whitespace around the Host header, but not capitalization.

The authors say that they found at least one firewall bypass for 100% of
the filters, but that statement needs to be qualified with the fact that
some of the bypasses require cooperation by the server. The authors
acknowledge this, saying "with special implementation of a server," and
"note that actual retrieval of prohibited content depends on the
implementation of the server." For example, requesting HTTP over UDP, or
moving the host from the Host header to another header, won't work with
standard HTTP servers.


Frolov2017a
An ISP-Scale Deployment of TapDance

Describes a one-week deployment of TapDance at real ISPs with real users
(the first ever such deployment). They installed four TapDance stations,
three at a medium-sized regional ISP (2 × 40 Gbps, 1 × 10 Gbps) and one
at CU Boulder (1 × 10 Gbps). Installing the stations at the ISPs turned
out to be technically easy: the only access the stations needed on the
ISP's network was a mirrored traffic port and the ability to inject new
packets. The implementation didn't require any specialized hardware.
Stations had to fit into 1U space; the most capable of them had an
8-core Xeon, 64 GB of RAM, and 10 GbE network adapter. Users came from a
partnership with Psiphon, which distibuted a TapDance client to a
fraction of their users, over 50,000 unique over the week.

There were around 5500 sites behind the two ISPs, but >34% had to be
excluded because they lacked required network properties, like minimum
TCP window sizes and HTTP timeouts. They attempted to limit load on
overt destinations by only distibuting half of them at a time, having
the station impose limits, and enabling a robots.txt opt-out (which no
sites took advantage of).

Introduces the term "refraction networking" as a replacement for "decoy
routing," to avoid confusion with the specific implementation of Karlin
et al. 2011, also known as CurveBall.

The station implementation is mostly in Rust. It uses C for better
performance in Elligator, to link with OpenSSL, and to implement a Linux
kernel module. The client is written in Go and linked with Psiphon's
usual client. Code is available at https://refraction.network/.


Nekrasov2017a
Limits to Internet Freedoms: Being Heard in an Increasingly Authoritarian World

Interviews with 110 people in Zambia, Mongola, and Turkey 2014–2016:
activists, press, minorities, watchdogs, and unaffiliated citizens.
Identified adversaries to Internet freedom are government, corporations,
and communities. Investigates limitations of existing tools. There was
fair awareness of proxies and VPNs. Inadequate localization was a
problem. Groups that practice anonymity in order to protect themselves
also have greater difficulty establishing reputation and trust.


Ling2012a
Extensive Analysis and Large-Scale Empirical Evaluation of Tor Bridge Discovery

Derives models for the discovery of Tor bridges by (1) exhausting the
https and smtp bridge distribution channels, and (2) running middle
nodes. (These are ways 1 and 2 from tor-techreport-2011-10-002, which
was published shortly afterward and cites this work.) Over 14 days, they
discovered 2,365 bridges by querying the distributor and 2,369 by
running middle nodes (they don't say how much overlap there was between
the two sets). For their model, they partially reverse-engineered Tor
path selection and give a detailed description of the algorithm. Their
models predict actual performance well.

They needed a large number of IP addresses, to (1) evade Yahoo's
restrictions on creating multiple email accounts, and (2) evade
BridgeDB's bucketing by source IP address. For this, they used 500
PlanetLab nodes and 500+ Tor exits.


Darer2017a
FilteredWeb: A Framework for the Automated Search-Based Discovery of Blocked URLs

FilteredWeb is a system for discovering blocked URLs using search
engines. Starting from a seed list of known-blocked URLs, it extracts
text phrases ("tags") that commonly appear in the blocked URLs, searches
for those terms, and then tests each page in the result of the search
for blocking. They test blocking by looking for DNS injection from
outside the firewall. Then those new pages have their tags extracted,
and the process repeats. It's emphasized that they are looking for
blocked URLs, not blocked keywords: the goal is not to find the keywords
per se (as in Crandall2007a); the text tags are just a means of
efficiently finding new blocked URLs.

Through this technique they found 1,355 blocked domains in China, most
of which are not in the Alexa top 1,000 or the Citizen Lab test list.
They also report a large number of discovered blocked URLs (many URLs →
one domain), but that doesn't seem as relevant because a large fraction
are of the form facebook.com/...; i.e., a bunch of pages on domains
already known to be blocked.

Some sample tags (Fig. 6) are: twitter, facebook, youtube, android, ios,
tweet, tweets, linkedin, trump, uyghur, iphone, gmail, instagram,
openvpn, storify, insign, followers, brasil, bahasa, censorship,
intentionality, dmca, pptp, blurred, unblock, polski, tor, ipad, hindi,
tibet, abuses, pinterest, anonymously.


Jones2015a
Ethical Concerns for Censorship Measurement

Identifies three methods for censorship measurement: have researchers
travel to a country of interest, deploy software to people already
living there, or co-opt existing systems. Goes into detail on the third
method, based on experience from Spooky Scan (later Pearce2017a) and
Encore (Burnett2015a) and interaction with IRBs. Lists challenges: IRBs
are not used to dealing with research of this sort, which is not really
human subjects research but does involve risks to people; actual risk is
difficult to gauge; impossible to predict how censors will react; hard
to get consent. Lists further unanswered ethical questions.


Wright2011a
Fine-Grained Censorship Mapping: Information Sources, Legality and Ethics

Considers possibilities for testing censorship at a finer granularity
than a country. For various reasons (including organizational and
administrative limitations), filtering may not be uniform across a
country but may vary by e.g. ISP. Ideas for gaining multiple measurement
vantages within a country include Tor and VPNs (but those are less
likely to work in the highly filtered locations where we would most want
them). Some other ideas are asking users to run measurement tools,
indirect channels (remote testing of DNS poisoning), open proxies,
(mis)using existing protocols like BitTorrent that allow limited control
over a remote computer, and (throwing ethics to the wind) botnets. All
these bring up questions of ethics and legality. Misusing a protocol or
attempting to access a blocked site may be illegal--but even if it is,
it's very unclear whether it will actually result in any harm. The scale
of Internet censorship means that the censor cannot possibly apprehend
everyone trying to circumvent--but they could, in targeted cases.
Advocates for informed consent by volunteers who run measurement tools.


Crandall2015a
Forgive Us our SYNs: Technical and Ethical Considerations for Measuring Internet Filtering

Considers technicalities and difficulties of two methods of measuring
censorship: client-side and and side channel–based. Everything is held
to the light of ethical research, despite that ethical frameworks for
network security measurements are not yet well established. Users who
participate in client-side measurements should have informed consent,
even though the actual risks to them are highly context-specific and
potentially unknown. Consent is impossible for side-channel
measurements, but still we try to minimize risk. The treatment of
client-side methods is informed by the experiences of ONI and ICLab.
ONI employed an in-person informed consent meeting with each volunteer,
and deemed some locations to risky to run any measurements at all. ICLab
is developing a context-sensitive informed consent procedure (where
context is based on e.g. Freedom on the Net scores). The risks of
side-channel measurements include, for example, DoS and the perception
that a third-party is directing their computer to access some resource,
when really it was a researcher doing it. Poses some open questions
about safe side-channel measurement: should the IP address of the
researcher's computer appear in logs of both client and server (because
you can e.g. host a page there explaining the research)? How much risk
of DoS is too much? What kind of machine should the client be (e.g.
someone's home computer)? Covers ways of mitigating risk, which include
not directly using leaf nodes for measurement, but rather doing a
traceroute and stepping two hops back; and scanning from entire /24s at
a time to avoid implicating any single user.


Mahdian2010a
Fighting Censorship with Algorithms

Looks at bridge distribution from an algorithmic point of view. The
model is that there are $k$ adversaries, $n-k$ honest users, $m$
bridges; the goal is that at the end, all honest users should have at
least one unblocked bridge. Considers three scenarios: static one-shot,
dynamic multiple-round, and trust networks where users can invite other
users. In the static one-shot model, the problem reduces to a problem in
secure overlay network design
(http://www.cs.columbia.edu/~lierranli/publications/AAIM06.pdf) and the
algorithm they offer is to give each bridge to each user with
independent probability $1/(k+1)$. In the dynamic multiple-round model,
the simpler version of the algorithm is to divide the users into $n/k$
groups and give each group a unique bridge. When a bridge is
compromised, divide its user group in half and give each half a new
unique bridge. A more complicated version of the dynamic algorithm (that
gives better bounds on the number of bridges required) reuses
uncompromised bridges when probing to find the adversaries. In the trust
network model (where adversaries can invite other adversaries), they
leave the problem mostly open but show a solution for the case $k=1$.
Not really practical: wants knowledge of $n$ and $k$, assumes bridges
are always online, set of users doesn't change much.


Anderson1996a
The Eternity Service

Sketches a design for a high-availability file storage system using
redundant copies. Asserts that most computer science research on
security is related to confidentiality or integrity; comparatively
little is on availability. Web publishing means that knowledge is often
stored on just a few servers subject to coercion, making it easier to
suppress ideas than in the past. Envisions a paid service where you pay
a fee for some time period of guaranteed storage. Data can be replicated
across multiple servers in different legal jurisdictions, so as to be
robust against coordination coercion and accidental errors. Considers
relative advantages of tamper-resistant hardware versus mathematical
fault-tolerance as in Rampart. Suggests a "perjury trap" legal hack
where, on logging in, an administrator is prompted to affirm under oath
that they are not under duress. Important questions towards making
something like the Eternity Service possible are secure timekeeping,
anonymity, traffic analysis, and resilience of distributed databases.


Heydari2017a
Scalable Anti-Censorship Framework Using Moving Target Defense for Web Servers

Applies Mobile IPv6 (RFC 3775) for the purpose of anti-censorship. The
server acts as if it were a mobile node with multiple changing
addresses. The idea is that Mobile IPv6 is like a proxying service: a
node can register one or more "care-of addresses" that stand in for its
long-term "home address". A special extension header (type 2 routing
header) carries the long-term home address when corresponding with the
care-of address.

They do some analysis based on a model with some fraction of the users
being censors in disguise. Users are partitioned into separate access
groups, which are shuffled periodicially (they seem to envision a
shuffling period on the order of minutes). If a censor lands in an
access group, it can block that one care-of address during that time
interval. I wasn't quite sure what to take from the analysis—it seems
that a censor controlling a small fraction of users can block a large
fraction of access groups. They suggest captchas or similar as an
anti-sybil defense, and a credit system to assign users whose groups
tend not to get blocked to the same group. They outline a
divide-and-conquer strategy to isolate censors within groups,
reminiscent of Mahdian2010a. (It wasn't clear what the identity of a
user is for the purpose of credit accounting—its IP address?)

A big problem is that Mobile IPv6 packets have Destination and Type 2
Routing headers that contain the long-term home address of the server,
which the censor could look for and block (ignoring the care-of address
that appears as the destination address). They propose to work around
this with kernel modifications (they say it only requires changes on the
server, but Fig. 4 also shows client-to-server mods) that remove the
problematic headers and hide the necessary information under IPsec.

The other big problem is that the care-of addresses are selected from
the same /64 prefix as the home address. If the censor blocks the /64,
it's game over. (Section IV: "64 bit addresses are created at random and
combined with the home link prefix to generate the new CoAs.") For this
they have vague suggestions of putting high–collateral damage servers
within the same prefix.

The term "moving target defense" comes from DDoS literature. They have a
prototype implementation.


Cronin2005a
The Eavesdropper's Dilemma

Discusses challenges in the interception of network traffic:
fundamentally, the eavesdropper trades sensitivity against selectivity.
A lack of end-to-end security on a channel is necessary for
interception, but not sufficient. A variety of factors, digital
equivalents of analog noise, may prevent the eavesdropper from
recovering an intelligible message, or recovering any message at all.
Goes into detail on countermeasures evasion and confusion. Evasion: the
eavesdropper fails to record part of a stream. Confusion: the
eavesdropper records too much, including noise that would be discarded
by the endpoint, resulting in multiple possible interpretations. The
potential hinges on a difference between the sensitivity/selectivity of
the eavesdropper and that of the receiver: If the eavesdropper's
sensitivity > receiver's: use confusion; if the receiver's sensitivity >
eavesdropper's: use evasion. Examples of evasion are putting information
in overlapping fragments: if the eavesdropper records only at the
transport layer and not the network layer, it will lose the overlapped
information. Another example of evasion is sending on a physical link at
a level (e.g. voltage level) that the receiver detects but not the
eavesdropper. Examples of confusion are sending chaff with an invalid
destination MAC, TTLs that cause the packet to die before reaching thre
receiver, invalid checksums. Tests confusion attacks against existing
NIDSes.

Their "evasion" is the same as the "evasion" of Ptacek1998a. They say
that their "confusion" is different from the "insertion" of Ptacek1998a
(Section 3.2), but I don't really see the difference.


morphing09
Traffic Morphing: An efficient defense against statistical traffic analysis

Traffic morphing is an algorithm for frustrating traffic analysis: it
makes an input distribution of packet sizes conform to an output
distribution, by padding and optionally splitting packets. It has lower
overhead than, for example, padding every packet to a constant size. It
doesn't do anything for interpacket timing: one input packet corresponds
to one output packet (or, in the case of shortening, a burst of short
packets) with minimal delay.

Morphing works by computing an intermediate matrix that converts one
probability distribution (of packet sizes) to another. It's therefore
specific to the input protocol: if it doesn't match the expected input
distribution, the output distribution will not be as expected. Each
column of the matrix is a probability distribution on output sizes,
given an input size. In the case where the computed output size is less
than the input size, first a short packet with the desired output size
is split off, then the rest of the input packet is split up by directly
sampling the output distribution (without using the matrix). Section 3.3
also discusses constructing a matrix such that packets are never
shortened. Convex optimization can efficiently compute the matrix and
minimize byte overhead. Naively, the process only handles 1-grams, but
Section 3.4 is about coping with large input spaces, such as the n²
space created by bigrams.

The flaw, as I see it, is that there is still a correspondence between
input packet and output packets. Whenever the input is idle, the output
will be idle as well. You could sample directly from the desired output
distribution, independent of the input packet sizes, and not need to
compute a matrix. I suppose such a design would have lower efficiency.


Winter2013b
ScrambleSuit: A Polymorphic Network Protocol to Circumvent Censorship

ScrambleSuit is an obfuscation protocol that aims to resist active
probing, DPI, and traffic analysis. It resists active probing with
per-user (or per-server) secrets: a client that connects without being
able to show knowledge of the secret will not get a response from the
server. (The client authentication is an extension of the UniformDH used
by obfs3.) It resists DPI by encrypting everything, including the
key-exchange handshake, so it looks like a random stream. It resists
traffic analysis by shaping packet sizes and delays according to a
probability distribution that depends on a server secret. Therefore to
the extent that ScrambleSuit has a traffic analysis signature, each
server has a different signature, so blocking one does not result in a
block of the others.

After an initial UniformDH handshake, later authentications are done
using session tickets. There are slightly different designs for session
tickets to accomodate different underlying protocols. (Tor's bridge
distribution design makes it difficult to have per-client secrets.)

Discusses the difference between mimickry protocols (effective against
whitelisting) and look-like-random protocols (effective only against
blacklisting). The authors judge that whitelisting results in
unacceptable false positives, and that looking like random increases the
false-positive rate sufficiently to make it expensive to block. They
considered client-side puzzles (proof of work) as an alternative to
out-of-band secrets, but the current rate of churn in Tor bridges would
give too large an advantage to the censor in completing puzzles.

Evaluation is difficult because there is no specific protocol they are
mimicking; therefore they instead measure deviation from vanilla Tor.


Pfitzmann2010a
A terminology for talking about privacy by data minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management

Defines terms used in the field of privacy by data minimization. For
"privacy" they use the definition of Alan Westin: "the claim of
individuals, groups, or institutions to determine for themselves when, 
how, and to what extent information about them is communicated to
others." "Data minimization" is a way to support privacy: minimize data
collection and how long it is stored. Despite dealing with the terms
"undetectability" and "unobservability," it's not very relevant to
censorship.

They tend to define terms relative to the perspective of an attacker.
Changes in quantities named by terms are described using "deltas," which
are the difference between the attacker's a priori and a posteriori
perspective. E.g. "anonymity delta," "unlinkability delta," etc.

Anonymity is defined in terms of an anonymity set: a subject is not
identifiable within a set (or equivalently(?), not distinguishable from
the other members of the set). They present a revised definition which
says that subjects only cannot be "sufficiently" identified, relative to
some attacker. They have a funny, circular, way of defining what an
anonymity set is, though: it's the set of all *possible* subjects, where
"possible" is some subset of the universe of subjects, chosen by an
attacker according to its knowledge of who might be in an anonymity set;
e.g., the subjects capable of receiving messages.

Unlinkability is also defined in terms of distinguishing, relative to an
attacker. The attacker cannot distinguish two related items from two
unrelated items. Unlinkability is also relative to an "unlinkability
set," for example the set of senders. Anonymity can then be defined in
terms of unlinkability: for example, sender anonymity means that
messages are unlinkable within the set of senders. (But isn't there a
problem in the case of only one message, when there's nothing to link?)

Undetectable means that means that an attacker cannot distinguish
whether an item under question exists or not. (They first explain this
by saying that messages should not be distinguishable from "random
noise"—it's not clear whether by "random noise" they mean an idle wire,
or some actual signal that happens to look random to an attacker. In a
footnote, a "slightly more precise" formulation says that a message
should be indistinguishable from no message, which seems to be a
qualitative difference from the other example.)

Unobservability is undetectability plus anonymity. The item has to be
undetectable to an outside attacker, and anonymous among the
participants (to whom it is necessarily not undetectable).

Pseudonymity is using identifiers other than "real names." It only
describes a mechanism, and does not necessarily imply anonymity
(footnote 64).

Despite trying to outline a formalism, it's not very clear. The
terminology is obscure, with variants that confuse rather than clarify.
For example, they never give a single simple definition of "anonymity"
that is independent of the footnotes and hedges which follow it. They
contrast "individual anonymity" and "global anonymity" somehow--global
anonymity seems to be about how uniform is the probability of subjects
within the anonymity set (as in information entropy). However, according
to them, a subject only exits the anonymity set when its probability
equals zero (footnote 27), not tying it back to their earlier
clarification about "sufficiently." The document fails as an operational
reference. The table in Section 14 is a decent summary.

The document is 98 pages long, but only 35 pages are the real content:
the remainder is appendices and translations of terms into various
languages.


Houmansadr2013b
The Parrot is Dead: Observing Unobservable Network Communications

Investigates "parrot" circumvention systems that work by imitating other
protocols, and finds many flaws and implementation inconsistencies that
make them detectable even by a local censor, concluding that
"unobservability by imitation" is a fatally flawed strategy. It is just
too difficult to imitate a cover protocol in every detail, including
error behavior and interdependencies with other protocols. A potential
solution is to use genuine tunneling rather than imitation.

Takes as examples three real systems: SkypeMorph (Moghaddam2012a),
StegoTorus (Weinberg2012a), CensorSpoofer (Houmansadr2011a).
Analyzes each alongside what it is trying to imitate (Skype, Ekiga,
etc.). Divides attacks into "passive," "active," and "proactive" (active
but outside of the context of a specific connection). Divides
adversaries according to their knowledge: local adversary (LO),
state-level oblivious adversary (OB), state-level omniscient adversary
(OM). Analyzes the stated threat models of previous research thus:
 * SkypeMorph: OM
 * StegoTorus: OB
 * CensorSpoofer: OM
In some cases the attacks they find are simpler than more advanced
attacks that the systems actually defend against. (Keep in mind that
some of them were later found impractical by Wang2015a.) Some of the
detectable features they find are:
 * Missing Skype TCP channel.
 * Missing Skype plaintext metadata.
 * Active-probing for Skype supernodes.
 * Missing xref tables in StegoTorus PDF.
 * StegoTorus HTTP server fingerprint.
 * Replacing CensorSpoofer SIP tags.

Uses the term "unobservability" as its blocking-resistance criterion,
but does not define the term, except to say that the meaning is not the
same as in Pfitzmann2010a (Section X).


Geddes2013a
Cover Your ACKs: Pitfalls of Covert Channel Censorship Circumvention

Houmansadr2013b showed many failures to completely emulate a cover
protocol. This paper shows that even with perfect channel emulation,
mismatches between the cover protocol and proxy protocol makes the
circumvention detectable and blockable. Identifies three classes of such
mismatches:
	Architectural mismatch: e.g. peer-to-peer ≠ client–server
	Channel mismatch: e.g. unreliable UDP ≠ reliable TCP
	Content mismatch: e.g. different packet lengths
Examines in detail three systems: SkypeMorph (Moghaddam2012a), FreeWave
(Houmansadr2013a), CensorSpoofer (Houmansadr2011a). Tested the systems
through a manipulated network (e.g. simulating packet loss) and invented
new attacks. Examples of attacks are:
	inducing high error rates at the beginning kills FreeWave
	dropping SkypeMorph's semi-predictable ACK packets
	replaying SkypeMorph ACKs to roll back the receive buffer
	dropping a few consecutive packets for CensorSpoofer
	many-to-one traffic patterns of Tor bridges
	long connection duration of Tor bridges
	tracking standard deviation of FreeWave packet lengths
One conclusion is that peer-to-peer channels are a bad choice for the
cover channel of a client–server system like Tor pluggable transports.

Defines "unobservability": a censor is unable to tell whether or not a
client is participating in the system, and "unblockability": a censor
cannot block access to the system without also blocking access to the
entire Internet or a popular service (collateral damage).


Wang2015a
Seeing through Network-Protocol Obfuscation

Analyzes real circumvention protocols (obfs3, obfs4, FTE, meek-amazon,
meek-google) using real traffic traces (≈1 TB from a campus network,
30,000 from a lab setting), with an emphasis on practical attacks with
low false-positive rates. They find attacks that in all cases have TPR
close to 1.0 and FPR close to 0.0. The attacks require little in terms
of state or computation. They also evaluate some semantics-based attacks
proposed in Houmansadr2013b, and find that they have prohibitively many
false positives or would be easy to counter.

They divide circumvention protocols into randomizing (e.g. obfs*),
mimicking (e.g. FTE), and tunneling (meek). They classify attacks as
semantics-based, entropy-based, or machine learning–based. Their
classifier for obfs3 and obfs4 uses a minimum entropy threshold and a
length check over the first packet only. Their classifier for FTE uses a
mimimum entropy threshold and a URL length check over the first packet.
Their classifier for meek uses a decision tree over the first few
packets using entropy, timing, and packet-header features.

The ML classifier's TPR suffers a lot when tested in a different
environment than it was trained on (Table 8). FPR doesn't suffer as
much, but can be up to 12%. meek's different packet sizes unexpectedly
lead to a difference in min, avg, max entropy, even though it uses HTTPS
like some of the negative samples.


Wang2012a
CensorSpoofer: Asymmetric Communication using IP Spoofing for Censorship-Resistant Web Browsing

CensorSpoofer is a circumvention design that uses a covert, unblockable
upstream (similar requirements as flash proxy rendezvous) and an
IP-spoofed downstream. Distinguishing characteristics are high
resistance to insider attacks (works even if the censor runs the client
software) and easier deployment than decoy routing. Clients send their
desired URLs upstream through a steganographic email or IM channel.
Downstream data comes through a simulated VoIP session whose source
address is spoofed to appear to come from a dummy host. The client must
carry on a fake VoIP conversation (sending meaningless upstream UDP
packets) with the dummy host. The client never even knows the actual
address of the proxy. The asymmetric design is feasible because web
browsing is more download than upload.

Use port scanning to find plausible dummy hosts; a test of 10,000 random
IP addresses outside China found 12% of them suitable. Most of those
also have compatible traceroutes. You need to protect the IM/email
destination address (they don't assume an encrypted channel). For that,
they have an invitiation system whereby each user gets its own unique
address to send its upstream requests to.

Built and tested a prototype.


Martin2002a
Deanonymizing Users of the SafeWeb Anonymizing Service

Looks at SafeWeb, an anonymizing HTML-rewriting proxy that was
apparently really hot during its heyday, 2001–2002. It attempts to
achieve client anonymity by proxing all requests through a SafeWeb HTTPS
server. It turns out that easy, obvious JavaScript attacks completely
break its anonymity and allow requests to leak out to other origins. The
flaws can't be fixed without sacrificing some of of SafeWeb's design
requirements, such as anonymity of fidelity.

SafeWeb's client web form rewrites URLs like "http://example.com/" into
something like
"https://www.safeweb.com/o/_o(410):_win(1):_i:http://example.com/". The
www.safeweb.com server fetches the page and tries to sanitize it by
similarly rewriting all contained URLs. It tries to sanitize JavaScript
while preserving its functionality, which is obviously impossible from a
theoretic point of view. A few of the flaws are: consolidation of all
cookies under the safeweb.com origin (any site can steal cookies from
any other site); using substring matching to compare origins; ActiveX,
Java, and PDF leaks. Total disaster. Got a passing grade from the CIA.

Briefly mentions TriangleBoy (SafeWeb's blocking-resistance system based
on encryption and IP spoofing), but doesn't analyze it separately.


Koepsell2004a
How to Achieve Blocking Resistance for Existing Systems Enabling Anonymous Web Surfing

Lays out models and considerations for blocking-resistant communication
systems. Covers a lot of ground: threat model, classification of
blocking techniques, system design requirements. Really anticipated a
lot of later developments: centralized databases of volunteer bridges,
proxy distribution strategies, centralization of services behind
gateways (e.g. CDNs), active probing, cryptographic puzzles and
captchas, poisoned IP lists, reverse proxy connections (proxy connects
to client), "all or nothing" blocking. Influenced Tor's bridge design
(tor-techreport-2006-11-001).

Its threat model is partially borrowed from Peekabooty:
 * Censor blocks some traffic, not all.
 * The exists some low-bandwidth unblockable inward channel.
 * There are willing volunteers outside the censor's network.
 * Censored users have differing computational abilities.
 * Censor has "huge amounts" of resources, including human.
 * Censor controls all network links to the outside.
 * Censor doesn't control outside.
 * Censor knows the design of the circumvention system.

The classification of blocking criteria is in general based on the
network layer. They break it down thus (more detail in Appendix A):
 * Blocking based on circumstances
   1. Addresses (including geographic location)
   2. Timing (duration, frequency, time of day)
   3. Data Transfer (rate, amount, half or full duplex)
   4. Services (protocols, names, addresses [again])
 * Blocking based on content
   File type
   Statistical (e.g. entropy)
   Pattern matching

A blocking resistance system, in general, needs two things:
 1. A high-bandwidth, low-latency communication infrastructure.
 2. Some way to distribute information about the infrastructure (doesn't
    need to be high-bandwidth or low-latency)
You can think of (1) as a network of proxies and (2) as some kind of
proxy distribution strategy.

There are a few options for the underlying proxy infrastructure:
 * Many access points (which could be full-fledged proxies or simple
   forwarders). Ensure the censor doesn't get the full list of access
   points. This is like today's BridgeDB model.
 * Reverse connection: The proxy connects back to the censored client.
   Flash proxy did this.
 * Single access point: Many different types of services are served
   indistinguishably from one server. "Imagine that all web pages of the
   Unites States are only retrievable (from abroad) by sending encrypted
   request to one and only one special node," presaging CDNs. This is
   like domain fronting, CloudTransport, decoy routing.
Information can be encoded using steganography in addition to
cryptography. Censors do not need high-accuracy detection, only
suspicion, because they can confirm a guess by active probing ("acting
like a blockee"). Users should always have plausible deniability, for
example by embedded a second steganographic message in addition to the
intended one.

The information distribution problem is harder and still open. They give
some ideas. It needs to be unblockable--which is not a contradiction,
because the distribution has different requirements than general data
transport. Also it needs to be "fuzzy," by which they mean it can't be
easy to enumerate all the information. Some ideas are satellite or
shortwave broadcast, steganographic email, rate limiting via
computational puzzles or captchas, and poisoned IP lists. Suggests that
manual email forwarding (chain-letter style) could be used to distribute
infrastructure information.

They claim that strong anonymity is a requirement for circumvention, in
order to protect users. Blocking resistance should be implementable as
an "add-on" to an existing standalong anonymity system.

They make a prototype implementation as an add-on to the JAP client
software for the AN.ON anonymity network. This lets them use their
existing network of users as proxies (they contrast with TriangleBoy,
which doesn't offer volunteer proxies any benefit for running it). Their
implementation follows the "many access points" model, also with
connect-back capability. They don't use anything special for content
obfuscation. Their admittedly weak information distribution mechanism is
SMTP with captcha.


Feamster2003a
Thwarting Web Censorship with Untrusted Messenger Discovery

Has two main ideas:
 1. a specific proxy distribution strategy called "keyspace hopping."
 2. the insight that "proxy addresses" can actually just be dumb
    untrusted forwarders ("messengers") to a trusted proxy ("portal").

Each client gets to learn only a small random subset of the total pool
of proxies. Each client only gets a few proxies; each proxy only serves
a few clients. (This protects clients from censor-operated proxies and
protects proxies from censor-operated clients.) Each client's personal
subset of proxies is independent from other clients'. To achieve this,
they use "in-band" proxy discovery: the discovery is codified as part of
the protocol and proxies themselves participate in it. (This is a change
in thinking from Feamster2002a, which emphasized out-of-band discovery.)
Proxy assignment is based on a long-term client IP (e.g. the client's
IP address or subnet). On top of that, client puzzles limit the rate of
proxy discovery.

Details of keyspace hopping. Each proxy has a permanent unique ProxyID
(such as a hash of its IP address). There's a globally globally known
list of the ProxyIDs of existing proxies (this implies that it can't be
possible to reverse a ProxyID into an IP address). There's a global
constant |S_i| for the number of proxies each client gets to know about.
The first time the client enters the system and requests its proxies, it
is assigned a random value hkey, shared between the client and all the
proxies. hkey and the ClientID are hashed together to determine the
client's unchanging base index B_i. The client's subset of proxies are
in the |S_i| consecutive indices starting at B_i. (These are all the
proxies the client will ever know, until the system is reinitialized.)
The client only uses one at a time: during each time epoch t, the client
chooses one using a pseudorandom generator seeded with ClientID, hkey,
and t. Because hkey is shared between the client and the proxies, each
proxy knows which subset of clients it's supposed to be serving, and so
it ignores requests from all others. "Ignore" is not defined; I took it
to mean blocked by e.g. a firewall rule. The very first proxy discovery
is out of band, through a social network or something.

The notation in Section 2.2 could be improved. |S_i| should be just a
number S: S_i is never used as a set, so you only need to know the
parameter controlling its size, and then you don't have to index it by i
because it's the same for all clients. On the other hand, hkey should be
indexed by i because it's different for all clients. |P| could just be
P. In the text, ProxyID is a hash output (e.g. a 160-bit value), but in
the "equations," ProxyID_t,i is an index between 0 and |P|−1.

The client bootstraps by finding any available proxy and contacting it
for the first time. This is when the client gets its hkey value and
learns the IP addresses of its subset of proxies. The information is
hidden inside a computational puzzle. They're envisioning really hard
puzzles, taking like a day to solve. Puzzles are necessary because they
assume that the censor can spoof any IP address inside its domain. The
analysis of puzzles is a little questionable. "If we design the system
so that it is difficult enough to solve each puzzle (e.g., a day per
puzzle), then it will take the censor almost 200 years to discover half
of the proxies." But each user only has one computer, while censor has
many and can parallelize. Even if the censor has the equivalent of 100
PCs, then the "200 years" calculation only holds if normal users take
100 days to solve one puzzle.

The authors don't say it outright, but it's important for the assignment
of hkey to be synchronous. That is, you shouldn't be able for a client
to run the initialization more than once. Otherwise there's a powerful
discovery attack. Assume an attacker only knows one proxy IP address. It
connects and does initialization, learning |S_i| additional proxy IP
addresses. Now, from a different IP address, it simultaneously connects
to all the newly discovered proxies and initializes with all of them.
Each proxy assigns an independent hkey and a set of |S_i| more proxies.
So controlling two IP addresses gets you not 2|S_i| proxies, but
≈|S_i|². Three IP addresses get you ≈|S_i|³, etc. The problem is solved
if proxies instantly propagate their clientIP→hkey mapping to *all*
other proxies, preventing simultaneous initialization from the same IP
address. The paper understates this, saying only that hkey must be
shared with the proxies in the client's assigned set: "The proxy that
assigns hkey must also inform the other proxies in the client's proxy
set."

The design for untrusted messengers is fairly straightforward. You need
an additional layer of encryption so the messenger can't see what you're
browsing. They sketch proposals: modifications to Infranet, or SSL in
SSL. The untrusted messengers only know the hkeys of the clients they're
supposed to serve, and don't know the full list of proxy IP addresses.
Only the trusted portals hold the complete state.


McLachlan2009a
On the risks of serving whenever you surf: Vulnerabilities in Tor's blocking resistance design

Addresses question 8.3 from tor-techreport-2006-11-001: what effects on
anonymity result from acting as a bridge at the same time you are using
Tor. Weaknesses in the bridge design lead to a deanonymization attack
against users who simultaneously use Tor locally ("surf") and operate a
bridge for others to use ("serve"; e.g., how bridge mode in Vidalia used
to work). The attack takes advantage of three flaws in the bridge model:
 1. bridges are easy to find
 2. bridge accepts connections while its owner is active
 3. bridge traffic interferes with local SOCKS traffic (circuit clogging)
The attack scenario is that a server that can link multiple visits (e.g.
a pseudonymous forum) wants to know the IP address of one of its users.
The attack proceeds in three phases:
 discovery: build as large a list of candidate bridges as possible
 winnowing: eliminate bridge candidates (long-term intersection)
 confirmation: circuit clogging against remaining candidates
Phases 2 and 3 have been used before but not for this purpose.

For discovery, they used Tor itself to scrape BridgeDB from many address
prefixes. (Way 1 from tor-techreport-2006-11-001.) BridgeDB did not, at
that time, put all Tor exits under one pseudo-prefix. They found 247
bridges that way, scraping once per week for 2 weeks. A separate scrape
from 47 open proxies got 129 bridges. They mention, but don't do, other
possibilities for discovery: Internet-wide scans (way 6) and running a
middle relay and connecting back to any host that connects to you
(way 5).

Suggested mitigations. Decouple surf and serve. Bucket all anonymous
services from the BridgeDB point of view. Client proofs of bridge
identity (anti–active probing). Listen on random ports. Unfair queueing
that partitions local SOCKS connections from bridge connections and
handles only one type during a particular timeslice.

I'm not clear on whether it was every actually common for bridges to be
running under the surf+serve model. Section 3.1 says "Since the bridge
design envisions most Tor clients as bridge operators, the initiator
will be in this set with reasonable probability."


Matic2017a
Dissecting Tor Bridges: a Security Evaluation of Their Private and Public Infrastructures

Examines many aspects of Tor bridges circa April 2016 using public data:
CollecTor and Censys/Shodan, along with a small amount of their own port
scans. The presentation is a little disjointed but they pack a lot in.
They discover a number of public and private (not reporting to the
bridge authority) bridges and analyze various statistics. They also dig
into default bridges and pluggable transports. Their procedure for
discovering bridges was:
  search Censys/Shodan for Tor's distinctive certificate CNs; these are
    candidate bridges
  check the candidates' IP addresses in Collector to filter out relays
  connect to each candidate to verify its bridge-ness
  compare the bridge's descriptor IP address to the Censys/Shodan IP
    address (finds forwarders)
  check the bridges' fingerprints in CollecTor to see if each is public
    or private
This discovery procedure only works for vanilla ORPorts, not pluggable
transports. However it is effective because of two Tor bugs: pluggable
transport bridges cannot hide their ORPort (https://bugs.torproject.org/7349),
and CollecTor does not (did not) obfuscate ports as it did IP addresses
(https://bugs.torproject.org/19317).

They discovered and verified 1,239 public bridge IP addresses, 694
private bridge IP addresses, and 645 proxies that were not bridges
themselves but forwarded to some other bridge or relay. They found
thousands of odd bridges with the nickname "Ki" which changed their
fingerprint every hour. I happened to also notice the "Ki" bridges at
about the same time:
  https://lists.torproject.org/pipermail/tor-project/2016-December/000851.html
  https://lists.torproject.org/pipermail/tor-project/2017-February/000958.html

They find it necessary to subdivide bridges into "active" (with the
Running flag set) and "inactive"; and also whether they ever served at
least one client. Most bridges are short-lived and never serve even one
client. The median lifetime of active bridges with ≥1 client is 4
months; median overall is less than 1 day. Of the active bridges with ≥1
client, 84% have only 1 IP address, and the rest have more. The top 3
ORPorts (443, 8443, 9001) constitute 71% of all ORPorts; "Ki" uses an
ORPort of 444, which makes up another 11%. Between Censys and Shodan,
they got scans for ports 443, 993, 995, 444, 8443, 9001, and 9002. 90%
of users use default bridges, which are trivial to block—although the
distribution varies by country.

The most popular transport is surprisingly still vanilla; 77% of all
bridges. The most popular PT combination is obfs3+obfs4+scramblesuit,
which may be a problem because obfs3's susceptibility to active probing
endangers the other transports. When a new PT is deployed, its number of
bridges reaches a stable 1K–2K after 4–12 months. For a few countries,
counted how many bridge users, how many default bridge users.


Smits2011a
BridgeSPA: Improving Tor Bridges with Single Packet Authorization

BridgeSPA is a port knocking system based on SilentKnock, encoding a MAC
into the ISN and Timestamp fields of TCP. It replaces SilentKnock's
per-client counters with a rounded timestamp included in the MAC (server
and clients need loosely synchronized clocks). The secret for each
bridge changes periodically (every 1–7 days), so clients will have to
continue contacting the bridge authority to refresh it.

I initially thought that their main goal was the prevention of bridge
discovery by scanning. But actually, their concern is not bridge
blocking, but rather confirmation attacks in the style of
McLachlan2009a. They assume that the bridge addresses are known, and the
goal is to prevent an adversary from knowing the specific times when the
bridge is online. Nevertheless, it also protects against scanning.

They have an implementation for Linux using libnetfilter_queue.


Aase2012a
Whiskey, Weed, and Wukan on the World Wide Web: On Measuring Censors' Resources and Motivations

Look at three cases studies of censorship through the lens of a censor's
motivation, resources, and time. These three elements of censorship are
difficult to analyze, but, the authors argue, essential.

Case study 1: public wi-fi. They made an app to test URL blocking and
ran it in various places in Albuquerque; e.g. the public library,
Wendy's, Starbucks. They found blocking on a few networks, in the
categories pornography, gambling, alcohol & tobacco, and peer-to-peer.

Case study 2: microblogging in China (Weibo). Used China Digital Times's
sensitive keywords list. Search censorship is more aggressive than post
censorship (you can post some keywords that you cannot search for). The
blocking is somewhat stateful: "法ccc轮" ("fa-ccc-lun") is not blocked
unless you already searched "法轮" ("falun") and had it blocked.
Comments on posts were less censored than the posts themselves. The
keyword "乌坎" ("wukan") was censored during only a narrow time range in
December 2011, for about one week starting with the onset of protests in
Wukan.

Case study 3: chat programs in China. They reverse-engineered SinaUC
chat and compared its blocklist to that of TOM-Skype (previously
analyzed in Knockel2011a). SinaUC's list was not updated for current
events as TOM-Skype's was.

"We had assumed that censors are fully motivated to block content
and the censored are fully motivated to disseminate it." Instead, they
found in the case of the chat programs that clearly there was some
intention to censor, but no central or unifying guidance as to precisely
how. They observe an "intern effect" where it looks like the lists were
assembled without much care or expertise. SinaUC's list may be partly
motivated by anti-spam rather than pure censorship. Weibo censorship is
easier to assign a motivation to, with specific keywords consistent with
known censorship policies. Some open wi-fi blocking has a clear
motivation (pornography in the library), others not so much (gambling in
a fast food restaurant?).

Resource limitations were apparent in Weibo censorship. Perhaps the
reason that posts are less censored than searches is that censoring
posts is more resource-intensive, therefore it cannot be done as well.
Failure to censor comments may also be caused by limited human
resources.

Topics that are likely to be the target of censorship are timely. The
censorship of "Wukan" during the time of the protests may have been an
effort to get ahead of the rumor/news cycle. Once that was accomplished,
the keyword could be unblocked again.


Zhu2013a
The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions

Looks at post deletions on Sina Weibo: how long they last before being
deleted and what topics tend to be deleted more. Their data set is
2M public posts from July–September 2012 made by about 3,500 users who
tend to post on sensitive topics. They also have a set of 470M posts
from the generic public timeline--they only use this latter set to
identify popular topics. They poll for new posts at a high rate,
enabling them to know of a post's deletion within two minutes, which is
more precise than previous work which was hours at best. They found a
side channel that distinguishes censored posts from those that were
deleted by their author. They use the information to make guesses about
the censor's methods and priorities.

30% of deleted posts were deleted in first 5–30 minutes; 90% were
deleted in their first 24 hours. But some posts survive for days or
weeks. This seems to show that once the censors know about a topic to
censor, they delete all the old posts with it, and thereafter try to
keep abreast of new posts. There are automated filters that prevent you
from even making certain posts. Users who have more censor-deleted posts
also have a lower average post lifetime, but it is ambiguous whether it
is because those users are being more closely watched. Using NLP, they
extract topics from posts and compare deletion speed with topics. The
fastest-deleted posts are those about "hot topics": topics that are also
popular among the larger set of public posts. Deletions occur around the
clock, but there are fewer from about 0:00 to 8:00. Posts made in the
late night/early morning last longer, perhaps because the censors are
working through a backlog that has accumulated overnight.


King2012a
How Censorship in China Allows Government Criticism but Silences Collective Expression

Analysis of censored (deleted) blog posts during the first half of 2011.
Initially I thought their thesis was that posts that are likely to cause
collective action are more likely to be censored than posts that merely
criticize the government. But actually it is one step removed: posts
that are made during a real-world event that has the potential to cause
collective action are more likely to be censored. That is, the
collective action potential is not a property of the post, but of the
*event* going on when the post was made. "Our central hypothesis is that
the government censors *all* posts in topic areas during volume bursts
that discuss events with collective action potential. That is, the
censors do not judge whether individual posts have collective action
potential... Instead, we hypothesize that the censors make the much
easier judgment, about whether the posts are on topics associated with
events that have collective action potential..."

The description of their methodology was a little unclear to me. Here is
what I interpret. They identified 85 topics and scraped only the posts
falling into one of those topics (apparently using some kind of
keyword→topic map). The sources were 1,382 domestic blog-type sites.
They then--separately for each of the 85 topics--identified volume
bursts (87 bursts in total) in posts for that topic. They found a
real-world cause (such as a news event) to explain each burst, and say
that the cause was never ambiguous. They placed each event into one of
five categories: 1) collective action potential 2) criticism of the
censors 3) pornography 4) government policies 5) other news (everything
else). (This categorization doesn't make total sense to me. It seems
like an event could fall into one category, but only one event--the Fang
Binxing shoe-throwing incident--seemed to their coders to belong in more
than one. I'm guessing the categories were made up after looking at the
events, because I don't see a reason to pick out those four categories
plus "other", a priori. It's also weird that four of the categories
don't have a pro or con slant, but "criticism of the censors" is
explicitly con.)

Volume bursts have a higher rate of censorship than other times. Even
outside of volume bursts, pornography and criticism of censors have high
rates of deletion. Government policies and other news had low rates of
deletion at other times. Events with potential for collective action are
on average more censored than other types of events. (They say that they
looked for volume bursts in each topic individually, but the time series
plots Figs. 5, 6, 7 are not labeled with a topic--maybe they show the
sum of all posts during that time?) They do some kind of supervised
learning (Appendix B) to classify posts as being either pro- or
anti-government, and do not find that rates of censorship differ.
(Except, presumably, in the case of anti-censor posts.)

Most deletions happen in the first 24 hours, although some happen days
later (consistent with Zhu2013a). The authors are impressed by the fast
deletions. They say that "different Internet content providers first
need to come to a decision about what to censor in each situation," a
statement I don't find fully justified, because they don't compare e.g.
deletion rates across different providers, only in aggregate. They
describe a "military-like precision" and apparent heavy use of manual
human effort, in contrasts to Aase2012a, which says that there is an
apparent lack of central control and a high degree of automation.


Tokachu2006a
The Not-So-Great Firewall of China

Says the Great Firewall uses two techniques: DNS poisoning and RST
injection. The firewall is stateless and cannot drop packets. The
author's impression is that it's a "lowest bidder" system with many
flaws. Reckons that the censorship is vastly expensive, and that if
Chinese people knew how much it cost, the "economy would collapse in a
matter of hours."

Only a few "very, very, very naughty" domains are blocked by DNS
poisoning. DNS poisoning only affects port 53. Describes how you can
test for DNS injection even from outside of China, like Lowe2007a,
Anonymous2014a, and Farnan2016a later did. Shows a packet trace that
purports to show fake DNS replies along with "a few random ICMP
messages":
  0.000000 192.168.1.2 -> 220.194.59.17 DNS Standard query A minghui.org
  0.288963 220.194.59.17 -> 192.168.1.2 DNS Standard query response A 203.105.1.21
  0.289482 220.194.59.17 -> 192.168.1.2 DNS Standard query response A 203.105.1.21
  0.289838 220.194.59.17 -> 192.168.1.2 DNS Standard query response A 203.105.1.21
  0.290374 220.194.59.17 -> 192.168.1.2 DNS Standard query response A 203.105.1.21
  0.290732 220.194.59.17 -> 192.168.1.2 DNS Standard query response A 203.105.1.21
  0.290757 192.168.1.2 -> 220.194.59.17 ICMP destination unreachable (Port unreachable)
  0.291311 220.194.59.17 -> 192.168.1.2 DNS Standard query response A 169.132.13.103
  0.291337 192.168.1.2 -> 220.194.59.17 ICMP destination unreachable (Port unreachable)
But the ICMP messages are from the client, not the server or the
firewall. (Presumably they are caused by some of the DNS responses, like
maybe some of the responses went to weird ports.)

TCP RST followed by temporary timeout for TCP packets containing
forbidden keywords. Keywords are encoded in ASCII or GB2312. Keywords
encoded in UTF-8 don't get noticed. Mentions the possibility of spoofing
a source address in order to cut off someone else's Internet. TTLs of
forged packets are too high. Says a tool called covertsession
(https://packetstormsecurity.com/files/36554/covertsession-0.4.c.html),
that uses the urgent pointer to cause the firewall to mis-parse packets,
gets past the filter without any kernel-level modifications.

Google's in-house censorship of searches processes the text of queries
differently than does their backend search processes. Specifically, you
can send a query using fullwidth characters "ｋｅｙｗｏｒｄ": it won't
be caught by the censor filters, and the backend will treat it the same
as "keyword".


Narain2014a
Deniable Liaisons

DenaLi is a system for covert communication within a local wireless
network. It works by sending deliberately corrupted wi-fi frames. Alice
corrupts a fraction of her frames in such a way that they will be
dropped by the eavesdropping access point, but Bob, who is within
wireless range and with whom Alice shares a secret, can recognize them
and decrypt their contents. The idea is like Rivest's chaffing and
winnowing. The corrupted frames should belong to a TLS of similar flow,
so that they already look random. Each corrupted frame is actually a
MACed ciphertext.

They try to match distribution of naturally occurring corrupted frames:
errors tend to occur in bursts, and bits later in a frame are more
likely to be corrupted. They need the adversary to be some distance
away, in order for the number of corrupted frames to be plausible. Even
if the adversary can detect the presence of extra corrupted frames,
thereby detecting the use of DenaLi, it cannot find out who the intended
recipient is. Confidentiality comes from encryption. Blocking the system
would require require radio jamming.

They made a prototype. It only works with certain wireless cards; most
do not allow you to send bad checksums.


Wang2017a
Your State is Not Mine: A Closer Look at Evading Stateful Internet Censorship

Looks at the monitor capabilities of the GFW, in the tradition of
Khattak2013a. The firewall has improved since 2013, and some of the old
evasions no longer work. From 11 vantage points in China, they test
various evasions' average success rates in reaching some real web
servers (pulled from Alexa, containing blocked keywords). Applying no
evasion gets you a 2.8% success rate. But even trying to be evasive
succeeds much less often than 100%. This is because of changes in the
GFW, and interference by middleboxes. That is the importance of testing
against real web servers across a diversity of routes: some techniques
get past the GFW (bad checksum, IP fragments) but are foiled elsewhere
by middleboxes. The GFW has two different kinds of RST injection, which
seem independent.

They found a new "re-synchronization state" of the firewall, caused by
multiple SYNs or SYN/ACKs with unmatched sequence numbers. The firewall
does not trust the first number, but resynchonizes itself when it sees a
data-carrying packet (or a SYN/ACK, in the case that the
re-synchronization state was entered because of multiple SYNs). The new
state provides a new way to desynchronize the firewall: send multiple
SYNs, then send a TTL-limited data packet with the wrong sequence
number, then continue with the right sequence numbers--the firewall
doesn't resynchonize itself again after the data packet. Another way to
mess up the state is for the client to send SYN/ACK first; the firewall
will think the client is the server and ignore its later SYN packet.

They mined the source code of Linux 4.4 to identify packets that lead to
an ignore state, then tested them all against the GFW to find which ones
it did not ignore. Doing so, they found two new kinds of useful
injections: RST/ACK with incorrect ACK number, and packets with an
unsolicited MD5 options. Both of these are ignored by Linux 4.4, but
honored by the firewall, and can, for instance, be used to tear down a
connection from the firewall's point of view.

Built an INTANG tool that tests and remembers best evasion strategy for
each destination. Converts DNS over UDP to DNS over TCP. Gives 90+%
success rates.

Active probing of Tor bridges now causes the whole IP to be blocked
(Section 7.3), different from Clayton2006a, Winter2012a.


Pearce2017a
Augur: Internet-Wide Detection of Connectivity Disruptions

Augur, a tool for the global and continuous measurement of TCP
connectivity. Builds on Ensafi2015a, using a hybrid idle scan that can
distinguish inbound and outbound blocking, without needing to control
either endpoint. In order to avoid implicating people, they only use
"infrastructure" hosts that are two hops back from the end of CAIDA Ark
traceroutes, which reduces pool of potential reflectors from 22M to 53K.
To determine blocking, they use sequential hypothesis testing on a
random variable representing acceleration (or not) of the IP ID counter.
Acceleration works because it has a zero mean in the steady state.

They tested Augur in 180 (or 179?) countries over 17 days. They checked
against public block lists and Tor bridges, finding blocking as
expected. There are different levels of blocking within a country.


Pearce2017b
Global Measurement of DNS Manipulation

Describes Iris, a system to test for DNS manipulation globally. It
doesn't require volunteers like other measurement platforms; rather it
uses open DNS resolvers. There are many reasons why DNS results may
differ around the world, so they employ two detection metrics to decide
when a set of measurements constitutes manipulation:
  consistency: answers should be the same or similar from multiple
    locations with respect to IP address, AS and AS name, HTTP content,
    TLS certificate, and PTR records pointing to CDN.
  independent verifiability: comparison with other information sources
    like the names in TLS certificates.

They put restrictions on what resolvers they can use for ethical
reasons, to reduce the risk that some innocent person will be implicated
in the course of someone else's research. A lot of open resolvers are
just people's home devices. They try to use only "infrastructure"
resolvers by looking at the reverse DNS for strings like "ns". Another
part of ethical measurement is managing load so as not to hit any
resolver or network too hard in a short time.

They test the Citizen Lab list, plus a random subset of the Alexa 10K.
They give a a list of the most commonly manipulated domains, which
includes gambling, porn, and filesharing sites.


Nasr2017a
The Waterfall of Liberty: Decoy Routing Circumvention that Resists Routing Attacks

Presents Waterfall, a downstream-only decoy routing design. Waterfall's
decoy routers only need to see traffic in the server→client direction.
The intuition is that it's much harder for a censor to control routes
in, compared to routes out (Routing Around Decoys attack). It uses
features of TCP, HTTP and TLS.

Requires pre-registration by each client. Out of band (e.g. by email),
the client has to send out a bundle of connection identifiers. A
connection is a TCP ISN, a TLS nonce, TLS keys, and a keypair for
communication with the decoy router. A client sends perhaps 1000 of such
connection identifiers at a time; each identifier is good for one
session, and the client has to send more when they run out. The decoy
routers look for TCP connections with a previously registered ISN (in
the server's SYN/ACK, reflected from the client), then MITMs the TLS
connection. It can MITM because the client has previously revealed what
secrets it's going to use, and the nonce which the decoy router doesn't
get to see because it's upstream-only.

The client sends data upstream using HTTP reflection tricks: 404 pages
that reflect the included path (20% of Alexa top 10K), or 3XX hostname
canonicalization (Location header reflecting path; 50% of Alexa top
10K). The decoy router sends data downstream by replacing the contents
of TLS records by ciphertext encrypted using a key from the connection
identifier. The client side uses an overt user simulator, like Slitheen.
Waterfall takes the approach of requesting previously cached pages, so
it can replace more than just leaf content.

They invent downstream analogs of Routing Around Decoys attacks and
decide that they are even more expensive than the original upstream
ones.