Flow feature*–based classification in commercial firewalls:
A coming menace?

* E.g., packet lengths, interpacket timings, byte entropy

David Fifield <david@bamsoftware.com>
PETS 2017,
https://www.bamsoftware.com/talks/pets-2017-menace/


“Towards Grounding Censorship Circumvention in Empiricism,” Tschantz et al., 2016:

In particular, we identify three disconnects between practice and research: 1) Real censors attack how users discover and set up channels, whereas research often centers on channel usage, 2) Real censors prefer cheap passive monitoring or more involved active probing, whereas research often looks at complex passive monitoring and traffic manipulations at line speed, and 3) Censors favor attacks that do not risk falsely blocking allowed connections due to packet loss, whereas research considers many less robust attacks.

↑ IMO this is still basically true.
Circumvention designs intended for deployment should first focus on the fundamentals.


Cisco “The Network,”


Seeing threats hidden in encrypted traffic. June 20, 2017.

The resulting technique, called Encrypted Traffic Analytics (ETA), involves looking for telltale signs in three features of encrypted data.

The first is the initial data packet of the connection. This by itself may contain valuable data about the rest of the content. Then there is the sequence of packet lengths and times, which offers vital clues into traffic contents beyond the beginning of the encrypted flow. Finally, ETA checks the byte distribution across the payloads of the packets within the flow being analyzed. Since this network-based detection process is aided by machine learning, its efficacy improves over time.

This week, Cisco is making Encrypted Traffic Analytics functionality available by pairing up the enhanced NetFlow from the new Catalyst® 9000 swtiches and Cisco 4000 Series Integrated Services Routers with the advanced security analytics of Cisco Stealthwatch.

“Enhanced telemetry for encrypted threat analytics,” McGrew and Anderson, 2016:

The majority of our analysis is based on three main data elements: the sequence of packet lengths and times, the byte distribution, and TLS-specific features such as the offered ciphersuites or the client’s public key length. The following subsections explain each in turn. We conclude this section with an overview of some other data elements that our open source project can collect.

3.1  Sequence of Packet Lengths and Times.
The sequence of packet lengths and times (SPLT) has been a well studied data element. In our implementation, we keep two arrays per flow: an array of sizes (in bytes) of the packets, and an array of times representing the number of milliseconds since the previous packet was observed. In our open source implementation, the SPLT elements are collected for the first 50 packets of
a flow ensuring the compactness property.  The first 50 packets were used for all of our experiments and is shown to perform quite well. Composability and adjusting the active flow timeout of the open source project would allow visibility into the SPLT data element for longer lived flows.

3.2  Byte Distribution. Similar to the sequence of packet lengths and times, the byte distribution is another well-studied data element. In this context, the byte distribution represents the probability that a specific byte value appears in the payload of a packet within a flow.  We implemented most of the major data types associated with the byte distribution such as the full byte distribution, entropy, and the mean/standard deviation of the bytes.

Free software tool: https://github.com/cisco/joy

Related research:

I don't want to suggest any ill intent by these researchers—their purpose is detecting malware. But the same ideas could conceivably be applied to the detection of circumvention.


Report in May 2016 that Cyberoam firewalls (somehow) block obfs4:

https://lists.torproject.org/pipermail/tor-talk/2016-May/040898.html
“I have a DPI box that I use to test pluggable transports with. I also test other circumvention tools against it just to see how good it is. Manufacturer is Cyberoam. About 6 or 8 weeks ago, Cyberoam released a DPI engine update that could filter normal Tor and the following pluggable transports: OBFS3, OBFS4, Scramblesuit”

The reporter says that flow features (packet size, interpacket timing) are part of the Cyberoam classifier.

Allot Communications also claims to be able to block obfs4 (and Tor, and Psiphon, and various VPNs...).


This is your Tor
00000000 16 03 01 00 a6 01 00 00 a2 03 03 23 66 1f f0 20 ........ ...#f.. 00000010 7e 74 aa 92 59 79 10 5b 06 89 35 ae d2 64 cb 85 ~t..Yy.[ ..5..d.. 00000020 1f b7 d8 5a ce d6 db 22 5e 03 da 00 00 1e c0 2b ...Z..." ^......+ 00000030 c0 2f cc a9 cc a8 c0 2c c0 30 c0 0a c0 09 c0 13 ./....., .0...... 00000040 c0 14 00 33 00 39 00 2f 00 35 00 ff 01 00 00 5b ...3.9./ .5.....[ 00000050 00 00 00 11 00 0f 00 00 0c 77 77 77 2e 64 73 32 ........ .www.ds2 00000060 6b 2e 63 6f 6d 00 0b 00 04 03 00 01 02 00 0a 00 k.com... ........ 00000070 0a 00 08 00 1d 00 17 00 19 00 18 00 23 00 00 00 ........ ....#... 00000080 0d 00 20 00 1e 06 01 06 02 06 03 05 01 05 02 05 .. ..... ........ 00000090 03 04 01 04 02 04 03 03 01 03 02 03 03 02 01 02 ........ ........ 000000A0 02 02 03 00 16 00 00 00 17 00 00 ........ ...
00000000 16 03 03 00 39 02 00 00 35 03 03 a6 c0 94 b4 4a ....9... 5......J 00000010 a9 65 d9 d0 39 61 ba c7 07 d6 29 34 8b 3d c8 4e .e..9a.. ..)4.=.N 00000020 7c c4 04 cb 74 9d 96 b7 46 d5 05 00 c0 30 00 00 |...t... F....0.. 00000030 0d ff 01 00 01 00 00 0b 00 04 03 00 01 02 16 03 ........ ........ 00000040 03 01 cf 0b 00 01 cb 00 01 c8 00 01 c5 30 82 01 ........ .....0.. 00000050 c1 30 82 01 2a a0 03 02 01 02 02 09 00 b7 8e 91 .0..*... ........ 00000060 47 3d 56 90 60 30 0d 06 09 2a 86 48 86 f7 0d 01 G=V.`0.. .*.H.... 00000070 01 05 05 00 30 20 31 1e 30 1c 06 03 55 04 03 0c ....0 1. 0...U... 00000080 15 77 77 77 2e 79 73 34 7a 72 67 62 78 70 72 75 .www.ys4 zrgbxpru 00000090 78 34 2e 63 6f 6d 30 1e 17 0d 31 37 30 36 31 39 x4.com0. ..170619 000000A0 30 30 30 30 30 30 5a 17 0d 31 38 30 35 31 31 32 000000Z. .1805112 000000B0 33 35 39 35 39 5a 30 25 31 23 30 21 06 03 55 04 35959Z0% 1#0!..U. 000000C0 03 0c 1a 77 77 77 2e 6a 6d 64 75 75 71 6e 72 73 ...www.j mduuqnrs 000000D0 66 37 67 74 6d 6f 6d 35 33 2e 6e 65 74 30 81 9f f7gtmom5 3.net0..
This is your Tor on obfs4
00000000 69 21 95 7e 23 87 5f fe 10 1a 28 9a 08 a2 8d bd i!.~#._. ..(..... 00000010 25 c1 a3 5c d5 39 6c 5b 93 c3 73 54 4c ce dc 6a %..\.9l[ ..sTL..j 00000020 a4 14 1c ca f8 44 42 f8 84 2e 50 b5 83 ad 53 03 .....DB. ..P...S. 00000030 78 3a 00 9a ff 1a 31 87 a1 e2 c1 31 65 45 0c f7 x:....1. ...1eE.. 00000040 7b 53 83 a2 de 40 70 df a6 54 2e b5 05 e4 07 46 {S...@p. .T.....F 00000050 82 37 ef 9a fe c2 a4 24 c9 95 e0 f4 b2 fb 82 3d .7.....$ .......= 00000060 ab eb c1 ab 05 4e f8 5e 17 e3 7a 45 54 49 89 4d .....N.^ ..zETI.M 00000070 e4 ed d7 08 dc f5 a6 25 b6 7f d5 45 fd d2 cb e5 .......% ...E.... 00000080 5f 8c 86 e8 40 c4 e6 2b 93 dc 1c be 4a 29 9d e5 _...@..+ ....J).. 00000090 0c 97 14 6f 6b 79 fa 99 df aa 8b 6c 10 a7 05 f0 ...oky.. ...l.... 000000A0 ff b7 99 e6 08 db 5a 75 62 c7 d9 3c 9d 0b 9a 0e ......Zu b..<....
00000000 f0 0e 72 b8 c8 fc 58 4e 76 52 82 23 15 ad 9c ee ..r...XN vR.#.... 00000010 47 f6 2b cd 8f b7 8a c3 e1 0b f2 97 fd 98 3a 61 G.+..... ......:a 00000020 81 cc 54 af e6 b4 a9 d2 71 d3 4a fd 9b 7a d9 43 ..T..... q.J..z.C 00000030 38 7c be 25 15 fb 9d 4f 0c 09 cc 21 e9 5f 1c 96 8|.%...O ...!._.. 00000040 b0 7a 11 3c bc 57 be e6 33 1e 36 8e 01 3b 83 85 .z.<.W.. 3.6..;.. 00000050 59 23 fe fc 45 10 6a 1c 6f 15 ec 4e 7d 07 d2 0c Y#..E.j. o..N}... 00000060 e3 7e 38 6f 7c e8 17 5f 45 f2 88 ab 83 0e 4e 59 .~8o|.._ E.....NY 00000070 92 4e a5 3e ee 1a 38 d6 f3 26 a0 e8 67 45 98 92 .N.>..8. .&..gE.. 00000080 2b ec 51 77 f4 52 3d 77 78 37 8f fb c0 33 5b b2 +.Qw.R=w x7...3[. 00000090 ee 87 28 0b 42 d7 88 0d 6f e4 25 53 98 1c 91 f9 ..(.B... o.%S.... 000000A0 b8 43 21 d0 32 bf 28 ab 18 32 06 4a 29 29 f4 1a .C!.2.(. .2.J)).. 000000B0 a8 7c f7 93 6f 5e 51 a5 8d 6e 0f 48 da ef ee 36 .|..o^Q. .n.H...6 000000C0 33 4d 3f fb 26 53 82 23 9c d9 2d 01 5d b1 f3 b6 3M?.&S.# ..-.]... 000000D0 34 59 66 4c 90 75 b5 13 08 e3 fc 24 37 c0 6b d3 4YfL.u.. ...$7.k.

But, this is your Tor:
Timing diagram of vanilla Tor

And this is your Tor on obfs4:
Timing diagram of obfs4 with iat-mode=0

And this is your Tor on obfs4 with timing obfuscation:
Timing diagram of obfs4 with iat-mode=1

And this is your Tor on obfs4 with aggressive timing obfuscation:
Timing diagram of obfs4 with iat-mode=2

Source code for these figures.


Summary:

Research ideas:

David Fifield <david@bamsoftware.com>
PETS 2017,
https://www.bamsoftware.com/talks/pets-2017-menace/