These are the visual aids I used to deliver a talk on IPv6 OS fingerprinting on October 16, 2015 at AISec 2015.
For the full paper see: https://www.bamsoftware.com/papers/ipv6-os.pdf.
(Authors are in alphabetical order.)
David Fifield UC Berkeley
Alexandru Geana TU Eindhoven / Fox-IT, Delft
Luis MartinGarcia ETSIT, Polytechnic University of Madrid
Mathias Morbitzer Fox-IT, Delft
J. D. Tygar UC Berkeley
This talk is about the IPv6-based OS fingerprinting engine in Nmap, a widely used network security scanner.
# nmap -6 -O ipv6.google.com Starting Nmap 6.49SVN ( https://nmap.org ) at 2015-09-28 11:36 MDT Nmap scan report for ipv6.google.com (2607:f8b0:4009:804::1002) Host is up (0.022s latency). rDNS record for 2607:f8b0:4009:804::1002: ord08s10-in-x02.1e100.net Not shown: 998 filtered ports PORT STATE SERVICE 80/tcp open http 443/tcp open https Device type: general purpose Running: Linux 3.X OS CPE: cpe:/o:linux:linux_kernel:3 OS details: Linux 3.12 - 3.18 OS detection performed. Please report any incorrect results at https://nmap.org/submit/ . Nmap done: 1 IP address (1 host up) scanned in 8.54 seconds
OS fingerprinting is relevant for network inventory, vulnerability scanning, and exploit tailoring.
Design goals, based on extensive experience with IPv4:
If there is no good match, the system displays a raw fingerprint and asks the user to submit it.
We use LIBLINEAR in its L2-regularized logistic regression mode.
Different OSes speak different “dialects” of TCP/IP.
|Linux 3.12||Windows 7||
We looked for “SHOULD”s and “MAY”s in IPv6 standards.
|Some IPv6 standards documents|
|RFC 2460 (IPv6)|
|RFC 2463 (ICMP for IPv6)|
|RFC 2473 (Generic Packet Tunneling)|
|RFC 2675 (Jumbograms)|
|RFC 3122 (Inverse Discovery)|
|RFC 3775 (Mobility)|
|RFC 3971 (Secure Neighbor Discovery)|
|RFC 4620 (Node Information Queries)|
|RFC 4782 (Quick-Start)|
|RFC 4861 (Neighbor Discovery)|
|RFC 5570 (CALIPSO)|
We built a test program with 154 candidate OS probes.
Volunteers tested all 154 probes against a “seed” set of OSes.
We selected 18 probes that offer good efficiency: 13 TCP, 4 ICMPv6, 1 UDP.
TCP_OPT_0 … TCP_OPT_15
TCP_OPTLEN_0 … TCP_OPTLEN_15
A major challenge is identifying when the classifier doesn’t have a good answer (e.g. a never-before-seen type of network printer).
Novelty is the distance of a feature vector from the mean of a class, where each dimension is scaled by the inverse of its variance. (One-sample classes have their variance set to a small constant.)
We rely on user submissions to grow the database.
But IPv6 adoption is still not as high as we would like :(
In the time it took to get 4,700 IPv4 submissions, we got only 97 IPv6 submissions.
Low ratio of training samples to classes. 16% of training samples are the only member of their class; 10% are in a two-sample class.
20–30% of classes are unknown, embedded OSes (identified only by hardware model number).
Network-corrupted and missing features.
Lack of ground truth. Very few training samples compared to IPv4.
10-fold cross-validation on our training set of 290 samples has an accuracy of 69%.
If we allow near misses (e.g., one Linux 3.x class confused for another Linux 3.x class), accuracy rises to 80%.