This is a review of the top ports and -p option extension patchset
(r4727).
nmap-services ratios.
--top-ports and --port-ratio always ignore ports not present in
nmap-services.
-p test suite is a great idea; maybe it should be actually link
with the C code and be distributed with the source.
wildtestThe new wildtest function is used by the new addportsfromservmask and
addprotocolsfromservmask functions, which are in turn called by the code
that interprets the argument to the -p option. wildtest checks if a
given test string matches a wildcard pattern like "n*p?".
wildtest is supposed to be case-insensitive, but I found a way to make
case significant. It is significant on the first character after a '*'
wildcard.
wildtest("ftp", "ftp") succeedswildtest("FTP", "ftp") succeedswildtest("F?P", "ftp") succeedswildtest("F*P", "ftp") fails
You can test this by running the commands
nmap -p 'f*p' localhost nmap -p 'f*P' localhost
(You can't give -p 'F*P' from the command line because upper-case
initial characters are reserved for T:, U:, and P: style specifications.)
This looks to be easy to fix.
wildtest's handling of the '*' character is susceptible to the
exponential growth problem described at
http://swtch.com/~rsc/regexp/regexp1.html. For example, the call
wildtest("a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaab") takes a long time to fail (I didn't
wait for it to finish).
This could conceivably be a problem for things like Nmap web interfaces that take port specifications from untrusted users. In practice, I don't think it's a problem because longest protocol in nmap-protocols is 15 characters long (rsvp-e2e-ignore) and the longest service name in nmap-services is 21 characters long (iss-realsecure-sensor). These limit the backtracking so that the exponential growth isn't noticeable.
The issue might become significant if services or protocol with longer names are added in the future (I tested with a 40-character service name and found that it is exploitable). It could also be a problem if wildtest is used elsewhere with untrusted user input.
The nmap-services database has been extended to give the frequency that
a particular port has been found open. Here's a quick-and-dirty script
to generate a new-style nmap-services file:
gawk -F'#' '{ if (rand() < 0.9) { d = int(10000 * rand()) + 1; n = int(d * rand()); \
r = n "/" d " "; } else { r = "" } if ($1) print $1 " " r "#" $2 }'
< nmap-services > nmap-services-new
You can use the new file by saying
nmap --servicedb nmap-services-new
But note that --servicedb implies -F, which prohibits using -p. So use
the --datadir option if you want to use -p also.
The code to parse the new-style file checks for a division-by-zero error, but doesn't check for negative numbers. Given the line
tcpmux 1/tcp -1/10
Nmap gives no warning, but with
tcpmux 1/tcp 1/-10
or
tcpmux 1/tcp -1/-10
it gives the confusing error messages (respectively)
nmap-services-new:1 has a ratio -0.1. All ratios must be < 1 nmap-services-new:1 has a ratio 0.1. All ratios must be < 1
(respectively) because the code checks only for numerator > denominator.
This is probably harmless, because the distributed nmap-services file wouldn't have such defects, but it's also easy to fix.
gettopptsThe function gettoppts is now called instead of getpts for TCP and UDP
scans. It ends up calling getpts directly if an old-style services file
is used or neither of --top-ports or --port-ratio are used. Otherwise,
it still calls getpts and uses the result to filter which ports are
returned.
I found gettoppts hard to understand. The way it works is to get a list
of all services from the services database (sorted_services) and then
see which of those fit the port specification from getpts (via
is_port_member). To me, it's more natural first to generate a list of
ports using getpts, then to filter out all those that don't match the
top ports level. (By discarding those with a low ratio in case level < 1,
and by sorting by ratio and truncating the lists in case level >= 1).
--top-ports and --port-ratio always ignore ports not present in nmap-servicesnmap -p 10,11,12 --top-ports 3 localhost
only shows output for port 11 because 10 and 12 aren't in nmap-services.
I was expecting it to scan ports 10, 11, and 12, as with
nmap -p 10,11,12 localhost
This is an artifact of the way --top-ports and --port-ratio are handled
in gettoppts, which only considers ports in sorted_services to be
eligible (sorted_services only contains ports for which an entry exists
in nmap-services).
Likewise,
nmap -p 1-100 --port-ratio 0 localhost
behaves like
nmap -p '[1-100]' localhost
when I would expect it to act like
nmap -p 1-100
It's possible the current behavior is correct, but it was not intuitive
to me. Suppose Nmap were not to disregard absent nmap-services entries.
What should it do in this case?
nmap -p 10,11,12 --top-ports 2 localhost
It should choose port 11 and then one of the ports 10 or 12 arbitrarily. I thought this was weird, until I reflected that that's the same thing it would do if ports 10 and 12 had identical ratios.
-p test suiteDoug wrote a test suite for the port-matching code. This is a really good idea and something I'd like to discuss doing more of during this Summer of Code. The fact that the author wrote so many tests gives me confidence that the code is correct.
The test suite is a Lisp program which (I suppose) parses the output of
a specially modified nmap executable. It strikes me that a program
independent of the nmap binary would be more useful. A C program, or a
Lisp program with C bindings, could call the getpts function and check
its output. Then the test could easily be distributed with the code and
run without the inconvenience of uncommenting a line of source.
To me, the most compelling part of the patch at the moment is being able
to give ports and protocols by name (even without wildcard matching).
Saying -p ssh,ftp,http just feels right. Once the research is done,
the --top-ports and --port-ratio parts of it could be even more
compelling.