Validation and Port Signatures In this section we wish to discuss our validation and testing of the TCP-based work mentioned previously. We do this in the context of a novel reporting technique that has proven useful called a "port signature" report. The port signature report is more or less a view of our TCP syn tuple, which includes the work metric previously discussed as well as a small sampled set of destination ports (1..10) for any IP source in question giving us a limited Layer 4 view of the applications being used by the host. Before we present the port signature and related efforts aimed at validation, consider the following statement from [TRW/Paxson]: "Consequently our argument is nearly circular: we show that there are properties we can plausibly use to distinguish likely scanners from non-scanners in the remainder hosts, and we then incorporate those as part of a (clearly imperfect) ground truth against which we test an algorithm we develop that detects the same distinguishing properties". Ultimately in a very narrow sense, it is important to remember, our work weight system catches *anomalies*, as for example in the limited scope of a system sending SYNS and not getting any packets back (barring resets). We all know this is not how TCP should work. However ultimately if a given system sends 100 SYNS and nothing else is seen, one cannot always necessarily explain whether a programmer produced buggy code, or we have a bad protocol, or had evil intent. However looking at a sample set of ports does help us here, especially when we know from other evidence (when possible) that certain target ports with certain frequency counts do not appear as the nature targets of so many SYN packets in such a short time. Still there are new phenomenon including a very small percentage of "scanners" (and here we are speaking only of work weights > 80%) that are in some way not explicable and knowing the destination ports used may not be sufficient. Below we will briefly introduce the port signature report, with some examples. In the process we will discuss both some explicit validation testing (which is a work in progress) and some observations we have made based on six months of experience with this tool. port signatures: ip src: flags work: SA/S: port signature: 1 (WOM) 100: 0: [445,100] 2 (WOM) 100: 0: [24910,100] 3 (WOR) 100: 0: [5554,65][9898,34] 3.1 (WOR) 100: 0: [5554,65][9898,34] 4 () 6: 100:[1054,2] ... 5 () 2: 10: [1124,14]...[6881,36][6882,5]... 6 () 22: 0: [1433,99][3536,0] 7 (WOR) 100: 0: [139,33][1025,22][2745,21][6129,23] The port signature report given here is simplified from the current version for reasons of space, and also consists of a small set of illustrative examples taken from one real PSU report from fall 2004. The portreport is derived from the front-end syn tuple and represents the subset of hosts produced by the worm metric. Thus we can say the hosts in question for the most part have produced N more SYNS than FINS. Each port signature begins with the IP source in question, with statistics for each individual IP source given per line. In addition to three metrics, flags, work, and SA/S, the primary mechanism here is the port signature on the far right of each IP source. The port signature includes 1..10 two-tuple port samples, with each port sample consisting of a destination port and a packet frequency count for each port in the port sample space. The number of buckets for port destinations is currently set to 10 (we use ellipsis above for cases when the entire port sample space is filled). For example the third entry shows that packets were sent to TCP ports 5554, and 9898 by IP source 3. The former received 65% of the packets, and the latter received 34%. The port signature report is sorted from top to bottom in terms of its logical key, the IP source address. Here we are replacing real IP addresses with logical numbers as substitutes. (IP source 1 will be referred to as example 1, etc.). A sorted IP source space is useful because one can see possible "nearby" groupings of distributed IP attacks, and of course, one can easily view ones own IP source address space for outbound attacks. For example, we have observed agobot-based attacks in which all the IP source addresses in a /24 space appear to be attacking the same remote destination ports. In our report above, there are two attacks that appear similar based on their ports coming from the same network (3 and 3.1). The port signature is also sorted from low port to high port and this helps us see similar attacks using the same set of ports. The flags metric shows us whether or not the worm candidate is receiving 2-way data. Flags here include: 1. W - the work weight is >= 90%. 2. 0 - few fins if any are returned. 3. R - large numbers of resets are being returned. 4. M - few non-reset data packets are being returned. The work metric is shown next. We have done a statistical analysis of the distribution of work metrics during both "normal" times and during large distributed attack periods (as shown in our worm graph, see []). During normal periods (most times), the work weights tend to cluster around low values, say 0..20%, and high values, 80..100%. This corresponds to our empirical hunch that in general work weights in the range of 80% or higher are usually a worm, and more rarely a misbehaving application. Of course during large attacks the metric clusters predominately in the high zone and tends to 100%. PSU source IPs are mostly clients of which a few are infected dormitory hosts, and others tend to be running P2P clients like bittorrent. External hosts divide up into mostly TCP-based worms with the occasional scanner (a la nmap) and a smaller set a puzzling but likely benign phenomenon that we call the "noisy web server". The noisy web server (example 4) and p2p apps (example 5) tend to have low work weights. Worms (examples 1, 2, 3, and 3.1) tend to have high work weights. We could choose to only show IP sources with high work weights because of the high rate of "worminess". Out of 1000s of instances of "worms", we have seen less than 10 cases that were not worms. These cases are true anomalies in that something is wrong, but they are not necessarily worms. Three example anomalies so far spotted (and explained) include: 1. one case of a popular meeting application that perhaps over enthusiatically tries to reconnent to its server when the server is taken down, 2. well-known (as opposed to infected) campus email servers that are attempting to forward email error messages to spammers (which given fake return IP addresses will never work), and 3. certain P2P clients (often Gnutella-based) that have a very low success rate for peer connections. Examples 1-3.1 and 7 show work metrics at 100% for various "real" worms. 3 and 3.1 are examples of the dabber worm[dabber]. Example 7 is an old phenomenon seen many times, and is some form of phatbot/agobot attack. These two examples taken together illustrate a very interesting forensic possibility which is that the display of the ports may allow you to identify the worm. On the other hand, Example 2 is a new phenomemon as of late November, 2004 which we have not seen before but based on experience and the work metric, it is highly dubious. Still we have not as of yet identified it. In summer 2004, we performed a number of Microsoft file share tests and looked at hosts screened from the Internet using various Microsoft file share and SQL services. We dumped their associated syn tuples during short and long sample periods to see if ports used by Microsoft (with TCP) including 135, 139, 445, and 1433 were likely to ever show up in the worm metric sample. The answer was no, which conformed to the intituion of various local security experts. This per application testing is a very good idea and we intend to continue it in the future with other applications (nmap, nessus, and various P2P applications). For now we can state that instances of port 445, 135, and 139 with high packet rates are worms. At this point in time, we are satified with our general understanding of the work metric in the high range. However there is work yet to be done in terms of understanding why IP sources may appear in the lower range. For example, we have the noisy web server phenomenon mentioned before that is shown in example 4. We do not yet understand why PSU contact with certain web servers produces large amounts of SYNS that exceed the FIN count. The work weight tends to be low, thus there is two-way data exchange. The SA/S metric is useful here. It compares the total number of SYN+ACK packets sent against the total number of SYNS sent by an IP source. Thus it gives us a rough idea as to whether or not a system has client tendencies (also true of worms), or is a server (SA/S equals 100%, with the noisy web server), or is somewhere in-between, which is true for hosts running P2P clients (example 5). Example 6 is interesting simply because it too has a low work weight, and yet we know from the previously mentioned application testing that any mention of port 1433 in the work report (with large numbers of packets) is an attack. The work weight is low here because this is a unsuccessful password guessing attack on SQL servers, thus there is (nefarious) work being done. Here the use of ports is invaluable. Of course, it is also important to remember that one does not need a high work weight to have an attack. Example 5 is also interesting, as it is quite common to see P2P applications appear in the portreport output. This is presumably because P2P applications in general will have some set of connection peers however with a subset of those peers possibly unavailable simply because some of the peer IP addresses are old. Of course there is no guarantee that a given P2P application is bound to a given port. Still we can guess that this example is using bittorrent because of port 6881 and 6882. The SA/S metric is interesting here in that it suggests the host in this example has some server tendencies although it tends to the client side. In general, we intend to more research on the lower work weights. For example, we hope to improve our abilities to identify P2P applications.