This is an English translation of the research paper "高速网络环境下入侵检测系统结构研究", published in the Journal of Computer Research and Development, .

Chen2004a.en.html
This HTML file for offline reading
Chen2004a.pdf
Original Chinese PDF
Chen2004a.en.zip
Source code for this HTML version of the English translation

Discussion thread

Posted .

Architecture of Intrusion Detection for High-Speed Networks

Journal of Computer Research and Development
Vol. 41, No. 9
Sep. 2004

(PACT Laboratory, College of Computer Science, Harbin Institute of Technology, Harbin 150001)
(cxx@mail.nisac.gov.cn)

Keywords
IDS; high-speed networks; signal coupling; aggregation and splitting; hashing; data stream bus; sensor
CLC category number
TP393.08

Date received: 2003-07-15; date revised: 2003-10-17
Fund project: National High-tech R&D Program of China “863” (2002AA147020)

Abstract

This paper proposes an architecture for intrusion detection systems in high-speed network environments. It effectively addresses the processing performance challenges of cybersecurity analysis on multi-line and high-bandwidth backbone network lines by integrating raw signal coupling technology (packet capture technology and stream reassembly technology), aggregation and balancing technologies, and an efficient data stream engine. The architecture has a good hierarchy, offering high scalability and adaptability, and it can accommodate a wide range of complex network environments, from low-speed access networks to high-speed backbone networks (multi–OC-48 links and higher), as well as various types of interface formats. The system based on such an architecture achieves line-speed performance in an eight-line OC-48 network environment when 16 data stream buses are configured, which exceeds the best performance levels reported for similar systems.

1 Introduction

Intrusion detection systems (IDS) are currently an important and practical technical means to address cybersecurity issues. As network scale expands, bandwidth increases, technology advances, and the number of users surges dramatically, high-speed network environments are becoming increasingly prevalent. We define a high-speed network environment as a network environment with multiple lines operating at speeds of 2.5 Gbps or higher in a backbone network. IDS is typically deployed at the egress of the protected network and on core switches. One of the outstanding problems encountered in current IDS research is the great challenge of the speed of data processing. The famous information security research and consulting firm Gartner, Inc. put forward an argument that IDS would gradually phase out by 2005[1]. One of the four key reasons for this argument is that IDSes of the time were incapable of handling transmission rates exceeding 600 Mbps. The U.S. Department of Energy has identified high-speed intrusion detection systems as one of the key research priorities for IDS[2].

To date, there have been relatively few published results internationally. Sekar et al. proposed a high-performance IDS capable of processing speeds up to 500 Mbps[3], but it is based on offline data. IIS’s Giga Sentry and Cisco’s related products can reach a processing speed of 50 Kpps. A more practical high-speed intrusion detection system architecture is the stateful intrusion detection system architecture for high-speed networks proposed by Christopher Kruegel, et al.[4] This system consists of six parts: a network tap; a traffic scatterer; a set of m traffic slicers S0, …, Sm−1; a switch; a set of n data stream reassemblers R0, …, Rn−1; and a set of p intrusion detection sensors. The tap is used to capture the sequence of high-speed bandwidth network data frames, denoted F = ⟨f0, f1, …, ft⟩, within a specific time period , and transmit it to the traffic scatterer. The traffic scatterer uses a specific classification algorithm to split F into m subsequences Fj: 0 ≤ j < m. Each Fj is a subset (possibly empty) of F. Each data frame fi is an element belonging to one and only one specific subsequence Fj, so j=0j<mFj=F\bigcup_{j=0}^{j<m} F_j = F. The classification algorithm uses a round-robin algorithm to evenly split F into m subsequences, so each Fj contains one mth of the total traffic. Each subsequence Fj is sent to a traffic slicer Sj, which serves to forward associated data frames in Fj (the data frames that belong to the same attack scenario) to the same reassembler. m traffic slicers and n reassemblers are interconnected via a switch to form an m × n unidirectional cross matrix. The raw large traffic is decentralized and redirected into several smaller streams that individual sensors can handle, thereby solving the traffic problem while avoiding missing information.

This architecture effectively addresses the traffic splitting problem in a high-traffic environment, allowing the backend processing system to handle data far beyond the capacity of a single-node processor through a clustered approach, and it is highly scalable. However, this architecture has two shortcomings. First, it only monitors a single high-traffic line, and an extension is needed to support multi-line monitoring (in most cases). Second, communication channels are required between intrusion detection sensors to enable collaborative analysis of time-related events across multiple lines. We supplement the above two points and integrate the traffic scatterer, slicers, and switch into a unified system, creating an intrusion detection architecture suitable for complex multi-line, high-traffic network environments, as shown in Figure 1:

An abstract network diagram. On the left, three lines labeled “Access point 1”, “Access point 2”, …, “Access point m” lead into a large box labeled “Aggregation and splitting”. On the other side of the large box, three lines emerge labeled “Data stream bus 1”, …, “Data stream bus n−1”, “Data stream bus n”. Each of the data stream bus lines is connected to two boxes labeled “Sensor”, with an ellipsis in between to indicate that there are more than two sensors per data stream bus line. All the “Sensor” boxes connect into a rectangular box labeled “Switch”. The switch is connected to two ovals labeled “Response” and “Log”.
Figure 1: Architecture of intrusion detection system in high-speed network environments

The architecture consists of a set of m raw signal couplers, an aggregation and splitting subsystem, a set of n data stream buses, n sets of sensors, a response mechanism (optional), and a logging subsystem. The function and structure of each of these subsystems are described below.

2 Raw signal coupler

Raw signal coupler refers to a network’s raw data output mechanism, such as the mirror port of a switch or the monitoring port of a hub. The data signals on the line are reproduced in their entirety through these ports and sent to the next level for primary processing. With the development of network and transmission technologies, the single-line bandwidth of raw signal sources has evolved from 10 Mbps (Ethernet, E1/E3, and other interfaces), to 100 Mbps (FE, OC-3, OC-12, and other interfaces), 1 Gbps (GE, OC-48 and other interfaces), and now to 10 Gbps (10GE, OC-192, and other interfaces). Currently, due to the uneven development among various operators and the differing needs of application environments, the aforementioned raw data interface types coexist in both domestic enterprise networks and operator networks.

The coupling technology for raw signals is based on monitoring mechanisms, of which there are currently four types: shared hub monitoring, device port mirroring monitoring, optical coupling monitoring for certain types of interfaces, and dedicated device monitoring. If the network environment lacks the aforementioned devices/lines, a transport protocol conversion is required. The specific raw signal coupling technologies corresponding to different environments are as follows:

  1. FE/Ethernet line (electrical port) signal coupling technology

    This technology is suitable for data acquisition on Ethernet and fast Ethernet lines with electrical interfaces. The advantage of this technology is that it allows simultaneous acquisition of bidirectional data, while reducing the need for aggregation. However, the disadvantage is that when the sum of bidirectional traffic approaches or exceeds the bandwidth of a single line, it can cause congestion on the monitoring line. In the case of shared hubs, increased collisions may result in a sharp decline in available bandwidth. For mirror ports, monitoring capacity can be improved by increasing the number of mirror ports, but this requires the use of aggregation technology. The solution is to use high-bandwidth ports as monitoring ports, such as using GE ports to monitor on FE ports, to avoid congestion (as shown in Figure 2).

    A central rectangle is labeled “Ethernet/FE Hub/Switch”. On the left and right, it is connected over bidirectional arrows labeled “Ethernet/FE” to ovals labeled “Network A” and “Network B”, respectively. Downward there is a single, out-only arrow labeled “Ethernet/FE” and “Shared port/monitoring port” pointing at the label “Monitoring data”.

    Figure 2: FE/Ethernet line (electrical port) signal coupling technology

  2. Low-speed WAN line signal coupling technology

    The conversion device in Figure 3 can be either a layer-3 device or a link-layer device. If a layer-3 device is used, then routing configuration is required thereon, but it can support multi-line WAN aggregation and conversion, offering broader applicability. If a link-layer device is used, then no layer-3 configuration is needed, and only simple installation and setup are required. However, it only supports single-line conversion and does not support WAN aggregation. This technology is suitable for signal coupling of WAN lines such as DDN, E1, and E3 (as shown in Figure 3).

    A central rectangle is labeled “Ethernet/FE Hub/Switch”. On the left and right, it is connected over bidirectional arrows labeled “Ethernet/FE” to ovals labeled “Conversion device”, and then continuing over bidirectional arrows labeled “WAN line” to ovals labeled “Network A” and “Network B”, respectively. Downward there is a single, out-only arrow labeled “Ethernet/FE” and “Shared port/monitoring port” pointing at the label “Monitoring data”.

    Figure 3: Low-speed WAN line signal coupling technology

  3. Optical fiber link signal coupling technology

    This technology is more widely used in cable TV signal transmission and billing systems. It employs optical couplers to duplicate optical signals into one or more copies. Its characteristics include being a passive device with stable operation. However, bidirectional data is output separately, aggregation is required, and it cannot filter out unnecessary data. It is suitable for GE, 10/100BASE-FX, and other optical interfaces (as shown in Figure 4).

    Two ovals stand at the left and right sides of the diagram, labeled “Network A” and “Network B”. Network A is connected to network B with a unidirectional arrow labeled “Optical fiber link”, and likewise network B is connected back to network A with a unidirectional arrow. Each arrow line has a hollow box on it, each of which has a unidirectional arrow pointing down at the label “Monitoring data”.

    Figure 4: Optical fiber link signal coupling technology

  4. Dedicated signal coupling device

    A dedicated data acquisition device can perform certain preprocessing on the monitoring data before outputting the data, such as filtering out invalid packets, attack packets, and so on. It is characterized in that tandem connection in the line may potentially degrade the raw data signals. As a result, it demands high levels of reliability and fault tolerance. This technology is commonly used in communication instruments (as shown in Figure 5).

    Two ovals stand at the left and right sides of the diagram, labeled “Network A” and “Network B”. Network A is connected to network B with an unlabeled, unidirectional arrow, and likewise network B is connected back to network A with an unlabeled, unidirectional arrow. The arrows pass behind a large rectangular box with the label “Dedicated device”, within which is a smaller box labeled “Preprocessing unit”. Two arrows point downward out of the dedicated device box to the label “Monitoring data”. Parallel to the arrows connecting network A and network B, and pointing down at the monitoring data, there is an additional set of arrows drawn with a heavy dashed line, in front of the central boxes rather than behind them.

    Figure 5: Dedicated signal coupling device

3 Aggregation and splitting subsystem

This subsystem is responsible for performing necessary interface conversions of raw network data streams—converting network device communication interfaces (e.g., POS, ATM, E1, etc.) into host communication interfaces (e.g., FE, GE, etc.), and merging and evenly splitting the data streams before outputting them to the processor cluster. In a low-bandwidth network environment, if the output port of the monitoring data is in the form of a host interface (e.g., FE, GE, etc.), then multiple interface cards can be configured on the sensor units to receive the raw data signals and process them directly. If there are multiple lines between the protected network and the external network, there are multiple monitoring data lines that need to be aggregated, classified by stream, and then output to the data stream bus in a balanced manner.

  1. Conversion device aggregation technology

    This technology is applicable to FE/GE monitoring lines. It assigns a separate VLAN to each monitoring line’s receiving port and configures a high-speed aggregation port (GE). Using SPAN, all incoming traffic from the monitoring line’s receiving ports is mirrored to the aggregation port for output. In this way, once a particular monitoring data stream enters the aggregation switch, a copy thereof is made and sent to the aggregation port for output. For incoming raw packets, the switch will discard the raw packets without interference because it cannot find an output port corresponding to the destination MAC address in the packets and there are no other active ports in the VLAN where the receiving port of the monitoring data is located for broadcasting. This technology is suitable for monitoring environments where the total traffic from the monitoring lines is less than the bandwidth that the packet processor can handle (as shown in Figure 6).

    A large rectangle labeled “Aggregation and conversion device” is divided horizontally into three tiers. The lowest tier is labeled “VLAN1”. The middle tier is unlabeled. The top tier is divided into partitions: “VLAN2”, “VLAN3”, “…”, “VLAN n−1”. Solid labeled arrows enter from the top: “Monitoring line 1” into VLAN2, “Monitoring line 2” into VLAN3, …, “Monitoring line n” into VLAN n−1. On entering the top of the large rectangle, the solid arrows each split into two dashed arrows. One of each pair of dashed arrows passes through its top VLAN partition and ends in the middle tier with a big “X”. The other dashed arrow goes to the center of the bottom tier (VLAN1), where it converges with other dashed arrows from other monitoring lines. Where there dashed arrows converge, a solid arrow emerges and points to a label “Aggregation output line GE”.

    Figure 6: Conversion device aggregation technology

  2. Aggregation and balancing technologies based on hash stream classification

    When the total traffic from the monitoring lines exceeds the bandwidth capacity of a single packet processor, traffic needs to be split. To ensure a balanced distribution, the traffic splitting algorithm must operate with a sufficiently fine granularity. Currently, there are two options. One is to split the traffic according to the destination IP address. The advantage of this method is that an off-the-shelf routing device can be used to achieve the purpose, while the disadvantage is that the output traffic varies greatly, with sharp spikes on the traffic variation curve, leading to a suboptimal mean square deviation. Under extreme conditions, the output traffic may exceed the bandwidth of the output port or the processing capacity of the packet processor, which results in packet loss or packet processor congestion. Splitting by IP address makes it difficult to evenly split traffic across output ports, but aside from burst traffic, output traffic remains relatively stable most of the time. Therefore, this technology is still an economical choice for network environments that are not heavily loaded with traffic. The other option is stream-based traffic splitting (originally derived from layer 4–7 switches and layer 4–7 load balancing systems, as shown in Figure 7); i.e., performing a certain hash operation H(Sip, Dip, Sp, Dp) on several stream-related parameters within a data packet p: the source address Sip, destination address Dip, source port Sp, and destination port Dp. H(p) must satisfy the following condition:

    H(Sip, Dip, Sp, Dp) = H(Dip, Sip, Dp, Sp).

    There are n output ports. For an incoming packet p, the output port number Tn is calculated as follows:

    Tn = H(Sip, Dip, Sp, Dp) mod n + 1.

    The larger the value of H(p)/n, the finer the traffic splitting granularity, resulting in more balanced splitting and stronger resistance to traffic spikes. This is because, in the case of large traffic volumes related to concentrated IP addresses (such as access traffic to large websites or high-speed proxy servers), while the destination address of the TCP stream remains unchanged, the source ports are constantly changing (usually incrementing cyclically, for most clients). Currently, many new general-purpose layer-3 devices support traffic classification features, which can effectively perform stream-based traffic splitting. The experimental comparison data for the two traffic splitting technologies described above are shown in Figure 8 and Figure 9.

    A large rectangle labeled “Aggregation and conversion device” is divided horizontally into three tiers. The lowest tier is divided into partitions: “T1”, “T2”, “…”, “Tn”. The middle tier is undivided and unlabeled. The upper tier is divided into partitions: “R1”, “R2”, “…”, “Rm”. Solid labeled arrows enter from the top: “Monitoring line 1” into R1, “Monitoring line 2” into R2, …, “Monitoring line m” into Rm. From there, each solid arrow splits into n dashed arrows, with each of the arrows going to one of the “T” partitions in the bottom tier, like a fully connected bipartite graph. At the bottom, solid arrows emerge from the “T” partitions: “Splitting output T1”, “Splitting output T2”, …, “Splitting output Tn”.

    Figure 7: Aggregation and balancing technologies based on hash stream classification

    A line chart. The horizontal axis is labeled “Time / 10 s” with a range of 1 to 10. The vertical axis is labeled “pps” with a range of 0 to 4500. There are 8 time series labeled “Node 1” to “Node 8”. Each time series is relatively flat. They all stay within a range of about 2000 to 4000 pps.

    Figure 8: Effect of splitting based on hash stream classification

    A bar chart. The horizontal axis is labeled “Node” and the bars are labeled 1 to 16. The vertical axis is labeled “pps” and its range is 0 to 140000. The bars have variable heights: most are in the range 20000–60000, but node 3 is an outlier at the low end, at only about 5000, and not 14 is an outlier at the high end, at around 130000. The bars are colored different shades of gray, but the color does not seem to correspond to the height.

    Figure 9: Effect of splitting based on IP address

4 Data stream bus

Since sensors need to perform operations such as recovery, decoding, matching, and repeated search on the raw network data, they consume a lot of CPU resources. Therefore, multiple specialized sensors with different functions are needed to process a single piece of raw data. In a high-speed network environment with large bandwidth, it is impractical to store raw network data before processing, so a bus system is required to broadcast the raw data to all the sensors. For the currently common host interfaces FE and GE, there are two widely used forms: fast Ethernet bus and optical coupling. The fast Ethernet bus connects the sensors and the output ports of the traffic splitting subsystem via a hub device. The receiving ports of the sensors are set to silent mode, thus enabling one-to-many data replication at the data link layer. An advantage of this setup is that signal regeneration amplifies the data without a reduction in energy, while erroneous packets are discarded. The optical coupling method involves using an optical coupler for passive signal duplication at the physical layer. Its advantage is that no abnormal packets are lost; however, energy will decrease, thus limiting the number of sensors that can be connected.

5 Sensor

IDS sensors perform functions such as packet reception, data stream recovery, decoding, and analysis of attack characteristics. Stateful sensors also need to maintain the state of each connection. Sensor software consists of a data stream engine and an analysis engine.

5.1 Data stream engine

The function of data stream engine consists of two parts. The first part is to receive raw data packets from the packet shaper and place them into the user buffer, which is generally known as packet capture technology. The second part is stream reassembly, which performs step-by-step reassembly from data frames to IP fragments, to IP packets, and finally to TCP-layer connections.

The development of packet capture technology has gone through three stages so far. The first stage involves using the library functions provided by the operating system for packet capture, such as the libpcap library in Linux and the Winsock2 library in Windows, which are characterized by good hardware compatibility and good operating system compatibility. However, due to the significant overhead involved in the process of a data packet arriving at the network card, going through DMA to kernel memory space, and then being copied multiple times before reaching user space, making it difficult to achieve a high packet capture rate. (A 2 GHz CPU running Linux can achieve a packet capture rate of 50 Kpps, with FreeBSD performing slightly better.) Additionally, some operating systems (such as Linux) generate an interrupt for each received packet; therefore, when packet traffic is too high, the system may become overwhelmed by frequent interrupts, leading to system paralysis.

The second stage involves using some buffering techniques, for example, modifying network card parameters to reduce interrupt frequency and writing a specialized driver to buffer packets after they arrive at kernel memory space. Once a certain number of data packets (e.g., 1 K data packets) have accumulated, they are submitted to the user application. This mode still maintains relatively good hardware compatibility, but compared to strong coupling with the operating system, there is a significant improvement in performance in the first stage. (With a 2 GHz CPU host running the Linux operating system, the packet capture rate can exceed 100 Kpps.)

The third stage involves using zero-copy technology. By modifying the network card driver, after receiving a certain number of packets, the network card can transfer the data packets to a buffer in user memory space via DMA, thereby reducing the number of memory copies required for packet capture to zero (hence the name “zero-copy”). This mode offers poor hardware transparency, as it requires detailed understanding of the technical information of the network card. In addition, since packet capture bypasses the system’s protocol stack, it results in stronger coupling with the operating system. However, by addressing the memory copy bottlenecks in packet capture, this mode has a significant performance improvement. (On a 2 GHz CPU running the Linux operating system, this mode can achieve a packet capture rate of over 600 Kpps. With mixed packet lengths, it can easily enable true gigabit line-speed packet capture, making gigabit-interface firewalls genuinely practical.)

Flow reassembly technology is primarily based on the relevant standards of the TCP protocol stack and uses the techniques of finite state automata (FSA) for reassembly. Since there are many techniques specifically designed to disrupt IDS stream reassembly, such as insertion attacks, evasion attacks, and connection pool DoS attacks proposed by Mark Handley et al.[5], corresponding identification methods need to be considered to counter these during reassembly. Typically, methods such as anomaly detection (e.g., PHAD proposed by Mahoney et al.[6]) and traffic shaping[4] can be used.

5.2 Analysis engine

The function of the analysis engine is to perform tasks such as session recovery, high-layer protocol reassembly, decoding, decompression, and code conversion on the data carried within the transport layer to create a stateful information stream. The engine then uses pre-constructed sensitive information patterns to match the information stream. For successfully matched streams, it logs the events and notifies the response mechanism. The tasks performed by the analysis engine are CPU-intensive, making it the most CPU-intensive component. Its performance and efficiency determine the processing capability of the information intrusion detection system. Functionally, the analysis engine is divided into two modules. One is a protocol analysis module, which is responsible for restoring various application-layer protocols of interest in the data streams, analyzing the structure of the information streams, and generating various types of structured information streams. The other module is a pattern matcher, which performs pattern matching on each structure of interest within the structured information streams based on a library of known sensitive patterns to identify sensitive patterns.

The protocol analysis module reconstructs the specified application-layer protocols within the data streams, such as HTTP, SMTP, POP3, and BBS, saves the state of each stream’s application-layer protocols, and restores the data streams that can undergo content matching. For example, in HTTP, it is necessary to parse each header field (URL, HOST, CONTENT-TYPE, etc.), identify the content type in the HTTP body, and determine its starting boundaries. Similarly, in SMTP, the module must restore such contents as the sender, recipient, subject, message body, and attachments, as well as perform decoding, boundary identification, and nested message body identification.

In order to improve efficiency, finite state automata (FSA) are used for string matching for sensitive keywords[78]. Since each input information structure needs to be matched, it is necessary to adopt a space-for-time strategy. A multi-pattern finite state automata (MP-FSA) is used to perform one-time keyword matching on the input information stream. For patterns that require fuzzy matching, the matching patterns should also be extended, and a multi-pattern extended finite state automata (MP-EFSA) is used for the matching process.

For non-keyword-based pattern matching, separate processing is required, that is, the same structure needs to be processed multiple times. Non-keyword-based pattern matching is required to be light-weight.

6 Response mechanism

The response mechanism, also known as linkage mechanism, is also one of the hot research topics in the field of IDS. Its function is to block the connection of the attack traffic between the protected network and external networks in real time in order to actively terminate the process of the attack. It serves as the active actuator of the intrusion detection system. Because the response mechanism can be easily exploited by attackers, or cause its own failure or be used as a stepping stone for DDoS, the academic community has long recommended not activating the IDS response mechanism under normal circumstances. However, as attack methods become increasingly sophisticated and can cause fatal damage to a network in a very short period of time, the time available for human judgment and response is becoming more limited. Therefore, how to effectively respond to attacks in real time, as well as the linkage with gateway and host devices, has become highly significant research topics.

The development of the response mechanism has gone through two stages: IP packet filtering (static and dynamic IP packet filtering) and connection spoofing (transport-layer and application-layer connection spoofing), which has led to a current state where multiple approaches coexist, each tailored to different applications. It is currently evolving towards a third stage—real-time connection filtering.

Static IP packet filtering is realized by an IDS through linking with endpoint network-layer devices (such as routers, three-layer switches, etc.) at the connection boundary between the protected network and external networks, and specified IP addresses are filtered by setting access control lists (ACLs) or static routing tables on these devices[8]. Due to the large number of IP addresses that need to be filtered, most network layer devices cannot meet the requirements in terms of the size and performance of ACLs. Therefore, static routing is more commonly used in practice. Using this method, the information intrusion detection system can only perform access control through a dedicated client program that writes statically. This results in a coarse level of granularity (IP address level), slower response times, and limited capacity. However, it allows for static writing into the configuration files of routing devices, making it non-volatile.

Dynamic IP packet filtering[8] means that the intrusion detection system uses dynamic routing protocols (such as BGP, OSPF, etc.) and key routing devices for route propagation; i.e., propagation of the IP addresses that need to be filtered into the routing tables of the routing devices. This method features fast response times and large capacity. However, it only allows dynamic writing into the routing table of the routing device’s memory (such as RAM), which is volatile and has coarse granularity. Connection spoofing means that the information intrusion detection system forges connection termination signals (such as RST, FIN) during the transmission of sensitive connection and sending them to the source and destination addresses to interrupt the connection. It is characterized by strong real-time capabilities and fine granularity (connection level), and allows for the interruption of a specific sensitive connection. The disadvantage is that it heavily depends on the operational status of the analysis system, requires sending packets to the service network, and is vulnerable to DoS attacks.

By linking with connection-level firewall devices, data streams can be filtered on the five-tuples of connection (transport protocol type, source address, source port, destination address, destination port). It allows for filtering based on any specified combination of five-tuples, offering strong real-time capabilities and fine granularity.

7 Logging mechanism

For an IDS, the logging mechanism is an essential and key component for communicating with users and alerting them[9]. For IDS in high-speed network environments, the performance of the logging mechanism is critical. Since IDS is an alert system, it not only identifies known attacks but also generates a certain number of anomaly reports, and these reports are generated in high-speed network environments at a rate that often far exceeds human processing capacity, causing important attack alerts to be buried in a flood of irrelevant alerts and potentially escape detection[1011]. Therefore, in IDS in high-speed network environments, the logging mechanism must be redesigned to perform multiple processing of alert information, data fusion, clustering, and classification[121314] in order to highlight important information while concealing general information. International research in this area has made some progress, and products have already begun to emerge. However, data fusion methods remain relatively simplistic, and their effect is still limited[15], indicating there is still a considerable way to go.

8 Summary

The architecture of an intrusion detection system in high-speed network environments proposed herein effectively addresses the processing performance challenges of cybersecurity analysis on multi-line and large-bandwidth backbone network lines. The architecture has a good hierarchy, offering high scalability and adaptability, and it can accommodate a wide range of complex network environments, from low-speed access networks to high-speed backbone networks, as well as various types of interface formats. Through backbone network experiments, configuring 16 data stream buses enables line-speed processing of network data across eight OC-48 interfaces.

References

[1]
A Haines. Gartner Information Security Hype Cycle Declares Intrusion Detection a Market Failure. http://www3.gartner.com/5_about/press_releases/pr11june2003c.jsp. 2003
[2]
J Allen, A Christie, W Fithen, et al. State of the practice of intrusion detection technologies. Software Engineering Institute, Carnegie Mellon University, Tech Rep: CMU/SEI-99-TR-028, 2000
[3]
R Sekar, Y Guang, S Verma, et al. A high-performance network intrusion detection system. The 6th ACM Conf on Computer and Communications Security, New York, 1999
[4]
C Kruegel, F Valeur, G Vigna, et al. Stateful intrusion detection for high-speed networks. In: Proc of the IEEE Symposium Security and Privacy. Los Alamitos, CA: IEEE Computer Society Press, 2002
[5]
M Handley, V Paxson, C Kreibich. Network intrusion detection: Evasion, traffic normalization, and end-to-end protocol semantics. The USENIX Security Symposium, Washington, DC, 2001
[6]
M V Mahoney, P K Chan. PHAD: Packet header anomaly detection for identifying hostile network traffic. Department of Computer Sciences, Florida Institute of Technology, Tech Rep: 2001-04, 2001
[7]
R Bettati, W Zhao, D Teodor. Real-time intrusion detection and suppression. The 1st USENIX Workshop on Intrusion Detection and Network Monitoring, Santa Clara, CA, 1999
[8]
李蕾, 乔佩利, 陈训逊. 一种 IP 访问控制方法的设计和实现. 信息技术, 2001, (5):38~38.
(Li Lei, Qiao Peili, Chen Xunxun. An IP access controlling method—Design and implementation. Information Technology (in Chinese), 2001, (5):35–38)
[9]
Dorothy E Denning. An intrusion—Detection Model. IEEE Trans on Software Engineering, 1987, SE-13(2): 222–232
[10]
K Das. The development of stealthy attacks to evaluate intrusion detection systems: [Master dissertation]. Cambridge, MA: MIT Department of Electrical Engineering and Computer Science, 2000
[11]
D Song, G Shaffer, M Undy. Nidsbench—A Network intrusion detection system test suite. The 2nd Int’l Workshop on Recent Advances in Intrusion Detection (RAID), West Lafayette, Indiana, 1999
[12]
W Lee, S Stolfo. Data mining approaches for intrusion detection. The 7th USENIX Security Symposium (SECURITY-98), San Antonio, Texas, 1998
[13]
W W Cohen. Fast effective rule induction. In: Proc of the 12th Int’l Conf on Machine Learning (ICML-95). San Mateo, CA: Morgan Kaufman, 1995. 115–123
[14]
V Paxson. Bro: A system for detecting network intruders in real-time. The 7th USENIX Security Symposium, San Antonio, TX, 1998
[15]
R Lippmann, J W Haines, D J Fried, et al. The 1999 DARPA off-line intrusion detection evaluation. Computer Networks, 2000, 34(4): 579–595

Biography photo of 陈训逊 (Chen Xunxun)

陈训逊 (Chen Xunxun), male, born in 1972, Ph.D. candidate, whose main research areas include computer network and information security, and who was awarded the first prize of National Science and Technology Progress Award in 2002.

Biography photo of 方滨兴 (Fang Binxing)

方滨兴 (Fang Binxing), male, born in 1960, professor, doctoral supervisor, whose main research areas include computer network and information security, and who was awarded the first prize of the National Science and Technology Progress Award in 2002 (bxfang@mail.cnnisc.gov.cn).

Biography photo of 李蕾 (Li Lei)

李蕾 (Li Lei), female, born in 1972, Ph.D. candidate, whose main research areas include computer network and information security (lilei@pact518.hit.edu.cn).