A Review of IP Packet Compression Techniques

Published on February 2018 | Categories: Documents | Downloads: 57 | Comments: 0 | Views: 685
of 6
Download PDF   Embed   Report

Comments

Content

A Review of IP Packet Compression Techniques Ching Shen Tye and Dr. G. Fairhurst Electronics Research Group, Department of Engineering, Aberdeen University, Scotland, AB24 3UE. {c.tye, gorry}@erg.abdn.ac.uk Abstract

This paper investigates several compression techniques may improve the IP packet delivery process and identifies their limitations. The recent emergence and popularity of wireless Internet has triggered a demand for improved transmission efficiency, especially where the link has a high cost per transmitted byte. Packet compression at the link layer may speed up the delivery process and provide more efficient use of available capacity. However, the success of compression depends on several factors – the protocol headers present, the use of encryption, and the type of data being sent.

1 INTRODUCTION

IP networking is replacing many existing networks, e.g. IP telephony may replace a circuit switched telephone network. However, IP introduces packet overhead, and this raises the question of efficiency (defined as the ratio between the total number of information bytes and the total number of received bytes). Efficiency is important were the “cost” of transmission is high. Examples include fixed rate links where the speed of transmission is limited or wireless links were there is a cost for using the radio bandwidth. One effect of a low efficiency is that there is less capacity available for other services. It also increases the transit delay of the packets across the link (a larger packet takes longer to serialise).

processor hardware, and may introduce extra delay. In some cases this cost can be justified in terms of the improved efficiency and reduced bandwidth. A second drawback arises from the way in which compression is performed. To decompress a packet, the decompressor also needs to obtain information about the way in which the compression was performed, which we call this the “context [4]”. For correct decompression, this context information needs to be reliably sent by the compressor to the decompressor.

In this paper, we assume that successfully decompressed packets are semantically identical to the original packets. In practice, this means that the decompressed packets must exactly match the original packets. This implies the use of lossless Furthermore, IP version 6 (IPv6) [1] is being data compression techniques. deployed in many next generation networks, and especially in broadband wireless networks. This 3 COMPRESSION TECHNIQUES increases the size of an IP address from 32 bits in 3.1 Bulk Compression IPv4 [2] to 128 bits, resulting in a doubling of the The most common form of compression used in computer software is bulk compression. In this minimum IP header to 40 B in IPv6. technique, all the information in the packets are treated as a block of information and are 2 LINK COMPRESSION One well known way to improve efficiency is use compressed using a compression algorithm. The data compression [3]. This process attempts to compressor constructs a dictionary of common yield a compact digital representation of the sequences within the information, and matches information, and send this in place of the original each sequence to a shorter compressed representation (key strings). Bulk compression information. may use a pre-defined dictionary (static context When compression is used at the link layer, it can for all IP flows) or a running dictionary based on improve transmission efficiency. There are two the compression algorithm (i.e. optimised for a implications: First, there is a computational cost particular IP flow). The receiver must use an associated with algorithms for compression (at the identical dictionary for decompression. This sender side of the link) and decompression (at the method can achieve high compression ratios; link receiver). This may require additional however, it has two major drawbacks:

ISBN: 1-9025-6009-4 © 2003 PGNet

• •

The dictionary requires a large memory. The dictionaries at the compressor and decompressor must be synchronised. A running dictionary system may loose synchronisation (or “context”) when used over a link subject to packet loss. A loss of synchronisation will cause the receiver to discard all packets, until the receive dictionary is re-synchronised.

To overcome the limitations on memory and link quality, packet by packet dictionary [5] algorithms were invented. This compresses each packet individually, sending the context with the packet. This prevents a loss of synchronisation. This also requires a smaller dictionary (less memory). The trade off is this achieves lower compression ratios. Another kind of compression technique is the Guess-Table-Based compression [5] algorithm. At the sender, this uses an algorithm to guess the next byte(s) of data based on previous data. If the guess is correct, the byte is not transmitted on the link. Both receiver and transmitter must use the same algorithm (context) to successfully decompress data at the receiver. 4. HEADER COMPRESSION

Bulk compression achieves little benefit when used on protocol header information. The structure of this information varies from packet to packet and from field to field within the packet headers. Standard bulk compression algorithms can not take advantage of this structure, however a compression algorithm that understands the syntax (and possibly semantics) of the packet headers may be able to achieve considerable benefit by exploiting the redundancy which is often present in successive packets in the same IP flow. This is called Header Compression. 4.1 Van Jacobson Header Compression (VJHC)

The Van Jacobson Header Compression scheme [6] relies on knowledge of the TCP/IPv4 headers. The algorithm first classifies packets into individual flows (i.e. packets that share the same set of {IP addresses, IP protocol type, and TCP port numbers}). State (a context) is then created for each flow and a Context ID (CID) assigned to identify the flow at the compressor and decompressor. The sender then omits fields in the header that remain unchanged between successive packets in an IP flow (these may be deduced by

using the CID in each packet to refer to the context state at the decompressor). VJHC compresses the IPv4 and TCP header together as a combined set of fields. Figure 1 shows the TCP/IPv4 header fields. Within an IP flow, more then half of the fields are unchanged between successive packets. The “total length” and “identification” field are expected to be handled by link framing protocols. The “IP checksum” can also re-calculate at the receiver. By suppressing these fields at the compressor and restoring them at the decompressor, VJHC can significantly improve transmission efficiency for the packet header.

Figure 1 TCP/IPv4’s header behaviour

Furthermore, the remaining changing fields do not frequently change at the same time, and the compressed header can thus omit these in most cases. The remaining fields usually change only by a small amount in successive packets. A further saving can be achieved by transmitting the difference (i.e. differential encoding) in the value of the field rather then the entire fields. VJHC relies on two types of error detection: A CRC at the link layer (to detect corruption of the compressed packet) and the TCP checksum at the transport layer (to detect corruption in the compressed packet). When errors are detected, the receiver discards the erroneous packet. This creates another problem in the decompression process. Since the differential compression techniques are applied, the receiver also looses the context state. The next following packet after the discarded packet can not therefore be decompressed correctly. It must also be discarded. All subsequent packets will therefore be discarded until the next synchronisation (i.e. uncompressed packet is received, restoring the context state). To overcome this error propagation, receiver should use the differential sequence number change from the incoming compressed packet to the sequence number of the last correctly received packet, and

generate a correct sequence number for the packet main advantage of IPHC is independence of the transport layer protocols. The drawback is IPHC is after discard packet. only half as efficient as VJHC for TCP packets. Errors in the value of the TCP checksum errors of packets received by the destination end host must 5. ROBUST HEADER COMPRESSION (ROHC) also be considered. When the end host fails to The RObust Header Compression scheme [4] is a receive TCP data segments (forward path), no new header compression scheme, being developed TCP acknowledgement is sent. The sender in the ROHC Working Group of the IETF. eventually suffers a TCP timeout, and resends the Compared with the previous schemes (in section missing segment(s), these also trigger the 4), the major advantages are high robustness and improved efficiency. compressor to resynchronise the context state. The combined IPv4 and TCP headers the two header can typically be reduced using VJHC from 40B to 4B (i.e. 10% of the original size). The technique significantly improves performance over low speed (300 to 19,200bps) serial links. The main disadvantages of VJHC are the impact of loss of synchronisation (when not used with a reliable link protocol) and the combined compression of the TCP and IPv4 headers. The scheme does not support recent changes to IP (e.g. ECN, Diffserv), or TCP (e.g. SACK, ECN, TS option, LFN). It also prevents the VJHC algorithm from compressing packets with additional headers placed between the IP and TCP headers (e.g. IPSEC AH, or tunnel encapsulations). It also will not compress either IPv6 and/or for other transport protocols (such as UDP).

A key feature is that the ROHC framework is extensible. This means that new protocols can be added without the need to design a completely new compression protocol. The monolithic approach of VJHC now seems inappropriate, considering the widespread use of tunnel encapsulations, increasing use of security protocols and the emergence of IPv6. The penalty for flexibility is that ROHC is a complicated technique, absorbing all the existing compression techniques, and adding a more sophisticated mechanism to achieve robustness and reliability.

4.2 IP Header Compression (IPHC)

Compression and Decompression are treated as finite state machines and can be broken into a series of states. At the compressor, these are Initialization & Refresh (IR), First Order (FO) and Second Order (SO). At the decompressor these three sates: No Context (NO), Static Context (SC) and Full Context (FC).

Internet Protocol Header Compression [7] is another uses the same mechanism as VJHC by omitting the unchanged fields in the header. The main difference between IPHC and VJHC, is IPHC compresses only the IP header. This supports any transport protocol or tunnel encapsulation also ECN and IPv6. IPHC also allows extension to multicast, multi-access links. Like VJHC, a 16 bit CID is used for TCP flows, but a larger CID is assigned to non-TCP flows.

The link starts with no context state for a flow (CID), and therefore can only perform limited compression. Once the compressor has successfully installed context state at the receiver, it may transits to the next state. (E.g. the compressor will gradually transit forward from IR → FO → SO when it gets sufficiently confident that decompressor has all the information to decompress the header).

IPHC handles errors in a similar way to VJHC. IPHC can also use other recovery methods to recover TCP checksum failure (e.g. TWICE algorithm [8]). For non-TCP packets, a periodical uncompressed packet is sent to improve the probability of the context information being correct. This additional information reduces efficiency, but helps bound the impact of packet loss.

ROHC defines three operation modes [4], they are: Unidirectional (U-mode), Bidirectional Optimistic (O-mode) and Bidirectional Reliable (R-mode). All ROHC operations start from U-mode, and then may transit to O-mode or R-mode depending on the feedback information. U-mode makes a periodic refresh and time out to update the context (as in IPHC for UDP), O-mode uses the feedback channel for error correction and R-mode utilises all the available resources to prevent from any IPHC may reduce the IP header to 2 bytes for non- context loss or loss synchronisation. TCP session and 4 bytes for TCP session. The

Each operation mode contains all three states; IR, FO and SO, as shown in the Figure 2 below. Any errors that occur will force each state to return back to a lower compression state or gradually to lower the operation mode if more errors are detected or propagated.

the entire segment such as the initialisation and refresh header and ROHC fragmentation. The main benefit of ROHC is that it may be designed to compress any type of header. For links that may experience loss, it may also be made more robust than exiting schemes such as VJHC and IPHC. Moreover it offer significant benefits for small packets such as TCP ACKs or VoIP packets when sent over a link with limited capacity or low transmission speed. 6 EXPERIMENTAL RESULTS 6.1 Packet by packet bulk compression

Figure 2 ROHC Operation modes

The ROHC basic packet format is shown in Figure 3; it consists of padding, feedback, header information and payload. These four generic fields allow the decompressor to generate feedback information quickly and send this back to compressor quickly.

Figure 3 ROHC basic packet structure

Padding: Padding is located at the beginning of the compressed header; this aligns headers to byte boundaries. Feedback: Feedback information travels from the de-compressor to compressor, to assist in synchronisation of the context state. Header : Compressed header information Payload: Packet payload data As earlier header compression techniques (section 4), ROHC still relies on a Link Layer CRC to verify integrity of the compressed packet. To gain the full benefit, ROHC relies on a feedback acknowledgment channel for error discovery and recovery. In some cases, a CRC also used protects

This section investigates the performance of several bulk compression algorithms (section 3), when use over the entire IP packet. 10,000 TCP/IPv4 packets were captured from an Ethernet LAN and compressed by both the Huffman [3] and Lempel-Ziv Welch (LZW) [9] algorithms. The relationship between compression ratio and original packet length for each compression algorithm is shown in Figure 7(a), (b) respectively. Figure 7(a) shows two curves corresponding to two distinct types of packet payloads. The compression ratio of the packets in the lower curve are consistently lower than 1 throughout the entire range of packet length, (i.e. these packets are expanded in size by the compression algorithm, rather than being reduced). Packets in this class include packets that contain random (or precompressed) data, which can not be successfully compressed by the link compressor using the dictionary based compression techniques, such as Huffman coding and LZW. The upper curve in Figure 7(a) shows an increase in the compression ratio with the packet length increase, yielding a compression gain for packets whose original length is larger than 550 B. In Huffman coding, variable length packets are represented by a fixed size entry in the dictionary, so the data is are smaller than the corresponding code size at the beginning of dictionary, whereas near the end of dictionary, the data size is much bigger than the code size. When the packet size is small, most data in a packet are represented by the first entries in the dictionary, thus, the compression ratio is less then 1. For larger packet sizes, good compression ratios are achieved. Similar result can be found for LZW coding in Figure 7(b) however LZW maps blocks of variable length into blocks of fixed size which allow it to obtain better compression than Huffman coding.

Figure 8 plots the compression ratio against packet size for 10,000 UDP/IPv4 packets compressed by either the Huffman or LZW algorithms. The compression ratio is below 1 through the entire range of original packet size, which means no compression gain is obtained. Further analysis showed that most UDP traffic captured was steaming data (e.g. audio or video) which is already compressed by video/audio codec. Some small (control) packets were compressible.

outperform the other compression schemes, but only where data has not already been compressed. The use of IPSEC encryption (ESP with encrypted payload) is also an obstacle to compression. This prevents use of VJHC, and forces ROHC to only compress the outer (unencrypted) packet headers. Bulk compression will also fail to compress an encrypted payload. One mitigation, is to use a bulk compression algorithm prior to applying the IPSEC encryption.

6.2 Comparison of techniques

Figure 4 presents a theoretical comparison of the original packet length and compressed packet length with several types of header. ROHC shows considerable advantage and is constantly capable of compression of all headers to less than 10 B. Header Lengths (bytes)

80

6.3 Application performance

Header compression increases the efficiency of information delivery, and for low speed transmission may also reduce the packet transit time. The benefit when using small packets (such as VoIP or TCP ACKs) may be very noticeable.

70 60

Original

50

BULK

40

VJHC

30

IPHC

20

ROHC

10

M ultimedia

HTTP Bit saving (%)

Telnet /U D P /U DP /R IP TP v6 /U D P/ R TP

FTP

IP v4

IP v6

P /T C

IP v4

/U D P

P /T C

IP v6

IP v6

IP v4

IP v4

0

0

Figure 4 Length of original packets and compressed packet by several compression schemes

Compression Ratio

1.5 1.4 BULK VJHC IPHC ROHC

1.2 1.1 1

80

Figure 6 Bit rate saving base on application

1.6

1.3

20 40 60 Bit saving percentage (%)

To investigate the overall benefit, requires a study of the range of packet sizes encountered for different applications. 10,000 packets were captured from an Ethernet LAN corresponding to IP flows between several applications, and the resulting data analysed to determine the distribution of packet sizes.

0.9

Figure 6 shows the predicted bit rate saving of header compression utilised by application. Since Telnet sessions packets are usually small, they Original Packet Length (bytes) offer good compression. HTTP, FTP and Figure 5 Comparison of compression ratio of several multimedia packet usually consist of large packets, compression schemes thus significant efficiency savings cannot be The compression ratio of several compression obtained by solely compressing protocol headers. schemes are compared in Figure 5. VJHC, IPHC 7 CONCLUSION and ROHC show a similar relationship between compression ratio and original packets length. Bulk compression is useful when packets carry When the packet length is less then 180 bytes large volumes of uncompressed data (especially header compression schemes achieve a higher when the payload is larger than 500 bytes). compression ratio then bulk compression. In the However, when the IP traffic uses encryption or case of larger packets, bulk compression can higher layer compression (e.g. ZIP files, IPCOMP, or multimedia CODECs) there is no benefit from 1450

1350

1250

1150

1050

950

850

750

650

550

450

350

250

150

50

0.8

compression. In fact, attempted compression simply wastes sender computing resources Header compression is an effective way to reduce the packet header size for small packets (less than approximately 200 B) where compression can provide a significant saving in overall packet size. VJHC can only be applied on TCP/IPv4 packet with no any options. IPHC can not achieve high compression ratio by just compressing IP header, since IP header is one of multiple headers used in a packet. The existing schemes have a number of weaknesses in the packet headers they can compress, and their robustness to loss of packets. In the future, it can be expected that header compression techniques based on the ROHC framework will be able to achieve a reasonable compression ratio but without necessarily a significant loss of robustness to packet loss. At the moment, compression schemes for TCP using ROHC have not been defined, and their behaviour with loss is still to be examined. The practical performance of ROHC is therefore an item of further work. REFERENCES

[1] [2] [3] [4]

[5] [6] [7] [8]

[9]

S. Deering and R. Hinden, “Internet Protocol Version 6 (IPv6) Specification,” RFC 1883, 1995. U. o. S. California and C. Marina del Rey, “Internet Protocol Specification,” RFC 0791, 1981. D. A. Huffman, “A Method for the Construction of Minimum Redundancy Codes,” Proceedings of the IRE, 1952. C. Bormann, et al “RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed,” RFC 3095, 2001. HP Company, “HP Case Study: WAN Link Compression on HP Routers,” 1995. V. Jacobson, “Compression TCP/IP for Low-Speed Serial Link,” RFC 1144, 1990. M. Degermark, B. Nordgren, and S. Pink, “IP Header Compression,” RFC 2507, 1999. M. Degermark, M. Engan, B. Nordgren, and S. Pink, “Low-loss TCP/IP header compression for wireless networks,” at ACM MobiCom, 1996. T. A. Welch, “A Technique for HighPerformance Data Compression,” Computer, pp. 8-18, 1984.

(a) Huffman Coding

(b) Lempel-Ziv Welch

Figure 7 Packet by packet compression on

TCP/IP packets

(a) Huffman Coding

(b) Lempel-Ziv Welch Figure 8 Packet by packet compression on UDP/IP packets

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close