嵌入式linux中文站在线图书

Previous Page
Next Page

Chapter 22. Internet Protocol Version 4 (IPv4): Handling Fragmentation

Fragmentation and defragmentation are complex tasks because of the variety of inputs that the IP layer of a host can receive both when fragmenting and when defragmenting a packet. We have seen much of the work that goes into fragmentation as part of the functions shown in previous chapters on IPv4. This chapter describes the ip_fragment function, which is defined in net/ipv4/ip_output.c, where all of these efforts reach their final culmination and result in separate packets ready to transmit. This chapter also describes the corresponding ip_defrag function, defined in net/ipv4/ip_fragment.c, where incoming fragments are reassembled into a packet prior to being passed to the L4 layer via ip_local_deliver. Helper functions are described in each section as well.

These two functions can be used by other subsystems besides IPv4. For example, Netfilter uses them when it is forced to defragment (and refragment) an IP packet to be able to access header fields above the L3 layer. This is necessary mostly for forwarded packets and was discussed in the section "The ip_push_pending_frames Function" in Chapter 21.

How does the IP layer recognize that a packet is a fragment of a larger packet? Based on what we saw in Chapter 17, we need both the Offset and MF fields of the IP header to tell. If the packet has not been fragmented, Offset=0 and MF=0. If instead we have a fragment on our hands, the following is true:

  • The first fragment has Offset=0 and MF=1.

  • All the fragments between the first and the last one have both of the fields nonzero.

  • The last fragment has MF=0 and Offset nonzero.

We said earlier that ip_local_deliver is one of the places where defragmentation could take place. Here is a snapshot from the function that shows how a fragment is recognized and passed to ip_defrag based on the considerations just listed:

        if (skb->nh.iph->frag_off & htons(IP_MF|IP_OFFSET)) {
            skb = ip_defrag(skb);
            if (!skb)
                return 0;
        }

Similar logic can be found in fragmentation code to correctly tag fragments.

The fragmentation/defragmentation subsystem is initialized by ipfrag_init, which is invoked at boot time by inet_init. The initialization function does not do much; it mainly starts a timer and initializes one variable to a random value. Both of these tasks are needed to handle an optimization added to protect the kernel from a possible Denial of Service (DoS) attack; see the section "Hash Table Reorganization" for details.


Previous Page
Next Page