28.9. Processing Ingress ARP PacketsAs explained in the section "ARP Protocol Initialization," ARP registers the arp_rcv routine as its protocol handler. Let's see how this handler processes incoming ARP packets. The ARP packet can be accessed from the skb buffer that is the function's input argument; in particular, the ARP header is at skb->nh.arph. The function's first task is to make sure the ARP packet is not fragmented; that is, that it can be accessed linearly in memory. This task is necessary because sometimes the skb buffer is fragmented in memory.[*] If it is, arp_rcv calls the generic routine pskb_may_pull to make sure there is enough room in the main buffer for the ARP header and payload.
int arp_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) { struct arphdr *arp; /* ARP header, plus 2 device addresses, plus 2 IP addresses. */ if (!pskb_may_pull(skb, (sizeof(struct arphdr) + (2 * dev->addr_len) + (2 * sizeof(u32))))) goto freeskb; An input ARP packet is dropped by arp_rcv if one of the following conditions is met:
In case the buffer was shared (that is, someone else holds a reference to it), arp_rcv clones the buffer with skb_share_check. Cloning is necessary to make sure that no one will change the content of skb (in particular, its header pointers) while processing the ARP packet. See the section "Cloning and copying buffers" in Chapter 2 for more details. Refer to the section "ARP Packet Format" for the meaning of SIP and TIP. Once an ingress ARP packet is ready to be processed, supposing Netfilter does not kidnap it, arp_process takes care of it, as shown in Figure 28-13. Figure 28-16 shows the structure of the arp_process function. It starts with some sanity checks common to all the ARP packet types it understands, and then continues with operations specific to particular packet types. The final part of the function is another common piece of code that updates the cache with the new information, unless the entry to update is locked (see the section "Final Common Processing"). Requests for multicast IP addresses are dropped because they are illegal: we saw in the section "Special Cases" in Chapter 26 that multicast IP addresses do not need the use of ARP to be translated to link layer addresses. 28.9.1. Initial Common Processingarp_process processes both ARPOP_REQUEST and ARPOP_REPLY packet types. Any other ARP packet type is dropped. Packets with a multicast or broadcast destination address, which can be detected with the LOOPBACK and MULTICAST macros,[*] are also dropped because ARP is not needed for them, as described in the earlier section "Destination Address Types for ARP Packets," and the section "Special Cases" in Chapter 26.
Figure 28-16. arp_process function![]() Some device types are supported by the kernel only when it has been explicitly compiled with support for them. They are not included by default because they are not used very often, so the kernel developers decided to reduce the kernel size by making their support optional. The switch statement shown here simply goes one by one through these device types (using a #ifdef to make sure each one has been compiled into the kernel) and checks whether the protocol specified on the ARP packet is correct for that device type. This part of the code is long and repetitive. switch (dev_type) { default: if (arp->ar_pro != htons(ETH_P_IP)) || htons(dev_type) != arp->ar_hrd) goto out; break; #ifdef CONFIG_NET_ETHERNET case ARPHRD_ETHER: ... ... ... if (arp->ar_hrd != htons(ARPHRD_ETHER) && arp->ar_hrd != htons(ARPHRD_IEEE802)) || arp->ar_pro != htons(ETH_P_IP)) goto out; break; #endif #ifdef CONFIG_TR case ARPHRD_IEEE802_TR: ... ... ... #endif ... ... ... #endif } The last task in this section of arp_process is to initialize a few local variables from fields of the ARP header to make later code cleaner. This part of the function is not shown here, but is fairly easy to understand by consulting Figure 28-1. arp_ptr points to the end of the hardware header. 28.9.2. Processing ARPOP_REQUEST PacketsFigure 28-17 is a high-level description of how ARPOP_REQUEST packets are processed by arp_process. arp_process processes both requests for local IP addresses and requests for nonlocal IP addresses. The latter casethat is, the left side of the figureis described in the section "Proxy ARP." Table 28-4 explains the meanings of SIP and TIP.
An ARPOP_REQUEST is processed only if all of the following are true:
If everything is OK, arp_process calls arp_send with the right input parameters. arp_send was described in the section "Transmitting ARP Packets: Introduction to arp_send." rt = (struct rtable*)skb->dst; addr_type = rt->rt_type; if (addr_type == RTN_LOCAL) { n = neigh_event_ns(&arp_tbl, sha, &sip, dev); if (n) { int dont_send = 0; if (!dont_send) dont_send |= arp_ignore(in_dev,dev,sip,tip); if (!dont_send && IN_DEV_ARPFILTER(in_dev)) dont_send |= arp_filter(sip,tip,dev); if (!dont_send) arp_send(ARPOP_REPLY,ETH_P_ARP,sip,dev, tip,sha,dev->dev_addr,sha); neigh_release(n); } goto out; } else { /* Handle Proxy ARP if all the required conditions */ /* are met. See the section "Proxy ARP" */ 28.9.2.1. Passive learning and ARP optimizationThe section "Creating a neighbour Entry" in Chapter 27 mentioned that at the end of an ARP transaction, both the requester and the replier learn something. The sender achieves its essential goal of learning the target's address from the ARPOP_REPLY; this is called active learning. But the target host that receives the ARPOP_REQUEST learns the sender's address from the request itself; this is called passive learning. It is a valuable optimization of the neighboring protocol. Passive learning is taken care of by neigh_event_ns. The latter checks if it already has an entry associated to the requester; it then updates an existing entry or creates a new entry if one doesn't already exist. Whether updating an existing entry or creating a new one, the function sets the state of the neighbor to NUD_STALE. ARP does not take the optimistic step of calling it NUD_REACHABLE because that state is reserved for hosts that have provided proof of reachability, a stricter requirement described in Chapter 27. neigh_event_ns returns NULL when it fails to create an entry (usually because of a lack of memorythat is, no space is available in the cache). In this case, a reply is not sent to the requester. This policy is a little conservative; a more aggressive approach would be to reply anyway on the basis that even though we are temporarily unable to create an entry on our system for the neighbor, we should not deprive it of the ability to transmit data to us. neigh_event_ns calls one of the lookup functions described in the section "Caching" in Chapter 27. Because these always increment the entry's reference counter when the search succeeds, neigh_event_ns needs to decrement the reference count correspondingly. 28.9.2.2. Requests with zero addressesWhen the source IP address in an ARP request is set to 0 (0.0.0.0 in standard quad notation), it could be a corrupted packet, because 0.0.0.0 is not a valid IP address. However, it could also be a special packet used by DHCP to detect duplicated addresses. See the earlier section "Duplicate Address Detection" for the conditions under which these packets are sent, and RFC 2131, section 2.2, for the use of a 0 address. A DHCP server or client can optionally send an ARPOP_REQUEST for a DHCP-assigned IP address to double-check whether, by mistake, the same address is already in use by another host. That special ARPOP_REQUEST is sent with a source IP address of 0.0.0.0 so that it will not create any trouble for the other hosts on the subnet. The following code in arp_process runs when the source IP address (sip) is 0, and lets the local host claim an address when the packet's sender is making this type of request: if (sip == 0) { if (arp->ar_op == htons(ARPOP_REQUEST) && inet_addr_type(tip) == RTN_LOCAL && !arp_ignore(in_dev, dev, sip, tip)) arp_send(ARPOP_REPLY,ETH_P_ARP,tip,dev,tip,sha, dev->dev_addr,dev->dev_addr); goto out; } 28.9.3. Processing ARPOP_REPLY PacketsIncoming ARPOP_REPLY packets are processed if one of the following conditions is met:
The right and left sides of Figure 28-18, respectively, show how these two cases are handled. Figure 28-18. ARPOP_REPLY handling by arp_process![]() Regardless of why the packet is accepted, the existing neighbour entry is updated by the common code described in the next section (and is shown in the dotted box in the figure) to reflect the information in the ARPOP_REPLY packet. 28.9.4. Final Common ProcessingThe last part of arp_process is executed for all ARPOP_REPLY packets, and for ARPOP_REQUEST packets that have not been processed because they did not meet the conditions listed in the section "Processing ARPOP_REQUEST Packets." Remember that when a host replies to an ARPOP_REQUEST, it inverts the source and destination fields of the ARP header, as well as fills in the empty spaces. Another concept to understand, in reading this code, is the locktime. This is unrelated to the semaphore type of locking used frequently by the kernel. Rather, it's a simple kind of timeout that takes care of the chance that a host could receive more than one ARPOP_REPLY for the same ARPOP_REQUEST. This could happen if there is some kind of misconfiguration or if there are multiple proxy ARP servers on the same LAN; the arp_process function reacts by using only the first reply and rejecting subsequent replies. The mechanism is as follows: the neighboring subsystem introduces the locktime parameter in the neigh_table structure; the parameter can also be tuned by /proc. The following code sets override to a time in the future that reflects locktime. (locktime is expressed in jiffies, so a value of HZ means 1 second.) The neigh_update function is called to update an entry only if it wasn't called for that same entry during the preceding locktime. Thus, the final code is: n = _ _neigh_lookup(&arp_tbl, &sip, dev, 0); ... if (n) { int state = NUD_REACHABLE; int override; override = time_after(jiffies, n->updated + n->parms->locktime); if (arp->ar_op != htons(ARPOP_REPLY) || skb->pkt_type != PACKET_HOST) state = NUD_STALE; neigh_update(n, sha, state, override ? NEIGH_UPDATE_F_OVERRIDE : 0); neigh_release(n); } The code has to select the right state to assign to the neighbour entry being updated. As explained in the section "Reachability" in Chapter 26, unicast and broadcast replies have different levels of authority. A unicast reply (PACKET_HOST) sets the neighbor state to NUD_REACHABLE, and a broadcast reply sets it to NUD_STALE. Updates caused by ARPOP_REQUEST packets always set the state to NUD_STALE. |