嵌入式linux中文站在线图书

Previous Page
Next Page

26.6. Neighbor States and Network Unreachability Detection (NUD)

Figure 26-11 is a simplified summary of the steps the kernel has to go through when transmitting a packet to a given L3 address.

Figure 26-12 is a simplified model that shows the states a neighbor can go through.

The two simple models in Figures 26-11 and 26-12 would work in most cases, but the Linux kernel uses a more sophisticated model to handle all possible states. The next section will expand the model in Figure 26-12, and later sections will focus on the details in Figure 26-11.

As you can see, an important part of managing neighbors is to know whether they are reachable.

Figure 26-11. L3-to-L2 address resolution steps


Figure 26-12. States of an L3-to-L2 mapping


26.6.1. Reachability

Reachability, from the neighboring subsystem's perspective, can be described through a real-life analogy. Suppose you are in a dark room with other people, including me. If you say "Everybody out of the room!" everybody will leave the room because they all can hear you. But if you want only me to go out, you will need one more piece of information: my name.

Thus, a solicitation reply sent to a broadcast destination address does not carry the same amount of information as one with a unicast destination address: anyone can receive a broadcast, but you need the exact address if you want to talk to a given recipient.

From the neighboring perspective, a host is considered reachable if the kernel has proof that the recipient can correctly receive frames addressed at its unicast address, and vice versa. In other words, you need bidirectional reachability for the kernel to consider a neighbor reachable. In the rest of this chapter, we will therefore use the term reachable to mean bidirectional reachability. We will see in the section "Reachability Confirmation" that there are two possible ways in which reachability can be confirmed: L4 confirmation and a solicitation reply.

26.6.2. Transitions Between NUD States

IPv6 defines an NUD mechanism that can help determine quickly whether neighbors have disconnected or gone down. The Linux kernel uses the same mechanism for both IPv4 and IPv6. Similar models are used by the other protocols we will not cover in the book, such as DECnet.

Figure 26-13 summarizes the states a neighbor can assume and the conditions that can trigger a change of state. An entry can be created by several events, including the request to transmit a data packet to a neighbor, or the reception of a solicitation request from a neighbor.

The state of an entry may change several times during its lifetime, and the same state can be entered multiple times by one entry. Different protocols may carry out different transitions, including some not shown in the figure, to take advantage of special conditions. For example, the link that puts a newly created entry directly into NUD_STALE is used by IPv4, but not by IPv6.

A description of the states in Figure 26-13 follows. The possible values are grouped based on some common properties. This description will be followed by a discussion of the transitions in the graph, and in particular the NUD mechanism.

26.6.2.1. Basic states

The states in Figure 26-13 are defined as follows. We start with the default state of a newly created entry:


NUD_NONE

The neighbor entry has just been created and no state is available yet.

Figure 26-13. Transitions among NUD states

This next set comes from the IPv6 neighboring definition and has been adopted by the latest Linux ARP/IPv4 implementation as well:


NUD_INCOMPLETE

A solicitation has been sent, but no reply has been received yet. In this state, there is no hardware address to use (not even an old one, as there is with NUD_STALE).


NUD_REACHABLE

The address of the neighbor is cached and the latter is known to be reachable (there has been a proof of reachability).


NUD_FAILED

Marks a neighbor as unreachable because of a failed solicitation request, either the one generated when the entry was created or the one triggered by the NUD_PROBE state.


NUD_STALE


NUD_DELAY


NUD_PROBE

Transitional states; they will be resolved when the local host determines whether the neighbor is reachable. See the section "Reachability Confirmation."

The next set of values represents a group of special states that usually never change once assigned:


NUD_NOARP

This state is used to mark neighbors that do not need any protocol to resolve the L3-to-L2 mapping (see the section "Special Cases"). The section "Start of the arp_constructor Function" in Chapter 28 shows how and why this state is set in IPv4/ARP. But even though the name of this state suggests that it applies only to ARP, it can actually be used by any neighboring protocol.


NUD_PERMANENT

The L2 address of the neighbor has been statically configured (i.e., with user-space commands) and therefore there is no need to use any neighboring protocol to take care of it. See the section "System Administration of Neighbors" in Chapter 29.

26.6.2.2. Derived states

In addition to the basic states listed in the previous section, the following derived values are defined just to make the code clearer when there is a need to refer to multiple states with something in common:


NUD_VALID

An entry is considered to be in the NUD_VALID state if its state is any one of the following, which represent neighbors believed to have an available address:

NUD_PERMANENT
NUD_NOARP
NUD_REACHABLE
NUD_PROBE
NUD_STALE
NUD_DELAY

NUD_CONNECTED

This is used for the subset of NUD_VALID states that do not have a confirmation process pending:

NUD_PERMANENT
NUD_NOARP
NUD_REACHABLE

NUD_IN_TIMER

The neighboring subsystem is running a timer for this entry, which happens when the status is unclear. The basic states that correspond to this are:

NUD_INCOMPLETE
NUD_DELAY
NUD_PROBE

Let's look at an example of why a derived state is useful in kernel code. When a neighbor instance is removed, the host needs to stop all the pending timers associated with that data structure. Instead of comparing the neighbor's state to the three states known to have a pending timer associated with them, it is just cleaner to define NUD_IN_TIMER and compare the neighbor's state against it using the bitwise operator &.

26.6.2.3. Initial state

When a neighbor instance is created, the NUD_NONE state is assigned to it by default, but the state can also be explicitly set to something different when the creation is caused by an explicit user command (see Chapter 29).

As explained in the section "Neighbor Initialization" in Chapter 27, the protocol's constructor method may also change the state depending on the characteristics of the associated device (e.g., point-to-point) and L3 address (e.g., broadcast).

26.6.3. Reachability Confirmation

We saw in the section "Why Static Assignment of Addresses Is Not Sufficient" that it is possible for an L3-to-L2 mapping to change. Because of this, it makes sense to confirm the information stored in the cache regularly, if the information has not been used for some time. This is called reachability confirmation.

Note that a change in reachability status is not necessarily due to the reasons listed in the section "Reasons That Neighboring Protocols Are Needed"; a router, bridge, or other network device may just be experiencing some problems. While the reachability confirmation is in progress, the cached information is temporarily used under the assumption that it is most likely still valid.

The three NUD states NUD_STALE, NUD_DELAY, and NUD_PROBE support the task of reachability confirmation. The key reason for the use of these states is that there is no need to start a reachability confirmation process until a packet needs to be sent to the associated neighbor.

Let's define once again the exact meaning of these three NUD states, and then look at the two ways a mapping can be confirmed:


NUD_STALE

The cache contains the address of the neighbor, but the latter has not been confirmed for a certain amount of time (see the discussion of reachable_time in the section "neigh_parms Structure" in Chapter 29). The next time a packet is sent to the neighbor, the reachability verification process will be started.


NUD_DELAY

This state, closely tied to NUD_STALE, represents an optimization that can reduce the number of transmissions of solicitation requests.

This state is entered when a packet is sent to a neighbor whose associated entry is in the NUD_STALE state. The NUD_DELAY state represents a window of time where external sources could confirm the reachability of the neighbor. The simplest sort of external confirmation is when the neighbor in question sends a packet, thus indicating that it is running and accessible.

This state gives some time to the upper network layers to provide a reachability confirmation, which may relieve the kernel from sending a solicitation request and thus save both bandwidth and CPU usage. This state may look like a small optimization, but if you think in terms of big networks, you can imagine the gain it can provide.

If no confirmation is received, the entry is put into the next state, NUD_PROBE, which resolves the status of the neighbor through explicit solicitation requests or whatever other mechanism a protocol might use.


NUD_PROBE

When the neighbor has been in the NUD_DELAY state for the allotted amount of time and no proof of reachability has been received, its state is changed to NUD_PROBE and the solicitation process starts.

The reachability status of a neighbor can be confirmed in two main ways. As we will see, these two methods do not have the same level of authority. They are:


Confirmation from a unicast solicitation's reply

When your host receives a solicitation reply in answer to a solicitation request it previously sent out, it means that the neighbor received the request and was able to send back a reply; this in turn means that either it already had your L2 address or it learned your address from your request (see the section "Creating a neighbour Entry" in Chapter 27. It also means that there is a working path in both directions. Note, however, that this is true only when the solicitation's reply is sent as a unicast packet. The reception of a broadcast reply would move the state to NUD_STALE rather than NUD_REACHABLE. (You can find more discussion of this from the standpoint of ARP in the section "Processing Ingress ARP Packets" in Chapter 28.)


External confirmation

If your host is sure it received a packet from the neighbor in response to something previously sent, it can assume the neighbor is still reachable. Figure 26-14 shows an example, where the TCP layer of Host A confirms the reachability of Host B when it receives a SYN/ACK in reply to its SYN. Note that if Host B was not a neighbor of Host A, the reception of the SYN/ACK from Host B would confirm the reachability of the next hop gateway used by Host A to reach Host B.

Figure 26-14. Example of external neighbor reachability confirmation

Confirmation is done via dst_confirm, which confirms the validity of the routing table cache entry used to route the SYN packet toward Host B. dst_confirm is a simple wrapper around neigh_confirm, which accomplishes the task we described earlier: it confirms the reachability of the neighbor and therefore the L3-to-L2 mapping. Note that neigh_confirm only updates the neigh->confirmed timestamp; it will be the neigh_periodic_timer function (which is executed by the expiration of the timer started when the neighbor entered the NUD_DELAY state) that actually upgrades the neighbor entry's state to NUD_REACHABLE.[*]

[*] The delay between the reception of the confirmation from the L4 layer and the setting of the state to NUD_REACHABLE does not affect traffic in any way.

Note that the correlation between the two packets in Figure 26-14 could not be performed at the IP layer because the latter doesn't have any knowledge of data streams. This is why the L4 layer takes care of the confirmation. TCP SYN/ACK exchanges are only one example of an L4 protocol providing external confirmation. Given a socket, and therefore the associated routing cache entry and its next-hop gateway, a user-space application can confirm the reachability of the gateway by using the MSG_CONFIRM option with transmission calls such as send and sendmsg.

While the reception of a solicitation's reply can move the state to NUD_REACHABLE regardless of the current state, external confirmations can be used only when the current state is NUD_STALE. This means that if the entry had just been created and it was in the NUD_INCOMPLETE state, external confirmations would not be allowed to confirm the reachability of the neighbor (see Figure 26-13).

Note that NUD_DELAY/NUD_PROBE and NUD_NONE can lead to NUD_REACHABLE, as shown in Figure 26-13; however, from NUN_NONE to get to NUD_REACHABLE, you need full proof of reachability, while from NUD_DELAY/NUD_PROBE, any kind of confirmation is sufficient.


Previous Page
Next Page