13.1. Overview of Network Stack
Readers of this book are expected to be familiar with the basic TCP/IP protocols, but there are some other protocols in common usesuch as Logical Link Control (LLC) and Subnetwork Access Protocol (SNAP)that you may not know. This section introduces key protocols and shows their relationships.
The two best-known models for network protocols are the seven-layer OSI model
and the five-layer TCP/IP model, shown in Figure 13-1. The OSI model remains an important reference point for networking discussions even though it never took off for a variety of reasons. The TCP/IP model covers most of the protocols used by computers today.
Figure 13-1. OSI and TCP/IP models
At each layer, numerous protocols are available. At the lowest level, where interfaces exchange data, the protocol in use is predetermined. A driver for that protocol is associated with the interface, and all data that comes in on the interface is assumed to follow the protocol (i.e., Ethernet); if it doesn't, errors are reported and no communication takes place.
But once the driver has to hand over data to a higher layer, a choice of protocols ensues. Should data at L3 be handled by IPv4, IPv6, IPX (the Novell NetWare protocol), DECnet, or some other network-layer protocol? And a similar choice must be made going from L3 to L4, where TCP, UDP, ICMP, and other protocols reside.
This chapter deals with the lower three layers and briefly touches on the fourth one.
An individual package of transmitted data is commonly called a frame on the link layer, L2; a packet on the network layer; a segment on the transport layer; and a message on the application layer.
The layers are often called the network stack, because communication travels down the layers until it is physically transmitted across the wire (or wireless bands) and then travels back up. Headers are also added and removed in a LIFO manner.
13.1.1. The Big Picture
Figure 13-2 builds on the TCP/IP model in Figure 13-1. Figure 13-2 shows which chapter covers each interface between adjacent layers. Some of these interfaces involve communication down the stack, whereas others involve communication upward:
Going up in the stack (for receiving a message)
This chapter describes how ingress traffic is handed to the right protocol handler. (The meaning of ptype_base and ptype_all will become clear in the section "Protocol Handler Organization.")
Chapter 10 describes how device drivers notify the kernel about the reception of ingress frames.
Chapter 24 describes how the IPv4 protocol delivers ingress IPv4 packets to the right L4 protocol (IPv4 is the only network layer protocol we cover in the book). The IPv4 receive routine is described in Chapter 19.
Going down in the stack (for sending a message)
Chapter 21 describes the functions provided by the IPv4 layer for transmission.
Part VI describes how the neighboring layer interfaces the L3 protocols to the transmitting routine dev_queue_xmit. The latter is described in Chapter 11.
As shown in Figure 13-2, the socket interface is not covered in this book. However, there is one point worth mentioning about the AF_PACKET socket type. It's the Linux way to capture frames at the link layer and inject frames into the link layer, directly bypassing all the intermediate protocol layers. Network sniffers such as tcpdump and Ethereal are common users of AF_SOCKET sockets. You can see from the figure that AF_PACKET sockets hand frames directly to dev_queue_xmit, and receive ingress frames directly from the network protocol dispatcher routine (this latter point is addressed in Chapter 10).
Figure 13-2 shows only two protocol families (PF_INET, PF_PACKET), but several others are implemented in the Linux kernel. Among them are:
Used as the preferred interface for network configuration. See Chapter 3.
Used as a key management interface for network security services. IPsec is one of these services.
See the section "Logical Link Control (LLC)."
13.1.2. Link Layer Choices for Ethernet (LLC and SNAP)
Although the link layer protocol is fairly fixed by the hardware in use, the Ethernet standard allows some choice between protocols. The first attempt at standardizing this choice was called Logical Link Control (LLC). Since it offered very limited options, it never saw much use. The IEEE 802 committee then standardized the Subnetwork Access Protocol (SNAP)
, which is found in use fairly often. The implementation of both of these subprotocols is described later in this chapter.
In LLC, the header contains a field specifying the protocol for the Source Service Access Point (SSAP)
and the protocol for the Destination Service Access Point (DSAP)
. Each field, however, contains only 8 bits, one of which is dedicated to a flag that indicates whether multicast is in use and another dedicated to a flag that indicates whether the address is local to one network or is recognized worldwide. Therefore, with 6 bits left to specify a protocol, LLC supports a maximum of 64 protocols, which is too few to make the technology popular.
Figure 13-2. The big picture
Therefore, the IEEE 802 committee extended LLC by providing a special value in the SSAP and DSAP fields that indicates that the protocol in use by that source or destination is identified by another 5 bytes in the header. With this extension, called SNAP, there are 40 bits that can be assigned to various protocols.
13.1.3. How the Network Stack Operates
Let's briefly examine a sample communication to see how choices are made at communication points.
In Figure 13-3, assume that a user at Host X wants to download an HTML page using a web browser from the web server running on Server Y. Some of the questions to answer include the following:
Figure 13-3. Example of communication between two remote stations (Host X and Server Y)
Because Host X and Server Y are on different local area networks, how will they be able to talk to each other?
Because Host X does not know where Server Y is physically located, how will it find out where to send its request?
If Server Y is running more than one application (not just the web server), how can its operating system determine which application should handle the request from Host X?
If Host X is running more than one application (not just the browser), how can its operating system determine which application receives the data that returns?
Let's follow the request for a web page through the network stack to see how these questions are answered. We'll use Figures 13-3 and 13-4 as references.
Application layer, Host X
The browser reads the URL requested by the user; suppose it is http://www.oreilly.com. The browser uses the Domain Name System (a topic beyond the scope of this book) to resolve the domain www.oreilly.com to an IP address, which we'll suppose is 22.214.171.124. It is up to the IP protocol (L3, the network layer) to find a path between Host X and Server Y using this address.
The browser now initiates an HTTP session on the application layer to 126.96.36.199. It then invokes TCP to carry the traffic to the remote web server. (TCP is used instead of UDP because HTTP requires a reliable channel that can deliver large amounts of data without corrupting it.) The request is now traveling down the network stack.
Transport layer, Host X
The TCP layer breaks the HTTP message request into segments, if needed, and adds a TCP header to each. Among other things, TCP adds the source and destination port. The port number lets the operating system direct the request to the proper application. The web server on Server Y listens on the default HTTP port 80 unless it is explicitly configured to use a different port number, and picks up all traffic there. Server Y directs responses back to Host X's port 5000, which is the source port number the server got from the request received from the host.
Port numbers are an L4 concept, so a separate set of ports exist for TCP and UDP.
The TCP layer on Host X knows the destination port is 80 because the browser uses the default port assigned to the HTTP protocol unless a different one is provided in the URL. The source port assigned to the browser (which will be used to identify the target application when processing ingress traffic) is assigned by the OS (unless a specific one is asked by the application). Let's assume that port was 5000. Different ports can be used for the two sides of the conversation. Network Address Translation (NAT) and proxying firewalls complicate the issue even further, but the outlines of how applications reach each other should be clear from this discussion.
The TCP layer does not know how to get the segments to the destination system. To accomplish that, the TCP layer invokes the IP layer, passing the destination IP address in each transmission request.
Network layer, Host X
The IP layer does not care about applications or ports. All it does is examine the IP addresses on the packets and the network options related to IP. Its big task is to consult routing tables (a complex process discussed in detail in Part VII) to discover that the packet should go through Router RT1. The IPv4 protocol is described in detail in Part V.
The packet is going to drop down another layer to be sent to the router, but the IP layer has to find the right address on this layer for the router. Since L2 involves communication between neighboring hosts (such as hosts sharing a LAN or a point-to-point link), the process used by the IP layer to find the L2 address associated with a given IP address is called a neighbor protocol. It is discussed in Part VI.
Link layer, Host X and Router RT1
This layer is implemented partly by a device driver. On LANs, Ethernet is the most common protocol, but ATM, Token Ring, FDDI, and others exist. Long-distance links use dedicated copper or fiber lines; the simplest of these is the dial-up connection that millions of home and small-office users still establish with their ISPs. LANs use their own (L2) addressing schemes that have nothing to do with TCP/IP; on Ethernet (and in IEEE 802 networks in general), addresses are 6 octets long and are commonly called MAC addresses. On a dedicated line (e.g., dial-up), no L2 addressing is needed at all because each side simply sends to the other side.
Different types of headers might be used on different links, because each is hardware-specific. These headers do not carry any information that is meaningful for the browser and server at the application layer.
Routers RT1, RT2, etc.
Each router in the path, except for the last, goes through the following process to forward the packet to its final destination:
It removes the link layer header.
It can see that the L3 protocol is IP thanks to a specific field in the link layer header, discussed later in this chapter.
It determines that the local system is not the destination of the packet because the destination IP address included in the IP header is not one of its own IP addresses.
It forwards the IP packet to the next router on the path toward Server Y. To do this, it consults its routing tables to select the next hop router and creates a new link layer header (i.e., Figure 13-4(E)). The last step is described in detail in Chapter 35.
Normally, the information on L3 (the IP header) does not change as the packet goes from system to system. Different L2 headers are used on each link.
When the packet finally arrives at Router RT3, the latter realizes that Server Y is directly connected and that there is no need to route the packet another hop.
Once the message reaches the destination server, it traverses the network stack again from the bottom upward:
Link layer, Server Y
Stripping off the L2 header, this layer checks a field to see which protocol handles the L3 layer. Finding that L3 is handled by IP, the link layer invokes the appropriate function to continue handling the L3 packet (i.e., L2 payload). Most of this chapter discusses the manner in which protocols register themselves and handle the key field indicating which protocol to use.
Network layer, Server Y
This layer recognizes that its own system's IP address, 188.8.131.52, is the destination address in the packet and therefore that the packet should be handled locally. The network layer strips off the L3 header and once again checks a field to see what protocol handles L4. Chapter 24 offers an in-depth description of the interface between L3 and L4 for ingress traffic.
Figure 13-4 shows how a header is added by each network layer as each one takes the data from a higher layer. The last step, from Figure 13-4(d) to Figure 13-4(e), shows the difference between the original frame transmitted to Router RT1 by Host X and the one between Router RT1 and Router RT2.
Figure 13-4. Headers compiled by layers: (a...d) on Host X as we travel down the stack; (e) on Router RT1
As we have seen, each layer provides a variety of protocols. Each protocol is handled by a different set of kernel functions. Thus, as the packet travels back up the stack, each protocol must figure out which protocol is being used by the next-higher layer, and invoke the proper kernel function to handle the packet.
On the lowest software layer, L2, the hardware in use defines the protocol. If the frame is received on an Ethernet interface, the receiver knows it contains an Ethernet header, and a Token Ring interface knows it contains a Token Ring header, and so on. There is no ambiguity unless LCC or SNAP is specified. LLC and SNAP are discussed later in this chapter.
But as the packet travels up the network stack, each protocol needs a field in its header to tell it which protocol should handle the next stage of processing. The progress is shown in Figure 13-5. Thus, the transition from L2 in Figure 13-5(a) to L3 in Figure 13-5(b) depends on L2 checking an "Above protocol" field in the L2 header. Similarly, the L3 layer checks a field in its header to facilitate the transition to L4, shown in Figure 13-5(b) and Figure 13-5(c). Finally, L4 uses the Destination Port field of the packet to take the packet out of the kernel and find the process, such as a web server, that handles the packet on the local host.