23.4. IP Configuration
The Linux IP protocol can be tuned and configured manually by a system administrator in different ways. This tuning includes both changes to the protocol itself and to device configuration . The four main interfaces are:
The last set of protocols in the preceding list have an interesting twist. They are normally implemented in user space, but Linux also has a simple kernel-space implementation that is useful when used together with the nfsroot boot option. The latter allows the kernel to mount the root directory (/) via NFS. To do that, it needs an IP configuration at boot time before the system is able to initialize the IP configuration from user space (which, by the way, could be stored in a remote partition and not even be available to the system when it mounts the root directory). Via kernel boot options, it is possible to give nfsroot a static configuration, or specify what protocols (yes, more than one can be used concurrently) to use to obtain the configuration. The IP configuration code is in net/ipv4/ipconfig.c, and the one used by nfsroot is in fs/nfs/nfsroot.c. The two files cross-reference variables and functions, but they are actually simple to read. We will not cover them, because network filesystems and user-space clients are outside the scope of this book. Once you know how to read _ _setup macros (described in Chapter 7), reading the code should become a piece of cake. It is clear and well commented.
The third item in the list, /proc, is covered later in the section "Tuning via /proc Filesystem."
In this section, I will say a bit about the kernel interfaces that support the behavior of the first two items, ifconfig and ip. The purpose here is not to cover the internals of the user-space commands or the associated kernel counterparts that handle configuration requests. It is to show how user space and kernel space communicate, and the kernel functions that are invoked in response to a user-space command.
23.4.1. Main Functions That Manipulate IP Addresses and Configuration
In net/ipv4/devinet.c, you can find several functions that can be used to add an IP address to a network interface, delete an address from an interface, modify an address, retrieve the IP configuration of a device given its device index or net_device data structure, etc. Here I introduce only a few of the functions that will be useful, to help you to understand the functions described later when we talk about the ip and ifconfig user-space tools.
Before reading these descriptions of functions, it would be worthwhile reviewing the key data structures used by the IP layer, introduced in Chapter 19 and described in detail later in this chapter. For instance, a single IP address is represented by an in_ifaddr structure and the complete IPv4 configuration of a device by an in_device structure.
Many other, smaller functions can be used to make the code more readable. Here are a few of them:
23.4.2. Change Notification: rtmsg_ifa
Netlink provides the RTMGRP_IPV4_IFADDR multicast group to user-space applications interested in changes to the locally configured IP addresses. The kernel uses the rtmsg_ifa function to notify those applications that registered to the group when any change takes place on the local IP addresses. The function can be called when two types of events occur:
The generated message is initialized with inet_fill_ifaddr, the same function used to handle dump requests from user space (with commands such as ip addr list). The message includes the address being added or removed, and the device associated with it.
So, who is interested in this kind of notification? Routing protocols are a major example. If you are using Zebra, the routing protocols you have configured would like to remove all of the routes that are directly or indirectly dependent on an address that has gone away. In Chapter 31, you will learn more about the way routing protocols interact with the kernel routing subsystem.
23.4.3. inetaddr_chain Notification Chain
The IP subsystem uses the inetaddr_chain notification chain to notify other kernel subsystems about changes to the IP configuration of the local devices. A kernel subsystem can register and unregister itself with inetaddr_chain by means of the register_inetaddr_notifier and unregister_inetaddr_notifier functions. Here are two examples of users for this notification chain:
The two NETDEV_DOWN and NEtdEV_UP events, respectively, are notified when an IP address is removed and when it is added to a local device. Such notifications are generated by the inet_del_ifa and inet_insert_ifa routines introduced in the section "Main Functions That Manipulate IP Addresses and Configuration."
23.4.4. IP Configuration via ip
Traditionally, Unix system administrators configured interfaces and routes manually using ifconfig, route, and other commands. Currently Linux provides an umbrella ip command to handle IP configuration, with a number of subcommands.
In this section we will see how IPROUTE2 handles the main addressing operations, such as adding and removing an address. Once you are familiar with these operations, you can easily understand and read through the code for the others.
Figure 23-2 shows the files and the main functions of the IPROUTE2 package that are involved with IP address configuration activities. The labels on the lines are ip keywords, and the nodes show the function invoked and the file the latter belongs to. For instance, the command ip address addwould be handled by ipaddr_modify.
Figure 23-2. IPROUTE2 files and functions for address configuration
Table 23-1 shows the association between the operation specified with a command-line keyword (e.g., add) and the kernel handler run by the kernel. For instance, when the kernel receives a request for an RTM_NEWADDR operation, it knows it is associated with an add command and therefore invokes inet_rtm_newaddr. Some kernel operations are overloaded, and for these, the kernel needs extra flags to figure out exactly what the user-space command is asking for. See Chapter 36 for an example. This association is defined in net/ipv4/devinet.c in the inet_rtnetlink_table structure. For an introduction to RTNetlink, refer to Chapter 3.
The list and flush commands need some explanation. list is simply a request to the kernel to dump information, for instance, about a given device, and flush is a request to clear the entire IP configuration on the device.
The two functions inet_rtm_newaddr and inet_rtm_deladdr are wrappers for the generic functions inet_insert_ifa and inet_del_ifa that we introduced in the section "Main Functions That Manipulate IP Addresses and Configuration." All the wrappers do is translate the request that comes from user space into an input understandable by the two more-general functions. They also filter bad requests that are associated with nonexistent devices.
23.4.5. IP Configuration via ifconfig
ifconfig is implemented in the ifconfig.c user-space file (part of the net-tools package). Unlike ip, ifconfig uses ioctl calls to interface to the kernel. However, a set of functions are used by both the ip and ifconfig handlers. In Chapter 3, we had an overview of how ioctl calls are handled by the kernel. Here all we need to know is that the requests related to IPv4 configuration are handled by the inet_ioctl function in net/ipv4/af_inet.c. Based on the ioctl code you can see what helper functions inet_ioctl uses to process the user-space commands (e.g., devinet_ioctl).
As for IPROUTE2, user-space requests from ifconfig are handled on the kernel side by wrappers that end up calling the functions in the section "Main Functions That Manipulate IP Addresses and Configuration."