嵌入式linux中文站在线图书

Previous Page Next Page

Chapter 15. Network Interface Cards

In This Chapter

440
448
450
451
457
458
459
461


Connectivity imparts intelligence. You rarely come across a computer system today that does not support some form of networking. In this chapter, let's focus on device drivers for network interface cards (NICs) that carry Internet Protocol (IP) traffic on a local area network (LAN). Most of the chapter is bus agnostic, but wherever bus specifics are necessary, it assumes PCI. To give you a flavor of other network technologies, we also touch on Asynchronous Transfer Mode (ATM). We end the chapter by pondering on performance and throughput.

NIC drivers are different from other driver classes in that they do not rely on /dev or /sys to communicate with user space. Rather, applications interact with a NIC driver via a network interface (for example, eth0 for the first Ethernet interface) that abstracts an underlying protocol stack.


Driver Data Structures

When you write a device driver for a NIC, you have to operate on three classes of data structures:

  1. Structures that form the building blocks of the network protocol stack. The socket buffer or struct sk_buff defined in include/linux/sk_buff.h is the key structure used by the kernel's TCP/IP stack.

  2. Structures that define the interface between the NIC driver and the protocol stack. struct net_device defined in include/linux/netdevice.h is the core structure that constitutes this interface.

  3. Structures related to the I/O bus. PCI and its derivatives are common buses used by today's NICs.

We take a detailed look at socket buffers and the net_device interface in the next two sections. We covered PCI data structures in Chapter 10, "Peripheral Component Interconnect," so we won't revisit them here.

Socket Buffers

sk_buffs provide efficient buffer handling and flow-control mechanisms to Linux networking layers. Like DMA descriptors that contain metadata on DMA buffers, sk_buffs hold control information describing attached memory buffers that carry network packets (see Figure 15.1). sk_buffs are enormous structures having dozens of elements, but in this chapter we confine ourselves to those that interest the network device driver writer. An sk_buff links itself to its associated packet buffer using five main fields:

Figure 15.1. sk_buff operations.


Assume skb points to an sk_buff, skb->head, skb->data, skb->tail, and skb->end slide over the associated packet buffer as the packet traverses the protocol stack in either direction. skb->data, for example, points to the header of the protocol that is currently processing the packet. When a packet reaches the IP layer via the receive path, skb->data points to the IP header; when the packet passes on to TCP, however, skb->data moves to the start of the TCP header. And as the packet drives through various protocols adding or discarding header data, skb->len gets updated, too. sk_buffs also contain pointers other than the four major ones previously mentioned. skb->nh, for example, remembers the position of the network protocol header irrespective of the current position of skb->data.

To illustrate how a NIC driver works with sk_buffs, Figure 15.1 shows data transitions on the receive data path. For convenience of illustration, the figure simplistically assumes that the operations shown are executed in sequence. However, for operational efficiency in the real world, the first two steps (dev_alloc_skb() and skb_reserve()) are performed while initially preallocating a ring of receive buffers; the third step is accomplished by the NIC hardware as it directly DMA's the received packet into a preallocated sk_buff; and the final two steps (skb_put() and netif_rx()) are executed from the receive interrupt handler.

To create an sk_buff to hold a received packet, Figure 15.1 uses dev_alloc_skb(). This is an interrupt-safe routine that allocates memory for an sk_buff and associates it with a packet payload buffer. dev_kfree_skb() accomplishes the reverse of dev_alloc_skb(). Figure 15.1 next calls skb_reserve() to add a 2-byte padding between the start of the packet buffer and the beginning of the payload. This starts the IP header at a performance-friendly 16-byte boundary because the preceding Ethernet headers are 14 bytes long. The rest of the code statements in Figure 15.1 fill the payload buffer with the received packet and move skb->data, skb->tail, and skb->len to reflect this operation.

There are more sk_buff access routines relevant to some NIC drivers. skb_clone(), for example, creates a copy of a supplied skb_buff without copying the contents of the associated packet buffer. Look inside net/core/skbuff.c for the full list of sk_buff library functions.

The Net Device Interface

NIC drivers use a standard interface to interact with the TCP/IP stack. The net_device structure, which is even more gigantic than the sk_buff structure, defines this communication interface. To prepare ourselves for exploring the innards of the net_device structure, let's first follow the steps traced by a NIC driver during initialization. Refer to init_mycard() in Listing 15.1 as we move along:

Let's now look at the methods that define the net_device interface. We categorize them under six heads for simplicity. Wherever relevant, this section points you to the example NIC driver developed in Listing 15.1 of the section "Device Example: Ethernet NIC."

Activation

The net_device interface requires conventional methods such as open(), close(), and ioctl(). The kernel opens an interface when you activate it using a tool such as ifconfig:

bash> ifconfig eth0 up

open() sets up receive and transmit DMA descriptors and other driver data structures. It also registers the NIC's interrupt handler by calling request_irq(). The net_device structure is passed as the devid argument to request_irq() so that the interrupt handler gets direct access to the associated net_device. (See mycard_open() and mycard_interrupt() in Listing 15.1 to find out how this is done.)

The kernel calls close() when you pull down an active network interface. This accomplishes the reverse of open().

Data Transfer

Data transfer methods form the crux of the net_device interface. In the transmit path, the driver supplies a method called hard_start_xmit, which the protocol layer invokes to pass packets down for onward transmission:

netdev->hard_start_xmit = &mycard_xmit_frame; /* Transmit Method. See Listing 15.1 */

					  

Until recently, network drivers didn't provide a net_device method for collecting received data. Instead, they asynchronously interrupted the protocol layer with packet payload. This old interface has, however, given way to a New API (NAPI) that is a mixture of an interrupt-driven driver push and a poll-driver protocol pull. A NAPI-aware driver thus needs to supply a poll() method and an associated weight that controls polling fairness:

netdev->poll   = &mycard_poll; /* Poll Method. See Listing 15.1 */
netdev->weight = 64;

We elaborate on data-transfer methods in the section "Talking with Protocol Layers."

Watchdog

The net_device interface provides a hook to return an unresponsive NIC to operational state. If the protocol layer senses no transmissions for a predetermined amount of time, it assumes that the NIC has hung and invokes a driver-supplied recovery method to reset the card. The driver sets the watchdog timeout through netdev->watchdog_timeo and registers the address of the recovery function via netdev->tx_timeout:

netdev->tx_timeout = &mycard_timeout; /* Method to reset the NIC */
netdev->watchdog_timeo = 8*HZ;        /* Reset if no response
                                         detected for 8 seconds */

Because the recovery method executes in timer-interrupt context, it usually schedules a task outside of that context to reset the NIC.

Statistics

To enable user land to collect network statistics, the NIC driver populates a net_device_stats structure and provides a get_stats() method to retrieve it. Essentially the driver does the following:

  1. Updates different types of statistics from relevant entry points:

    #include 
    struct net_device_stats mycard_stats;
    
    static irqreturn_t
    mycard_interrupt(int irq, void *dev_id)
    {
      /* ... */
      if (packet_received_without_errors) {
        mycard_stats.rx_packets++;   /* One more received
                                        packet */
      }
      /* ... */
    }
  2. Implements the get_stats() method to retrieve the statistics:

    static struct net_device_stats
    *mycard_get_stats(struct net_device *netdev)
    {
       /* House keeping */
       /* ... */
       return(&mycard_stats);
    }
  3. Supplies the retrieve method to higher layers:

    netdev->get_stats = &mycard_get_stats;
    /* ... */
    register_netdev(netdev);

To collect statistics from your NIC, trigger invocation of mycard_get_stats() by executing an appropriate user mode command. For example, to find the number of packets received through the eth0 interface, do this:

bash> cat /sys/class/net/eth0/statistics/rx_packets
124664

WiFi drivers need to track several parameters not relevant to conventional NICs, so they implement a statistic collection method called get_wireless_stats() in addition to get_stats(). The mechanism for registering get_wireless_stats() for the benefit of WiFi-aware user space utilities is discussed in the section "WiFi" in the next chapter.

Configuration

NIC drivers need to support user space tools that are responsible for setting and getting device parameters. Ethtool configures parameters for Ethernet NICs. To support ethtool, the underlying NIC driver does the following:

  1. Populates an ethtool_ops structure, defined in include/linux/ethtool.h with prescribed entry points:

    #include 
    
    /* Ethtool_ops methods */
    struct ethtool_ops mycard_ethtool_ops = {
      /* ... */
      .get_eeprom = mycard_get_eeprom, /* Dump EEPROM
                                          contents */
      /* ... */
    };
  2. Implements the methods that are part of ethtool_ops:

    static int
    mycard_get_eeprom(struct net_device *netdev,
                      struct ethtool_eeprom *eeprom,
                      uint8_t *bytes)
    {
       /* Access the accompanying EEPROM and pull out data */
       /* ... */
    }
  3. Exports the address of its ethtool_ops:

    netdev->ethtool_ops = &mycard_ethtool_ops;
    /* ... */
    register_netdev(netdev);

After these are done, ethtool can operate over your Ethernet NIC. To dump EEPROM contents using ethtool, do this:

bash> ethtool -e eth0
Offset          Values
------          ------
0x0000          00 0d 60 79 32 0a 00 0b ff ff 10 20 ff ff ff ff
...

Ethtool comes packaged with some distributions; but if you don't have it, download it from http://sourceforge.net/projects/gkernel/. Refer to the man page for its full capabilities.

There are more configuration-related methods that a NIC driver provides to higher layers. An example is the method to change the MTU size of the network interface. To support this, supply the relevant method to net_device:

netdev->change_mtu = &mycard_change_mtu;
/* ... */
register_netdev(netdev);

The kernel invokes mycard_change_mtu() when you execute a suitable user command to alter the MTU of your card:

bash> echo 1500 > /sys/class/net/eth0/mtu

Bus Specific

Next come bus-specific details such as the start address and size of the NIC's on-card memory. For a PCI NIC driver, this configuration will look like this:

netdev->mem_start = pci_resource_start(pdev, 0);
netdev->mem_end   = netdev->mem_start + pci_resource_len(pdev, 0);

We discussed PCI resource functions in Chapter 10.

Previous Page Next Page