chap3: interface layer 3.1 intro chapter is mostly overview of interfaces PLUS internal ifnet and ifaddr structures set when one does something like this: # ifconfig ed0 inet 131.252.215.2 netmask 255.255.255.240 which is done at boot in /etc/rc.conf or via pccardd via script/setup or by hand E.g., ed0 - ethernet driver xl0 - 3com 100BASE driver en0 - intel gbit driver sk0 - syskonnect gbit driver wi0 - 802.11 lucent/prism2 driver an0 - cisco 802.11 driver ppp0 - ppp driver driver may or may not be modular. # ifconfig ed0 1.2.3.4 ... # ifconfig -a (list all interfaces) Note some are virtual, some are "real": lp0: flags=8810 mtu 1500 mvif0: flags=80 mtu 16384 lo0: flags=8049 mtu 16384 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet 127.0.0.1 netmask 0xff000000 ppp0: flags=8010 mtu 1500 sl0: flags=c010 mtu 552 faith0: flags=8002 mtu 1500 wi0: ... ed0: ... en0: ... # netstat -i (interface stats) stack/interface layer provides for devices: 1. set of interface functions 2. standard set of stats/control flags 3. device-independent way of storing protocol addresses 4. standard queueing methof for packets going out (going in) No reliable delivery: best effort (if out of space, darn) net/3: in this book amd 7990 lance ethernet interface. slip interface (sl0) (built on top of rs-232 driver) loopback interface (lo0) 3.2 introduction sys/socket.h - address structure definitions net/if.h - if structure definition net/if_dl.h - link-level structure definitions kern/init_main.c - device init net/if.c - generic interface code net/if_loop.c - loopback driver net/if_sl.c - slip driver hp300/dev/if_le.c lance ethernet driver more likely places: /sys/dev - modular drivers (sound, scsi, ethernet) /sys/dev/wi ... /sys/dev/ed i386/isa ... isa bus drivers that are not modular pci/if_sk.c - syskonnect gE driver global variables struct ifnet * ifnet_addrs - link-level if addresses hz (usually 100) snmp variables system interfaces address translation (dead) ip icmp tcp udp transmission - ethernet snmp - snmp itself snmpd - from net-snmp (was ucd-snmp) usually snmpd daemon for linux/ net box. sucks counters out of kernel for various things. snmpwalk - app used to talk to snmpd variables consist of: 1. simple integer counters (bytes in/bytes out) on interface 2. lists, e.g., individual routing entry 3. list of lists, routing table itself ignore awful ISODE mentions ... net-snmp ... interfaces MIB interfaces.ifNumber if_index + 1 ... #number of interfaces on this box. 3.3 ifnet structure interface structure: basic for any interface struct ifnet list of associated addresses # e.g., # ifconfig ed0 1.2.3.4 netmask 0xff000000 # ifconfig ed0 alias 2.3.4.5 netmask 0xffff0000 now 1 interface has 2 ip addresses OR one interface may support ipv4/ipv6, and have two addresses for that reason. ifnet structure implementation info hardware info interface stats function pointers (implemented by driver) output queue struct ifnet chained together ifconfig adds ... ifnet if_next -> to next ifnet if_addrlist - list of addresses for this if if_name - string name of interface (ed0) if_unit - in case interface has sub-units if_index - integer index for interface if_flags - capabilities of i/f operational state. up/not up. #ifconfig sl0 down see figure 3.7 #ifconfig sl0 sl0: flags=c010 mtu 552 *************************** flags SIOCGIFFLAGS/SIOCSIFFLAGS as ioctl access these see figure 3.7 note: flags used internally by driver for state like OACTIVE (xmit in progress) note: carrier indication may also exist with ethernet i/fs; e.g., full-duplex, half-duplex, etc. if_timer - time till watchdog timeout function called if_pcount - count of prom. mode listeners if_bpf - bpf packet filter structure interface timer if_timer - watchdog timer bsd packet filter Berkeley Packet Filter -- chap. 31 if_pcount - number of prom. mode listeners if_bpf - pointer to bpf structure figure 3.8 hw interface characteristics note how #define if_mtu is short-form substitution for longer glarp see net/if_types.h if_type IFT_ETHER, length 6, hdr length 6 + 6 + 2 if_mtu ethernet < 1 gbit, 1500, gbit can be in 9k range, but that requires switch support and won't last long note: usoft ppp has 1500 MTU, 1500 is sacred number in some sense. interface stats: figure 3.10 note: #netstat -in shows some of this info inputs/input errors outputs/output errs collisions if_collisions ... no telling if this means 1 collision or collision max therefore pkt not sent. (16) what does "collision" mean on radio interface? example 1: /sys/dev/wi/if_wi.c ifp->if_collisions = sc->wi_stats.wi_tx_single_retries + sc->wi_stats.wi_tx_multi_retries + sc->wi_stats.wi_tx_retry_limit; example 2: intel fxp ethernet updated once a second in timeout driver routine: ifp->if_collisions += sp->tx_total_collisions; example 3: ed0 ethernet device ifp->if_collisions += collisions; switch(collisions) { case 0: case 16: break; case 1: sc->mibdata.dot3StatsSingleCollisionFrames++; sc->mibdata.dot3StatsCollFrequencies[0]++; break; default: sc->mibdata.dot3StatsMultipleCollisionFrames++; sc->mibdata. dot3StatsCollFrequencies[collisions-1] ++; break; } ------------------------------------------------------------------ note: /* XXX */ means "hack hack hack!" ifnet structure has 7 function pointers that should be set at init time by driver if_init - initialize if_output - queue packets for xmit if_start - start xmit if_done - cleanup after xmit (not used) if_ioctl - ioctl ... misc functions (e.g., set IP address) if_reset - reset hw if_watchdog - periodic timeout (*ifp->if_start) (ifp); does what ... Figure 3.13: structure ifqueue ... output queue (where is input queue?) struct ifqueue (linked list of mbuf chains) head tail len maxlen ifq_drops ... drop count ifq_len = IFQ_MAXLEN (50 at this point) Figure 3.14 ... general queue handling macros/routines net/if.c IF_QFULL - is q full IF_DROP - increment drop count, caller drops pkt IF_ENQUEUE - add to end of q IF_PREPEND - put at front of q IF_DEQUEUE - take 1st pkt from q if_qflush - wipe out q as interface is taken down 3.4 ifaddr structure 1 interface has linked list of ifaddr structure 2 ip addresses, ipv6 address, etc. see figure 3.15 ifa_addr ... ifa_dstaddr ... overloaded broadcast or other guy if ptp ifa_netmask note that # ifconfig ed0 1.2.3.4 installs 1/8 as route in routing table. 3.5 sockaddr structure struct sockaddr sa_len sa_family sa_data[14..253] (variable length) note: 16 bytes note: sa_data may be up to 253 bytes long. 3.6 ifnet/ifaddr specialization each driver may have "softc" structure with layout as follows: see figure 3.20 le_softc[0] ifnet arpcom driver specific parts generic high-level stuff comes first, followed by possible hw specific fields/structures arpcom used by all users of arp ifaddr list contains addresses including link-level address (MAC if IEEE) sockaddr_dl contains MAC and possible mask for it See figure 3.11 3.7 network init overview software only devices: slip (barring hw), loopback, tun tunneling device, gif tunneling device Figure 3.23 main function line 173: attach pseudo-devices probe: BSD drivers traditionally have had boot "probe" os speaks to hw, probes bus for devices. on pc, lets you know if e.g., isa IRQ setup is working. non-modular devices will have bus probe (e.g., pci/isa) routine called at boot. task is to determine if hw is resident and initialize it ... e.g., copy in MAC address (good test of hw) modular devices may do the same when loaded by kldload pccard devices may do the same when initialized by pccardd call this "hw ready state" attach: essentially setup software ready state: ifconfig drives this usually if_attach called to insert ifnet/softc into list of interfaces (if list) if_le.c declares 1 or more le_softc structures with NLE elements starts with arpcom note figure 3.26 what arpcom has in it See figure 3.27 for attach-time code note: this function combines probe-time and attach in one function. SIMPLEX means what? (it means it is ... old ... :->) 3.9 slip init Figure 3.28 entire slip softc structure note struct tty ... input/output queues, ... an internal function like putc can be called to talk to rs-232 hw TCP header compression also known as VJ compression if_mtu for slip is small (296 bytes) figure 3.30 slip attach function ... does for all of them. slattach(8) runs to bind hw/sw driver together, set baud rate, do ioctls on hw ... 3.10 loopback init ... most minimal ... # ifconfig lo0 lo0: flags=8049 mtu 16384 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet 127.0.0.1 netmask 0xff000000 3.11 if_attach function if_attach in net/if.c does work of binding together if list Figure 3.32 ifnet list ifnet ---> linked list of ifs ifnet_addrs ---> linked list of addresses hw addresses at this point. later ifconfig really makes hw ready to go and binds IP addresses in. Why ifnet_addrs: list of ip addresses used by ip_input to determine if packet is for "us" p. 86 Link-level address for ethernet device is 48 bit MAC. Figure 3.33: struct sockaddr_dl sdl_index: assigned as ifs added to iflist. links have numbers, link #1, etc. This is the "link index", and can be returned with recvmsg(2). sdl_data contains both i/f name and LL address # netstat -in (note Link#?) Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll lp0* 1500 0 0 0 0 0 mvif0 16384 0 0 0 0 0 lo0 16384 210 0 210 0 0 lo0 16384 ::1/128 ::1 0 - 0 - - lo0 16384 fe80:3::1/6 fe80:3::1 0 - 0 - - lo0 16384 127 127.0.0.1 210 - 210 - - ppp0* 1500 0 0 0 0 0 sl0* 552 0 0 0 0 0 faith 1500 0 0 0 0 0 pccard card insertion adds: wi0 1500 00:60:b3:68:b8:3f 0 0 3 0 0 ifconfig wi0 1.2.3.4 adds: wi0 1500 1 1.2.3.4 0 - 0 - - wi0 1500 fe80:7::260 fe80:7::260:b3ff: 0 - 0 - - ---------------------------------------------------------------- Figure 3.34 if_attach code find end of interface list assign next index first time code to allocate space for ifnet_addrs note malloc/free/bcopy Figure 3.35 more if_attach construct name from base name and unit, wi = 0, == wi0 space for MAC Figure 3.41 ether_ifattach called by device driver attach routine ... (more or less last) generic part of ethernet ifp setup plugs in MAC address into relevant dl structure 3.12 ifinit function kernel arranges to set max q size to 50 and must start watchdog timers at some point with automagic if_slowtimeo facility, which calls watchdogs 1 time per second can setup during attach set watchdog function then at some point set if_timer = 1 or 0 to turn off/on watchdog timers CAN arrange to call themselves to call timeout(ed_tick, sc, hz); to uncall untimeout(ed_tick, sc, hz); from kernel managed timeout Q: