Routing in the Internet

Routing is the technique by which data finds its way from one host computer to another. In the Internet context there are three major aspects of routing

  1. Physical Address Determination
  2. Selection of inter-network gateways
  3. Symbolic and Numeric Addresses

The first of these is necessary when an IP datagram is to be transmitted from a computer. It is necessary to encapsulate the IP datagram within whatever frame format is in use on the local network or networks to which the computer is attached. This encapsulation clearly requires the inclusion of a local network address or physical address within the frame.

The second of these is necessary because the Internet consists of a number of local networks interconnected by one or more gateways. Such gateways, generally known as routers, sometimes have physical connections or ports onto many networks. The determination of the appropriate gateway and port for a particular IP datagram is called routing and also involves gateways interchanging information in standard ways.

The third aspect which involves address translation from a reasonably human friendly form to numeric IP addresses is performed by a system known as the Domain Name System or DNS for short. It is not considered further at this stage.

Physical Address Determination

If a computer wishes to transmit an IP datagram it needs to encapsulate in a frame appropriate to the physical medium of the network it is attached to. For the successful transmission of such a frame it is necessary to determine the physical address of the destination computer. This can be achieved fairly simply using a table that will map IP addresses to physical addresses, such a table may include addresses for IP nets and a default address as well as the physical addresses corresponding to the IP addresses of locally connected computers.

Such a table could be configured into a file and read into memory at boot up time. However it is normal practice for a computer to use a protocol known as ARP (Address Resolution Protocol) and defined by RFC 826. This operates dynamically to maintain the translation table known as the ARP cache.

On most Unix systems the contents of the ARP cache can be displayed using the command arp -a.

Here is typical output from the arp -a command

scitsc16.wlv.ac.uk (134.220.4.16) at 8:0:20:b:ca:2
scitsc17.wlv.ac.uk (134.220.4.17) at 8:0:20:c:41:70
ccuf.wlv.ac.uk (134.220.4.202) at 8:0:20:10:e6:6
scit-sun-gw1.wlv.ac.uk (134.220.4.203) at 0:0:c0:fd:80:a4
scitsd.wlv.ac.uk (134.220.4.205) at 8:0:20:77:cf:18
scitsc31.wlv.ac.uk (134.220.4.31) at 8:0:20:4:96:83

This was obtained on the scitsc.wlv.ac.uk host at 0845 on May 7th 1996.

A computer determines its own physical address at boot up by examining the hardware and its own IP address by reading a configuration file at boot up time but it is necessary to fill the ARP cache. This is done by the computer making ARP request broadcasts whenever it encounters an IP address that cannot be mapped to a physical address by consulting the cache.

The format of an ARP request on an Ethernet is

General Use Size in bytes Typical values
Ethernet Header Ethernet Destination Address 6 A broadcast address
Ethernet Source Address 6 Identifies computer making request
Frame Type 2 Set to 0x0806 for ARP request and 0x8035 for an ARP reply
ARP request/reply Hardware Type 2 Set to 1 for an Ethernet
Protocol Type 2 Set to 0x0800 for IP addresses
Hardware Address Size in bytes 1 Set to 6 for Ethernet
Protocol Address Size in bytes 1 Set to 4 for IP
Operation 2 1 for request, 2 for reply
Sender Ethernet Address 6 -
Sender IP Address 4 -
Destination Ethernet Address 6 Not filled in on ARP request
Destination IP Address 4 -

By making such requests a host can fill up the ARP cache. ARP cache entries will eventually time-out and a new query will have to be made. This allows a computer to respond to changing topology. Typical timeouts are about 20 minutes. An ARP request to a non-existent computer may be repeated after a few seconds up to a modest maximum number of times.

If a computer is connected to more than one network via separate ports then a separate ARP cache will be maintained for each interface. Alternatively there will be a further entry in the ARP cache associating an entry with a particular interface.

It may be thought that ARP requests will be made for every Internet computer a computer wishes to contact. This is not true, a reference to an IP address not on a local or directly connected network will be re-directed to an IP router computer with an IP address that is on a local directly connected network.

Since ARP requests are broadcast any computer maintaining an ARP cache can monitor all such broadcasts and extract the sending computer's physical and IP address and update its own ARP cache as necessary. When a computer boots up it can send an ARP request (perhaps to itself !) as a means of announcing its presence on the local network.

It is possible to associate more than one IP address with a single physical address.

Note that the ARP request format is designed to be capable of supporting protocols other than IP and Ethernet as long as it is possible to broadcast on the local network.

Reverse Address Resolution Protocol

Discless workstations were once widely used. These had a local processor and RAM but all disc space was supplied from a server using NFS or some similar system. In the absence of local configuration files, boot-up involved the use of a very simple file transfer protocol known as TFTP, however before this could be used the workstation needed to know its IP address. In order to determine this Reverse Address Resolution Protocol (RARP) described in RFC 903 was used. This used the same message format as ARP but used operation types 3 and 4 for requests and responses. Only suitably configured RARP servers would reply to such requests.

RARP may still be encountered in conjunction with devices such as laser printers.

Internet Routing - Internal Routing Tables

Within any host there will be a routing table that the host uses to determine which physical interface address to use for outgoing IP datagrams. Once this table has been consulted the ARP cache(s) will be consulted to determine the physical address.

If a computer receives an IP datagram on any interface there are two possibilities, one is that the datagram is intended for that computer in which case it will be passed to the relevant application. The other is that the datagram is addressed to some other computer in which case the computer will attempt to re-transmit on one or other of the available interfaces.

On Unix systems the command netstat -nr can usually be used to display the state of the routing table.

Here is typical output from the netstat -nr command

Routing tables
Destination          Gateway              Flags    Refcnt Use        Interface
127.0.0.1            127.0.0.1            UH       6      1748676    lo0
default              134.220.4.203        UG       74     17345705   le0
134.220.40.0         134.220.4.203        UG       0      0          le0
134.220.32.0         134.220.4.203        UG       0      15516      le0
134.220.8.0          134.220.4.203        UG       0      359006     le0
134.220.17.0         134.220.4.203        UG       0      0          le0
134.220.1.0          134.220.4.203        UG       3      1346065    le0
134.220.18.0         134.220.4.203        UG       0      4708       le0
134.220.10.0         134.220.4.203        UG       0      103836     le0
134.220.35.0         134.220.4.203        UG       0      0          le0
134.220.3.0          134.220.4.203        UG       0      643        le0
134.220.19.0         134.220.4.203        UG       0      469        le0
134.220.11.0         134.220.4.203        UG       0      211689     le0
134.220.20.0         134.220.4.203        UG       0      6525       le0
134.220.12.0         134.220.4.203        UG       0      107309     le0
134.220.4.0          134.220.4.1          U        114    28841321   le0
134.220.13.0         134.220.4.203        UG       0      8748       le0
134.220.37.0         134.220.4.204        UG       0      567        le0
134.220.6.0          134.220.4.203        UG       0      1202340    le0
134.220.15.0         134.220.4.203        UG       0      2566       le0
134.220.7.0          134.220.4.203        UG       7      1207070    le0
134.220.39.0         134.220.4.203        UG       0      0          le0

This was obtained on the scitsc.wlv.ac.uk host at 0859 on May 7th, 1996.

So if, for example, the host wanted to send an IP datagram to 134.220.6.12, it would use the above table to determine that it had to go via 134.220.4.203 (a gateway) and then use the ARP cache to determine the physical address of the gateway (it was 0:0:c0:fd:80:a4). The datagram is then sent to the gateway which uses a similar table to the physical interface for the datagram and then uses it's ARP cache to determine the physical address for the datagram.

There are four basic items of information in such a table

  1. A destination IP address.

  2. A gateway IP address. This will be the same as the destination IP address for directly connected destinations.

  3. Various flags usually displayed as U, G, H and sometimes D and M. U means the route is up. G means the route is via a gateway. H means the destination address is a host address as distinct from a network address.

  4. The physical interface identification.

The destination address may appear as "default".

The host operation is to first look for the destination address as a host address in the routing table, if it is not found then look for the destination net address in the routing table and if that is not found then use one of the default addresses (there may be several).

A host dedicated to providing a gateway service between several networks is known as a router and may have a very large routing table (64 MB is not unknown) and will run special protocols to interchange routing information with other hosts and routers.

A general purpose host may have connections to at most two or three networks and a correspondingly simple table.

Communication between routers

The complete Internet consists of a large number of interconnected autonomous systems (ASs) each of which constitutes a distinct routing domain. Such autonomous systems are usually run by a single organisation such as a company or university. Within an AS, routers communicate with each other using one of several possible intra-domain routing protocols also known as interior gateway protocols. ASs are connected via gateways, these exchange information using inter domain routing protocol also known as exterior gateway protocols.

The commonest interior gateway protocols are the Routing Information Protocol (RIP) defined in RFC 1058 and the more recent Open Shortest Path First (OSPF) protocol defined in RFC 1247. The purpose of these protocols is to enable routers to exchange locally obtained information so that all routers within an AS have a coherent and up to date picture of how to reach any host within the AS.

Whenever a host receives routing information it is expected to revise its routing tables in the light of the new information. This update may cause the host to send new routing information to further hosts so that changes will propagate across the network.

The RIP (RFC 1058) protocol

Using RIP hosts will periodically broadcast (or send to all neighbour routers if there is no broadcast facility) its entire routing table or those parts that have changed recently. RIP information is transmitted using UDP/IP using messages of the form

field bytes typical values
command 1
  1. Request
  2. Reply
  3. Obsolete
  4. Obsolete
  5. Poll
  6. Poll Entry
Version 1 1 or 2
Reserved 2 Must be zero
Address Family 2 2 for IP addresses
Reserved 2 Must be zero
IP Address 4 Address of host
Reserved 8 Must be Zero
Metric 4 a number in the range 1 to 16

The metric is the hop-count to the host whose IP address is quoted. A value of 16 implies the host is unreachable. The 20 bytes specifying address family, IP address and metric may be repeated up to 25 times. An IP address of 0.0.0.0 is regarded as a default address.

Routers will receive RIP information and will use it to determine their shortest route to a particular host. RIP information is sent to neighbours or broadcast every 30 seconds. RIP information is processed by daemon processes (either routed or gated on Unix hosts) listening on the well known port number 520.

RIP suffers from very slow convergence in the face of topology changes because routers are not under any obligation to identify failed links and, more importantly, their consequences and propagate the facts to other routers.

RIP is an example of a distance vector protocol.

The OSPF (RFC 1247) Protocol

The O means open, i.e. non-proprietary protocol.

OSPF is a link state protocol (LSP). This means that each router maintains link status information and this is exchanged between routers wishing to build routing tables. Unlike RIP OSPF uses IP directly, OSPF packets being identified by a special value in the IP datagram protocol field.

All OSPF messages have a common initial 8 bytes

Field Bytes Typical values
Version 1 2
Packet Type 1
  1. Hello
  2. Database Description
  3. Link state request
  4. Link state update
  5. link state acknowledgment
Packet Length 2 Packet length in bytes
Router ID 4 IP address of sending host
Area ID 4 ID of area to which packet belongs
Checksum 2 As for IP datagram
Authentication type 2
  1. No authentication
  2. Simple password
Authentication data 8 For type 1 only


These pages were produced to support a communication systems module that is no longer taught. Further communication systems notes are available on-line.


Author : Peter Burden