Route Control

 

Routing was originally by node number. It was intolerant of loops. Every data packet contained an eight-bit destination node number and the number of a task within the destination node. Normal data packets bore no indication of the originating node.

 

Link control software delivered error-free data packets to the routing software. The routing required indexing into a route table to find the destination link:

            DESTLINK := ROUTETABLE[ destination_node ];

If the destination link was uninitiated, the data packet was discarded. If the destination link was initiated, the data packet was queued for transmission over that link.

 

If the destination node was self, the data packet was delivered to a local task. Task numbers greater than eight represented local ports (usually referred to as low-order lines). Task numbers less than eight were used for sundry purposes including call establishment, call disestablishment, logging and debugging.

 

 

 

Route control had one additional duty. When data link control detected a data link failure, it notified route control. Route control had to propagate this failure indication to two kinds of listeners:

A] All local ports which had established a call via the failed link were notified.

B] Other network nodes were notified that a group of nodes were now unreachable.

 

Local port notification required an iteration through all active ports to find those with a destination node reached via the failed link. Remote nodes were notified via a ”hard drop packet” which was addressed to node zero. It originally included the node number for a single unreachable destination. This was later changed to a vector of unreachable destinations.

 

The node which detected the original failure simply blasted out hard drop packets to all initiated links. Forwarding of a received hard drop was slightly more complicated. It obviously should not be returned to the link from which it was received. When the routing software was adjusted to be slightly tolerant of loops, some censorship of the received hard drop was required. For every node number mentioned in the hard drop packet, the route table entry was examined. If it pointed back to the link from which the hard drop was received, the node number was valid. A route table entry which pointed to another link suggests two paths to the named node converged in the receiver of the hard drop. These node numbers in the hard drop packet had to be ignored (and deleted before forwarding the hard drop).

 

Assuming globally consistent route tables in the network, this hard drop scheme notified both ends of a virtual call when an intervening link failed. At the terminal end, a LINES DOWN message was printed on the local terminal. At the APL end, the T-task was bounced and a CONTINUE workspace was saved. Inconsistent route tables always led to split hard drop situations such that only one end of the call was informed of the termination. Assume a network of four nodes arranged in a square with vertices: ABCD. If the route from A to C is ABC then route tables between A & C are consistent if the return route is CBA.

 

Some network designers consider the ability to make minor changes in network topology with minimal disturbance to users is desirable. (SNA/VTAM joke: It is a good thing that IBM did not invent the telephone otherwise all callers would have to hang up whenever a new subscriber was added.)

 

Maintaining route tables became a problem when the network expanded beyond two nodes. For an endnode all other nodes can be assumed reachable via the first (and only) high order line.

 

Call establishment used two messages: OutBoard Allocation (OBA) and InBoard Allocation (IBA). OBA was sent from an Alpha towards an APL host. It named the requesting outboard line via node and line numbers. Terminal type (2741 or ASCII) and terminal speed. Were also included. If the designated host had resources to accept one more call, an IBA was returned giving the assigned line number in the host (this was a psuedo-line with no physical manifestation). There was also a negative form of IBA which indicated that a call could not be established. A code which printed a “SYSTEM FULL” message on the terminal was generated when the host was accessible but unable to accept the call. If the host was unreachable, an intermediate node generated a negative response to print a “LINES DOWN” message on the terminal.

 

Initially the only way to adjust route tables was the debug patch facility. This wasn’t supported for the 3705 and had obvious pitfalls.

 

An early improvement was for intermediate nodes to examine the outboard node number in passing OBA messages. The link from which the OBA was received was assumed to be the link by which the outboard node could be reached. This was used to update the local route table entry for the outboard node. This scheme worked quite well in a loopfree network.

 

When a second trans-Atlantic line was added between London and Toronto, a loop was created. The route tables in London and Toronto provided a static division of traffic between the two lines. This worked well until one line failed (the lines were spread over CANTAT2 and CANTAT1). With one line inoperative it was desirable to switch all of the traffic to the remaining line. Switching required patching route table entries for those APL hosts which could be reached from Europe. The resultant overload gave some interpacket pausing to users but no serious problems.

 

The problem came when the failed link was repaired. It was desirable to switch traffic back to the original configuration to eliminate the overload. It was eventually determined that patching was not a reliable way to restore service. Outboard nodes which rarely established new calls were the culprits. Given the sequence of events:

1] Call establishment from node 7 via link A (the associated OBA adjusted the 3705 route table).

2] Telephone company restores service on link B.

3] Node 4 (main London node) route table patched to send 3705 traffic via link B.

Patching created an inconsistent route table situation.

4] Establishing a call from node 4 to the 3705 repaired the 3705 route table entry for node 4 but not for node 7.

5] A link failure anywhere in the loop will now create a split hard drop situation for a call from node 7 to the 3705. (Calls between node 4 and the 3705 are exempt because the call establishment of step 4 adjusted the 3705 route table.)

 

The consequences of a split hard drop are unpleasant for the network operator.

 

Assume the terminal (outboard) end is dropped while the APL T-task gets stuck due to no response from the other end. This usually resulted in a request to the APL system operator to forcibly terminate the T-task. Termination of the T-task sent logoff confirmation and a soft drop request to the original outboard line. If a new call has been established from that line, these packets will interfere with the new call. Some people would consider this a security breach.

 

Assume the T-task (inboard) end is dropped while the terminal (outboard) remains unaware of the drop. Subsequent traffic from the outboard end towards the T-task (or other inboard service) will be addressed to the old inboard line. If this line has been reused for a new call, there will be clash between the old and new calls. In a proper datagram network this is usually avoided by including the source port number as well the destination port number in every datagram. This allows the destination port to provide some validation of received packets.

 

The 3705 had some protection against a split hard drop. An inboard line which was the target of a hard drop was quarantined for sixty seconds. Traffic to a quarantined line generated an error message. After the quarantine period expired, more brutal action was taken. In addition to the error message, the high order line from which the offending traffic was received was reset. This was guaranteed to clean up after the split hard drop. (NB: Maybe a proof should be included here. RDM)

 

As network operators became aware of the problems associated with route table patching and recovery, operating procedures in London were changed. The new method of switching traffic back to the restored link was to reload the node which had been patched.

 

The difficulties of managing a network topology which included loops motivated a change in network routing. The efficiency associated with a 16 bit packet address appeared desirable. Another consideration was that eight-bit node numbers would eventually constrain network expansion. Split hard drop experience led to some distrust of statistically routing schemes such as ARPANET used. The idea of an all virtual call network where all traffic ran on rails seemed preferable to a Brownian motion model.

Data packet forwarding in the new routing scheme was slightly more expensive than in the old route table scheme. Every node had a three column matrix called the virtual call table (VCT). Every virtual call had a zero (outboard) end and a one (inboard) end. A virtual call was designated by a virtual call number of either 15 or 16 bits (VCN15 or VCN16). VCN16 was formed by (2 x VCN15) + direction_bit. Within a node there are two possible packet sources: remote and local. The links to adjacent nodes are the remote resource; sundry local ports and tasks are the local source. This traffic source was viewed as a direction to reach the zero or one end of a call. The Alpha and 3705 used a simple rule to encode remote and local sources in a small integer. The rule was remote sources had 1 to 7 range and local sources ranged from 8 to 255 (or less). With this simple numbering scheme for source/destination, the VCT was constructed as follows:

VCT[ndx;0] is direction to zero end of the call

VCT[ndx;1] is direction to one end

VCT[ndx;2] is 15 bit VCN (VCN15)

 

To process a received data packet the 16bit VCN was extracted and split into VCN15 and a direction_bit. Then

ndx := VCT[;2] IOTA VCN15 {APL IOTA finds an ndx such that VCT[ndx;2] = VCN15 }

The packet source can now be validated with the expression:

            source = VCT[ndx ; NOT direction_bit]

If this is true then the packet has come from the expected source and is assumed valid. If this is value the packet is discarded with an error message.

 

The packet can then be delivered to VCT[ndx ; direction_bit ]. With the encoding scheme described above the destination can be either remote or local.

 

The forwarding algorithm above was the foundation stone of the all virtual call routing protocol. Route control packets were required to create and destroy virtual call table entries. An important contribution to reliability was that a node began operation with an empty virtual call table. When all virtual calls passing through the node were terminated, the virtual call was again empty. An empty array cannot contain obsolete information.

 

There were two call establishment mechanisms in the virtual call network. They differed in performance, complexity and resiliency.

 

IBNL routing assumed the outboard end possessed an explicit recipe for reaching the inboard destination. This recipe took the form of an integer vector which named every node in the route from the outboard to the inboard node. Assuming the recipe described a feasible and reasonable route this was the preferred method of call establishment.

 

IBR routing required no prior knowledge of network topology by any node. A simple distributed computation through most of the network found a route to the inboard destination. This route was usually the best feasible route.

 

IBNL processing was fairly simple. The IBNL included a VCN15 and a description of the requested resource in the inboard node. A node which received an IBNL made a few simple checks. If VCN15 already existed in VCT[ ;2] the IBNL was invalid. The receiving node had to be in the node list. Within the node list the predecessor and successor of the receiving node were of interest. The source of the IBNL had to be link for which the predecessor was the adjacent node. If the receiving node was not the inboard node, the successor must be adjacent to the receiving node. If the receiving node is the inboard node, the requested resource must be available. If any of these conditions fail, a hard drop naming VCN16 was sent to the IBNL source so that VCT entries towards the outboard could be deleted. If the above conditions were met, a new VCT entry was created with directions to both the zero and one ends. If the inboard end was remote, the IBNL was forwarded. If the destination was local, call routing was complete. The outboard end was unaware of this and so an interend (as opposed to a route control) packet was sent to the outboard end. Receipt of this interend packet informed the outboard end that data transmission could begin.

 

Hard drop packets were the only implemented method of removing VCT entries (the unimplemented reroute scheme removed or altered some VCT entries). A hard drop packet contained a list of 16 bit virtual call numbers. Each VCN16 was subjected to the same tests as a VCN16 for a data packet as described above. An unknown VCN was passed on to the IBR processing system where it acted like an IBNF. An otherwise invalid VCN was simply ignored.

 

Processing of a valid hard drop to a remote destination had three parts:

A] The VCT entry was deleted.

B] Any data packets addressed to VCN16 were purged from the transmission queue. This guaranteed that obsolete packets associated with VCN16 were not going to materialize to corrupt a future calling using the same VCN.

C] The hard drop was forwarded towards the destination.

 

Processing of a valid hard drop to a local destination had two parts:

A] The VCT entry was deleted.

B] The local task was signalled.

 

IBR routing was more complex but more resilient than IBNL route establishment. It also operated without any overall control such as the deus ex machina which provided node list recipes. The underlying principle of IBR processing was that a request for a service was broadcast throughout the network. If the inboard destination was reachable and the resource was available, a virtual call was established.

 

The outboard node originated an IBR. The packet contained a virtual call number drawn from the outboard node’s pool, outboard node and line numbers, inboard node (or generic address), requested resource and a weight (IBRW). There were two possible forms for the IBR destination: node number or four character generic address. Typical generic addresses were “IPS “, “XER”, “MFT “, “ZEXT”.

 

The outboard node broadcast the IBR on all high order lines. (An IBR to self was trapped and handled locally.) The initial value of IBRW was zero. The outboard line then awaited an IBOK packet or IBNF replies to all the IBR packets which it had sent.

 

An intermediate node which received an IBR tended to forward an altered copy to all high order lines except the IBR source line. The alteration increased the value of IBRW by approximately sixteen. Plans to have this increment vary depending on high order excess capacity were never implemented and so the increment was exactly sixteen.

 

An end node which was not capable of providing the requested service (usually because the address was wrong) returned an IBNF to the IBR source line.

 

A destination node which was capable of providing the requested resource waited 500 milsec and then replied with an IBOK. During this delay, the inboard node behaved rather like an intermediate node as described below.

 

An outboard, intermediate or destination node created an IBR setup record. It recorded the IBR source line, VCN, IBRW, timestamp and a short Boolean vector. The Boolean vector (IBRSNT) indicated high order lines to which IBRs had been forwarded without any reply. Receipt of an IBNF cleared the corresponding bit in IBRSNT. When all IBRSNT bits became zero (or a 60 second timer expired), an IBNF was passed to the IBR source line. Receipt (or initial emission) of an IBOK indicated that a virtual call could be established. The setup table entry was replaced by a normal VCT entry with VCT[x;0] as IBR source and VCT[x;1] as IBOK source.

 

Receipt of an IBR referring to a virtual call number which was already in setup indicated that there were two paths to the outboard node. The path with smaller IBRW value was assumed to be the preferred route. If the second IBR was preferred, setup record fields (IBR source line and IBRW) were altered and the first route was rejected with an IBNF. If the first IBR was preferred, the second IBR was rejected. Possible late receipt of a preferred IBR was the reason for the 500 milsec delay at the inboard node.

 

The IBR process provided a persistent search for a feasible route. This was a mixed blessing as the next two anecdotes indicate:

A] One Sunday afternoon David Chivers and I were using 30cps terminals in our Toronto office. Our terminals were hard-wired to different Alphas about 100 metres away in the computer room. I noticed that the response time seemed a little sluggish and David had the same complaint. I traced the calls from our terminals to host APL system. My call was routed via our two trans-Atlantic lines from an Alpha to a 3705 which was about ten metres from that Alpha. David’s call had a similar routing but included a detour through Rochester, New York. Before we logged-on the Toronto communications staff had started reworking the null modems which interconnected the various nodes in the Toronto datacentre. Thus the only feasible routes involved rather long distances.

 

B] Massey Ferguson had their own APL host in Toronto, a leased line to their Stoneleigh, UK facility and connections to the main IPSANET in both England and Toronto. There were so many intervening nodes that non-Massey traffic did not normally use the Massey Ferguson private line. One day both of the I P Sharp trans-Atlantic lines failed. IBR routing tried to put all of the I P Sharp load on the Massey link which led to major delays in packet forwarding. The only network management tool to alleviate the situation was to pull the plug on one of the interconnections between Massey and IPSA.

 

Generic addressing did provide some automatic load balancing between 3705s. This was particularly convenient when the Toronto datacentre was moved across King St in December 1980. The network configuration changed daily. There was no need to inform all the nodes of changes in Toronto.

 

The security in this scheme was mixed. The check for proper packet source in the basic routing algorithm made it impossible for an intruder to corrupt an established call. Generic addressing would have made it very easy for an intruder to spoof an address. (This assumes a high order connection to the network, suitable hardware and sufficient knowledge to patch a configuration parameter in the software.) If a general interconnect service had been offered, some censorship of IBRs at the gateway nodes would have been required.

 

Hindsight showed that the IBNF packet wasn’t quite right. A hard drop had the same effect on the IBR setup table entry as did the IBNF. An attempt to hard drop an established call via an HD from the “wrong” source was ignored. A hard drop packet could mention multiple calls and thus had less overhead than an IBNF. Some minor simplicity would have resulted from eliminating the IBNF packet.

 

In the mid-80s IBRs were deemed unsatisfactory for routing calls to high-volume inboard destinations. There was a desire to balance the load on the three IPSA trans-Atlantic lines and some other loops. An N-task on the main APL system examined network statistics and calculated new IBNL recipes. These were then distributed via the logtask. Users of a single Alpha sent the bulk of their calls to one or two hosts. Historical network data was used to determine the popular destinations accessed from a specific Alpha. An Alpha could store many recipes but this was undesirable as each stored recipe reduced the size of the buffer pool by one.

 

At least two extensions to routing were considered but never implemented (to my knowledge).

 

I had hoped to implement a scheme for changing the route of an established call. Two similar reroute mechanisms with different objectives were proposed:

·      1] Discretionary reroute would have allowed network operations staff to gracefully removed a link from service. This would have allowed maintenance of modems and intermediate nodes.

·      2] Emergency reroute where one link in a loop was lost would have allowed graceful recovery. The network normally terminated these calls with a hard drop. There was a protocol such that the nodes on opposite ends of the failed link could cleanup their transmission systems and then forward data packets around the loop to their proper destination.

 

Although all of the Alpha and 3705 code was written and partially tested this reroute scheme was never implemented. The crux of the problem was the Alpha storage was restrictive and some node types were barely able to function with 16K memory.

 

Steven Crouch proposed a scheme to greatly increase the theoretical capacity of the network. The virtual call scheme had a global restriction of about 2**15 calls. Assuming perfect allocation of numbers from a central pool such that every node has exactly the number of VCNs which it needs at the present time, this limit was achievable. In practice VCN pools were allocated in a static fashion with 32 entries per Alpha and perhaps less for a 3705. This led to some inefficiencies in allocation.

 

The proposal was to renumber virtual calls at every intermediate node. This would have allowed 2**15 calls per link and 2**15 calls per outboard node. For IBR setup purposes, a 32 bit virtual call number was required. This was formed by catenation of a 16bit outboard node number and a 16bit outboard call number local to that node. I don’t think this was ever implemented.