Packet Switching at Philips Research Laboratories

The packet switching network at PRL is described. The purpose of the project is explained and the adoption of packet switching techniques justified. The structure of packets, their types and their uses are covered with the help of diagrams, as are the attempts made within the network to detect and recover from fault conditions. Some results that have been achieved and further developments envisaged are stated.

Keywords: Computer networks, data communication, packet switching, local network. Philips, CORAL 66, co-operating processes.

D.J. Burnett received his B.A. degree in Natural Sciences from Cambridge University in 1962. Since then he has been at Philips Research Laboratories in Redhill, England, working initially on linear accelerators and, subsequently, in the field of computer programming. He has worked on the topics of optical character recognition, computer graphics, communications networks and robot control.

H.R. Sethi received his B.A. degree in Mathematics in 1960 from Panjab University, India and B. Eng. degree in electronic engineering in 1966 from the University of Sheffield, England. From 1966 to 1969 at the University of Sheffield he worked as a research assistant carrying out post graduate research into electron flow systems, receiving the degree of Ph.D. in 1972. From 1969 he has been at Philips Research Laboratories, Redhill, England working on various aspects of computing-computer graphics, device independent graphics, utility packages and communications networks. He is a member of the Institution of Electrical Engineers and the British Computer Society.

1. Purpose of the project

One of the major innovations in scientific and technical work has been the use of computers and, especially in more recent years, the use of minicomputers such as the Honeywell 516 and the Philips P800 series dedicated to an individual project. As part of this development at Philips Research Laboratories a need is now arising for these minis to exchange data with each other and with the laboratories' main computer, an ICL 1904S. This need appears in several forms, for example:

a) Sending source or data to the 1904S or other minis for processing by a set of programs.

b) Using the 1904S filestore as a secure archive for data and programs.

c) Interactive computer graphics with some processing done centrally and some done locally on the mini to which the displays are attached.

d) Connect exotic peripherals (e.g. dot matrix printer/plotter) to the network.

e) Teletype access to the 1904S operating system with regard to the above activities.

In general data communication consists of file transfers and of interactive work mingled in varying degrees. With the increasing use of minis and now of microprocessors, data communication between the minis and micros is likely to become of increasing importance.

Fig. 1. shows the small network now in existence at PRL. It consists of intelligent and non-intelligent end nodes, all connected to a switch node in a star-shaped network. An end node in this context is conceived of as a physical host computer with its disc filestore (if any) which has a data communication link with the switch node. To permit a software multiplex use of the link each packet of data sent or received must also have logical channel numbers attached to it, one for each end (sender and receiver). Each end chooses its own channel number when the connection between the two ends is first set up, and each such pair of channel numbers forms an established 'connection'. One end node is rather different from the remainder; the switch node computer also contains a virtual end node within it whose function is to make available the services of the printer/plotter to any other end node. The network provides a service for three projects; visual scene analysis (VSA) for use in the automation of production, computer graphics and the materials group who work in the field of solid state physics. It was to the problem of providing this service, and to looking at the whole subject in general, that we applied ourselves.

2. Why packet switching?

One of the early decisions we made was to adopt a packet switching philosophy. This was not simply the pursuit of a current fashion, but follows inescapably from the problem itself.

The conditions it was necessary to fulfill may be itemised as follows:

a) a connection should be possible between any pair of end nodes.

b) the nodes have (or are allowed) only a limited capacity so there is a limit to the number of simultaneous connections that can be maintained. Once a connection has been broken off the resources it used should be available for reallocation.

c) it happens that an end node may require more than one connection at the same time to the same destination, to support different types of traffic.

d) for reasons of cost the most popular destination (the 1904S) could support only a single hardware entry port.

e) to make the network resilient to error on the part of an end node it was decided:

i) to keep a record of all current connections in switch node in order to be able unambiguously to allocate every packet to such a connection (or to reject it).

ii) When an end node became inoperable to close down all current connections related to it in a defined manner and to release the resources no longer in use.

This could be achieved by the remaining end nodes timing out and closing down their own individual connections to the inoperable node, but it was decided to adopt the virtual call approach since it allows the switch node to detect more effectively any errors and failures on the part of end nodes. This places less of a burden on the end node programs and allows them to be developed and corrected with a greater degree of mutual isolation. Packet switching seems to be the only technique capable of meeting these conditions.

Packet switching is, of course, a widely practised technique [1], and the question arose as to whether it would have been of advantage to have adopted an existing protocol. The decision not to was taken on the grounds that none of the existing systems was particularly well tailored to our needs, so if one cheerfully plundered them of their attractive features but retained freedom of design, the project could be realised more economically. Some systems such as ARPA network [3] were too sophisticated, being intended to solve an altogether more difficult problem, while others such as the NPL [2] system at Teddington met slightly different needs and used specialised hardware which was not available.

3. Packet structure

Fig. 2 shows the structure of a packet. The packet consists of a header carrying system information, to which is appended the user defined data (if any). The header carries information as to the packet type and to its size. There are several types, to be described shortly, and the maximum size is 252 bytes. Of these the first twelve form the header and the remainder are the user defined data. Both the destination and source are fully specified, not only as to physical computer (or, in network terms, end node) but also as to logical channel number. Such a channel pair, one for each end node or computer, constitutes a connection. The parameter bytes carry system information, and their exact significance depends on the packet type. Checksum applies to both header and user data. Its purpose is to detect any errors in the packet so that any software or hardware fault in one part of the network does not cause further errors in any other part.

The checksum is in a fixed position in the packet header because checksum calculations are performed by software and it is a (mild) convenience to have it easily found, particularly for the smaller packet types. Finally, the user data is a variable length byte stream of text or binary information whose safe delivery is the justification for the whole exercise. It will be noted that all packets are byte-oriented. This is principally for hardware reasons, and in particular to facilitate transfer of information between computers of 16-bit and 24-bit word length.

4. Packet types and protocol rules

C	Connect
CA	Ack.of C
T	Terminate

Connection and
Termination

D	Data
A	Ack.of D
I	Interrupt

Data Exchange

SR	Status Request
SA	Ack of SR
R	Repeat

Error Detection
and Recovery

E	Error Report
RUT	Are you there?
IN	Initialize

Miscellaneous

Fig. 3. Packet types.

Fig. 3 lists the various packet types. It is the correct use of these that constitute the core of the whole communications system. They are grouped together according to their functions, and each group will be discussed separately insofar as that is possible. The discussion assumes that communication is to take place between two end nodes called Alpha and Beta. In the case of a non-intelligent node physically interfaced to the switch node, the software to control it shares the same storage and central processor as the switching software though it is logically distinct from it.

5. Establishing and terminating a connection

The packet types of interest are C (which stands for 'connect'), CA (for 'connect acknowledge') and T (for 'terminate'). If Alpha wants a connection he selects a free channel number for himself and sends it in a C packet to Beta. The parameter bytes (see fig. 2) are used to indicate the purpose of the requested connection, e.g., a file transfer. Beta notes Alpha's channel number, selects his own channel number and sends it in a CA packet back to Alpha (fig. 4). This exchange tells both ends that the other end is ready and informs each end of the two channel numbers to be used. It also allows the switch node to register the creation of a new virtual call. At this point the exchange of data commences. When either end wishes to terminate the connection it sends a T packet to notify the other end and also, in passing, the switch node. Each end then releases its own channel number.

6. Data exchange

The packets involved in data exchange are those called D, A and I. The process of data exchange involves the acknowledgement of packets received, both to protect against loss of data and to prevent a 'fast' node from flooding a 'slow' node with data at an excessive rate. Each data (D) packet carries two sequence numbers in its parameter bytes, one to identify itself and the other to acknowledge those D's that have already been received in the opposite direction. When a D has been sent, it is put in a retransmit queue until it has been acknowledged, at which point it is discarded and the buffer returned to the free pool. A and I packets are very similar to D packets but serve a more restricted function. The A packet is used for acknowledgement when no D packet is available to do the job, while the I or interrupt packet generates a break in, a concept which is closely connected to the 1904S Operating System (GEORGE 3) [4].

The interrupt packet is of value when the purpose of the connection is to make use of the 1904S multi-terminal time sharing system. During such sessions it is sometimes desired to break in on, or interrupt, some activity initiated by the terminal user by means of pressing a 'break in key'. This action must be notified to the 1904S and, since it is not a standard data transfer, it is conveniently done by means of a special packet, the I packet.

The process of data exchange is illustrated in fig. 5. When D packets are exchanged more or less packet-for-packet no problem arises, but when the flow is very one-sided then A packets must be used (in this case by Beta). The general rule is that either end (in this case Alpha) may send up to three D packets before waiting, but the other end, Beta, must (eventually) send an acknowledgement, either D or A, after receiving two D packets.

This arrangement is intended to achieve three purposes. Firstly, by delaying the sending of an A packet. Beta can control the rate at which further D packets are issued by Alpha. Secondly, the limited number of unacknowledged D packets that can be sent, combined with the acknowledge sequence number in the A packets when they are received, limits the maximum lengths of retransmit queues. Thirdly, at maximum flow rate the fact that Beta will acknowledge (after two D packets) before Alpha has ceased sending (after three D packets) limits the effect of network end-to-end delays on the overall transmission rate.

7. Error detection and recovery

An error arises when a packet is corrupted and rejected, perhaps because its checksum is no longer valid. This can lead to one of two situations.

a) the next packet is received and its sequence number is no longer in sequence.

b) both ends wait for each other until one end times out.

Three packet types have been invented to cope with these situations: R (for 'repeat'), and a pair SR and SA (for 'status request' and 'status acknowledge'). Their use is illustrated in fig. 6. In the first case the use of R by Beta quite simply causes Alpha to restart transmission starting with the packet that was lost. In the second case Beta has timed out and sent an SR to Alpha. When Beta receives the SA it can deduce what has happened and sends an R to resume normal data exchange. A number of such error situations can be imagined, and these mechanisms can resolve most of them, but there always exists a number of possibilities that are too subtle or too complex to be resolved except at further complexity and expense, which we did not regard as justified.

8. Miscellaneous

There are three further types of packets. E packets are used to notify errors, while RUT (are you there?) packets are sent out by the switch node at regular intervals and returned by end nodes, thus allowing the switch node to keep note of which end nodes are working. This enables the switch node to warn an end node that it is attempting to connect to a non-working node or is connected to a node that has become non-working. IN (initialise) packets are used by end node programs to announce their activation or re-activation to the switch node. This can be important when an end node program has crashed and been restarted so quickly that its failure has not been detected by the switch node.

9. Programming of computers

All the computers in the network have been programmed in the real time language called CORAL 66 [5]. The programs have been written in the form of co-operating parallel processes, and since no suitable real time operating system exists in any of the computers one has had to be created for this work. This comprises semaphores, a scheduler, interrupt handler, free store handler and a library of useful procedures, and is of a very simple construction. The general design of such systems is well described in chapter 13 of ref.[1].

All processes have the same priority (except interrupt response coding). Processes may be active (waiting for the scheduler to allocate CPU time), running or waiting (at a semaphore). A process at the end of the active queue advances steadily to the head, is set running, and runs until it has to wait because a semaphore is set to 'wait'. It is queued at this semaphore until released by the action of some other process, when it goes to the end of the active queue again. The normal idling state of the system is for all processes to be waiting at semaphores with the CPU looping in the scheduler, and it is activated by the operation of some peripheral (teletype or inter-computer link for example) causing interrupt response code to release one of the processes.

To make code shareable between two or more processes requires some care because of the single stack nature of CORAL 66. All data of a permanent nature, i.e. whose value must be preserved through the interval between suspension at a semaphore and eventual resumption, must be kept in blocks specific to that process since the code may be reused by another process in the meanwhile and the values of 'local' variables modified. This data must be preserved either in the process control block (i.e. the data block that is actually queued at semaphores) or in separate rows of a two dimensional array. The policy followed has been to keep the main body of data in an array but to keep 'system' data such as the return address on procedure entry in the process control block. The need to be able to save data in safe places means that no process should be involuntarily suspended to give place to one of higher priority. Though strictly speaking the requirement is merely that processes sharing common code must have the same priority, the widespread use of useful procedures make the wider restriction the normal state of affairs. In practice so long as the central processor is not overloaded it is not clear what priority structure of processes would help to speed things up, so the use of a single priority level probably imposes little penalty. The only requirement is that no process should monopolise the central processor for too long. In this sort of application the processes are so interdependent for their data (the packets of the system) that this problem does not arise, provided only that input and output to physical devices are done on an interrupt basis.

9.1. Programming of end nodes

Apart from processes concerned with input and output to physical devices the end node program consists of two blocks of shareable program code (i.e. software processes), one concerned with processing interactive or conversational connections while the other is concerned with file transfers (fig. 7).

The file transfer process accesses the disc based filestore of its host computer directly, packages it into packets and transmits it. It is in this sort of transfer that speed, that is the number of D packets per second transferred, is of interest. Because of the autonomous nature of these connections it is important that, in the case of errors, the transfer is concluded in an orderly manner and the end user informed what error has been detected. Because the network is not homogenous either in computer type or word-length, when data not in source format such as ASCII is sent (e.g. a binary file of 16 bit words for archiving) it is split into four bit units, converted to the appropriate hexadecimal character 0-9, A-F and send as text (8 bit characters). The end of a file transfer is signaled by the node sending the file sending a T (terminate) packet. The premature sending of such a packet is the only error not detectable by the recipient. During a file transfer one third of all the packets carried by the system are A packets acknowledging the main flow of D packets in the opposite direction.

File transfers from one mini to another are much less common but are performed from time to time. For this a slightly different process has been designed to allow either mini to request to send or receive a file or to act as the recipient of such a request.

The main differences are that:

a) the mini may be the non-initiating partner in the file transfer, and

b) the man--machine interface is slightly different since, for instance, file names take a different format.

The specification of file and directory names originates from a brief dialogue between an end node program and a human user, and is passed to the other end node in the first D packet of the file transfer.

The decision between sending a file as text or as hexadecimal depends on circumstances. All transfers between identical file stores (mini to mini) are sent and received as hexadecimal, so the file is simply replicated. The 1904S treats all files as text strings, regardless of whether the strings are 'readable' or are of hexadecimal characters. When files are transferred to the 1904S by a mini, or vice versa, the program in the mini must decide whether or not local hexadecimal conversion is required. It does this on the basis of the aforementioned dialogue with the user. An error at this point by the user does not crash the network but may result in an error message or in the local storage of the file in an unexpected and truncated format.

It is possible in unusual cases for the 'file store' of the mini to consist in reality of a paper tape punch and reader. Interactive connections use a different process but conform to the same protocol rules. In the minis, data is received from a teletype (or VDU) input process and sent down the link, while data received over the link is routed to the teletype output process. In the 1904S the comparable source/receptor is a software port into the operating system called a command issuer, a separate one being assigned for each end user when a connection is established, and released when it is terminated. Some of the information from this command issuer, for example whether more input is required, is indicated by the setting of bits in the reply word to the supervisor call, and this must be passed on to the mini by the use of bits in the header parameter bytes of the next D packet. The scheduling of processes in the 1904S works to a different design since it is clearly unreasonable to have the CPU looping in a scheduler which is only one program in a time sharing system. In this case the scheduler, on failing to find any process ready to run, suspends the whole program awaiting an interrupt, thereby allowing the computer's main scheduler to resume work on another job for the time being. The close interaction with the 1904A Operating System GEORGE 3 needed to allow devices to be run on an interrupt basis requires a detailed knowledge and careful use of facilities available in the 1904S but not normally known to or used by the average programmer.

9.2. Programming of the switch node

The functions of the switch node, apart from simply passing packets on, are to check their format, do any code conversion needed, monitor the availability of the end nodes, and collect statistics. The purpose of checking packet formats is to prevent hardware or software faults in one end node inducing program failure in another, and for this purpose a comprehensive set of tests are made (refer to figures 2 and 3).

i) Is l ≤ type ≤ 12?

ii) Is length ≤ 240?

iii) Is checksum correct?

iv) Does source node quoted correspond to the physical link on which the packet was received?

v) Does destination node quoted exist?

vi) For C packet: Does destination node have enough capacity for another connection? (The number of connections that can be routed to any end node is set by prior agreement, primarily to minimise the work load of the 1904S).

vii) For CA packet: Enter new connection (channel pair for this node pair).

viii) For T packet: delete connection.

ix) For all packets except C, CA, T: Does such a connection already exist?

If any of the above conditions is not satisfied, the packet is discarded. The rigour of the above tests ensured that the end nodes need test little more than the checksum of all packets received before acting on them.

After the received packet has been vetted for errors, it is then attached to the output queue for its destination node unless it is going to, or has come from, the 1904S. Since characters are stored in the 1904S in an internal 6-bit form and not ASCII, the user data of packets must be converted. To minimise the workload on the 1904S this task has been transferred to the switch node, so such packets are queued to a code conversion process which converts ASCII to ICL 6-bit internal code or vice versa before passing them on for output (see fig. 8).

The switching node also contains a clock driven process which generates RUT packets, and also T packets in those cases where one end of a connection has become inoperable (i.e. has failed to return an RUT within the time out period) leaving the other end 'hanging'.

This node is also used to gather statistics. It compiles a histogram of packet types, a note of what type of connection is made (i.e. interactive, sending a file or receiving a file, detected by examining the parameter bytes of passing C packets) and which node sought the connection, and a count of any hardware faults on each individual link.

The software for the printer/plotter, which is also based in this node, consists of further processes which perform the functions of an end node with respect to the protocol rules and which control the physical output of data. These processes share the same scheduler and library routines as the switch node software and for 'link input and output' have direct access to each others input and output queues, but otherwise they are independent of the 'logical switch node'.

10. Conclusions

The system described has already been in use as a working tool since 1975. The maximum speed of data transfers between mini's (i.e. during a file transfer) is about 25 packets per second, though this is almost doubled if disc accesses are 'short circuited" out. The upper limit in throughput of the network alone is-believed to lie in the acknowledgement mechanism. For transfers between mini and 1904S the top speed (when the 1904S has no other jobs being processed) is about 35 packets a second. The removal of disc accesses does not significantly alter this speed, so the limiting factor in this case is probably a combination of the code conversion (in the switch node) and the time needed by the 1904S to respond to interrupts and supervisor calls.

The communication system is expandable and it is envisaged to attach new minis and perhaps a microprocessor system to the network when they become available.

A particular feature of the system from its first conception has been the attempt to make the system robust, i.e. difficult to break and reliable by a careful checking for errors and an attempt to recover from error situations. Early programming errors have tested both these features to a considerable degree.

We feel that communications facilities such as these now form a valuable part of the working environment of those projects which use local computing hardware to achieve their objectives. The time taken to realise this project was of the order of four man years, though clearly had the authors had previous experience in the field this might have been reduced. Were the task to be undertaken again, greater use would now be made of existing standards, for example CCITT 'X' recommendations (e.g. X25).

References

[1] D.W. Davies and D.L.A. Barber, 'Communication Networks for Computers', (J. Wiley and Sons, 1973).

[2] R.A. Scantlebury and P.T. Wilkinson, 'The National Physical Laboratory Data Communications Network'. Proceedings of the Second International Conference on Computer Communications, Stockholm, (August, 1974) T3245 pp 223-28.

[3] F.E. Heart, R.E. Kahn, S.M. Ornstein, W.R. Crowther and D.C. Walden, "The Interface Message Processor for the ARPA Computer Network'. Proc. AFIPS SJCC 36 pp 551-567, June 1970.

[4] ICL Operating System GEORGE 3 and 4 1900 series, International Computers Ltd, London (1971).

[5] Official Definitions of CORAL 66, Her Majesty's Stationary Office, London.