_P_a_c_k_e_t _D_r_i_v_e_r _P_r_o_t_o_c_o_l G. L. Chesson Bell Laboratories _A_b_s_t_r_a_c_t These notes describe the packet driver link proto- col that was supplied with the Seventh Edition of UNIX* and is used by the UUCP program. _G_e_n_e_r_a_l Information flow between a pair of machines may be regulated by first representing the data as sequence- numbered _p_a_c_k_e_t_s of data and then establishing conventions that govern the use of sequence numbers. The _P_K, or _p_a_c_k_e_t _d_r_i_v_e_r, protocol is a particular instance of this type of flow-control discipline. The technique depends on the notion of a transmission _w_i_n_d_o_w to determine upper and lower bounds for valid sequence numbers. The transmitter is allowed to retransmit packets having sequence numbers within the window until the receiver indicates that packets have been correctly received. Positive acknowledgement from the receiver moves the window; negative acknowledgement or no acknowledgement causes retransmission. The receiver must ignore duplicate transmission, detect the various errors that may occur, and inform the transmitter when packets are correctly or incorrectly received. The following paragraphs describe the packet formats, message exchanges, and framing used by the protocol as coded in the UUCP program and the UNIX kernel. Although no attempt will be made here to present internal details of the algorithms that were used, the checksum routine is supplied for the benefit of other implementors. _P_a_c_k_e_t _F_o_r_m_a_t_s The protocol is defined in terms of message transmis- sions of 8-bit bytes. Each message includes one _c_o_n_t_r_o_l byte plus a _d_a_t_a _s_e_g_m_e_n_t of zero or more information bytes. The allowed data segment sizes range between 32 and 4096 as determined by the formula 32(28k9) where k is a 3-bit number. The packet sequence numbers are likewise constrained to 3- bits; i.e. counting proceeds modulo-8. The control byte is partitioned into three fields as depicted below. __________________________ *UNIX is a Trademark of AT&T Bell Laboratories. October 5, 1988 - 2 - bit 7 6 5 4 3 2 1 0 t t x x x y y y The _t bits indicate a packet type and determine the interpretation to be placed on the _x_x_x and _y_y_y fields. The various interpretations are as follows: _t_t _i_n_t_e_r_p_r_e_t_a_t_i_o_n 00 control packet 10 data packet 11 `short' data packet 01 alternate channel A data segment accompanies all non-control packets. Each transmitter is constrained to observe the maximum data seg- ment size established during initial synchronization by the receiver that it sends to. Type 10 packets have maximal size data segments. Type 11, or `short', packets have zero or more data bytes but less than the maximum. The first one or two bytes of the data segment of a short packet are `count' bytes that indicate the difference between the max- imum size and the number of bytes in the short segment. If the difference is less than 127, one count byte is used. If the difference exceeds 127, then the low-order seven bits of the difference are put in the first data byte and the high- order bit is set as an indicator that the remaining bits of the difference are in the second byte. Type 01 packets are never used by UUCP and need not be discussed in detail here. The sequence number of a non-control packet is given by the _x_x_x field. Control packets are not sequenced. The newest sequence number, excluding duplicate transmissions, accepted by a receiver is placed in the _y_y_y field of non- control packets sent to the `other' receiver. There are no data bytes associated with a control packet, the _x_x_x field is interpreted as a control message, and the _y_y_y field is a value accompanying the control mes- sage. The control messages are listed below in decreasing priority. That is, if several control messages are to be sent, the lower-numbered ones are sent first. _x_x_x _n_a_m_e _y_y_y 1 CLOSE n/a 2 RJ last correctly received sequence number 3 SRJ sequence number to retransmit 4 RR last correctly received sequence number 5 INITC window size 6 INITB data segment size 7 INITA window size October 5, 1988 - 3 - The CLOSE message indicates that the communications channel is to be shut down. The RJ, or _r_e_j_e_c_t, message indicates that the receiver has detected an error and the sender should retransmit after using the _y_y_y field to update the window. This mode of retransmission is usually referred to as a `go-back-N' procedure. The SRJ, or _s_e_l_e_c_t_i_v_e _r_e_j_e_c_t, message carries with it the sequence number of a particular packet to be retransmitted. The RR, or _r_e_c_e_i_v_e_r _r_e_a_d_y, message indicates that the receiver has detected no errors; the _y_y_y field updates the sender's window. The INITA/B/C messages are used to set window and data segment sizes. Segment sizes are calculated by the formula 32(28yyy9) as mentioned above, and window sizes may range between 1 and 7. Measurements of the protocol running on communication links at rates up to 9600 baud showed that a window size of 2 is optimal given a packet size greater than 32 bytes. This means that the link bandwidth can be fully utilized by the software. For this reason the SRJ message is not as important as it might otherwise be. Therefore the UNIX implementations no longer generate or respond to SRJ mes- sages. It is mentioned here for historical accuracy only, and one may assume that SRJ is no longer part of the proto- col. _M_e_s_s_a_g_e _E_x_c_h_a_n_g_e_s _I_n_i_t_i_a_l_i_z_a_t_i_o_n Messages are exchanged between four cooperating enti- ties: two senders and two receivers. This means that the communication channel is thought of as two independent half-duplex data paths. For example the window and segment sizes need not be the same in each direction. Initial synchronization is accomplished with two 3-way handshakes: two each of INITA/INITB/INITC. Each sender transmits INITA messages repeatedly. When an INITA message is received, INITB is sent in return. When an INITB message is received _a_n_d an INITB message has been sent, an INITC message is sent. The INITA and INITB messages carry with them the packet and window size that each receiver wants to use, and the senders are supposed to comply. When a receiver has seen all three INIT messages, the channel is considered to be open. It is possible to design a protocol that starts up using fewer messages than the interlocked handshakes described above. The advantage of the more complicated design lies in its use as a research vehicle: the initial handshake sequence is completely symmetric, a handshake can be initiated by one side of the link while the connection is in use, and the software to do this can utilize code that October 5, 1988 - 4 - would ordinarily be used only once at connection setup time. These properties were used in experiments with dynamically adjusted parameters. That is attempts were made to adapt the window and segment sizes to changes observed in traffic while a link was in use. Other experiments used the initial handshake in a different way for restarting the protocol without data loss after machine crashes. These experiments never worked well in the packet driver and basically pro- vided the impetus for other protocol designs. The result as far as UUCP is concerned is that initial synchronization uses the two 3-way handshakes, and the INIT messages are ignored elsewhere. _D_a_t_a _T_r_a_n_s_p_o_r_t After initial synchronization each receiver sets a modulo-8 incrementing counter R to 0; each sender sets a similar counter S to 1. The value of R is always the number of the most recent correctly received packet. The value of S is always the first sequence number in the output window. Let W denote window size. Note that the value of W may be different for each sender. A sender may transmit packets with sequence numbers in the range S to (S+W-1) mod-8. At any particular time a receiver expects arriving packets to have numbers in the range (R+1) mod-8 to (R+W) mod-8. Packets must arrive in sequence number order are are only acknowledged in order. That is, the `next' packet a receiver will acknowledge must have sequence number (R+1) mod-8. A receiver acknowledges receipt of data packets by arranging for the value of its R counter to be sent across the channel where it will be used to update an S counter. This is done in two ways. If data is flowing in both direc- tions across a channel then each receiver's current R value is carried in the _y_y_y field of non-control packets. Other- wise when there is no bidirectional data flow, each receiver's R value is transmitted across the link as the _y_y_y field of an RR control packet. Error handling is up to the discretion of the receiver. It can ignore all errors in which case transmitter timeouts must provide for retransmission. The receiver may also gen- erate RJ error control packets. The _y_y_y field of an incom- ing RJ message replaces the S value of the local sender and constitutes a request for retransmission to start at that sequence number. The _y_y_y field of an incoming SRJ message selects a particular packet for retransmission. The resemblance between the flow control procedure in the packet driver and that defined for X.25 is no accident. The packet driver protocol began life as an attempt at cleaning up X.25. That is why, for example, control October 5, 1988 - 5 - information is uniform in length (one byte), there is no RNR message (not needed), and there is but one timeout defined in the sender. _T_e_r_m_i_n_a_t_i_o_n The CLOSE message is used to terminate communications. Software on either or both ends of the communication channel may initiate termination. In any case when one end wants to terminate it sends CLOSE messages until one is received from the other end or until a programmable limit on the number of CLOSE messages is reached. Receipt of a CLOSE message causes a CLOSE message to be sent. In the UNIX environment it also causes the SIGPIPE or `broken pipe' signal to be sent to the local process using the communication channel. _F_r_a_m_i_n_g The term _f_r_a_m_i_n_g is used to denote the technique by which the beginning and end of a message is detected in a byte stream; _e_r_r_o_r _c_o_n_t_r_o_l denotes the method by which transmission errors are detected. Strategies for framing and error control depend upon additional information being transmitted along with the control byte and data segment, and the choice of a particular strategy usually depends on characteristics of input/output devices and transmission media. Several framing techniques are in used in support of PK protocol implementations, not all of which can be described in detail here. The technique used on asynchronous serial lines will be described. A six byte framing _e_n_v_e_l_o_p_e is constructed using the control byte C of a packet and five other bytes as depicted below. The symbol denotes the ASCII ctrl/P character. If the envelope is to be followed by a data segment, has the value log928(size)-4; i.e. 1 <_ k <_ 8. If k is 9, then the envelope represents a control packet. The and bytes are the low-order and high-order bytes respectively of 0xAAAA minus a 16-bit checksum. For control packets, this 16-bit checksum is the same as the control byte C. For data packets, the checksum is calculated by the program below. The byte is the exclusive-or of . Error control is accomplished by checking a received framing envelope for compliance with the definition, and comparing a checksum function of the data segment with . This particular framing strategy assumes data segments are constant-sized: the `unused' bytes in a short packet are actually transmitted. This creates a certain amount of overhead which can be eliminated by a more complicated October 5, 1988 - 6 - framing technique. The advantage of this strategy is that i/o devices can be programmed to take advantage of the constant-sized framing envelopes and data segments. October 5, 1988 - 7 - The checksum calculation is displayed below as a C function. Note that the code is not truly portable because the definitions of _s_h_o_r_t and _c_h_a_r are not necessarily uni- form across all machines that might support this language. This code assumes that _s_h_o_r_t and _c_h_a_r are 16 and 8-bits respectively. /* [Original document's version corrected to actual version] */ chksum(s,n) register char *s; register n; { register short sum; register unsigned short t; register short x; sum = -1; x = 0; do { if (sum<0) { sum <<= 1; sum++; } else sum <<= 1; t = sum; sum += (unsigned)*s++ & 0377; x += sum^n; if ((unsigned short)sum <= t) { sum ^= x; } } while (--n > 0); return(sum); } The checksum routine used in gnuucp has been updated to avoid depending on the particular sizes of _c_h_a_r and _s_h_o_r_t variables. As long as a _c_h_a_r holds 8 bits or more, and a _s_h_o_r_t holds 16 bits or more, the code will work. To test it, uncomment the ``#define short long'' below. A good com- piler produces the same code from this function as from the less portable version. #define HIGHBIT16 0x8000 #define JUST16BITS 0xFFFF #define JUST8BITS 0x00FF #define MAGIC 0125252 /* checksum is subtracted from this */ int pktchksum(msg, bytes) unsigned char *msg; int bytes; { October 5, 1988 - 8 - return (JUST16BITS & (MAGIC - (chksum(&msg[6], bytes) ^ (JUST8BITS & msg[4])))); } int chksum(s,n) register unsigned char *s; register n; { /* #define short long /* To make sure it works with shorts > 16 bits */ register short sum; register unsigned short t; register short x; sum = (-1) & JUST16BITS; x = 0; do { /* Rotate "sum" left by 1 bit, in a 16-bit barrel */ if (sum & HIGHBIT16) { sum = (1 + (sum << 1)) & JUST16BITS; } else sum <<= 1; t = sum; sum = (sum + (*s++ & JUST8BITS)) & JUST16BITS; x += sum ^ n; if ((unsigned short)sum <= t) sum = (sum ^ x) & JUST16BITS; } while (--n > 0); return(sum); #undef short /* End of debugging check */ } Volume-Number: Volume 14, Number 13 October 5, 1988