MSBAAA.HLP MS-DOS KERMIT 3.0 "BOOTSTRAPPING" June 1990 MS-DOS Kermit 3.0 is distributed on diskette in binary executable form on a 5.25-inch IBM DOS-format diskette in the book "Using MS-DOS Kermit", by Christine M. Gianone, published by Digital Press, Bedford, MA, 1990, order number EY-C204E-DP (phone 1-800-343-8321). It is also available from Kermit Distribution at Columbia University and wherever computer books are sold. The MSB*.* files are for people who cannot get MS-DOS Kermit on diskette. MS-DOS Kermit (and many other Kermit programs) are often distributed using a special encoding called "boo" (short for "bootstrap") format, developed especially for distribution of MS-DOS Kermit over networks, e-mail, and communication lines. MS-DOS Kermit has grown to have so many features that the binary program image (the .EXE file) has become quite large. But binary files are generally not compatible with the common labeled tape formats (e.g. ANSI D), electronic mail, or raw downloading -- methods commonly used for Kermit software distribution. A common practice is to encode .EXE and other binary files into printable characters, such as hexadecimal digits, for transportability. A simple "hex" encoding results in two characters per 8-bit binary byte, plus CRLFs added every 80 (or less) hex characters to allow the file to pass through card-oriented networks like BITNET. A hex file is therefore more than twice as large as the original binary file. A .BOO file is a more compact, but somewhat more complicated, encoding. Every three binary bytes (24 bits) are split up into four 6-bit bytes with 48 (ASCII character "0") added to each, resulting in four ASCII characters ranging from "0" (ASCII 48) to "o" (ASCII 111), with CRLFs added at or near "column 76". The resulting file size would therefore be about 4/3 the .EXE file size. This is still quite large, so .BOO files also compress consecutive null (zero) bytes. Up to 78 consecutive nulls are compressed into two characters. Tilde ("~") is the null-compression lead-in, and the following character indicates how many nulls are represented (subtract 48 from this character's ASCII value). For instance "~A" means 17 consecutive nulls; "~~" means 78 of them. Repeated nulls are very common in .EXE files. 4-for-3 encoding combined with null compression reduces the size of the encoded file to approximately the same size as the original .EXE file, and sometimes even smaller. The first line of a .BOO file is the name (in plain text) of the original file. Here's what the first few lines of a typical .BOO file look like: MSVIBM.EXE CEYP0Id05@0P~3oomo2Y01FWeP8@007P000040HB4001`W~28bL005\W~2JBP00722V0ZHPYP: \8:H2]R2V0[`PYP:68>H2S23V0YHPiP:Xg800;Qd~2UWD006Yg~2Ogl009]o~2L8000;20~~~~ ~~~~~~~:R2H008TV?P761T410