The PC Software Interest Group, aka PC-SIG, began advertising a directory of public domain software for the IBM PC in early 1983. A year later, their collection featured over 135 diskettes, which you could purchase for $6 per diskette, plus $4 per order for shipping. Each diskette was numbered, to make it easy to specify which diskettes you wanted. By 1986, PC-SIG was one of the most prominent advertisers of user-supported software (“shareware”), sold annual memberships, and had a collection of nearly 500 diskettes.
As CD-ROM technology became popular in late 1980s, PC-SIG began distributing their library on CD-ROM as well. As of 1987, the library contained over 700 diskettes, so this was a much more convenient and economical way to get the collection.
So far, I’ve seen only five of their CD-ROM releases:
I recently acquired the 8th Edition CD-ROM, previously unavailable except as a collection of files on Jason Scott’s website, cd.textfiles.com (see “PC-SIG Library Eighth Edition From PC-SIG (April 1990)”), so I uploaded a copy of the CD-ROM to the Internet Archive.
I’m sure there were more than those five CD-ROMs. There was probably an 11th Edition released in 1992. And “CD ROM” releases were mentioned in their advertisements going back to at least September 1987, so the 8th Edition was definitely not the first CD-ROM release.
PC-SIG “editions” were not limited to CD-ROM releases, because at some point, their printed directories were labeled as “editions”, too. Another recent acquisition was a copy of “The PC-SIG Library 4th Edition”, published in March 1987, which I’ve also uploaded to the Internet Archive.
On page 424 of that directory, there is a picture of their “February 1987” CD-ROM, and while the CD-ROM does not appear to mention any “edition”, it would likely be considered the “PC-SIG Library 4th Edition” CD-ROM. It was probably the first CD-ROM that PC-SIG produced, too, based on that early date. At $295, it was also rather expensive, which means that not many people may have bought it, which may explain why the disc hasn’t shown up anywhere – at least not yet.
The layout of the PC-SIG CD-ROMs changed over time. In the 8th Edition, the contents of each diskette were deposited in the their own folder. For example, Disk #1 was stored in
512 Oct 3 1985 ASK.COM 8320 Jan 1 1970 BLACKJCK.BAS 1681 Jun 11 1986 CASPAR.BAT 4376 Jun 11 1986 CASPAR.HLP 1177 Jun 11 1986 CASPAR0.HLP 1359 Jun 11 1986 CASPAR1.HLP 1139 Jun 11 1986 CASPAR2.HLP 704 Jun 11 1986 CASPAR3.HLP 1024 Jan 1 1970 CIRCLES.BAS 198 Mar 26 1983 COPYOVER.BAT 450 Mar 26 1983 COPYOVER.DOC 1408 Feb 2 1982 DOTS.BAS 1299 May 28 1987 FILES001.TXT 1664 Jan 1 1970 HATDANCE.BAS 2560 Mar 26 1982 KALEID.BAS 4608 Mar 26 1982 MAXIT.BAS 4096 Dec 24 1982 MENU.BAS 6912 Jan 1 1970 OTHELLO.BAS 4096 Sep 8 1982 PATTERNS.BAS 4992 Jan 1 1970 PONGPONG.BAS 128 Mar 26 1983 SAMPLES.BAS 512 Mar 26 1982 STRINGS.BAS 19712 Jun 17 1986 WOMBATS.BAS 26880 Jun 10 1986 WOMBATS.DOC 25933 Jun 10 1986 WOMBATS.WP 15104 Mar 26 1982 YAHTZEE.BAS
However, beginning with the 9th Edition, diskette contents were stored on the CD-ROM as ZIP files (eg, DISK0001.ZIP through DISK2485.ZIP).
As PC-SIG explained, ZIP files were initially used on a case-by-case basis to preserve filenames that were not allowed on CD-ROM, but eventually it was done for all diskettes to save space as well:
“This zipping was necessary because the High Sierra Format and MS-DOS Extensions allow only the characters A through Z, 0 through 9, and _(underscore) in filenames. Some of our nearly 30,000 files include other characters, so rather than change the original filename given by the author, we elected to zip those files and give the newly created zipped files allowable filenames. (Every disks’ files were zipped on the Ninth Edition in order to fit, as we had approached the maximum for a CD on the Eighth Edition.)”
[More details about the PC-SIG CD-ROM can be found in the CD-ROM Booklet that I included with my upload].
It turns out there was another good for reason using for using ZIP files: on a CD-ROM, all files in a directory are stored in ascending (alphabetical) order, which would ruin the “DIR” listing order that some diskette authors depended on. For example, Disk #798 used filenames with line-drawing characters unique to the IBM PC:
A:\>DIR Volume in drive A has no label Directory of A:\ DEMO BAT 114 5-02-87 11:44a PMENU DOC 25407 8-01-88 11:28a BRUN40 EXE 76816 10-08-87 5:57p MENGEN EXE 67847 8-01-88 10:32a MG-C EXE 6723 8-01-88 10:34a MG-M EXE 5261 8-01-88 10:35a MGHLP EXE 26805 8-01-88 10:37a PLIC EXE 4791 8-01-88 10:38a PMENU EXE 21215 7-30-88 4:57p DEMO HLP 8562 5-05-87 12:16p DEMO MEN 5115 5-02-87 12:36p M MEN 1371 7-30-88 12:55p FILES798 TXT 2178 8-25-88 2:09p GO BAT 38 7-02-87 11:29a GO TXT 463 7-02-87 11:29a ┌─┬─┬─┐ 0 │T│G│S│ 0 │Y│O│T│ 0 │P│ 0 │E│T│R│ 0 │ 0 └─┴─┴─┘ 0 22 File(s) 60416 bytes free
While ZIP files did incidentally preserve file order, the ZIP format was not specifically designed for disk preservation (no volume labels, for example), and in the particular case of DISK0798, some filenames were still mangled at some point (though not necessarily by
PKZIP), because the directory listing was probably intended to display as:
┌─┬─┬─┐ │T│G│S│ │Y│O│T│ │P│ A│ │E│T│R│ │ │O│T│ └─┴─┴─┘
Another interesting difference between CD-ROM editions was their degree of “completeness”. At first glance, the 9th Edition CD-ROM would appear to include every diskette from 0001 through 2485, but it turns out that at least 165 of diskettes had their contents removed from the library. The ZIP file for those 165 diskettes contains nothing more than a
NOTE.TXT that says:
╔═════════════════════════════════════════════════════════════════════════╗ ║ This disk has been withdrawn by the author from the PC-SIG library. ║ ╚═════════════════════════════════════════════════════════════════════════╝
Later CD-ROM editions didn’t bother including folders or ZIP files for removed or obsolete diskettes, which is why I noted above that the 12th Edition had only 2929 disks available (out of 3404 total), and the 13th Edition had only 3064 available (out of 4313 total).
Some years ago, I added part of the PC-SIG Library to PCjs; specifically, the PC-SIG Library 8th Edition CD-ROM (April 1990). That process involved recreating 2121 diskette images from folders, and then producing four web pages, each with an IBM PC and directory listings for 500+ diskettes, along with buttons to load the desired diskette into the PC.
But I wanted to do something better – something more comprehensive and user-friendly.
So I’ve added the PC-SIG Diskette Library: The (Almost) Complete Collection, featuring over 4000 diskettes from all the available CD-ROMs, and supplemented with diskette images of actual PC-SIG diskettes where available.
Every disk now has its own page, and the PC-SIG Diskette Library page offers a very rudimentary search capability to help you find the program or diskette you’re looking for. I’ve also started adding PC-SIG documentation for each diskette to the individual pages. Work on both search and documentation is on-going.
Although I prefer original disk images, ZIP files are the next best thing, and while the standard practice of “unzipping” a ZIP file into a directory and then running DiskImage with the
--dir option works, it’s not ideal for several reasons, some of which were touched on above:
For all those reasons, I decided to update
DiskImage with a new
--zip option. It can now read and decompress the contents of an entire ZIP file into memory and then create a disk image, preserving the order of files, as well as the original attributes, dates, times, and filenames.
Next, I updated DiskImage’s
--extract option to automatically convert CP437 filenames to UTF-8 filenames, so by combining
--extract, you can effectively “unzip” a ZIP file into your file system and not get mangled filenames – even if they included PC graphics characters.
I also decided to “expand” on that feature with a new
--expand option, so that any
.ZIP files inside a ZIP file can be decompressed, too. For example, if you have a
BACKUP.ZIP that contains a
--expand option (in conjunction with
--extract) will create a directory named
DOCS.ZIP containing the decompressed contents of the
DOCS ZIP file. This feature should work for any number of nested ZIP files.
Last but not least, the PC-SIG collection contained a large number of BASIC programs, which I thought would be nice to include listings of on the individual PC-SIG diskette pages. Unfortunately, in those days,
.BAS files were usually stored in tokenized (non-ASCII) format, since that was
SAVE behavior and the files were slightly smaller.
So, I added yet another
DiskImage option (
--normalize) to automatically convert tokenized
.BAS files to CP437 text files (or UTF-8 files if extracting to your local file system). As an added bonus, I included the ability to detect “protected” (encrypted)
.BAS files and “de-protect” them in the process.
These “de-tokenization” and “de-protection” processes seemed straight-forward at first, thanks to several useful online resources that I credit in the source code (eg, https://github.com/rwtodd/bascat and https://slions.net), but de-tokenization is actually a bit trickier than most people realize, in part because they didn’t know how inventive BASIC programmers were in the early days of the IBM PC.
For example, some programmers liked to include PC graphics characters inside their strings, comments, and DATA statements – which BASIC was perfectly fine with. However, all the de-tokenization code and pseudo-code I saw would misidentify those characters as BASIC tokens. There were also a few other tricky details, like rendering floating-point constants with the correct precision, appending ‘#’ to double-precision constants, etc.
Here’s just one example of the use of non-ASCII characters inside strings and comments, from PC-SIG Disk #241:
3300 REM ▬ OTHER OTHELLO BOARD 3310 CLS:LOCATE 1,10:PRINT "O T H E L L O" 3320 LOCATE 3,5:PRINT"1 2 3 4 5 6 7 8" 3330 FOR N=1 TO 8:LOCATE 3+2*N,1:PRINT CHR$(N+64):NEXT 3340 LOCATE 4,3 :PRINT"╔═══╦═══╦═══╦═══╦═══╦═══╦═══╦═══╗":FOR N=1 TO 13STEP 2 3350 LOCATE 4+N,3:PRINT"║ ║ ║ ║ ║ ║ ║ ║ ║" 3360 LOCATE 5+N,3:PRINT"╠═══╬═══╬═══╬═══╬═══╬═══╬═══╬═══╣":NEXT 3370 LOCATE 4+N,3:PRINT"║ ║ ║ ║ ║ ║ ║ ║ ║" 3380 LOCATE 5+N,3:PRINT"╚═══╩═══╩═══╩═══╩═══╩═══╩═══╩═══╝" 3390 FOR I= 1TO 8 3400 FOR J= 1 TO 8:LOCATE 2* J+ 3,4* I+ 1:FACE= (A(I,J)+ 3)/2 3410 IF FACE = 1.5 THEN PRINT" " ELSE PRINT CHR$(FACE) 3420 NEXT J,I 3430 GOSUB 3250 3440 RETURN
In short, the collection of PC-SIG
.BAS files provided an excellent set of test cases for ZIP decompression and BASIC de-tokenization, and the PCjs DiskImage utility now has some handy new capabilities.
I started with node-stream-zip, added support for the ARC file format, and then extended its decompression support; like most modern ZIP utilities, it uses
zlib, which supports only Deflate compression.
StreamZipthat adds support for:
The combination of
LegacyZip should be able to decompress any old ARC or ZIP archive, so test it out with the new
--zip options in the DiskImage utility, and if you find one that doesn’t work, let me know.
Last but not least, I’ve also added a BASIC Conversion Utility page that loads BASFile.js in your web browser. It should be able to convert any old IBM PC BASIC file to plain text, but again, if you run into any interesting discrepancies, let me know.
Apr 6, 2023