Home of the original IBM PC emulator for browsers.
DiskImage is a Node command-line application that reads/writes PCjs v2 disk images, using the DiskInfo PCx86 machine module to parse the data, and it supersedes the older PCjs DiskDump utility.
PCjs v2 (version 2.x) disk images are JSON objects with the following properties:
Sector objects contain the raw sector data, using 32-bit signed decimal values, and for sectors that contain data from a file in fileTable, the object will also contain an index into fileTable, along with the data’s offset within the file.
For example, look at the PC DOS 2.00 diskette and examine the following sector object:
The f property is a zero-based index into fileTable that refers to the following file entry:
and the o property indicates that sector’s byte offset (1024) within COMMAND.COM. Having all this information about the FAT file system encoded in the disk image makes it easy to observe low-level I/O and immediately know which portions of which files are being accessed. Other tasks, such as cataloging the contents of disk images, locating identical files, finding files older or newer than a certain date, or extracting other information about the disks or their files, become much simpler as well.
Unusual sectors, like those used in copy-protection schemes, may have non-standard sector IDs or additional properties that simulate special behavior; for example, sectors with a dataError property can trigger a read or write failure at a certain point within the sector. Some of these properties have been discussed in PCjs blog posts and may be documented more fully at a later date.
Older PCjs v1 (version 1.x) disk images were basically just an array of CHS sector data (what is now called the diskData object), without any other information. Such disk images are still supported, but all the disk images now stored on PCjs disk servers, such as diskettes.pcjs.org, have been converted to the v2 format.
To build a PCjs disk image, such as this PC DOS 2.00 diskette, from an IMG file:
node diskimage.js /diskettes/pcx86/sys/dos/2.00/PCDOS200-DISK1.img PCDOS200-DISK1.json
In addition to IMG files, DiskImage also includes (experimental) support for PSI (PCE Sector Image) files, which can in turn be built from Kryoflux RAW files. Here are the basic steps, using tools from PCE:
which translates to these commands (using a 360K PC diskette named “disk1” as an example):
pfi disk1.00.0.raw disk1.pfi pfi disk1.pfi -p double-step -r 600000 -p decode pri disk1.pri pri disk1.pri -c 40-99 -p delete disk1.pri pri disk1.pri -p decode mfm disk1.psi node diskimage.js disk1.psi disk1.json
To build a VisiCalc diskette from a directory containing VC.COM, specify the name of the directory, including a trailing slash, like so:
node diskimage.js /miscdisks/pcx86/app/other/visicalc/1981/VISICALC-1981/ VISICALC-1981.json
By default, the diskette will be given an 11-character volume label derived from the directory name (eg, “VISICALC-19”);
however, you can use
--label to specify your own label (eg,
--label=VISICALC81), or use
--label=none to suppress
the volume label.
The smallest standard PC diskette format that can accommodate all the files will be automatically selected, but you can
specify a different target size (in Kb) using
--target=N, where N is 160, 180, 320, 360, 720, 1200, or 1440. For
example, if your diskette must work with PC DOS 1.0, use
Another useful option is
--normalize, which will transform the line-endings in all recognized text files from LF to CR/LF;
a recognized text file is any file ending with one of these extensions (.MD, .ME, .BAS, .BAT, .ASM, .LRF, .MAK, .TXT, OR .XML)
AND which contains only 7-bit ASCII characters – since some files, like .bas files, can contain either ASCII or non-ASCII
data. The list of recognized text file extensions is likely to grow over time.
There are many large software collections where the diskette contents have been archived as ZIP files rather than as disk images, and in theory, it’s trivial to
unzip them into separate folders and then use DiskImage to build new images from those folders (see above).
For example, I originally recreated all the PC-SIG Library diskette images from the “PC-SIG Library Eighth Edition” CD-ROM files stored at cd.textfiles.com. Some of the diskettes on the CD-ROM had been completely archived as single ZIP files – probably because the diskettes contained filenames that were not allowed on CD-ROM – so I used
unzip on macOS to extract those ZIP files to folders, and then recreated disk images from those folders.
However, this process doesn’t always work well. DISK0798 highlights a few issues that have already been discussed on GitHub.
First, the original order of the filenames was not preserved. Modern operating systems (eg, macOS) list files alphabetically, and as a result, the files on the recreated diskettes were sorted alphabetically as well.
Second, while the ZIP archives appeared to more-or-less preserve non-ASCII filenames,
unzip did not. IBM PCs used a character set now known as Code Page 437 (CP437), which included a variety of line-drawing characters and other symbols that
unzip failed to translate to their modern (UTF-8) counterparts.
To resolve all these issues, I updated DiskImage with an option (
--zip) to read ZIP archives directly. I started with an NPM package called node-stream-zip, which is essentially a module that understands the ZIP file format, identifies all the compressed files inside the ZIP file, and uses Node’s built-in zlib functionality to decompress them.
Here’s an example of
--zip in action:
node diskimage.js --zip=/Volumes/PCSIG_13B/BBS/DISK0042.ZIP --output=DISK0042.json --verbose DiskImage v2.11 Copyright © 2012-2023 Jeff Parsons <Jeff@pcjs.org> options: --zip=/Volumes/PCSIG_13B/BBS/DISK0042.ZIP --output=DISK0042.json --verbose /Volumes/PCSIG_13B/BBS/DISK0042.ZIP Filename Length Method Size Ratio Date Time CRC -------- ------ ------ ---- ----- ---- ---- --- MSVIBM.EXE 131392 Implode 73651 44% 1990-02-19 19:30:30 ac9163ba MSR300.UPD 20338 Implode 8627 58% 1990-02-19 19:35:12 61372fc6 MSKERM.HLP 35263 Implode 13799 61% 1990-02-19 19:39:16 1c61d95c MSKERM.BWR 27985 Implode 12132 57% 1990-02-19 19:42:28 353a76ed MSKERMIT.INI 4760 Implode 2309 51% 1990-02-19 20:07:20 00e884a5 GO.BAT 40 Shrink 38 5% 1980-01-01 06:00:08 75d72756 FILE0042.TXT 3870 Implode 896 77% 1990-11-12 01:46:16 3a817bda GO.TXT 1002 Implode 307 69% 1990-11-09 06:21:54 e64455e9 processing DISK0042: 327680 bytes (checksum -1217186896, hash bba045788185bc8284f5e4cde0929b70) writing DISK0042.json...
--verbose option generates the
PKZIP-style file listing, displaying the individual file names, compressed and uncompressed file sizes, compression ratio, etc.
In fact, creating a disk image is entirely optional; you can use DiskImage to simply examine the contents of
node diskimage.js --zip=/Volumes/PCSIG_13B/BBS/DISK0042.ZIP --verbose
To simplify dealing with large collections of files, I also added an
node diskimage.js --all="/Volumes/PCSIG_13B/**/*.ZIP" --verbose
That command will locate all matching
ZIP files and process each one with any other options you specify (eg,
--verbose to display their contents).
--all also supports file extensions
--disk is assumed for any file ending with one of those extensions, whereas
--zip is assumed for any file ending with a
If you want to create a disk image for every
node diskimage.js --all="/Volumes/PCSIG_13B/**/*.ZIP" --output=tmp --type=img
--output specifies the output folder and
--type specifies the output file type (either
JSON). Each output file will have the same basename as the
ZIP file. You can use also “%d” anywhere in the
--output value to represent the directory of the corresponding input file (eg,
ZIP files inside disk images can be automatically expanded during disk image processing as well; just add the new
--expand option. Each
ZIP file will be replaced with a folder of the same name, and that folder will contain the entire uncompressed contents of the archive; the original
ZIP file will not be included in the disk image:
node diskimage.js --all="/Volumes/PCSIG_13B/**/*.ZIP" --expand --output=tmp
Finally, support for the ARC file format (ZIP’s predecessor) is now available. Just use
--arc instead of
--zip, or specify input files with
.ARC extensions instead of
.ZIP. All the same capabilities apply.
You can extract the contents of a single disk image to your current directory, or to a specific directory using
node diskimage.js DISK0001.IMG --extract node diskimage.js DISK0001.IMG --extract --extdir=tmp
You can also extract the contents of an entire collection of disk images, placing the contents of each either in the same directory as the original disk image or in a specific directory:
node diskimage.js --all="*.IMG" --extract --extdir=%d node diskimage.js --all="*.IMG" --extract --extdir=tmp
You can also expand any
ZIP files during the extraction process, by including the
node diskimage.js --all="*.IMG" --extract --expand --extdir=tmp
Also, while the
--normalize option was originally created to “normalize” files read from the host (eg, to convert LF to CR/LF in text files), it can also be used during extraction now, when files are being written to the host.
For example, if you want any filenames with CP437 characters to be created properly on the host, or you want the contents of any CP437 text files, BASIC files, etc, to be stored in readable form on the host, use the
--normalize option along with the
--extract option; eg:
node diskimage.js --all="/Volumes/PCSIG_13B/**/*.ZIP" --extract --expand --normalize --extdir=tmp
In addition to converting line-endings back from CR/LF to LF,
--normalize will also convert any tokenized
.BAS files to plain-text UTF-8 files on the host, as well as decrypt any
.BAS files that have been “protected” by
BASIC with the
P option of the
--output option is available with all of the above commands as well, but that option only affects disk image creation, not file extraction. If you don’t want any disk images created at the same time, don’t use
Both local and remote diskette images can be examined. To examine a remote image, you must use the
with either an explicit URL, as in:
node diskimage.js --disk=https://diskettes.pcjs.org/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json
or with one of PCjs’ implicit diskette paths, such as
/diskettes, which currently maps to disk server
node diskimage.js --disk=/diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json
If you happen to have a local file that exists in the same location as the implicit diskette path, use
--server to force
the server mapping. The list of implicit paths for PC disks currently includes (but is not limited to):
NOTE: Implicit disk paths should normally begin with
/disks, because when running PCjs locally, that’s where any local copies of PCjs disk repositories are assumed to exist; however, PCjs and DiskImage will allow you to omit that portion, for convenience (and backward compatibility). In other words,
/diskettes will be automatically mapped to
/disks/diskettes for local access and
https://diskettes.pcjs.org for remote access.
To get a DOS-compatible directory listing of a disk image:
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --list
To display all the unused bytes of a disk image (JSON-encoded disk images only):
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --list=unused
NOTE: Unused bytes are a superset of free bytes. Free bytes are always measured in terms of unused clusters, multiplied by the cluster size, whereas unused bytes are the combination of all completely unused cluster space plus any partially unused cluster space. Being able to see all the unused bytes on a disk can be useful for studying disk image usage, or simply making sure that a disk is free of any unwanted data.
TODO: Update the unused byte report to include unused bytes, if any, in all FAT sectors and directory sectors.
To extract all the files from a disk image:
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --extract
To extract a specific file from a disk image:
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --extract=COMMAND.COM
To extract files from a disk image into a specific directory (eg, tmp):
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --extract --extdir=tmp
To dump a specific (C:H:S) sector from a disk image:
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --dump=0:0:1
To dump multiple (C:H:S) sectors from a disk image track, follow the C:H:S values with a sector count; eg:
node diskimage.js /diskettes/pcx86/sys/dos/ibm/2.00/PCDOS200-DISK1.json --dump=0:0:1:4