Home of the original IBM PC emulator for browsers.
This will probably be the final post in my “trilogy” of posts about the PC.js command-line utility, now that I’ve (hopefully) finished ironing out most of the wrinkles. I’ve improved FAT12 and FAT16 support, so when pc.js
builds either a standard or custom FAT disk image, it should be compatible with whatever version of PC DOS, MS-DOS, or COMPAQ DOS you select.
BTW, there’s no FAT32 support. FAT32 arrived on the scene in 1996, and my focus here at “PCjs HQ” is generally on stuff older than that and/or stuff that personally interests me. So, apologies in advance.
Let’s start with a quick recap of what you can now do with PC.js:
ls
, vi
, cp
) and access local files from the DOS command promptpc.js
disk driveIn some ways, pc.js
is a culmination of what I started 11 years ago with JavaScript-based machine emulation on the PCjs website (although back then, the website was jsmachines.net
; it didn’t become pcjs.org
until 2013).
Because I chose to write everything in vanilla JavaScript (sometimes jokingly referred to as Vanilla JS) and not some other language that could be compiled natively and transpiled to JavaScript or WebAssembly, creating a command-line version of PCjs has meant relying on Node.js, and while that works well, it is a rather heavyweight solution.
But I’m OK with that, because PCjs has always been focused on machines from the 1980s and early 1990s, which didn’t run very fast anyway. 16Mhz was pretty respectable back then, and it turns out that JavaScript has no problem simulating that level of performance.
Also, since Node.js, several interesting JavaScript runtime alternatives have popped up, like Deno and Bun. I haven’t tried running pc.js
with either of them yet, but I look forward to giving both of them a try. Browsers still rely on JavaScript, and JavaScript doesn’t appear to be going away any time soon.
And JavaScript runtime overhead isn’t actually responsible for the biggest part of PCjs startup time. That honor goes to the PC’s own ROM BIOS “POST” or Power-On Startup Tests – which PCjs faithfully emulates.
Fortunately, there’s a simple work-around for reducing PC startup time: the default machine used by pc.js
– a COMPAQ DeskPro 386 – comes with a state file that restores the machine’s state to the point where it’s about to read the boot sector from the machine’s hard disk. So not only are all power-on diagnostics bypassed, but all the floppy drive seek test and boot operations are skipped as well.
I really wanted to leverage all the browser-based code I had written over the years to solve a fairly simple problem: compile/assemble/link and run/debug DOS software from a modern non-DOS command-line. In other words, I wanted to integrate a simple retro-software-development environment into my existing macOS dev environment.
And while working on pc.js
, I ran into the perfect example. I needed to create my own slightly modified version of the DOS Master Boot Record (MBR), which meant I needed to disassemble an existing MBR, recreate the source as an .ASM file, and then run it through the usual DOS tool chain (masm
, link
, exe2bin
). And it since it would be an iterative process (assemble, test, fix, re-assemble), the easier the process, the better.
Here’s how I was able to do that using pc.js
:
In the screen capture above, I start by using my diskimage.js
utility to download and extract a complete DOS dev environment from an existing PCjs hard disk image into a directory named test
.
Then I cd test
and run pc.js
, which builds a bootable hard disk image with all the files and folders in test
, and then boots it. From that point on, until I run the quit
command, I’m running DOS (the default is MS-DOS 3.30). And when I quit, any files that were added/changed/deleted are automatically updated in the local file system.
I can also run a selection of non-DOS commands to access files from the local file system. Every time I type ls
or vi
, for example, a lot of work happens behind the scenes: the machine is shut down, all file changes are synced, then the non-DOS command is run, and when it exits, the machine’s hard disk is recreated and the machine is restarted.
The --sys
and --ver
options control the version of DOS used to the build its bootable hard disk – or bootable floppy disk if you specify --floppy
.
By default, pc.js
will build a 10Mb hard disk – or the largest floppy disk that your chosen version of DOS supports. If you want something bigger or smaller, use the --target
option; it supports either kilobytes (eg, --target=160K
) or megabytes (eg, --target=20M
).
pc.js
will “auto-build” only one disk for a machine: either a single-partition hard disk (C:) or a floppy disk (A:). If you want your machine to automatically load other disks into other drives, you’ll have to create a custom machine configuration file that uses prebuilt disk images.
The whole point of the “auto-build” drive is to mirror the files in your current directory (or whatever directory you specify with --dir
). Since you only have one current directory at any point in time, it doesn’t really make sense for pc.js
to manage more than one automatically built drive.
However, you can load any diskette image into drive A: or B: at any time using the load
command that pc.js
includes with every disk image it builds. Details of the load
command were covered in an earlier blog post.
The IBM PC XT, IBM PC AT, and COMPAQ DeskPro 386 are three examples of machines that supported hard disks, and PCjs includes built-in support for all their respective Hard Drive Types. For example, if you want your machine to be an PC AT with a Type 2 hard disk, run:
% pc.js ibm5170 --drivetype=2
[Press CTRL-D to enter command mode]
C:\>load info
AT drive type 2, CHS 615:4:17, 20.4Mb
16-bit FAT, 2048-byte clusters, 10398 clusters
82 FAT sectors (x2), 64 root sectors (1024 entries)
41752 total sectors, 41523 data sectors, 21295104 data bytes
NOTE: load info
is a variation of the load
command that displays information about the built-in disk image. I had considered making a separate info
utility to do that, since the purpose of the load
utility is loading diskette images into drives A: and B:, but I was lazy.
If you don’t remember your favorite PC AT drive type, you can just give pc.js
a target size and let it search the machine’s drive table for the closest match:
% pc.js ibm5170 --target=30.6M
warning: 62 FAT sectors allocated, but only 61 are required
[Press CTRL-D to enter command mode]
C:\>load info
AT drive type 3, CHS 615:6:17, 30.6Mb
16-bit FAT, 2048-byte clusters, 15608 clusters
124 FAT sectors (x2), 64 root sectors (1024 entries)
62628 total sectors, 62315 data sectors, 31965184 data bytes
Note that the drive tables of AT-class machines usually didn’t define any drives smaller than 10Mb, and the smallest drive type the PC XT defined was 5Mb, so using --target
with smaller sizes won’t give you smaller drives on those machines. To do that, you must bypass the machine’s drive table by adding --controller=pcjs
to the command-line. For example:
% pc.js load info --sys=pcdos --ver=3.0 --controller=pcjs --target=1M
[Press CTRL-D to enter command mode]
C>ECHO OFF
PCJS drive type 0, CHS 61:2:17, 1.0Mb
12-bit FAT, 1024-byte clusters, 1012 clusters
3 FAT sectors (x2), 7 root sectors (112 entries)
2040 total sectors, 2026 data sectors, 1036288 data bytes
pc.js
will calculate a drive geometry that matches your target size as closely as possible (1Mb in this case), and then set up a custom PCJS
drive type.
Finally, for complete control of a custom drive type, you can choose any drive geometry you want by passing a “cylinders:heads:sectors” (CHS) triplet to the --drivetype
parameter, and since that automatically uses a PCJS
drive type, you don’t need to include --controller=pcjs
.
For example:
% pc.js ibm5170 --drivetype=615:5:17
[Press CTRL-D to enter command mode]
C:\>load info
PCJS drive type 0, CHS 615:5:17, 25.5Mb
16-bit FAT, 2048-byte clusters, 13003 clusters
102 FAT sectors (x2), 64 root sectors (1024 entries)
52190 total sectors, 51921 data sectors, 26630144 data bytes
C:\>chkdsk
Volume TEST created Aug 30, 2023 11:10a
26630144 bytes total disk space
90112 bytes in 9 hidden files
45056 bytes in 22 directories
4306944 bytes in 257 user files
22188032 bytes available on disk
655360 bytes total memory
586256 bytes free
Custom drive types work thanks to the Master Boot Record that I created specifically for pc.js
. It operates exactly like a normal DOS MBR, with the additional ability to install custom drive tables. And since this is a fully PC-compatible solution, it should work in the “real world” as well – as long as you don’t select a drive geometry that doesn’t work with your actual hardware.
One example of a broken machine/drive combination would be any IBM PC XT machine attempting to boot from a drive using something other than 17 sectors/track. The XT BIOS does support custom drive tables, but that support is hard-coded to disks with only 17 sectors/track.
In the past, PCjs only needed to build a limited number of FAT volume types, so it used a predefined set of BPBs. BPBs (BIOS Parameter Blocks) are data structures introduced by DOS 2.00 and embedded in the DOS boot sector to describe the structure of the disk and the FAT file system on it. For reference, here’s how PCjs defines the BPB (see diskinfo.js):
OPCODE: 0x000, // 1 byte for an x86 JMP opcode, followed by a 1 or 2-byte offset
OEM: 0x003, // 8 bytes (eg, "IBM 2.0")
SECBYTES: 0x00B, // 2 bytes: bytes per sector (eg, 0x200 or 512)
CLUSSECS: 0x00D, // 1 byte: sectors per cluster (eg, 1)
RESSECS: 0x00E, // 2 bytes: reserved sectors; ie, # sectors preceding the first FAT, usually just the boot sector (eg, 1)
FATS: 0x010, // 1 byte: FAT copies (eg, 2)
DIRENTS: 0x011, // 2 bytes: root directory entries (eg, 0x40 or 64)
DISKSECS: 0x013, // 2 bytes: number of sectors (eg, 0x140 or 320); if zero, refer to LARGESECS
MEDIA: 0x015, // 1 byte: media ID (eg, 0xF8); should also match the first byte of the FAT (aka FAT ID)
FATSECS: 0x016, // 2 bytes: sectors per FAT (eg, 1)
TRACKSECS: 0x018, // 2 bytes: sectors per track (eg, 8)
DRIVEHEADS: 0x01A, // 2 bytes: number of heads (eg, 1)
HIDDENSECS: 0x01C, // 2 bytes (DOS 2.x) or 4 bytes (DOS 3.31 and up): number of hidden sectors (0 for non-partitioned media)
BOOTDRIVE: 0x01E, // 1 byte (DOS 2.x): BIOS boot drive # (eg, 0x00 or 0x80)
BOOTHEAD: 0x01F, // 1 byte (DOS 2.x): BIOS boot head # (0-based)
/*
* NOTE: DOS 2.0 also stores the number of sectors in the BIOS file (eg, IO.SYS, IBMBIO.COM) in the byte at offset
* 0x020 (LARGESECS), followed by a custom 11-byte Diskette Parameter Table (DPT) at offsets 0x021 through 0x0x2B, which
* it promptly points the DPT vector 0x1E (0:0078h) to.
*/
LARGESECS: 0x020, // 4 bytes (DOS 3.31 and up): number of sectors if DISKSECS is zero
Most BPB definitions start at offset 0x00B (ie, they don’t include the OPCODE or OEM fields), but I do include them because those fields are actually used by DOS for BPB validation purposes, so their purpose is really inseparable from the rest of the BPB.
My predefined BPBs included all the standard PC floppy disk formats (160K, 180K, 320K, 360K, 720K, 1200K, and 1440K) as well as two hard disk formats (10Mb and 20Mb) – all formats that also used 12-bit FATs.
However, I wanted the pc.js
utility to be able to accommodate a much wider range of hard disk formats, along with any version of DOS that supported those formats. This, in turn, meant adding some features I hadn’t needed until now: support for 16-bit FATs (added in DOS 3.00), and support for disks with more than 64K total sectors (added in DOS 3.31).
Neither of those features were difficult to add. Difficulties only arose when actually trying to boot FAT volumes built with custom settings. My troubles started with PC DOS 2.00 – the very first version of DOS to support hard disks – because even though it faithfully places a BPB on every hard disk it formats, it doesn’t actually pay attention to many of the values stored in the BPB.
In PC DOS 2.00, the code that determines a disk’s cluster size and root directory size begins below, where it first scans the MBR for FAT12 partition type 01h. I’ve disassembled and commented the code to make it a bit more readable:
&0070:093B BBC201 MOV BX,01C2 ; BX -> partition record + 4 (it actually starts at 1BEh)
&0070:093E 26803F01 CMP ES:[BX],01 ; is partition type 01h?
&0070:0942 740B JZ 094F ; if so, jump
&0070:0944 83C310 ADD BX,0010 ; otherwise, BX -> next partition record
&0070:0947 81FB0202 CMP BX,0202 ; is BX now beyond the end of the partition table?
&0070:094B 75F1 JNZ 093E ; jump if not
&0070:094D F9 STC ; otherwise, error (partition with type 01h is missing)
&0070:094E C3 RET ;
&0070:094F 268B4704 MOV AX,ES:[BX+04] ; AX = LBA of first partition sector (assumes a 16-bit value)
&0070:0953 894511 MOV [DI+11],AX ; (stash LBA)
&0070:0956 268B4708 MOV AX,ES:[BX+08] ; AX = total sectors in partition (assumes a 16-bit value)
&0070:095A 3D4000 CMP AX,0040 ; is total less than 64 sectors?
&0070:095D 72EE JC 094D ; if so, jump (error)
&0070:095F 894508 MOV [DI+08],AX ; (stash total sectors)
&0070:0962 B90001 MOV CX,0100 ; CH = default sectors/cluster, CL = clusters-to-sectors shift count
&0070:0965 BA4000 MOV DX,0040 ; DX = default number of root directory entries
&0070:0968 3D0002 CMP AX,0200 ; is total sectors <= 512?
&0070:096B 7629 JBE 0996 ; yes (that's only a 256Kb disk)
&0070:096D 02ED ADD CH,CH ; otherwise, default sectors/cluster is now 2 (1K cluster size)
&0070:096F FEC1 INC CL ; shift count to match
&0070:0971 BA7000 MOV DX,0070 ; DX = 70h (112 root directory entries)
&0070:0974 3D0008 CMP AX,0800 ; is total sectors <= 2048?
&0070:0977 761D JBE 0996 ; yes (that's only a 1Mb disk)
&0070:0979 02ED ADD CH,CH ; otherwise, default sectors/cluster is now 4 (2K cluster size)
&0070:097B FEC1 INC CL ; shift count to match
&0070:097D BA0001 MOV DX,0100 ; DX = 100h (256 root directory entries)
&0070:0980 3D0020 CMP AX,2000 ; is total sectors <= 8192?
&0070:0983 7611 JBE 0996 ; yes (that only a 4Mb disk)
&0070:0985 02ED ADD CH,CH ; otherwise, default sectors/cluster is now 8 (4K cluster size)
&0070:0987 FEC1 INC CL ; shift count to match
&0070:0989 03D2 ADD DX,DX ; DX = 200h (512 root directory entries)
&0070:098B 3DA87F CMP AX,7FA8 ; is total sectors <= 32680?
&0070:098E 7606 JBE 0996 ; yes (that's only a 16Mb disk)
&0070:0990 02ED ADD CH,CH ; otherwise, default sectors/cluster is now 16 (8K cluster size)
&0070:0992 FEC1 INC CL ; shift count to match
&0070:0994 03D2 ADD DX,DX ; DX = 400h (1024 root directory entries)
&0070:0996 895506 MOV [DI+06],DX ; (stash defaults in internal structure)
&0070:0999 886D02 MOV [DI+02],CH
So, based solely on total sectors, DOS 2.00 has decided what the cluster size and root directory should be – which means that no matter what values your FAT volume actually uses and has actually stored in the BPB, DOS 2.00 will crash if those values don’t match its defaults. There will be no error message, and unless you have a debugger running, the machine will appear hung.
DOS 2.00 then goes on to calculate how many “sectors per FAT” there should be. It does this by dividing “total sectors” by “sectors per cluster” (which it does by shifting right, since “sectors per cluster” is always a power of 2). That provides an upper limit for total clusters, which it then rounds up to the nearest even number, multiplies by 1.5 (to yield total cluster bytes), and then divides by 512:
&0070:099C 33DB XOR BX,BX
&0070:099E 8ADD MOV BL,CH
&0070:09A0 4B DEC BX
&0070:09A1 03D8 ADD BX,AX
&0070:09A3 D3EB SHR BX,CL
&0070:09A5 43 INC BX
&0070:09A6 80E3FE AND BL,FE
&0070:09A9 8BF3 MOV SI,BX
&0070:09AB D1EB SHR BX,1
&0070:09AD 03DE ADD BX,SI
&0070:09AF 81C3FF01 ADD BX,01FF
&0070:09B3 D0EF SHR BH,1
&0070:09B5 887D0B MOV [DI+0B],BH
This is also one of the many places where DOS also assumes 512-bytes sectors, ignoring the “bytes per sector” field in the BPB.
Since the above calculation of “sectors per FAT” doesn’t take into account any other sector usage (eg, reserved sectors, root directory sectors, not to mention the FAT sectors themselves, of which there are invariably 2 copies), there are going to be situations where DOS allocates more sectors per FAT than the number of available clusters actually requires. And when presented with such a situation, if we don’t over-allocate our FAT as well, DOS 2.00 will again crash – because, in addition to ignoring “bytes per sector”, “sectors per cluster”, and “root directory entries” in the BPB, DOS 2.00 also ignores “sectors per FAT”, relying instead on its own calculation.
To confirm this, I tested a drive configuration (162 cylinders, 4 heads, and 17 sectors/track) which PC DOS 2.00 would format as a 12-bit FAT with 4K clusters, 32 root directory sectors (for a total of 512 entries), and 5 sectors per FAT – yielding a disk with 1361 available clusters. Except that 1361 clusters only requires a 2045-byte FAT, which fits perfectly in 4 sectors, with 3 bytes to spare. However, when I built such a disk with only 4 sectors per FAT, DOS failed to boot.
Here’s how I built it using pc.js
:
% pc.js --sys=pcdos --ver=2.00 --drivetype=162:4:17 --trim --save=test.img
The undocumented --trim
flag tells pc.js
to bypass its normal DOS-compatibility rules and build/format the disk with an “optimized” 12-bit FAT, and the --save
option saves the disk image without starting a machine. And although DOS 2.00 didn’t like it, test.img
was an otherwise perfectly valid and usable disk image, and fsck_msdos
on macOS reported no problems.
It’s understandable that DOS 2.00 would be skeptical of its own BPBs, in part because BPBs were a new feature that probably evolved during the development of DOS 2.00, so they would have been dealing with disks with no BPBs, out-dated BPBs, or even invalid BPBs. However, perhaps the biggest problem was FDISK, because whenever FDISK created a DOS partition, it would simply update the partition table in the Master Boot Record and then reboot, leaving the partition’s boot sector and any BPB it previously contained in place. And that old BPB might be completely inappropriate.
On the other hand, I think DOS 2.00 could have made an effort to validate the BPB, ensuring that its drive parameters matched those in the Master Boot Record before relying on it. But that didn’t happen. To avoid a potential mess, DOS took the easy way out and simply used a set of FAT parameters based entirely on disk size.
Before diving into more examples of how picky DOS was about mounting FAT disks, I want to take a step back for a minute and talk about a document that Microsoft first released in the late 1990s called FAT: General Overview of On-Disk Format. The version here (1.03) is from 2000, but as far as FAT12 and FAT16 support is concerned, there are no substantive differences between it and the 1999 (1.02) version. Of course, in typical fashion, Microsoft no longer makes any version of the document available.
Anyway, while working on pc.js
, I referred extensively to the FAT12 and FAT16 sections of that document. It reinforced some very important rules, such as:
This is the one and only way that FAT type is determined. There is no such thing as a FAT12 volume that has more than 4084 clusters. There is no such thing as a FAT16 volume that has less than 4085 clusters or more than 65,524 clusters. There is no such thing as a FAT32 volume that has less than 65,525 clusters. If you try to make a FAT volume that violates this rule, Microsoft operating systems will not handle them correctly because they will think the volume has a different type of FAT than what you think it does.
This document was as close to a “gold standard” for the FAT file system as the world ever got – at least from Microsoft. Even so, it does gloss over some important details.
One such detail is when and how a volume should be formatted with FAT12 or FAT16. Actually, let’s break that question into smaller questions:
Regarding question #1, the document does say this:
There is no dynamic computation for FAT12. For the FAT12 formats, all the computation for BPB_SecPerClus and BPB_FATSz16 was worked out by hand on a piece of paper and recorded in the table (being careful of course that the resultant cluster count was always less than 4085). If your media is larger than 4 MB, do not bother with FAT12. Use smaller BPB_SecPerClus values so that the volume will be FAT16.
So that kind of answers the question: maybe use FAT12 if the volume is 4Mb or smaller.
With respect to question #2, there’s no sign of “the table” that DOS used to choose cluster size for FAT12. In DOS 2.00, as we saw above, there was only a series of disk size comparisons, which can be summarized in a table:
Disk Size Cluster Size Root Directory Size
--------- ------------ -------------------
<= 256Kb 1 sector 64 root entries
<= 1Mb 2 sectors 112 root entries
<= 4Mb 4 sectors 256 root entries
<= 16Mb 8 sectors 512 root entries
> 16Mb 16 sectors 1024 root entries
And starting with DOS 3.00, since FAT16 was now preferred, “the table” for FAT12 was reduced to a single entry that always selected 4K clusters and was used only if the disk contained 32680 (0x7FA8) sectors or less (ie, 16340Kb or approximately 16Mb).
As for FAT16 cluster sizes, the document does provide a table:
Disk Size Cluster Size Root Directory Size
---------- ------------ -------------------
<= 4200Kb N/A Use FAT12
<= 16340Kb 2 sectors 512 root entries
<= 128Mb 4 sectors 512 root entries
<= 256Mb 8 sectors 512 root entries
<= 512Mb 16 sectors 512 root entries
<= 1024Mb 32 sectors 512 root entries
<= 2048Mb 64 sectors 512 root entries
And except for the fact that DOS always used FAT12 in the second case as well, that table more or less matches the criteria that DOS eventually used, once disks with those sizes became available.
Note that when DOS 3.00 introduced FAT16 in 1984, the PC BIOS could not support drives with more than 1024 cylinders, 16 heads, and 63 sectors per track, imposing an upper limit of 504Mb. On top of that, DOS versions 2.00 through 3.30 couldn’t support partitions with more 64K sectors, which imposed a volume limit of 32Mb. This is why DOS 3.00 didn’t perform any cluster size calculations: if the disk was big enough to warrant FAT16, then it simply always used 2K clusters. However, as demand for larger disks grew, techniques like CHS translation (that allowed a maximum of 256 logical heads without requiring any changes to the PC BIOS interface) provided support for much larger disks (up to 8192Mb or 8Gb), and then when DOS 3.31 added support for volumes with more than 64K sectors, later versions of DOS had to make more judicious decisions about cluster size.
Question #3 is never really discussed. You might assume the rule is simply that if the partition type in the MBR is 01h, the partition is FAT12, and if it’s 04h (or 06h for disks with more than 64K sectors), the partition is FAT16. But it’s not that simple, and sadly, no one ever thought to add anything to the BPB to clearly indicate the type of FAT – a rather vital piece of information. I’ve learned the hard way that a 10Mb disk formatted with FAT16 and more than 4084 clusters and clearly marked with partition type 04h will still be treated by DOS 3.x as a FAT12 volume by default. More on that later.
Finally, even though Microsoft “sanitized” the document (by which I mean sprinkled a liberal dose of “legalese” and then completely removed any hint of who actually wrote or contributed to it), it’s pretty clear to me who wrote it: Aaron Reynolds.
I’m sure there are still living current or former Microsoft employees who could easily confirm that Aaron wrote the document, and maybe one will pop out of the woodwork and do just that, but sadly, Aaron isn’t one of them. He passed away in 2008.
Aaron had a long and legendary career at Microsoft, working on DOS 1.1 and subsequent versions, and well as many versions of Windows. His name appears several times in “The MS-DOS Encyclopedia”, which even includes a hand-drawn diagram by him:
My path occasionally crossed Aaron’s while we were both working on Windows 95, and I even share credit with him on the “Common Name Space For Long And Short Filenames” Patent (No. 5,579,517) – although to be honest, my contribution to that “invention” was pretty minimal. All I recall are hallway conversations about the viability of combinations of attribute bits, including the volume label bit, to hide long filename entries from previous versions of DOS.
Aaron was very smart, opinionated, and intense. One way that intensity came out was a tendency to really “hammer” on certain points, probably because he was tired of seeing other people repeatedly making common mistakes. And there are passages in the FAT: General Overview of On-Disk Format document that strike me as “vintage Aaron Reynolds”.
For example:
There is considerable confusion over exactly how [FAT Type Determination] works, which leads to many “off by 1”, “off by 2”, “off by 10”, and “massively off” errors. It is really quite simple how this works. The FAT type–one of FAT12, FAT16, or FAT32–is determined by the count of clusters on the volume and nothing else.
Please read everything in this section carefully, all of the words are important. For example, note that the statement was “count of clusters.” This is not the same thing as “maximum valid cluster number,” because the first data cluster is 2 and not 0 or 1.
and this:
Now we can determine the FAT type. Please note carefully or you will commit an off-by-one error! In the following example, when it says <, it does not mean <=. Note also that the numbers are correct. The first number for FAT12 is 4085; the second number for FAT16 is 65525. These numbers and the ‘<’ signs are not wrong.
There’s even one paragraph that simply says:
Please don’t draw an incorrect conclusion here.
I can’t read any of those passages without smiling.
There isn’t much about Aaron online, although I did find a nice tribute to him from Charles Wright Academy. I’ll end my own tiny tribute with a short clip of Aaron from July 14, 1995, as he was looking forward to the imminent release of Windows 95 and talking about “harsh email” (I don’t know who taped this, but I’m hopeful they won’t mind it being shared here).
I’ve already beaten PC DOS 2.00 to death, so let’s move on to PC DOS 3.00. I’d mentioned this earlier:
A 10Mb disk formatted with FAT16 (ie, with more than 4084 clusters and clearly marked with partition type 04h) will still be treated by DOS 3.x as a FAT12 volume by default.
To test this, I ran pc.js
in a directory with a small number of files, requesting a drive with PC DOS 3.00 and a 16-bit FAT:
% pc.js ibm5170 --sys=pcdos --ver=3.00 --fat=16
warning: 16-bit FAT replaced with 12-bit FAT
[Press CTRL-D to enter command mode]
C:\>load info
AT drive type 1, CHS 306:4:17, 10.2Mb
12-bit FAT, 4096-byte clusters, 2586 clusters
8 FAT sectors (x2), 32 root sectors (512 entries)
20740 total sectors, 20691 data sectors, 10592256 data bytes
You can see we successfully booted to a C:\>
prompt, but load info
told us that the disk was built as FAT12 instead of FAT16. This was because pc.js
tries to stick to historical defaults, and 10Mb disks were “historically” formatted as FAT12.
You may recall that the FAT: General Overview of On-Disk Format said:
If your media is larger than 4 MB, do not bother with FAT12.
Except that PC DOS 2.00 did bother with FAT12 on a 10Mb disk – because, well, FAT12 was all it could do. But even when PC DOS 3.00 introduced FAT16, it would still format a 10Mb disk as FAT12. It had to use 4K clusters in order to keep total clusters under 4085, but it preferred FAT12 over 2K clusters – future recommendations notwithstanding.
Anyway, we can force pc.js
to build a FAT16 disk. We just have to also specify a cluster size (2K) that will produce too many clusters for FAT12 to handle, forcing the use of FAT16:
% pc.js ibm5170 --sys=pcdos --ver=3.00 --fat=16:2048
[Press CTRL-D to enter command mode]
stopped (32233718 cycles, 4041 ms, 7976669 hz)
AX=0000 BX=FFFF CX=0342 DX=4F03 SP=00C2 BP=0004 SI=FFFF DI=1F64
SS=9E98 DS=0070 ES=9C72 PS=0246 V0 D0 I1 T0 S0 Z1 A0 P1 C0
&017D:4159 0000 ADD [BX+SI],AL
[Type help for list of commands, CTRL-C to terminate]
>> load info
AT drive type 1, CHS 306:4:17, 10.2Mb
16-bit FAT, 2048-byte clusters, 5164 clusters
21 FAT sectors (x2), 32 root sectors (512 entries)
20740 total sectors, 20665 data sectors, 10575872 data bytes
And the machine “crashes” (well, it executes a suspicious instruction at 17D:4159, so the PCjs debugger stops it).
It turns out this happened because I put “IBM 3.0” in the BPB, which seemed logical, since “IBM 2.0” would imply that PC DOS 2.x formatted the disk, but since the disk was using FAT16, it couldn’t have. And yet, ironically, by putting the “old” OEM signature in the BPB, PC DOS 3.00 honors the BPB values, and then later notices that “total clusters” is greater than 4085, so it marks the volume as FAT16 after all, and life is good.
Here’s the code where PC DOS 3.00 inspects the MBR and then the BPB in the boot sector. It appears that the code originally intended to honor a BPB with an “IBM 3.0” signature, but a mistake in the code made that impossible. Again, I’ve sprinkled my own comments throughout the disassembled code to help make it more readable:
&0070:1438 BBC203 MOV BX,03C2 ; BX -> partition table entry + 4
&0070:143B 26803F01 CMP ES:[BX],01 ; type 1? (FAT12)
&0070:143F 7411 JZ 1452 ; yes
&0070:1441 26803F04 CMP ES:[BX],04 ; type 4? (FAT16)
&0070:1445 740B JZ 1452 ; yes
&0070:1447 83C310 ADD BX,0010 ; BX -> next entry
&0070:144A 81FB0204 CMP BX,0402 ; more entries?
&0070:144E 75EB JNZ 143B ; yes
&0070:1450 F9 STC ; no, and we never found an entry, so return error (carry set)
&0070:1451 C3 RET
&0070:1452 52 PUSH DX ; save drive (DL)
&0070:1453 268B4704 MOV AX,ES:[BX+04] ;
&0070:1457 268B5706 MOV DX,ES:[BX+06] ; DX:AX = starting LBA of partition
&0070:145B 26034708 ADD AX,ES:[BX+08] ;
&0070:145F 2613570A ADC DX,ES:[BX+0A] ; DX:AX = ending LBA of partition + 1
&0070:1463 7405 JZ 146A ; are the top 16 bits of the sum zero?
&0070:1465 800ECC1080 OR [10CC],80 ; no, so set 80h in "drive byte" (possible error bit)
&0070:146A 5A POP DX ; recover drive (DL)
&0070:146B 268B4704 MOV AX,ES:[BX+04] ; AX = starting LBA (which we presume to be only 16 bits)
&0070:146F 894511 MOV [DI+11],AX ; stash it
&0070:1472 268B4708 MOV AX,ES:[BX+08] ; AX = total sectors in partition
&0070:1476 3D4000 CMP AX,0040 ; is total less than 64 sectors?
&0070:1479 72D5 JC 1450 ; if so, jump (error)
&0070:147B 894508 MOV [DI+08],AX ; (stash total sectors)
;
; Change from DOS 2.00: Read the partition's boot sector so we can access the BPB.
; What follows is a bunch of instructions that should have been a subroutine (ie, to
; convert the partition's starting LBA to C:H:S values).
;
; Also, it's a bit unfortunate that the volume's boot sector is already sitting in memory.
; Of course, that's only true for the first disk drive, but in the case of the first drive,
; this code is a complete waste of time.
;
&0070:147E 50 PUSH AX
&0070:147F 52 PUSH DX
&0070:1480 8B4511 MOV AX,[DI+11] ; AX = starting LBA again
&0070:1483 33D2 XOR DX,DX ;
&0070:1485 8AFE MOV BH,DH
&0070:1487 8A5D0D MOV BL,[DI+0D] ; BX = sectors per track
&0070:148A F7F3 DIV BX ; divide DX:AX by sectors per track
&0070:148C 8ACA MOV CL,DL
&0070:148E FEC1 INC CL
&0070:1490 99 CWD
&0070:1491 8A5D0F MOV BL,[DI+0F] ; BX = number of heads
&0070:1494 F7F3 DIV BX
&0070:1496 D0CC ROR AH,1
&0070:1498 D0CC ROR AH,1
&0070:149A 80E4C0 AND AH,C0
&0070:149D 0ACC OR CL,AH
&0070:149F 8AE8 MOV CH,AL
&0070:14A1 58 POP AX
&0070:14A2 8AF2 MOV DH,DL
&0070:14A4 8AD0 MOV DL,AL
&0070:14A6 33DB XOR BX,BX
&0070:14A8 B80102 MOV AX,0201
&0070:14AB CD13 INT 13
&0070:14AD 58 POP AX
;
; At this point, the partition's boot sector should be at ES:0 (since BX was zero);
; these instructions could have been a bit smaller if they had used ES:BX addressing.
;
&0070:14AE 26813E03004942 CMP ES:[0003],4249 ; does the OEM signature start with "IB"?
&0070:14B5 751C JNZ 14D3 ; no
&0070:14B7 26813E05004D20 CMP ES:[0005],204D ; does it continue with "M "?
&0070:14BE 7513 JNZ 14D3 ; no
&0070:14C0 26813E0800322E CMP ES:[0008],2E32 ; does it continue with "2."?
&0070:14C7 750A JNZ 14D3 ; no
&0070:14C9 26803E0A0030 CMP ES:[000A],30 ; is the "2." followed by "0"?
&0070:14CF 7505 JNZ 14D6 ; no
&0070:14D1 EB14 JMP 14E7 ; yes
&0070:14D3 EB4C JMP 1521
&0070:14D5 90 NOP
;
; This code is prepared to deal with a signature of "IBM 3.0" and jump to 14E7
; just as previous code did for "IBM 2.0", but technically, that will never happen,
; because previous code already gave up when the signature didn't contain "2.".
;
&0070:14D6 26813E0800332E CMP ES:[0008],2E33
&0070:14DD 75F4 JNZ 14D3
&0070:14DF 26803E0A0030 CMP ES:[000A],30
&0070:14E5 75EC JNZ 14D3
;
; This code is executed ONLY if the OEM signature contained "IBM 2.0"
; (well, technically, the previous code didn't care what came before the "2";
; normally it's a space but it could be anything).
;
; Anyway, this is the only code that actually honors the BPB values.
;
&0070:14E7 26A11300 MOV AX,ES:[0013] ; AX = total sectors
&0070:14EB 48 DEC AX ; subtract 1 (assumes reserved sectors == 1?)
&0070:14EC 268B161600 MOV DX,ES:[0016] ; DX = sectors per FAT
&0070:14F1 89550B MOV [DI+0B],DX ; (stash it)
&0070:14F4 D1E2 SHL DX,1 ; double it (assumes number of FATs == 2?)
&0070:14F6 2BC2 SUB AX,DX ; subtract from total sectors
&0070:14F8 268B161100 MOV DX,ES:[0011] ; DX = number of directory entries
&0070:14FD 895506 MOV [DI+06],DX ; (stash it)
&0070:1500 B104 MOV CL,04 ; CL = shift count
&0070:1502 D3EA SHR DX,CL ; DX /= 16
&0070:1504 2BC2 SUB AX,DX ; subtract from total sectors
&0070:1506 268A0E0D00 MOV CL,ES:[000D] ; CL = sectors per cluster
&0070:150B 884D02 MOV [DI+02],CL ; (stash it)
&0070:150E 33D2 XOR DX,DX ; DX:AX = total sectors
&0070:1510 8AEE MOV CH,DH ; CX = sectors per cluster
&0070:1512 F7F1 DIV CX ; AX = number of whole clusters
&0070:1514 3DF60F CMP AX,0FF6 ; is number of clusters < 4086? (technically, that should be 4085)
&0070:1517 7205 JC 151E ; yes, so it's a FAT12 volume
&0070:1519 800ECC1040 OR [10CC],40 ; no, so mark it as FAT16
&0070:151E EB43 JMP 1563 ; wrap up
&0070:1520 90 NOP
;
; We arrive here if the OEM signature was ANYTHING other than "IBM 2.0".
;
; SI will point to a table of 8-byte entries. There are only two sets of entries.
;
; SI+0: total sectors threshold (1 word) 7FA8 FFFF
; SI+2: sectors per cluster shift count (1 byte) 03 02
; SI+3: sectors per cluster byte count (1 byte) 08 04
; SI+4: root directory entries (1 word) 0200 0200
; SI+6: flags (eg, 0040h implies FAT16) 0000 0040
;
&0070:1521 BEF910 MOV SI,10F9 ; SI -> threshold table
&0070:1524 3B04 CMP AX,[SI] ; AX <= sector threshold?
&0070:1526 7605 JBE 152D ; yes
&0070:1528 83C608 ADD SI,0008 ; no, advance to next entry
&0070:152B EBF7 JMP 1524 ; try again
&0070:152D 8A4C06 MOV CL,[SI+06] ; load flags
&0070:1530 080ECC10 OR [10CC],CL ; save in "drive flags"
&0070:1534 8B4C02 MOV CX,[SI+02] ; CX = sectors per cluster info
&0070:1537 8B5404 MOV DX,[SI+04] ; DX = root directory entries
&0070:153A 895506 MOV [DI+06],DX ; (stash DX: root directory entries)
&0070:153D 886D02 MOV [DI+02],CH ; (stash CH: sectors per cluster byte count)
&0070:1540 F606CC1040 TEST [10CC],40 ; FAT16?
&0070:1545 7525 JNZ 156C ; yes
;
; FAT12 "sectors per FAT" calculation is performed identically to DOS 2.00
; (see the code at 70:099C from DOS 2.00, above)
;
&0070:1547 33DB XOR BX,BX
&0070:1549 8ADD MOV BL,CH
&0070:154B 4B DEC BX
&0070:154C 03D8 ADD BX,AX
&0070:154E D3EB SHR BX,CL
&0070:1550 43 INC BX
&0070:1551 80E3FE AND BL,FE
&0070:1554 8BF3 MOV SI,BX
&0070:1556 D1EB SHR BX,1
&0070:1558 03DE ADD BX,SI
&0070:155A 81C3FF01 ADD BX,01FF
&0070:155E D0EF SHR BH,1
&0070:1560 887D0B MOV [DI+0B],BH
&0070:1563 8A1ECC10 MOV BL,[10CC]
&0070:1567 885D13 MOV [DI+13],BL
&0070:156A F8 CLC
&0070:156B C3 RET
;
; FAT16 calculations (with some duplication of the logic at 70:1500):
;
&0070:156C B104 MOV CL,04 ; CL = shift count
&0070:156E D3EA SHR DX,CL ; DX /= 16
&0070:1570 2BC2 SUB AX,DX ; subtract from total sectors
&0070:1572 48 DEC AX ; subtract 1 (assumes reserved sectors == 1?)
&0070:1573 B302 MOV BL,02
&0070:1575 8A7D02 MOV BH,[DI+02] ; BX = sectors per cluster * 256 + 2
&0070:1578 33D2 XOR DX,DX
&0070:157A 03C3 ADD AX,BX
&0070:157C 83D200 ADC DX,0000 ; DX:AX = sector count + BX
&0070:157F 2D0100 SUB AX,0001
&0070:1582 83DA00 SBB DX,0000 ; DX:AX = sector count + BX - 1
&0070:1585 F7F3 DIV BX ; AX = (sector count + BX - 1) / BX
&0070:1587 89450B MOV [DI+0B],AX
&0070:158A EBD7 JMP 1563
PC DOS 3.00 introduces the same “sectors per FAT” calculation described in the FAT: General Overview of On-Disk Format for FAT16:
RootDirSectors = ((BPB_RootEntCnt * 32) + (BPB_BytsPerSec – 1)) / BPB_BytsPerSec;
TmpVal1 = DskSize – (BPB_ResvdSecCnt + RootDirSectors);
TmpVal2 = (256 * BPB_SecPerClus) + BPB_NumFATs;
FATSz = (TMPVal1 + (TmpVal2 – 1)) / TmpVal2;
The document goes on to say:
Do not spend too much time trying to figure out why this math works. The basis for the computation is complicated; the important point is that this is how Microsoft operating systems do it, and it works. Note, however, that this math does not work perfectly. It will occasionally set a FATSz that is up to 2 sectors too large for FAT16, and occasionally up to 8 sectors too large for FAT32. It will never compute a FATSz value that is too small, however. Because it is OK to have a FATSz that is too large, at the expense of wasting a few sectors, the fact that this computation is surprisingly simple more than makes up for it being off in a safe way in some cases.
However, that formula is actually very similar to how DOS 2.00 calculated sectors per FAT for FAT12 volumes. The similarity is just obscured by how the formula assumes a sector size of 512 to arrive at the 256 multiplier, whereas it did not assume a sector size of 512 at the beginning of the formula (it used BPB_BytsPerSec
instead).
Ultimately, “sectors per FAT” has to be derived from “total clusters” divided by “number of cluster entries that fit in one FAT sector” (and noting that there are 256 16-bit entries in one 512-byte sector). The only thing that’s “complicated” about the formula is the rationale for adding BPB_NumFATs
(which is invariably 2) to the TmpVal2
divisor; it was probably added as a way of compensating for the FAT sectors themselves, since they are not represented in the total overall sectors in TmpVal1
.
One final observation I’ll make about the FAT: General Overview of On-Disk Format is that, since it was written many years after the earliest versions of DOS supporting FAT12 and FAT16 hard disks were released, it tended to gloss over details that were probably considered irrelevant at that point.
The OEM signature field in the BPB is a good example. The document really only had this to say about it:
There are many misconceptions about this field. It is only a name string. Microsoft operating systems don’t pay any attention to this field. Some FAT drivers do. This is the reason that the indicated string, “MSWIN4.1”, is the recommended setting, because it is the setting least likely to cause compatibility problems. If you want to put something else in here, that is your option, but the result may be that some FAT drivers might not recognize the volume. Typically this is some indication of what system formatted the volume.
Unfortunately, rather than dispelling misconceptions, the document actually added to them, because as we saw above, beginning with DOS 3.00 and – as far as I know – continuing in every version of DOS since then, the OEM signature played a role, sometimes a vital role, in determining the format of the disk and whether the BPB would be used.
pc.js
, and the diskinfo.js module in particular, works hard to make sure the custom disk images it builds will work with the selected version of DOS. But I’m sure there are still combinations of drive geometries, FAT sizes, cluster sizes, DOS versions – and OEM signatures – that I haven’t tested and will fail to boot. Hopefully the volumes themselves, at least, will always be valid.
Fortunately, pc.js
makes it easy to debug those situations. Just add --halt
to the pc.js
command-line and you’ll be dropped into the PCjs debugger before the machine starts booting.
Have fun!
Jeff Parsons
Sep 5, 2023