Advanced MS-DOS Programming

Advanced MS-DOS Programming


════════════════════════════════════════════════════════════════════════════


Advanced MS-DOS Programming

The Microsoft(R) Guide for Assembly Language and C Programmers

By Ray Duncan


════════════════════════════════════════════════════════════════════════════


  PUBLISHED BY
  Microsoft Press
  A Division of Microsoft Corporation
  16011 NE 36th Way, Box 97017, Redmond, Washington 98073-9717
  Copyright (C) 1986, 1988 by Ray Duncan
  Published 1986. Second edition 1988.
  All rights reserved. No part of the contents of this book may be
  reproduced or transmitted in any form or by any means without the written
  permission of the publisher.
  Library of Congress Cataloging in Publication Data

  Duncan, Ray, 1952-
  Advanced MS-DOS programming.
  Rev. ed. of: Advanced MS-DOS. (C)1986.
  Includes index.
  1. MS-DOS (Computer operating system)  2. Assembler language
  (Computer program language)  3. C (Computer program language)
  I. Duncan, Ray, 1952-    Advanced MS-DOS.    II. Title.
  QA76.76.063D858      1988      005.4'46      88-1251
  ISBN 1-55615-157-8
  Printed and bound in the United States of America.

  1 2 3 4 5 6 7 8 9    FGFG    3 2 1 0 9 8

  Distributed to the book trade in the United States by Harper & Row.

  Distributed to the book trade in Canada by General Publishing Company,
  Ltd.

  Penguin Books Ltd., Harmondworth, Middlesex, England
  Penguin Books Australia Ltd., Ringwood, Victoria, Australia
  Penguin Books N.Z. Ltd., 182-190 Wairu Road, Auckland 10, New Zealand

  British Cataloging in Publication Data available

  IBM(R), PC/AT(R), and PS/2(R) are registered trademarks of International
  Business Machines Corporation. CodeView(R), Microsoft(R), MS-DOS(R), and
  XENIX(R) are registered trademarks and InPort TM is a trademark of
  Microsoft Corporation.

  ──────────────────────────────────────────────────────────────────────────
      Technical Editor: Mike Halvorson  Production Editor: Mary Ann Jones
  ──────────────────────────────────────────────────────────────────────────



                                  Dedication

                                  For Carolyn



────────────────────────────────────────────────────────────────────────────
Contents

  Road Map to Figures and Tables

  Acknowledgments

  Introduction

  SECTION 1   PROGRAMMING FOR MS-DOS

  Chapter 1   Genealogy of MS-DOS

  Chapter 2   MS-DOS in Operation

  Chapter 3   Structure of MS-DOS Application Programs

  Chapter 4   MS-DOS Programming Tools

  Chapter 5   Keyboard and Mouse Input

  Chapter 6   Video Display

  Chapter 7   Printer and Serial Port

  Chapter 8   File Management

  Chapter 9   Volumes and Directories

  Chapter 10  Disk Internals

  Chapter 11  Memory Management

  Chapter 12  The EXEC Function

  Chapter 13  Interrupt Handlers

  Chapter 14  Installable Device Drivers

  Chapter 15  Filters

  Chapter 16  Compatibility and Portability

  SECTION 2   MS-DOS FUNCTIONS REFERENCE

  SECTION 3   IBM ROM BIOS AND MOUSE FUNCTIONS REFERENCE

  SECTION 4   LOTUS/INTEL/MICROSOFT EMS FUNCTIONS REFERENCE

  Index




────────────────────────────────────────────────────────────────────────────
Road Map to Figures and Tables

  MS-DOS versions and release dates

  MS-DOS memory map

  Structure of program segment prefix (PSP)

  Structure of .EXE load module

  Register conditions at program entry

  Segments, groups, and classes

  Macro Assembler switches

  C Compiler switches

  Linker switches

  MAKE switches

  ANSI escape sequences

  Video attributes

  Structure of normal file control block (FCB)

  Structure of extended file control block

  MS-DOS error codes

  Structure of boot sector

  Structure of directory entry

  Structure of fixed-disk master block

  LIM EMS error codes

  Intel 80x86 internal interrupts (faults)

  Intel 80x86, MS-DOS, and ROM BIOS interrupts

  Device-driver attribute word

  Device-driver command codes

  Structure of BIOS parameter block (BPB)

  Media descriptor byte



────────────────────────────────────────────────────────────────────────────
Acknowledgments

  My renewed thanks to the outstanding editors and production staff at
  Microsoft Press, who make beautiful books happen, and to the talented
  Microsoft developers, who create great programs to write books about.
  Special thanks to Mike Halvorson, Jeff Hinsch, Mary Ann Jones, Claudette
  Moore, Dori Shattuck, and Mark Zbikowski; if this book has anything unique
  to offer, these people deserve most of the credit.



────────────────────────────────────────────────────────────────────────────
Introduction

  Advanced MS-DOS Programming is written for the experienced C or
  assembly-language programmer. It provides all the information you need to
  write robust, high-performance applications under the MS-DOS operating
  system. Because I believe that working, well-documented programs are
  unbeatable learning tools, I have included detailed programming examples
  throughout──including complete utility programs that you can adapt to your
  own needs.

  This book is both a tutorial and a reference and is divided into four
  sections, so that you can find information more easily. Section 1
  discusses MS-DOS capabilities and services by functional group in the
  context of common programming issues, such as user input, control of the
  display, memory management, and file handling. Special classes of
  programs, such as interrupt handlers, device drivers, and filters, have
  their own chapters.

  Section 2 provides a complete reference guide to MS-DOS function calls,
  organized so that you can see the calling sequence, results, and version
  dependencies of each function at a glance. I have also included notes,
  where relevant, about quirks and special uses of functions as well as
  cross-references to related functions. An assembly-language example is
  included for each entry in Section 2.

  Sections 3 and 4 are references to IBM ROM BIOS, Microsoft Mouse driver,
  and Lotus/Intel/Microsoft Expanded Memory Specification functions. The
  entries in these two sections have the same form as in Section 2, except
  that individual programming examples have been omitted.

  The programs in this book were written with the marvelous Brief editor
  from Solution Systems and assembled or compiled with Microsoft Macro
  Assembler version 5.1 and Microsoft C Compiler version 5.1. They have been
  tested under MS-DOS versions 2.1, 3.1, 3.3, and 4.0 on an 8088-based IBM
  PC, an 80286-based IBM PC/AT, and an 80386-based IBM PS/2 Model 80. As far
  as I am aware, they do not contain any software or hardware dependencies
  that will prevent them from running properly on any IBM PC─compatible
  machine running MS-DOS version 2.0 or later.

Changes from the First Edition

  Readers who are familiar with the first edition will find many changes in
  the second edition, but the general structure of the book remains the
  same. Most of the material comparing MS-DOS to CP/M and UNIX/XENIX has
  been removed; although these comparisons were helpful a few years ago,
  MS-DOS has become its own universe and deserves to be considered on its
  own terms.

  The previously monolithic chapter on character devices has been broken
  into three more manageable chapters focusing on the keyboard and mouse,
  the display, and the serial port and printer. Hardware-dependent video
  techniques have been de-emphasized; although this topic is more important
  than ever, it has grown so complex that it requires a book of its own. A
  new chapter discusses compatibility and portability of MS-DOS applications
  and also contains a brief introduction to Microsoft OS/2, the new
  multitasking, protected-mode operating system.

  A road map to vital figures and tables has been added, following the Table
  of Contents, to help you quickly locate the layouts of the program segment
  prefix, file control block, and the like.

  The reference sections at the back of the book have been extensively
  updated and enlarged and are now complete through MS-DOS version 4.0, the
  IBM PS/2 Model 80 ROM BIOS and the VGA video adapter, the Microsoft Mouse
  driver version 6.0, and the Lotus/Intel/Microsoft Expanded Memory
  Specification version 4.0.

  In the two years since Advanced MS-DOS Programming was first published,
  hundreds of readers have been kind enough to send me their comments, and I
  have tried to incorporate many of their suggestions in this new edition.
  As before, please feel free to contact me via MCI Mail (user name LMI),
  CompuServe (user ID 72406,1577), or BIX (user name rduncan).

  Ray Duncan  Los Angeles, California  September 1988



────────────────────────────────────────────────────────────────────────────
SECTION 1  PROGRAMMING FOR MS-DOS
────────────────────────────────────────────────────────────────────────────



────────────────────────────────────────────────────────────────────────────
Chapter 1  Genealogy of MS-DOS

  In only seven years, MS-DOS has evolved from a simple program loader into
  a sophisticated, stable operating system for personal computers that are
  based on the Intel 8086 family of microprocessors (Figure 1-1). MS-DOS
  supports networking, graphical user interfaces, and storage devices of
  every description; it serves as the platform for thousands of application
  programs; and it has over 10 million licensed users──dwarfing the combined
  user bases of all of its competitors.

  The progenitor of MS-DOS was an operating system called 86-DOS, which was
  written by Tim Paterson for Seattle Computer Products in mid-1980. At that
  time, Digital Research's CP/M-80 was the operating system most commonly
  used on microcomputers based on the Intel 8080 and Zilog Z-80
  microprocessors, and a wide range of application software (word
  processors, database managers, and so forth) was available for use with
  CP/M-80.

  To ease the process of porting 8-bit CP/M-80 applications into the new
  16-bit environment, 86-DOS was originally designed to mimic CP/M-80 in
  both available functions and style of operation. Consequently, the
  structures of 86-DOS's file control blocks, program segment prefixes, and
  executable files were nearly identical to those of CP/M-80. Existing
  CP/M-80 programs could be converted mechanically (by processing their
  source-code files through a special translator program) and, after
  conversion, would run under 86-DOS either immediately or with very little
  hand editing.

  Because 86-DOS was marketed as a proprietary operating system for Seattle
  Computer Products' line of S-100 bus, 8086-based microcomputers, it made
  very little impact on the microcomputer world in general. Other vendors of
  8086-based microcomputers were understandably reluctant to adopt a
  competitor's operating system and continued to wait impatiently for the
  release of Digital Research's CP/M-86.

  In October 1980, IBM approached the major microcomputer-software houses in
  search of an operating system for the new line of personal computers it
  was designing. Microsoft had no operating system of its own to offer
  (other than a stand-alone version of Microsoft BASIC) but paid a fee to
  Seattle Computer Products for the right to sell Paterson's 86-DOS. (At
  that time, Seattle Computer Products received a license to use and sell
  Microsoft's languages and all 8086 versions of Microsoft's operating
  system.) In July 1981, Microsoft purchased all rights to 86-DOS, made
  substantial alterations to it, and renamed it MS-DOS. When the first IBM
  PC was released in the fall of 1981, IBM offered MS-DOS (referred to as
  PC-DOS 1.0) as its primary operating system.

  IBM also selected Digital Research's CP/M-86 and Softech's P-system as
  alternative operating systems for the PC. However, they were both very
  slow to appear at IBM PC dealers and suffered the additional disadvantages
  of higher prices and lack of available programming languages. IBM threw
  its considerable weight behind PC-DOS by releasing all the IBM-logo PC
  application software and development tools to run under it. Consequently,
  most third-party software developers targeted their products for PC-DOS
  from the start, and CP/M-86 and P-system never became significant factors
  in the IBM PC─compatible market.

  In spite of some superficial similarities to its ancestor CP/M-80, MS-DOS
  version 1.0 contained a number of improvements over CP/M-80, including the
  following:

  ■  An improved disk-directory structure that included information about a
     file's attributes (such as whether it was a system or a hidden file),
     its exact size in bytes, and the date that the file was created or last
     modified

    A superior disk-space allocation and management method, allowing
     extremely fast sequential or random record access and program loading

    An expanded set of operating-system services, including
     hardware-independent function calls to set or read the date and time, a
     filename parser, multiple-block record I/O, and variable record sizes

    An AUTOEXEC.BAT batch file to perform a user-defined series of commands
     when the system was started or reset

  IBM was the only major computer manufacturer (sometimes referred to as
  OEM, for original equipment manufacturer) to ship MS-DOS version 1.0 (as
  PC-DOS 1.0) with its products. MS-DOS version 1.25 (equivalent to IBM
  PC-DOS 1.1) was released in June 1982 to fix a number of bugs and also to
  support double-sided disks and improved hardware independence in the DOS
  kernel. This version was shipped by several vendors besides IBM, including
  Texas Instruments, COMPAQ, and Columbia, who all entered the personal
  computer market early. Due to rapid decreases in the prices of RAM and
  fixed disks, MS-DOS version 1 is no longer in common use.

  MS-DOS version 2.0 (equivalent to PC-DOS 2.0) was first released in March
  1983. It was, in retrospect, a new operating system (though great care was
  taken to maintain compatibility with MS-DOS version 1). It contained many
  significant innovations and enhanced features, including those listed on
  the following page.

    Support for both larger-capacity floppy disks and hard disks

    Many UNIX/XENIX-like features, including a hierarchical file structure,
     file handles, I/O redirection, pipes, and filters

    Background printing (print spooling)

    Volume labels, plus additional file attributes

    Installable device drivers

    A user-customizable system-configuration file that controlled the
     loading of additional device drivers, the number of system disk
     buffers, and so forth

    Maintenance of environment blocks that could be used to pass
     information between programs

    An optional ANSI display driver that allowed programs to position the
     cursor and control display characteristics in a hardware-independent
     manner

    Support for the dynamic allocation, modification, and release of memory
     by application programs

    Support for customized user command interpreters (shells)

    System tables to assist application software in modifying its currency,
     time, and date formats (known as international support)

  MS-DOS version 2.11 was subsequently released to improve international
  support (table-driven currency symbols, date formats, decimal-point
  symbols, currency separators, and so forth), to add support for 16-bit
  Kanji characters throughout, and to fix a few minor bugs. Version 2.11
  rapidly became the base version shipped for 8086/8088-based personal
  computers by every major OEM, including Hewlett-Packard, Wang, Digital
  Equipment Corporation, Texas Instruments, COMPAQ, and Tandy.

  MS-DOS version 2.25, released in October 1985, was distributed in the Far
  East but was never shipped by OEMs in the United States and Europe. In
  this version, the international support for Japanese and Korean character
  sets was extended even further, additional bugs were repaired, and many of
  the system utilities were made compatible with MS-DOS version 3.0.

  MS-DOS version 3.0 was introduced by IBM in August 1984 with the release
  of the 80286-based PC/AT machines. It represented another major rewrite of
  the entire operating system and included the important new features listed
  on the following page.

    Direct control of the print spooler by application software

    Further expansion of international support for currency formats

    Extended error reporting, including a code that suggests a recovery
     strategy to the application program

    Support for file and record locking and sharing

    Support for larger fixed disks

  MS-DOS version 3.1, which was released in November 1984, added support for
  the sharing of files and printers across a network. Beginning with version
  3.1, a new operating-system module called the redirector intercepts an
  application program's requests for I/O and filters out the requests that
  are directed to network devices, passing these requests to another machine
  for processing.

  Since version 3.1, the changes to MS-DOS have been evolutionary rather
  than revolutionary. Version 3.2, which appeared in 1986, generalized the
  definition of device drivers so that new media types (such as 3.5-inch
  floppy disks) could be supported more easily. Version 3.3 was released in
  1987, concurrently with the new IBM line of PS/2 personal computers, and
  drastically expanded MS-DOS's multilanguage support for keyboard mappings,
  printer character sets, and display fonts. Version 4.0, delivered in 1988,
  was enhanced with a visual shell as well as support for very large file
  systems.

  While MS-DOS has been evolving, Microsoft has also put intense efforts
  into the areas of user interfaces and multitasking operating systems.
  Microsoft Windows, first shipped in 1985, provides a multitasking,
  graphical user "desktop" for MS-DOS systems. Windows has won widespread
  support among developers of complex graphics applications such as desktop
  publishing and computer-aided design because it allows their programs to
  take full advantage of whatever output devices are available without
  introducing any hardware dependence.

  Microsoft Operating System/2 (MS OS/2), released in 1987, represents a new
  standard for application developers: a protected-mode, multitasking,
  virtual-memory system specifically designed for applications requiring
  high-performance graphics, networking, and interprocess communications.
  Although MS OS/2 is a new product and is not a derivative of MS-DOS, its
  user interface and file system are compatible with MS-DOS and Microsoft
  Windows, and it offers the ability to run one real-mode (MS-DOS)
  application alongside MS OS/2 protected-mode applications. This
  compatibility allows users to move between the MS-DOS and OS/2
  environments with a minimum of difficulty.

  ┌─────────────┐
   MS-DOS 1.0   1981: First operating system on IBM PC
   PC-DOS 1.0  
  └──────┬──────┘
         
  ┌──────▼──────┐
   MS-DOS 1.25  Double-sided disk support and bug fixes added:
   PC-DOS 1.1   widely distributed by OEMs other than IBM
  └──────┬──────┘
         
  ┌──────▼──────┐ 1983: Introduced with IBM PC/XT;
   MS-DOS 2.0   support for UNIX/XENIX-like hierarchical
   PC-DOS 2.0   file structure and hard disks added
  └──────┬──────┘
         ├──────────────────────────────────────┐
  ┌──────▼──────┐                        ┌──────▼──────┐
   MS-DOS 2.01  2.0 with international  PC-DOS 2.1   Introduced with PCjr
  └──────┬──────┘ support                └─────────────┘ 2.0 with bug fixes
         
  ┌──────▼──────┐
   MS-DOS 2.11  2.01 with bug fixes
  └──────┬──────┘
         ├──────────────────────────────────────┐
  ┌──────▼──────┐ 1984: Introduced with  ┌──────▼──────┐ 1985: Far East OEMs;
   MS-DOS 3.0   PC/AT; support for      MS-DOS 2.25  support for extended
   PC-DOS 3.0   1.2 MB floppy disk,    └─────────────┘ character sets
  └──────┬──────┘ larger hard disk added
         
  ┌──────▼──────┐
   MS-DOS 3.1   Support for Microsoft  ┌─────────────┐ 1985: Graphical
   PC-DOS 3.1   Networks added            Windows    user interface
  └──────┬──────┘                             1.0      for MS-DOS
                                        └──────┬──────┘
  ┌──────▼──────┐                               
   MS-DOS 3.2   1986: Support for 3.5-        
   PC-DOS 3.2   inch disks added              
  └──────┬──────┘                               
                                        ┌──────▼──────┐ 1987: Compatibility
  ┌──────▼──────┐ 1987: Introduced with     Windows    with OS/2
   MS-DOS 3.3   IBM PS/2; generalized       2.0      Presentation Manager
   PC-DOS 3.3   code-page (font)       └─────────────┘
  └──────┬──────┘ support
         
  ┌──────▼──────┐ 1988: Support for
   MS-DOS 4.0   logical volumes larger
   PC-DOS 4.0   than 32 MB; visual shell
  └─────────────┘

  Figure 1-1.  The evolution of MS-DOS.

  What does the future hold for MS-DOS? Only the long-range planning teams
  at Microsoft and IBM know for sure. But it seems safe to assume that
  MS-DOS, with its relatively small memory requirements, adaptability to
  diverse hardware configurations, and enormous base of users, will remain
  important to programmers and software publishers for years to come.



────────────────────────────────────────────────────────────────────────────
Chapter 2  MS-DOS in Operation

  It is unlikely that you will ever be called upon to configure the MS-DOS
  software for a new model of computer. Still, an acquaintance with the
  general structure of MS-DOS can often be very helpful in understanding the
  behavior of the system as a whole. In this chapter, we will discuss how
  MS-DOS is organized and how it is loaded into memory when the computer is
  turned on.


The Structure of MS-DOS

  MS-DOS is partitioned into several layers that serve to isolate the kernel
  logic of the operating system, and the user's perception of the system,
  from the hardware it is running on. These layers are

  ■  The BIOS (Basic Input/Output System)

  ■  The DOS kernel

  ■  The command processor (shell)

  We'll discuss the functions of each of these layers separately.

The BIOS Module

  The BIOS is specific to the individual computer system and is provided by
  the manufacturer of the system. It contains the default resident
  hardware-dependent drivers for the following devices:

    Console display and keyboard (CON)

    Line printer (PRN)

    Auxiliary device (AUX)

    Date and time (CLOCK$)

    Boot disk device (block device)

  The MS-DOS kernel communicates with these device drivers through I/O
  request packets; the drivers then translate these requests into the proper
  commands for the various hardware controllers. In many MS-DOS systems,
  including the IBM PC, the most primitive parts of the hardware drivers are
  located in read-only memory (ROM) so that they can be used by stand-alone
  applications, diagnostics, and the system startup program.

  The terms resident and installable are used to distinguish between the
  drivers built into the BIOS and the drivers installed during system
  initialization by DEVICE commands in the CONFIG.SYS file. (Installable
  drivers will be discussed in more detail later in this chapter and in
  Chapter 14.)

  The BIOS is read into random-access memory (RAM) during system
  initialization as part of a file named IO.SYS. (In PC-DOS, the file is
  called IBMBIO.COM.) This file is marked with the special attributes hidden
  and system.

The DOS Kernel

  The DOS kernel implements MS-DOS as it is seen by application programs.
  The kernel is a proprietary program supplied by Microsoft Corporation and
  provides a collection of hardware-independent services called system
  functions. These functions include the following:

    File and record management

    Memory management

    Character-device input/output

    Spawning of other programs

    Access to the real-time clock

  Programs can access system functions by loading registers with
  function-specific parameters and then transferring to the operating system
  by means of a software interrupt.

  The DOS kernel is read into memory during system initialization from the
  MSDOS.SYS file on the boot disk. (The file is called IBMDOS.COM in
  PC-DOS.) This file is marked with the attributes hidden and system.

The Command Processor

  The command processor, or shell, is the user's interface to the operating
  system. It is responsible for parsing and carrying out user commands,
  including the loading and execution of other programs from a disk or other
  mass-storage device.

  The default shell that is provided with MS-DOS is found in a file called
  COMMAND.COM. Although COMMAND.COM prompts and responses constitute the
  ordinary user's complete perception of MS-DOS, it is important to realize
  that COMMAND.COM is not the operating system, but simply a special class
  of program running under the control of MS-DOS.

  COMMAND.COM can be replaced with a shell of the programmer's own design by
  simply adding a SHELL directive to the system-configuration file
  (CONFIG.SYS) on the system startup disk. The product COMMAND-PLUS from ESP
  Systems is an example of such an alternative shell.

  More about COMMAND.COM

  The default MS-DOS shell, COMMAND.COM, is divided into three parts:

  ■  A resident portion

  ■  An initialization section

  ■  A transient module

  The resident portion is loaded in lower memory, above the DOS kernel and
  its buffers and tables. It contains the routines to process Ctrl-C and
  Ctrl-Break, critical errors, and the termination (final exit) of other
  transient programs. This part of COMMAND.COM issues error messages and is
  responsible for the familiar prompt

  Abort, Retry, Ignore?

  The resident portion also contains the code required to reload the
  transient portion of COMMAND.COM when necessary.

  The initialization section of COMMAND.COM is loaded above the resident
  portion when the system is started. It processes the AUTOEXEC.BAT batch
  file (the user's list of commands to execute at system startup), if one is
  present, and is then discarded.

  The transient portion of COMMAND.COM is loaded at the high end of memory,
  and its memory can also be used for other purposes by application
  programs. The transient module issues the user prompt, reads the commands
  from the keyboard or batch file, and causes them to be executed. When an
  application program terminates, the resident portion of COMMAND.COM does a
  checksum of the transient module to determine whether it has been
  destroyed and fetches a fresh copy from the disk if necessary.

  The user commands that are accepted by COMMAND.COM fall into three
  categories:

    Internal commands

    External commands

    Batch files

  Internal commands, sometimes called intrinsic commands, are those carried
  out by code embedded in COMMAND.COM itself. Commands in this category
  include COPY, REN(AME), DIR(ECTORY), and DEL(ETE). The routines for the
  internal commands are included in the transient part of COMMAND.COM.

  External commands, sometimes called extrinsic commands or transient
  programs, are the names of programs stored in disk files. Before these
  programs can be executed, they must be loaded from the disk into the
  transient program area (TPA) of memory. (See "How MS-DOS Is Loaded" in
  this chapter.) Familiar examples of external commands are CHKDSK, BACKUP,
  and RESTORE. As soon as an external command has completed its work, it is
  discarded from memory; hence, it must be reloaded from disk each time it
  is invoked.

  Batch files are text files that contain lists of other intrinsic,
  extrinsic, or batch commands. These files are processed by a special
  interpreter that is built into the transient portion of COMMAND.COM. The
  interpreter reads the batch file one line at a time and carries out each
  of the specified operations in order.

  In order to interpret a user's command, COMMAND.COM first looks to see if
  the user typed the name of a built-in (intrinsic) command that it can
  carry out directly. If not, it searches for an external command
  (executable program file) or batch file by the same name. The search is
  carried out first in the current directory of the current disk drive and
  then in each of the directories specified in the most recent PATH command.
  In each directory inspected, COMMAND.COM first tries to find a file with
  the extension .COM, then .EXE, and finally .BAT. If the search fails for
  all three file types in all of the possible locations, COMMAND.COM
  displays the familiar message

  Bad command or file name

  If a .COM file or a .EXE file is found, COMMAND.COM uses the MS-DOS EXEC
  function to load and execute it. The EXEC function builds a special data
  structure called a program segment prefix (PSP) above the resident portion
  of COMMAND.COM in the transient program area. The PSP contains various
  linkages and pointers needed by the application program. Next, the EXEC
  function loads the program itself, just above the PSP, and performs any
  relocation that may be necessary. Finally, it sets up the registers
  appropriately and transfers control to the entry point for the program.
  (Both the PSP and the EXEC function will be discussed in more detail in
  Chapters 3 and 12.) When the transient program has finished its job, it
  calls a special MS-DOS termination function that releases the transient
  program's memory and returns control to the program that caused the
  transient program to be loaded (COMMAND.COM, in this case).

  A transient program has nearly complete control of the system's resources
  while it is executing. The only other tasks that are accomplished are
  those performed by interrupt handlers (such as the keyboard input driver
  and the real-time clock) and operations that the transient program
  requests from the operating system. MS-DOS does not support sharing of the
  central processor among several tasks executing concurrently, nor can it
  wrest control away from a program when it crashes or executes for too
  long. Such capabilities are the province of MS OS/2, which is a
  protected-mode system with preemptive multitasking (time-slicing).


How MS-DOS Is Loaded

  When the system is started or reset, program execution begins at address
  0FFFF0H. This is a feature of the 8086/8088 family of microprocessors and
  has nothing to do with MS-DOS. Systems based on these processors are
  designed so that address 0FFFF0H lies within an area of ROM and contains a
  jump machine instruction to transfer control to system test code and the
  ROM bootstrap routine (Figure 2-1).

  The ROM bootstrap routine reads the disk bootstrap routine from the first
  sector of the system startup disk (the boot sector) into memory at some
  arbitrary address and then transfers control to it (Figure 2-2). (The
  boot sector also contains a table of information about the disk format.)

  The disk bootstrap routine checks to see if the disk contains a copy of
  MS-DOS. It does this by reading the first sector of the root directory and
  determining whether the first two files are IO.SYS and MSDOS.SYS (or
  IBMBIO.COM and IBMDOS.COM), in that order. If these files are not present,
  the user is prompted to change disks and strike any key to try again.

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │                                               │
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         │                                               │
         │                                               │
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-1.  A typical 8086/8088-based computer system immediately after
  system startup or reset. Execution begins at location 0FFFF0H, which
  contains a jump instruction that directs program control to the ROM
  bootstrap routine.

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │                                               │
         ├───────────────────────────────────────────────┤
         │           Disk bootstrap routine              │
         ├───────────────────────────────────────────────┤ ◄ Arbitrary
         │                                               │   load location
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         │                                               │
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-2.  The ROM bootstrap routine loads the disk bootstrap routine
  into memory from the first sector of the system startup disk and then
  transfers control to it.

  If the two system files are found, the disk bootstrap reads them into
  memory and transfers control to the initial entry point of IO.SYS (Figure
  2-3). (In some implementations, the disk bootstrap reads only IO.SYS into
  memory, and IO.SYS in turn loads the MSDOS.SYS file.)

  The IO.SYS file that is loaded from the disk actually consists of two
  separate modules. The first is the BIOS, which contains the linked set of
  resident device drivers for the console, auxiliary port, printer, block,
  and clock devices, plus some hardware-specific initialization code that is
  run only at system startup. The second module, SYSINIT, is supplied by
  Microsoft and linked into the IO.SYS file, along with the BIOS, by the
  computer manufacturer.

  SYSINIT is called by the manufacturer's BIOS initialization code. It
  determines the amount of contiguous memory present in the system and then
  relocates itself to high memory. Then it moves the DOS kernel, MSDOS.SYS,
  from its original load location to its final memory location, overlaying
  the original SYSINIT code and any other expendable initialization code
  that was contained in the IO.SYS file (Figure 2-4).

  Next, SYSINIT calls the initialization code in MSDOS.SYS. The DOS kernel
  initializes its internal tables and work areas, sets up the interrupt
  vectors 20H through 2FH, and traces through the linked list of resident
  device drivers, calling the initialization function for each. (See Chapter
  14.)

         ┌───────────────────────────────────────────────┐
                      ROM bootstrap routine             
         ├───────────────────────────────────────────────┤
                                                        
         ├───────────────────────────────────────────────┤  Top of RAM
                                                        
         ├───────────────────────────────────────────────┤
                     Disk bootstrap routine             
         ├───────────────────────────────────────────────┤
                                                        
         └──────────────────────┐                        
         ┌────────────────────┐ └────────────────────────┘
                             └──────────────────────────┐
                                                        
         ├───────────────────────────────────────────────┤
                   DOS kernel (from MSDOS.SYS)          
         ├───────────────────────────────────────────────┤  In temporary
                      SYSINIT (from IO.SYS)                location
         ├───────────────────────────────────────────────┤
                       BIOS (from IO.SYS)               
         ├───────────────────────────────────────────────┤
                                                        
  00400H ├───────────────────────────────────────────────┤
                        Interrupt vectors               
  00000H └───────────────────────────────────────────────┘

  Figure 2-3.  The disk bootstrap reads the file IO.SYS into memory. This
  file contains the MS-DOS BIOS (resident device drivers) and the SYSINIT
  module. Either the disk bootstrap or the BIOS (depending upon the
  manufacturer's implementation) then reads the DOS kernel into memory from
  the MSDOS.SYS file.

  These driver functions determine the equipment status, perform any
  necessary hardware initialization, and set up the vectors for any external
  hardware interrupts the drivers will service.

  As part of the initialization sequence, the DOS kernel examines the
  disk-parameter blocks returned by the resident block-device drivers,
  determines the largest sector size that will be used in the system, builds
  some drive-parameter blocks, and allocates a disk sector buffer. Control
  then returns to SYSINIT.

  When the DOS kernel has been initialized and all resident device drivers
  are available, SYSINIT can call on the normal MS-DOS file services to open
  the CONFIG.SYS file. This optional file can contain a variety of commands
  that enable the user to customize the MS-DOS environment. For instance,
  the user can specify additional hardware device drivers, the number of
  disk buffers, the maximum number of files that can be open at one time,
  and the filename of the command processor (shell).

  If it is found, the entire CONFIG.SYS file is loaded into memory for
  processing. All lowercase characters are converted to uppercase, and the
  file is interpreted one line at a time to process the commands. Memory is
  allocated for the disk buffer cache and the internal file control blocks
  used by the handle file and record system functions. (See Chapter 8.) Any
  device drivers indicated in the CONFIG.SYS file are sequentially loaded
  into memory, initialized by calls to their init modules, and linked into
  the device-driver list. The init function of each driver tells SYSINIT how
  much memory to reserve for that driver.

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │               SYSINIT module                  │
         ├───────────────────────────────────────────────┤
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         ├───────────────────────────────────────────────┤
         │              Installable drivers              │
         ├───────────────────────────────────────────────┤
         │              File control blocks              │
         ├───────────────────────────────────────────────┤
         │               Disk buffer cache               │
         ├───────────────────────────────────────────────┤
         │                  DOS kernel                   │
         ├───────────────────────────────────────────────┤ ◄ In final
         │                     BIOS                      │   location
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-4.  SYSINIT moves itself to high memory and relocates the DOS
  kernel, MSDOS.SYS, downward to its final address. The MS-DOS disk buffer
  cache and file control block areas are allocated, and then the installable
  device drivers specified in the CONFIG.SYS file are loaded and linked into
  the system.

  After all installable device drivers have been loaded, SYSINIT closes all
  file handles and reopens the console (CON), printer (PRN), and auxiliary
  (AUX) devices as the standard input, standard output, standard error,
  standard list, and standard auxiliary devices. This allows a
  user-installed character-device driver to override the BIOS's resident
  drivers for the standard devices.

  Finally, SYSINIT calls the MS-DOS EXEC function to load the command
  interpreter, or shell. (The default shell is COMMAND.COM, but another
  shell can be substituted by means of the CONFIG.SYS file.) Once the shell
  is loaded, it displays a prompt and waits for the user to enter a command.
  MS-DOS is now ready for business, and the SYSINIT module is discarded
  (Figure 2-5).

         ┌───────────────────────────────────────────────┐
                     ROM bootstrap routine              
         ├───────────────────────────────────────────────┤
                                                        
         ├───────────────────────────────────────────────┤  Top of RAM
                  Transient part of COMMAND.COM         
         ├───────────────────────────────────────────────┤
         └──────────────────────┐                        
         ┌────────────────────┐ └────────────────────────┘
                             └──────────────────────────┐
                     Transient program area             
         ├───────────────────────────────────────────────┤
                  Resident part of COMMAND.COM          
         ├───────────────────────────────────────────────┤
                       Installable drivers              
         ├───────────────────────────────────────────────┤
                       File control blocks              
         ├───────────────────────────────────────────────┤
                        Disk buffer cache               
         ├───────────────────────────────────────────────┤
                           DOS kernel                   
         ├───────────────────────────────────────────────┤
                              BIOS                      
         ├───────────────────────────────────────────────┤
                                                        
  00400H ├───────────────────────────────────────────────┤
                      Interrupt vectors                 
  00000H └───────────────────────────────────────────────┘

  Figure 2-5.  The final result of the MS-DOS startup process for a typical
  system. The resident portion of COMMAND.COM lies in low memory, above the
  DOS kernel. The transient portion containing the batch-file interpreter
  and intrinsic commands is placed in high memory, where it can be overlaid
  by extrinsic commands and application programs running in the transient
  program area.



────────────────────────────────────────────────────────────────────────────
Chapter 3  Structure of MS-DOS Application Programs

  Programs that run under MS-DOS come in two basic flavors: .COM programs,
  which have a maximum size of approximately 64 KB, and .EXE programs, which
  can be as large as available memory. In Intel 8086 parlance, .COM programs
  fit the tiny model, in which all segment registers contain the same value;
  that is, the code and data are mixed together. In contrast, .EXE programs
  fit the small, medium, or large model, in which the segment registers
  contain different values; that is, the code, data, and stack reside in
  separate segments. .EXE programs can have multiple code and data segments,
  which are respectively addressed by long calls and by manipulation of the
  data segment (DS) register.

  A .COM-type program resides on the disk as an absolute memory image, in a
  file with the extension .COM. The file does not have a header or any other
  internal identifying information. A .EXE program, on the other hand,
  resides on the disk in a special type of file with a unique header, a
  relocation map, a checksum, and other information that is (or can be) used
  by MS-DOS.

  Both .COM and .EXE programs are brought into memory for execution by the
  same mechanism: the EXEC function, which constitutes the MS-DOS loader.
  EXEC can be called with the filename of a program to be loaded by
  COMMAND.COM (the normal MS-DOS command interpreter), by other shells or
  user interfaces, or by another program that was previously loaded by EXEC.
  If there is sufficient free memory in the transient program area, EXEC
  allocates a block of memory to hold the new program, builds the program
  segment prefix (PSP) at its base, and then reads the program into memory
  immediately above the PSP. Finally, EXEC sets up the segment registers and
  the stack and transfers control to the program.

  When it is invoked, EXEC can be given the addresses of additional
  information, such as a command tail, file control blocks, and an
  environment block; if supplied, this information will be passed on to the
  new program. (The exact procedure for using the EXEC function in your own
  programs is discussed, with examples, in Chapter 12.)

  .COM and .EXE programs are often referred to as transient programs. A
  transient program "owns" the memory block it has been allocated and has
  nearly total control of the system's resources while it is executing. When
  the program terminates, either because it is aborted by the operating
  system or because it has completed its work and systematically performed a
  final exit back to MS-DOS, the memory block is then freed (hence the term
  transient) and can be used by the next program in line to be loaded.


The Program Segment Prefix

  A thorough understanding of the program segment prefix is vital to
  successful programming under MS-DOS. It is a reserved area, 256 bytes
  long, that is set up by MS-DOS at the base of the memory block allocated
  to a transient program. The PSP contains some linkages to MS-DOS that can
  be used by the transient program, some information MS-DOS saves for its
  own purposes, and some information MS-DOS passes to the transient
  program──to be used or not, as the program requires (Figure 3-1).

  Offset
  0000H ┌────────────────────────────────────────────────────────┐
        │                        Int 20H                         │
  0002H ├────────────────────────────────────────────────────────┤
        │            Segment, end of allocation block            │
  0004H ├────────────────────────────────────────────────────────┤
        │                        Reserved                        │
  0005H ├────────────────────────────────────────────────────────┤
        │        Long call to MS-DOS function dispatcher         │
  000AH ├────────────────────────────────────────────────────────┤
        │        Previous contents of termination handler        │
        │               interrupt vector (Int 22H)               │
  000EH ├────────────────────────────────────────────────────────┤
        │ Previous contents of Ctrl-C interrupt vector (Int 23H) │
  0012H ├────────────────────────────────────────────────────────┤
        │      Previous contents of critical-error handler       │
        │               interrupt vector (Int 24H)               │
  0016H ├────────────────────────────────────────────────────────┤
        │                        Reserved                        │
  002CH ├────────────────────────────────────────────────────────┤
        │          Segment address of environment block          │
  002EH ├────────────────────────────────────────────────────────┤
        │                        Reserved                        │
  005CH ├────────────────────────────────────────────────────────┤
        │             Default file control block #1              │
  006CH ├────────────────────────────────────────────────────────┤
        │             Default file control block #2              │
        │              (overlaid if FCB #1 opened)               │
  008OH ├────────────────────────────────────────────────────────┤
        └──────────────────────────┐                             │
        ┌────────────────────────┐ └─────────────────────────────┘
        │                        └───────────────────────────────┐
        │  Command tail and default disk transfer area (buffer)  │
  OOFFH └────────────────────────────────────────────────────────┘

  Figure 3-1.  The structure of the program segment prefix.

  In the first versions of MS-DOS, the PSP was designed to be compatible
  with a control area that was built beneath transient programs under
  Digital Research's venerable CP/M operating system, so that programs could
  be ported to MS-DOS without extensive logical changes. Although MS-DOS has
  evolved considerably since those early days, the structure of the PSP is
  still recognizably similar to its CP/M equivalent. For example, offset
  0000H in the PSP contains a linkage to the MS-DOS process-termination
  handler, which cleans up after the program has finished its job and
  performs a final exit. Similarly, offset 0005H in the PSP contains a
  linkage to the MS-DOS function dispatcher, which performs disk operations,
  console input/output, and other such services at the request of the
  transient program. Thus, calls to PSP:0000 and PSP:0005 have the same
  effect as CALL 0000 and CALL 0005 under CP/M. (These linkages are not the
  "approved" means of obtaining these services, however.)

  The word at offset 0002H in the PSP contains the segment address of the
  top of the transient program's allocated memory block. The program can use
  this value to determine whether it should request more memory to do its
  job or whether it has extra memory that it can release for use by other
  processes.

  Offsets 000AH through 0015H in the PSP contain the previous contents of
  the interrupt vectors for the termination, Ctrl-C, and critical-error
  handlers. If the transient program alters these vectors for its own
  purposes, MS-DOS restores the original values saved in the PSP when the
  program terminates.

  The word at PSP offset 002CH holds the segment address of the environment
  block, which contains a series of ASCIIZ strings (sequences of ASCII
  characters terminated by a null, or zero, byte). The environment block is
  inherited from the program that called the EXEC function to load the
  currently executing program. It contains such information as the current
  search path used by COMMAND.COM to find executable programs, the location
  on the disk of COMMAND.COM itself, and the format of the user prompt used
  by COMMAND.COM.

  The command tail──the remainder of the command line that invoked the
  transient program, after the program's name──is copied into the PSP
  starting at offset 0081H. The length of the command tail, not including
  the return character at its end, is placed in the byte at offset 0080H.
  Redirection or piping parameters and their associated filenames do not
  appear in the portion of the command line (the command tail) that is
  passed to the transient program, because redirection is transparent to
  applications.

  To provide compatibility with CP/M, MS-DOS parses the first two parameters
  in the command tail into two default file control blocks (FCBs) at
  PSP:005CH and PSP:006CH, under the assumption that they may be filenames.
  However, if the parameters are filenames that include a path
  specification, only the drive code will be valid in these default FCBs,
  because FCB-type file- and record-access functions do not support
  hierarchical file structures. Although the default FCBs were an aid in
  earlier years, when compatibility with CP/M was more of a concern, they
  are essentially useless in modern MS-DOS application programs that must
  provide full path support. (File control blocks are discussed in detail in
  Chapter 8 and hierarchical file structures are discussed in Chapter 9.)

  The 128-byte area from 0080H through 00FFH in the PSP also serves as the
  default disk transfer area (DTA), which is set by MS-DOS before passing
  control to the transient program. If the program does not explicitly
  change the DTA, any file read or write operations requested with the FCB
  group of function calls automatically use this area as a data buffer. This
  is rarely useful and is another facet of MS-DOS's handling of the PSP that
  is present only for compatibility with CP/M.

  ──────────────────────────────────────────────────────────────────────────
  WARNING
    Programs must not alter any part of the PSP below offset 005CH.
  ──────────────────────────────────────────────────────────────────────────


Introduction to .COM Programs

  Programs of the .COM persuasion are stored in disk files that hold an
  absolute image of the machine instructions to be executed. Because the
  files contain no relocation information, they are more compact, and are
  loaded for execution slightly faster, than equivalent .EXE files. Note
  that MS-DOS does not attempt to ascertain whether a .COM file actually
  contains executable code (there is no signature or checksum, as in the
  case of a .EXE file); it simply brings any file with the .COM extension
  into memory and jumps to it.

  Because .COM programs are loaded immediately above the program segment
  prefix and do not have a header that can specify another entry point, they
  must always have an origin of 0100H, which is the length of the PSP.
  Location 0100H must contain an executable instruction. The maximum length
  of a .COM program is 65,536 bytes, minus the length of the PSP (256 bytes)
  and a mandatory word of stack (2 bytes).

  When control is transferred to the .COM program from MS-DOS, all of the
  segment registers point to the PSP (Figure 3-2). The stack pointer
  register contains 0FFFEH if memory allows; otherwise, it is set as high as
  possible in memory minus 2 bytes. (MS-DOS pushes a zero word on the stack
  before entry.)

     SS:SP  ┌────────────────────────────────────────────────────────┐
            │                                                        │
            │       Stack grows downward from top of segment         │
            │                           │                            │
            │                           ▼                            │
            │                                                       │
            │                           │                            │
            │                 Program code and data                  │
            │                                                        │
  CS:0100H  ├────────────────────────────────────────────────────────┤
            │                 Program segment prefix                 │
  CS:0000H  └────────────────────────────────────────────────────────┘
  DS:0000H
  ES:0000H
  SS:0000H

  Figure 3-2.  A memory image of a typical .COM-type program after loading.
  The contents of the .COM file are brought into memory just above the
  program segment prefix. Program, code, and data are mixed together in the
  same segment, and all segment registers contain the same value.

  Although the size of an executable .COM file can't exceed 64 KB, the
  current versions of MS-DOS allocate all of the transient program area to
  .COM programs when they are loaded. Because many such programs date from
  the early days of MS-DOS and are not necessarily "well-behaved" in their
  approach to memory management, the operating system simply makes the
  worst-case assumption and gives .COM programs everything that is
  available. If a .COM program wants to use the EXEC function to invoke
  another process, it must first shrink down its memory allocation to the
  minimum memory it needs in order to continue, taking care to protect its
  stack. (This is discussed in more detail in Chapter 12.)

  When a .COM program finishes executing, it can return control to MS-DOS by
  several means. The preferred method is Int 21H Function 4CH, which allows
  the program to pass a return code back to the program, shell, or batch
  file that invoked it. However, if the program is running under MS-DOS
  version 1, it must exit by means of Int 20H, Int 21H Function 0, or a
  NEAR RETURN. (Because a word of zero was pushed onto the stack at entry, a
  NEAR RETURN causes a transfer to PSP:0000, which contains an Int 20H
  instruction.)

  A .COM-type application can be linked together from many separate object
  modules. All of the modules must use the same code-segment name and class
  name, and the module with the entry point at offset 0100H within the
  segment must be linked first. In addition, all of the procedures within a
  .COM program should have the NEAR attribute, because all executable code
  resides in one segment.

  When linking a .COM program, the linker will display the message

  Warning: no stack segment

  This message can be ignored. The linker output is a .EXE file, which must
  be converted into a .COM file with the MS-DOS EXE2BIN utility before
  execution. You can then delete the .EXE file. (An example of this process
  is provided in Chapter 4.)

An Example .COM Program

  The HELLO.COM program listed in Figure 3-3 demonstrates the structure of
  a simple assembly-language program that is destined to become a .COM file.
  (You may find it helpful to compare this listing with the HELLO.EXE
  program later in this chapter.) Because this program is so short and
  simple, a relatively high proportion of the source code is actually
  assembler directives that do not result in any executable code.

  The NAME statement simply provides a module name for use during the
  linkage process. This aids understanding of the map that the linker
  produces. In MASM versions 5.0 and later, the module name is always the
  same as the filename, and the NAME statement is ignored.

  The PAGE command, when used with two operands, as in line 2, defines the
  length and width of the page. These default respectively to 66 lines and
  80 characters. If you use the PAGE command without any operands, a
  formfeed is sent to the printer and a heading is printed. In larger
  programs, use the PAGE command liberally to place each of your subroutines
  on separate pages for easy reading.

  The TITLE command, in line 3, specifies the text string (limited to 60
  characters) that is to be printed at the upper left corner of each page.
  The TITLE command is optional and cannot be used more than once in each
  assembly-language source file.

  ──────────────────────────────────────────────────────────────────────────
   1:          name    hello
   2:          page    55,132
   3:          title   HELLO.COM--print hello on terminal
   4:
   5:  ;
   6:  ; HELLO.COM:    demonstrates various components
   7:  ;               of a functional .COM-type assembly-
   8:  ;               language program, and an MS-DOS
   9:  ;               function call.
  10:  ;
  11:  ; Ray Duncan, May 1988
  12:  ;
  13:
  14:  stdin   equ     0               ; standard input handle
  15:  stdout  equ     1               ; standard output handle
  16:  stderr  equ     2               ; standard error handle
  17:
  18:  cr      equ     0dh             ; ASCII carriage return
  19:  lf      equ     0ah             ; ASCII linefeed
  20:
  21:
  22:  _TEXT   segment word public 'CODE'
  23:
  24:          org     100h            ; .COM files always have
  25:                                  ; an origin of 100h
  26:
  27:          assume  cs:_TEXT,ds:_TEXT,es:_TEXT,ss:_TEXT
  28:
  29:  print   proc    near            ; entry point from MS-DOS
  30:
  31:          mov     ah,40h          ; function 40h = write
  32:          mov     bx,stdout       ; handle for standard output
  33:          mov     cx,msg_len      ; length of message
  34:          mov     dx,offset msg   ; address of message
  35:          int     21h             ; transfer to MS-DOS
  36:
  37:          mov     ax,4c00h        ; exit, return code = 0
  38:          int     21h             ; transfer to MS-DOS
  39:
  40:  print   endp
  41:
  42:
  43:  msg     db      cr,lf           ; message to display
  44:          db      'Hello World!',cr,lf
  45:
  46:  msg_len equ     $-msg           ; length of message
  47:
  48:
  49:  _TEXT   ends
  50:
  51:          end     print           ; defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-3.  The HELLO.COM program listing.

  Dropping down past a few comments and EQU statements, we come to a
  declaration of a code segment that begins in line 22 with a SEGMENT
  command and ends in line 49 with an ENDS command. The label in the
  leftmost field of line 22 gives the code segment the name _TEXT. The
  operand fields at the right end of the line give the segment the
  attributes WORD, PUBLIC, and `CODE'. (You might find it helpful to read
  the Microsoft Macro Assembler manual for detailed explanations of each
  possible segment attribute.)

  Because this program is going to be converted into a .COM file, all of its
  executable code and data areas must lie within one code segment. The
  program must also have its origin at offset 0100H (immediately above the
  program segment prefix), which is taken care of by the ORG statement
  in line 24.

  Following the ORG instruction, we encounter an ASSUME statement on line
  27. The concept of ASSUME often baffles new assembly-language programmers.
  In a way, ASSUME doesn't "do" anything; it simply tells the assembler
  which segment registers you are going to use to point to the various
  segments of your program, so that the assembler can provide segment
  overrides when they are necessary. It's important to notice that the
  ASSUME statement doesn't take care of loading the segment registers with
  the proper values; it merely notifies the assembler of your intent to do
  that within the program. (Remember that, in the case of a .COM program,
  MS-DOS initializes all the segment registers before entry to point to the
  PSP.)

  Within the code segment, we come to another type of block declaration that
  begins with the PROC command on line 29 and closes with ENDP on line 40.
  These two instructions declare the beginning and end of a procedure, a
  block of executable code that performs a single distinct function. The
  label in the leftmost field of the PROC statement (in this case, print)
  gives the procedure a name. The operand field gives it an attribute. If
  the procedure carries the NEAR attribute, only other code in the same
  segment can call it, whereas if it carries the FAR attribute, code located
  anywhere in the CPU's memory-addressing space can call it. In .COM
  programs, all procedures carry the NEAR attribute.

  For the purposes of this example program, I have kept the print procedure
  ridiculously simple. It calls MS-DOS Int 21H Function 40H to send the
  message Hello World! to the video screen, and calls Int 21H Function 4CH
  to terminate the program.

  The END statement in line 51 tells the assembler that it has reached the
  end of the source file and also specifies the entry point for the program.
  If the entry point is not a label located at offset 0100H, the .EXE file
  resulting from the assembly and linkage of this source program cannot be
  converted into a .COM file.


Introduction to .EXE Programs

  We have just discussed a program that was written in such a way that it
  could be assembled into a .COM file. Such a program is simple in
  structure, so a programmer who needs to put together this kind of quick
  utility can concentrate on the program logic and do a minimum amount of
  worrying about control of the assembler. However, .COM-type programs have
  some definite disadvantages, and so most serious assembly-language efforts
  for MS-DOS are written to be converted into .EXE files.

  Although .COM programs are effectively restricted to a total size of 64 KB
  for machine code, data, and stack combined, .EXE programs can be
  practically unlimited in size (up to the limit of the computer's available
  memory). .EXE programs also place the code, data, and stack in separate
  parts of the file. Although the normal MS-DOS program loader does not take
  advantage of this feature of .EXE files, the ability to load different
  parts of large programs into several separate memory fragments, as well as
  the opportunity to designate a "pure" code portion of your program that
  can be shared by several tasks, is very significant in multitasking
  environments such as Microsoft Windows.

  The MS-DOS loader always brings a .EXE program into memory immediately
  above the program segment prefix, although the order of the code, data,
  and stack segments may vary (Figure 3-4). The .EXE file has a header, or
  block of control information, with a characteristic format (Figures 3-5
  and 3-6). The size of this header varies according to the number of
  program instructions that need to be relocated at load time, but it is
  always a multiple of 512 bytes.

  Before MS-DOS transfers control to the program, the initial values of the
  code segment (CS) register and instruction pointer (IP) register are
  calculated from the entry-point information in the .EXE file header and
  the program's load address. This information derives from an END statement
  in the source code for one of the program's modules. The data segment (DS)
  and extra segment (ES) registers are made to point to the PSP so that the
  program can access the environment-block pointer, command tail, and other
  useful information contained there.

     SS:SP ┌────────────────────────────────────────────────────────┐
                                                                   
                                Stack segment:                     
                   stack grows downward from top of segment        
                                                                  
                                                                  
  SS:0000H ├────────────────────────────────────────────────────────┤
                                 Data segment                      
           ├────────────────────────────────────────────────────────┤
                                 Program code                      
  CS:0000H ├────────────────────────────────────────────────────────┤
                            Program segment prefix                 
  DS:0000H └────────────────────────────────────────────────────────┘
  ES:0000H

  Figure 3-4.  A memory image of a typical .EXE-type program immediately
  after loading. The contents of the .EXE file are relocated and brought
  into memory above the program segment prefix. Code, data, and stack reside
  in separate segments and need not be in the order shown here. The entry
  point can be anywhere in the code segment and is specified by the END
  statement in the main module of the program. When the program receives
  control, the DS (data segment) and ES (extra segment) registers point to
  the program segment prefix; the program usually saves this value and then
  resets the DS and ES registers to point to its data area.

  The initial contents of the stack segment (SS) and stack pointer (SP)
  registers come from the header. This information derives from the
  declaration of a segment with the attribute STACK somewhere in the
  program's source code. The memory space allocated for the stack may be
  initialized or uninitialized, depending on the stack-segment definition;
  many programmers like to initialize the stack memory with a recognizable
  data pattern so that they can inspect memory dumps and determine how much
  stack space is actually used by the program.

  When a .EXE program finishes processing, it should return control to
  MS-DOS through Int 21H Function 4CH. Other methods are available, but
  they offer no advantages and are considerably less convenient (because
  they usually require the CS register to point to the PSP).

  Byte
  offset
  0000H ┌────────────────────────────────────────────────────────┐
        │           First of .EXE file signature (4DH)           │
  0001H ├────────────────────────────────────────────────────────┤
        │        Second part of .EXE file signature (5AH)        │
  0002H ├────────────────────────────────────────────────────────┤
        │                 Length of file MOD 512                 │
  0004H ├────────────────────────────────────────────────────────┤
        │    Size of file in 512-byte pages, including header    │
  0006H ├────────────────────────────────────────────────────────┤
        │            Number of relocation-table items            │
  0008H ├────────────────────────────────────────────────────────┤
        │      Size of header in paragraphs (16-byte units)      │
  000AH ├────────────────────────────────────────────────────────┤
        │   Minimum number of paragraphs needed above program    │
  000CH ├────────────────────────────────────────────────────────┤
        │   Maximum number of paragraphs desired above program   │
  000EH ├────────────────────────────────────────────────────────┤
        │          Segment displacement of stack module          │
  0010H ├────────────────────────────────────────────────────────┤
        │            Contents of SP register at entry            │
  0012H ├────────────────────────────────────────────────────────┤
        │                     Word checksum                      │
  0014H ├────────────────────────────────────────────────────────┤
        │            Contents of IP register at entry            │
  0016H ├────────────────────────────────────────────────────────┤
        │          Segment displacement of code module           │
  0018H ├────────────────────────────────────────────────────────┤
        │        Offset of first relocation item in file         │
  001AH ├────────────────────────────────────────────────────────┤
        │    Overlay number (0 for resident part of program)     │
  001BH ├────────────────────────────────────────────────────────┤
        │                Variable reserved space                 │
        ├────────────────────────────────────────────────────────┤
        │                    Relocation table                    │
        ├────────────────────────────────────────────────────────┤
        │                Variable reserved space                 │
        ├────────────────────────────────────────────────────────┤
        │               Program and data segments                │
        ├────────────────────────────────────────────────────────┤
        │                     Stack segment                      │
        └────────────────────────────────────────────────────────┘

  Figure 3-5.  The format of a .EXE load module.

  The input to the linker for a .EXE-type program can be many separate
  object modules. Each module can use a unique code-segment name, and the
  procedures can carry either the NEAR or the FAR attribute, depending on
  naming conventions and the size of the executable code. The programmer
  must take care that the modules linked together contain only one segment
  with the STACK attribute and only one entry point defined with an END
  assembler directive. The output from the linker is a file with a .EXE
  extension. This file can be executed immediately.

  ──────────────────────────────────────────────────────────────────────────
  C>DUMP HELLO.EXE
         0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0000  4D 5A 28 00 02 00 01 00 20 00 09 00 FF FF 03 00  MZ(..... .......
  0010  80 00 20 05 00 00 00 00 1E 00 00 00 01 00 01 00  .. .............
  0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0050  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        .
        .
        .
  0200  B8 01 00 8E D8 B4 40 BB 01 00 B9 10 00 90 BA 08  ......@.........
  0210  00 CD 21 B8 00 4C CD 21 0D 0A 48 65 6C 6C 6F 20  ..!..L.!..Hello
  0220  57 6F 72 6C 64 21 0D 0A                          World!..
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-6.  A hex dump of the HELLO.EXE program, demonstrating the
  contents of a simple .EXE load module. Note the following interesting
  values: the .EXE signature in bytes 0000H and 0001H, the number of
  relocation-table items in bytes 0006H and 0007H, the minimum extra memory
  allocation (MIN_ALLOC) in bytes 000AH and 000BH, the maximum extra memory
  allocation (MAX_ALLOC) in bytes 000CH and 000DH, and the initial IP
  (instruction pointer) register value in bytes 0014H and 0015H. See also
  Figure 3-5.

An Example .EXE Program

  The HELLO.EXE program in Figure 3-7 demonstrates the fundamental
  structure of an assembly-language program that is destined to become a
  .EXE file. At minimum, it should have a module name, a code segment, a
  stack segment, and a primary procedure that receives control of the
  computer from MS-DOS after the program is loaded. The HELLO.EXE program
  also contains a data segment to provide a more complete example.

  The NAME, TITLE, and PAGE directives were covered in the HELLO.COM example
  program and are used in the same manner here, so we'll move to the first
  new item of interest. After a few comments and EQU statements, we come to
  a declaration of a code segment that begins on line 21 with a SEGMENT
  command and ends on line 41 with an ENDS command. As in the HELLO.COM
  example program, the label in the leftmost field of the line gives the
  code segment the name _TEXT. The operand fields at the right end of the
  line give the attributes WORD, PUBLIC, and `CODE'.

  Following the code-segment instruction, we find an ASSUME statement on
  line 23. Notice that, unlike the equivalent statement in the HELLO.COM
  program, the ASSUME statement in this program specifies several different
  segment names. Again, remember that this statement has no direct effect on
  the contents of the segment registers but affects only the operation of
  the assembler itself.

  ──────────────────────────────────────────────────────────────────────────
   1:          name    hello
   2:          page    55,132
   3:          title   HELLO.EXE--print Hello on terminal
   4:  ;
   5:  ; HELLO.EXE:    demonstrates various components
   6:  ;               of a functional .EXE-type assembly-
   7:  ;               language program, use of segments,
   8:  ;               and an MS-DOS function call.
   9:  ;
  10:  ; Ray Duncan, May 1988
  11:  ;
  12:
  13:  stdin   equ     0               ; standard input handle
  14:  stdout  equ     1               ; standard output handle
  15:  stderr  equ     2               ; standard error handle
  16:
  17:  cr      equ     0dh             ; ASCII carriage return
  18:  lf      equ     0ah             ; ASCII linefeed
  19:
  20:
  21:  _TEXT   segment word public 'CODE'
  22:
  23:          assume  cs:_TEXT,ds:_DATA,ss:STACK
  24:
  25:  print   proc    far             ; entry point from MS-DOS
  26:
  27:          mov     ax,_DATA        ; make our data segment
  28:          mov     ds,ax           ; addressable...
  29:
  30:          mov     ah,40h          ; function 40h = write
  31:          mov     bx,stdout       ; standard output handle
  32:          mov     cx,msg_len      ; length of message
  33:          mov     dx,offset msg   ; address of message
  34:          int     21h             ; transfer to MS-DOS
  35:
  36:          mov     ax,4c00h        ; exit, return code = 0
  37:          int     21h             ; transfer to MS-DOS
  38:
  39:  print   endp
  40:
  41:  _TEXT   ends
  42:
  43:
  44:  _DATA   segment word public 'DATA'
  45:
  46:  msg     db      cr,lf           ; message to display
  47:          db      'Hello World!',cr,lf
  48:
  49:  msg_len equ     $-msg           ; length of message
  50:
  51:  _DATA   ends
  52:
  53:
  54:  STACK   segment para stack `STACK'
  55:
  56:          db      128 dup (?)
  57:
  58:  STACK   ends
  59:
  60:          end     print           ; defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-7.  The HELLO.EXE program listing.

  Within the code segment, the main print procedure is declared by the PROC
  command on line 25 and closed with ENDP on line 39. Because the procedure
  resides in a .EXE file, we have given it the FAR attribute as an example,
  but the attribute is really irrelevant because the program is so small and
  the procedure is not called by anything else in the same program.

  The print procedure first initializes the DS register, as indicated in the
  earlier ASSUME statement, loading it with a value that causes it to point
  to the base of the data area. (MS-DOS automatically sets up the CS and SS
  registers.) Next, the procedure uses MS-DOS Int 21H Function 40H to
  display the message Hello World! on the screen, just as in the HELLO.COM
  program. Finally, the procedure exits back to MS-DOS with an Int 21H
  Function 4CH on lines 36 and 37, passing a return code of zero (which by
  convention means a success).

  Lines 44 through 51 declare a data segment named _DATA, which contains the
  variables and constants the program will use. If the various modules of a
  program contain multiple data segments with the same name, the linker will
  collect them and place them in the same physical memory segment.

  Lines 54 through 58 establish a stack segment; PUSH and POP instructions
  will access this area of scratch memory. Before MS-DOS transfers control
  to a .EXE program, it sets up the SS and SP registers according to the
  declared size and location of the stack segment. Be sure to allow enough
  room for the maximum stack depth that can occur at runtime, plus a safe
  number of extra words for registers pushed onto the stack during an MS-DOS
  service call. If the stack overflows, it may damage your other code and
  data segments and cause your program to behave strangely or even to crash
  altogether!

  The END statement on line 60 winds up our brief HELLO.EXE program, telling
  the assembler that it has reached the end of the source file and providing
  the label of the program's point of entry from MS-DOS.

  The differences between .COM and .EXE programs are summarized in Figure
  3-8.

╓┌─┌──────────────────┌──────────────────────────┌───────────────────────────╖
                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────
  Maximum size       65,536 bytes minus 256     No limit
                     bytes for PSP and 2 bytes
                     for stack

  Entry point        PSP:0100H                  Defined by END statement

  AL at entry        00H if default FCB #1 has  Same
                     valid drive, 0FFH if
                     invalid drive

                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────

  AH at entry        00H if default FCB #2 has  Same
                     valid drive, 0FFH if
                     invalid drive

  CS at entry        PSP                        Segment containing module
                                                with entry point

  IP at entry        0100H                      Offset of entry point within
                                                its segment

  DS at entry        PSP                        PSP

  ES at entry        PSP                        PSP

  SS at entry        PSP                        Segment with STACK attribute

  SP at entry        0FFFEH or top word in      Size of segment defined with
                     available memory,          STACK attribute
                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────
                     available memory,          STACK attribute
                     whichever is lower

  Stack at entry     Zero word                  Initialized or uninitialized

  Stack size         65,536 bytes minus 256     Defined in segment with
                     bytes for PSP and size of  STACK attribute
                     executable code and data

  Subroutine calls   Usually NEAR               NEAR or FAR

  Exit method        Int 21H Function 4CH      Int 21H Function 4CH
                     preferred, NEAR RET if     preferred
                     MS-DOS version 1

  Size of file       Exact size of program      Size of program plus header
                                                (multiple of 512 bytes)
  ──────────────────────────────────────────────────────────────────────────

                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────


  Figure 3-8.  Summary of the differences between .COM and .EXE programs,
  including their entry conditions.


More About Assembly-Language Programs

  Now that we've looked at working examples of .COM and .EXE
  assembly-language programs, let's backtrack and discuss their elements a
  little more formally. The following discussion is based on the Microsoft
  Macro Assembler, hereafter referred to as MASM. If you are familiar with
  MASM and are an experienced assembly-language programmer, you may want to
  skip this section.

  MASM programs can be thought of as having three structural levels:

    The module level

    The segment level

    The procedure level

  Modules are simply chunks of source code that can be independently
  maintained and assembled. Segments are physical groupings of like items
  (machine code or data) within a program and a corresponding segregation of
  dissimilar items. Procedures are functional subdivisions of an executable
  program──routines that carry out a particular task.

Program Modules

  Under MS-DOS, the module-level structure consists of files containing the
  source code for individual routines. Each source file is translated by the
  assembler into a relocatable object module. An object module can reside
  alone in an individual file or with many other object modules in an
  object-module library of frequently used or related routines. The
  Microsoft Object Linker (LINK) combines object-module files, often with
  additional object modules extracted from libraries, into an executable
  program file.

  Using modules and object-module libraries reduces the size of your
  application source files (and vastly increases your productivity), because
  these files need not contain the source code for routines they have in
  common with other programs. This technique also allows you to maintain the
  routines more easily, because you need to alter only one copy of their
  source code stored in one place, instead of many copies stored in
  different applications. When you improve (or fix) one of these routines,
  you can simply reassemble it, put its object module back into the library,
  relink all of the programs that use the routine, and voilga: instant
  upgrade.

Program Segments

  The term segments refers to two discrete programming concepts: physical
  segments and logical segments.

  Physical segments are 64 KB blocks of memory. The Intel 8086/8088 and
  80286 microprocessors have four segment registers, which are essentially
  used as pointers to these blocks. (The 80386 has six segment registers,
  which are a superset of those found on the 8086/8088 and 80286.) Each
  segment register can point to the bottom of a different 64 KB area of
  memory. Thus, a program can address any location in memory by appropriate
  manipulation of the segment registers, but the maximum amount of memory
  that it can address simultaneously is 256 KB.

  As we discussed earlier in the chapter, .COM programs assume that all four
  segment registers always point to the same place──the bottom of the
  program. Thus, they are limited to a maximum size of 64 KB. .EXE programs,
  on the other hand, can address many different physical segments and can
  reset the segment registers to point to each segment as it is needed.
  Consequently, the only practical limit on the size of a .EXE program is
  the amount of available memory. The example programs throughout the
  remainder of this book focus on .EXE programs.

  Logical segments are the program components. A minimum of three logical
  segments must be declared in any .EXE program: a code segment, a data
  segment, and a stack segment. Programs with more than 64 KB of code or
  data have more than one code or data segment. The routines or data that
  are used most frequently are put into the primary code and data segments
  for speed, and routines or data that are used less frequently are put into
  secondary code and data segments.

  Segments are declared with the SEGMENT and ENDS directives in the
  following form:

  name   SEGMENT attributes
  .
  .
  .
  name   ENDS

  The attributes of a segment include its align type (BYTE, WORD, or PARA),
  combine type (PUBLIC, PRIVATE, COMMON, or STACK), and class type. The
  segment attributes are used by the linker when it is combining logical
  segments to create the physical segments of an executable program. Most of
  the time, you can get by just fine using a small selection of attributes
  in a rather stereotypical way. However, if you want to use the full range
  of attributes, you might want to read the detailed explanation in the MASM
  manual.

  Programs are classified into one memory model or another based on the
  number of their code and data segments. The most commonly used memory
  model for assembly-language programs is the small model, which has one
  code and one data segment, but you can also use the medium, compact, and
  large models (Figure 3-9). (Two additional models exist with which we
  will not be concerning ourselves further: the tiny model, which consists
  of intermixed code and data in a single segment── for example, a .COM file
  under MS-DOS; and the huge model, which is supported by the Microsoft C
  Optimizing Compiler and which allows use of data structures larger than 64
  KB.)

  Model                    Code segments           Data segments
  ──────────────────────────────────────────────────────────────────────────
  Small                    One                     One
  Medium                   Multiple                One
  Compact                  One                     Multiple
  Large                    Multiple                Multiple
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-9.  Memory models commonly used in assembly-language and C
  programs.

  For each memory model, Microsoft has established certain segment and class
  names that are used by all its high-level-language compilers (Figure
  3-10). Because segment names are arbitrary, you may as well adopt the
  Microsoft conventions. Their use will make it easier for you to integrate
  your assembly-language routines into programs written in languages such as
  C, or to use routines from high-level-language libraries in your
  assembly-language programs.

  Another important Microsoft high-level-language convention is to use the
  GROUP directive to name the near data segment (the segment the program
  expects to address with offsets from the DS register) and the stack
  segment as members of DGROUP (the automatic data group), a special name
  recognized by the linker and also by the program loaders in Microsoft
  Windows and Microsoft OS/2. The GROUP directive causes logical segments
  with different names to be combined into a single physical segment so that
  they can be addressed using the same segment base address. In C programs,
  DGROUP also contains the local heap, which is used by the C runtime
  library for dynamic allocation of small amounts of memory.

╓┌─┌───────────┌────────────┌───────────┌───────────┌────────────┌───────────╖
  Memory      Segment      Align       Combine     Class        Group
  model       name         type        type        type
  ──────────────────────────────────────────────────────────────────────────
  Memory      Segment      Align       Combine     Class        Group
  model       name         type        type        type
  ──────────────────────────────────────────────────────────────────────────
  Small       _TEXT        WORD        PUBLIC      CODE
              _DATA        WORD        PUBLIC      DATA         DGROUP
              STACK        PARA        STACK       STACK        DGROUP

  Medium      module_TEXT  WORD        PUBLIC      CODE
              .            WORD        PUBLIC      DATA         DGROUP
              .
              .
              _DATA
              STACK        PARA        STACK       STACK        DGROUP

  Compact     _TEXT        WORD        PUBLIC      CODE
              data         PARA        PRIVATE     FAR_DATA
              .            WORD        PUBLIC      DATA         DGROUP
              .
              .
              _DATA
              STACK        PARA        STACK       STACK        DGROUP
  Memory      Segment      Align       Combine     Class        Group
  model       name         type        type        type
  ──────────────────────────────────────────────────────────────────────────
              STACK        PARA        STACK       STACK        DGROUP

  Large       module_TEXT  WORD        PUBLIC      CODE
              .
              .
              .
              data         PARA        PRIVATE     FAR_DATA
              .
              .
              .
              _DATA        WORD        PUBLIC      DATA         DGROUP
              STACK        PARA        STACK       STACK        DGROUP
  ──────────────────────────────────────────────────────────────────────────


  Figure 3-10.  Segments, groups, and classes for the standard memory models
  as used with assembly-language programs. The Microsoft C Optimizing
  Compiler and other high-level-language compilers use a superset of these
  segments and classes.

  For pure assembly-language programs that will run under MS-DOS, you can
  ignore DGROUP. However, if you plan to integrate assembly-language
  routines and programs written in high-level languages, you'll want to
  follow the Microsoft DGROUP convention. For example, if you are planning
  to link routines from a C library into an assembly-language program, you
  should include the line

  DGROUP group _DATA,STACK

  near the beginning of the program.

  The final Microsoft convention of interest in creating .EXE programs is
  segment order. The high-level compilers assume that code segments always
  come first, followed by far data segments, followed by the near data
  segment, with the stack and heap last. This order won't concern you much
  until you begin integrating assembly-language code with routines from
  high-level-language libraries, but it is easiest to learn to use the
  convention right from the start.

Program Procedures

  The procedure level of program structure is partly real and partly
  conceptual. Procedures are basically just a fancy guise for subroutines.

  Procedures within a program are declared with the PROC and ENDP directives
  in the following form:

  name   PROC attribute
  .
  .
  .
         RET
  name   ENDP

  The attribute carried by a PROC declaration, which is either NEAR or FAR,
  tells the assembler what type of call you expect to use to enter the
  procedure──that is, whether the procedure will be called from other
  routines in the same segment or from routines in other segments. When the
  assembler encounters a RET instruction within the procedure, it uses the
  attribute information to generate the correct opcode for either a near
  (intra-segment) or far (inter-segment) return.

  Each program should have a main procedure that receives control from
  MS-DOS. You specify the entry point for the program by including the name
  of the main procedure in the END statement in one of the program's source
  files. The main procedure's attribute (NEAR or FAR) is really not too
  important, because the program returns control to MS-DOS with a function
  call rather than a RET instruction. However, by convention, most
  programmers assign the main procedure the FAR attribute anyway.

  You should break the remainder of the program into procedures in an
  orderly way, with each procedure performing a well-defined single
  function, returning its results to its caller, and avoiding actions that
  have global effects within the program. Ideally procedures invoke each
  other only by CALL instructions, have only one entry point and one exit
  point, and always exit by means of a RET instruction, never by jumping to
  some other location within the program.

  For ease of understanding and maintenance, a procedure should not exceed
  one page (about 60 lines); if it is longer than a page, it is probably too
  complex and you should delegate some of its function to one or more
  subsidiary procedures. You should preface the source code for each
  procedure with a detailed comment that states the procedure's calling
  sequence, results returned, registers affected, and any data items
  accessed or modified. The effort invested in making your procedures
  compact, clean, flexible, and well-documented will be repaid many times
  over when you reuse the procedures in other programs.



────────────────────────────────────────────────────────────────────────────
Chapter 4  MS-DOS Programming Tools

  Preparing a new program to run under MS-DOS is an iterative process with
  four basic steps:

  ■  Use of a text editor to create or modify an ASCII source-code file

  ■  Use of an assembler or high-level-language compiler (such as the
     Microsoft Macro Assembler or the Microsoft C Optimizing Compiler) to
     translate the source file into relocatable object code

  ■  Use of a linker to transform the relocatable object code into an
     executable MS-DOS load module

  ■  Use of a debugger to methodically test and debug the program

  Additional utilities the MS-DOS software developer may find necessary or
  helpful include the following:

  ■  LIB, which creates and maintains object-module libraries

  ■  CREF, which generates a cross-reference listing

  ■  EXE2BIN, which converts .EXE files to .COM files

  ■  MAKE, which compares dates of files and carries out operations based on
     the result of the comparison

  This chapter gives an operational overview of the Microsoft programming
  tools for MS-DOS, including the assembler, the C compiler, the linker, and
  the librarian. In general, the information provided here also applies to
  the IBM programming tools for MS-DOS, which are really the Microsoft
  products with minor variations and different version numbers. Even if your
  preferred programming language is not C or assembly language, you will
  need at least a passing familiarity with these tools because all of the
  examples in the IBM and Microsoft DOS reference manuals are written in one
  of these languages.

  The survey in this chapter, together with the example programs and
  reference section elsewhere in the book, should provide the experienced
  programmer with sufficient information to immediately begin writing useful
  programs. Readers who do not have a background in C, assembly language, or
  the Intel 80x86 microprocessor architecture should refer to the tutorial
  and reference works listed at the end of this chapter.


File Types

  The MS-DOS programming tools can create and process many different file
  types. The following extensions are used by convention for these files:

╓┌─┌──────────┌──────────────────────────────────────────────────────────────╖
  Extension  File type
  Extension  File type
  ──────────────────────────────────────────────────────────────────────────
  .ASM       Assembly-language source file

  .C         C source file

  .COM       MS-DOS executable load module that does not require relocation
             at runtime

  .CRF       Cross-reference information file produced by the assembler for
             processing by CREF.EXE

  .DEF       Module-definition file describing a program's segment behavior
             (MS OS/2 and Microsoft Windows programs only; not relevant to
             normal MS-DOS applications)

  .EXE       MS-DOS executable load module that requires relocation at
             runtime

  .H         C header file containing C source code for constants, macros,
             and functions; merged into another C program with the #include
  Extension  File type
  ──────────────────────────────────────────────────────────────────────────
             and functions; merged into another C program with the #include
             directive

  .INC       Include file for assembly-language programs, typically
             containing macros and/or equates for systemwide values such as
             error codes

  .LIB       Object-module library file made up of one or more .OBJ files;
             indexed and manipulated by LIB.EXE

  .LST       Program listing, produced by the assembler, that includes
             memory locations, machine code, the original program text, and
             error messages

  .MAP       Listing of symbols and their locations within a load module;
             produced by the linker

  .OBJ       Relocatable-object-code file produced by an assembler or
             compiler
  Extension  File type
  ──────────────────────────────────────────────────────────────────────────
             compiler

  .REF       Cross-reference listing produced by CREF.EXE from the
             information in a .CRF file
  ──────────────────────────────────────────────────────────────────────────



The Microsoft Macro Assembler

  The Microsoft Macro Assembler (MASM) is distributed as the file MASM.EXE.
  When beginning a program translation, MASM needs the following
  information:

    The name of the file containing the source program

    The filename for the object program to be created

    The destination of the program listing

    The filename for the information that is later processed by the
     cross-reference utility (CREF.EXE)

  You can invoke MASM in two ways. If you enter the name of the assembler
  alone, it prompts you for the names of each of the various input and
  output files. The assembler supplies reasonable defaults for all the
  responses except the source filename, as shown in the following example:

  C>MASM  <Enter>

  Microsoft (R) Macro Assembler Version 5.10
  Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.

  Source filename [.ASM]: HELLO  <Enter>
  Object filename [HELLO.OBJ]:  <Enter>
  Source listing  [NUL.LST]:  <Enter>
  Cross-reference [NUL.CRF]:  <Enter>

    49006 Bytes symbol space free

        0 Warning Errors
        0 Severe Errors

  C>

  You can use a logical device name (such as PRN or COM1) at any of the MASM
  prompts to send that output of the assembler to a character device rather
  than a file. Note that the default for the listing and cross-reference
  files is the NUL device──that is, no file is created. If you end any
  response with a semicolon, MASM assumes that the remaining responses are
  all to be the default.

  A more efficient way to use MASM is to supply all parameters in the
  command line, as follows:

    MASM [options] source,[object],[listing],[crossref]

  For example, the following command lines are equivalent to the preceding
  interactive session:

  C>MASM HELLO,,NUL,NUL  <Enter>

  or

  C>MASM HELLO;  <Enter>

  These commands use the file HELLO.ASM as the source, generate the
  object-code file HELLO.OBJ, and send the listing and cross-reference files
  to the bit bucket.

  MASM accepts several optional switches in the command line, to control
  code generation and output files. Figure 4-1 lists the switches accepted
  by MASM version 5.1. As shown in the following example, you can put
  frequently used options in a MASM environment variable, where they will be
  found automatically by the assembler:

  C>SET MASM=/T /Zi  <Enter>

  The switches in the environment variable will be overridden by any that
  you enter in the command line.

  In other versions of the Microsoft Macro Assembler, additional or fewer
  switches may be available. For exact instructions, see the manual for the
  version of MASM that you are using.

╓┌─┌──────────┌──────────────────────────────────────────────────────────────╖
  Switch     Meaning
  ──────────────────────────────────────────────────────────────────────────
  /A         Arrange segments in alphabetic order.
  /Bn        Set size of source-file buffer (in KB).
  /C         Force creation of a cross-reference (.CRF) file.
  /D         Produce listing on both passes (to find phase errors).
  /Dsymbol   Define symbol as a null text string (symbol can be referenced
             by conditional assembly directives in file).
  /E         Assemble for 80x87 numeric coprocessor emulator using IEEE
             real-number format.
  /Ipath     Set search path for include files.
  /L         Force creation of a program-listing file.
  /LA        Force listing of all generated code.
  /ML        Preserve case sensitivity in all names (uppercase names
             distinct from their lowercase equivalents).
  /MX        Preserve lowercase in external names only (names defined with
             PUBLIC or EXTRN directives).
  Switch     Meaning
  ──────────────────────────────────────────────────────────────────────────
             PUBLIC or EXTRN directives).
  /MU        Convert all lowercase names to uppercase.
  /N         Suppress generation of tables of macros, structures, records,
             segments, groups, and symbols at the end of the listing.
  /P         Check for impure code in 80286/80386 protected mode.
  /S         Arrange segments in order of occurrence (default).
  /T         "Terse" mode; suppress all messages unless errors are
             encountered during the assembly.
  /V         "Verbose" mode; report number of lines and symbols at end of
             assembly.
  /Wn        Set error display (warning) level; n=02.
  /X         Force listing of false conditionals.
  /Z         Display source lines containing errors on the screen.
  /Zd        Include line-number information in .OBJ file.
  /Zi        Include line-number and symbol information in .OBJ file.
  ──────────────────────────────────────────────────────────────────────────


  Figure 4-1.  Microsoft Macro Assembler version 5.1 switches.

  MASM allows you to override the default extensions on any file──a feature
  that can be rather dangerous. For example, if in the preceding example you
  had responded to the Object filename prompt with HELLO.ASM, the assembler
  would have accepted the entry without comment and destroyed your source
  file. This is not too likely to happen in the interactive command mode,
  but you must be very careful with file extensions when MASM is used in a
  batch file.


The Microsoft C Optimizing Compiler

  The Microsoft C Optimizing Compiler consists of three executable files──
  C1.EXE, C2.EXE, and C3.EXE──that implement the C preprocessor, language
  translator, code generator, and code optimizer. An additional control
  program, CL.EXE, executes the three compiler files in order, passing each
  the necessary information about filenames and compilation options.

  Before using the C compiler and the linker, you need to set up four
  environment variables:

  Variable                 Action
  ──────────────────────────────────────────────────────────────────────────
  PATH=path                Specifies the location of the three executable C
                           compiler files (C1, C2, and C3) if they are not
                           in the current directory; used by CL.EXE.

  INCLUDE=path             Specifies the location of #include files (default
                           extension .H) that are not found in the current
                           directory.

  LIB=path                 Specifies the location(s) for object-code
                           libraries that are not found in the current
                           directory.

  TMP=path                 Specifies the location for temporary working
                           files created by the C compiler and linker.
  ──────────────────────────────────────────────────────────────────────────

  CL.EXE does not support an interactive mode or response files. You always
  invoke it with a command line of the following form:

    CL [options] file [file ...]

  You may list any number of files──if a file has a .C extension, it will be
  compiled into a relocatable-object-module (.OBJ) file. Ordinarily, if the
  compiler encounters no errors, it automatically passes all resulting .OBJ
  files and any additional .OBJ files specified in the command line to the
  linker, along with the names of the appropriate runtime libraries.

  The C compiler has many optional switches controlling its memory models,
  output files, code generation, and code optimization. These are summarized
  in Figure 4-2. The C compiler's arcane switch syntax is derived largely
  from UNIX/XENIX, so don't expect it to make any sense.

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /Ax                      Select memory model:
                           C = compact model
                           H = huge model
                           L = large model
                           M = medium model
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                           M = medium model
                           S = small model (default)
  /c                       Compile only; do not invoke linker.
  /C                       Do not strip comments.
  /D<name>[=text]          Define macro.
  /E                       Send preprocessor output to standard output.
  /EP                      Send preprocessor output to standard output
                           without line numbers.
  /F<n>                    Set stack size (in hexadecimal bytes).
  /Fa [filename]           Generate assembly listing.
  /Fc [filename]           Generate mixed source/object listing.
  /Fe [filename]           Force executable filename.
  /Fl [filename]           Generate object listing.
  /Fm [filename]           Generate map file.
  /Fo [filename]           Force object-module filename.
  /FPx                     Select floating-point control:
                           a = calls with alternate math library
                           c = calls with emulator library
                           c87 = calls with 8087 library
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                           c87 = calls with 8087 library
                           i = in-line with emulator (default)
                           i87 = in-line with 8087
  /Fs [filename]           Generate source listing.
  /Gx                      Select code generation:
                           0 = 8086 instructions (default)
                           1 = 186 instructions
                           2 = 286 instructions
                           c = Pascal style function calls
                           s = no stack checking
                           t[n] = data size threshold
  /H<n>                    Specify external name length.
  /I<path>                 Specify additional #include path.
  /J                       Specify default char type as unsigned.
  /link [options]          Pass switches and library names to linker.
  /Ox                      Select optimization:
                           a = ignore aliasing
                           d = disable optimizations
                           i = enable intrinsic functions
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                           i = enable intrinsic functions
                           l = enable loop optimizations
                           n = disable "unsafe" optimizations
                           p = enable precision optimizations
                           r = disable in-line return
                           s = optimize for space
  /Ox                      t = optimize for speed (default)
                           w = ignore aliasing except across function
                           calls
                           x = enable maximum optimization (equivalent to
                           /Oailt /Gs)
  /P                       Send preprocessor output to file.
  /Sx                      Select source-listing control:
                           l<columns> = set line width
                           p<lines> = set page length
                           s<string> = set subtitle string
                           t<string> = set title string
  /Tc<file>                Compile file without .C extension.
  /u                       Remove all predefined macros.
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /u                       Remove all predefined macros.
  /U<name>                 Remove specified predefined macro.
  /V<string>               Set version string.
  /W<n>                    Set warning level (03).
  /X                       Ignore "standard places" for include files.
  /Zx                      Select miscellaneous compilation control:
                           a = disable extensions
                           c = make Pascal functions case-insensitive
                           d = include line-number information
                           e = enable extensions (default)
                           g = generate declarations
                           i = include symbolic debugging information
                           l = remove default library info
                           p<n> = pack structures on n-byte boundary
                           s = check syntax only
  ──────────────────────────────────────────────────────────────────────────


  Figure 4-2.  Microsoft C Optimizing Compiler version 5.1 switches.


The Microsoft Object Linker

  The object module produced by MASM from a source file is in a form that
  contains relocation information and may also contain unresolved references
  to external locations or subroutines. It is written in a common format
  that is also produced by the various high-level compilers (such as FORTRAN
  and C) that run under MS-DOS. The computer cannot execute object modules
  without further processing.

  The Microsoft Object Linker (LINK), distributed as the file LINK.EXE,
  accepts one or more of these object modules, resolves external references,
  includes any necessary routines from designated libraries, performs any
  necessary offset relocations, and writes a file that can be loaded and
  executed by MS-DOS. The output of LINK is always in .EXE load-module
  format. (See Chapter 3.)

  As with MASM, you can give LINK its parameters interactively or by
  entering all the required information in a single command line. If you
  enter the name of the linker alone, the following type of dialog ensues:

  C>LINK  <Enter>

  Microsoft (R) Overlay Linker  Version 3.61
  Copyright (C) Microsoft Corp 1983-1987. All rights reserved.

  Object Modules [.OBJ]: HELLO  <Enter>
  Run File [HELLO.EXE]:  <Enter>
  List File [NUL.MAP]: HELLO  <Enter>
  Libraries [.LIB]:  <Enter>

  C>

  If you are using LINK version 4.0 or later, the linker also asks for the
  name of a module-definition (.DEF) file. Simply press the Enter key in
  response to such a prompt. Module-definition files are used when building
  Microsoft Windows or MS OS/2 "new .EXE" executable files but are not
  relevant in normal MS-DOS applications.

  The input file for this example was HELLO.OBJ; the output files were
  HELLO.EXE (the executable program) and HELLO.MAP (the load map produced by
  the linker after all references and addresses were resolved). Figure 4-3
  shows the load map.

  ──────────────────────────────────────────────────────────────────────────
   Start  Stop   Length Name                   Class
   00000H 00017H 00018H _TEXT                  CODE
   00018H 00027H 00010H _DATA                  DATA
   00030H 000AFH 00080H STACK                  STACK
   000B0H 000BBH 0000CH $$TYPES                DEBTYP
   000C0H 000D6H 00017H $$SYMBOLS              DEBSYM

    Address         Publics by Name

    Address         Publics by Value

  Program entry point at 0000:0000
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-3.  Map produced by the Microsoft Object Linker (LINK) during the
  generation of the HELLO.EXE program from Chapter 3. The program contains
  one CODE, one DATA, and one STACK segment. The first instruction to be
  executed lies in the first byte of the CODE segment. The $$TYPES and
  $$SYMBOLS segments contain information for the CodeView debugger and are
  not part of the program; these segments are ignored by the normal MS-DOS
  loader.

  You can obtain the same result more quickly by entering all parameters in
  the command line, in the following form:

    LINK options objectfile, [exefile], [mapfile], [libraries]

  Thus, the command-line equivalent to the preceding interactive session is

  C>LINK HELLO,HELLO,HELLO,,  <Enter>

  or

  C>LINK HELLO,,HELLO;  <Enter>

  If you enter a semicolon as the last character in the command line, LINK
  assumes the default values for all further parameters.

  A third method of commanding LINK is with a response file. A response file
  contains lines of text that correspond to the responses you would give the
  linker interactively. You specify the name of the response file in the
  command line with a leading @ character, as follows:

    LINK @filename

  You can also enter the name of a response file at any prompt. If the
  response file is not complete, LINK will prompt you for the missing
  information.

  When entering linker commands, you can specify multiple object files with
  the + operator or with spaces, as in the following example:

  C>LINK HELLO+VMODE+DOSINT,MYPROG,,GRAPHICS;  <Enter>

  This command would link the files HELLO.OBJ, VMODE.OBJ, and DOSINT.OBJ,
  searching the library file GRAPHICS.LIB to resolve any references to
  symbols not defined in the specified object files, and would produce a
  file named MYPROG.EXE. LINK uses the current drive and directory when they
  are not explicitly included in a filename; it will not automatically use
  the same drive and directory you specified for a previous file in the same
  command line.

  By using the + operator or space characters in the libraries field, you
  can specify up to 32 library files to be searched. Each high-level-
  language compiler provides default libraries that are searched
  automatically during the linkage process if the linker can find them
  (unless they are explicitly excluded with the /NOD switch). LINK looks for
  libraries first in the current directory of the default disk drive, then
  along any paths that were provided in the command line, and finally along
  the path(s) specified by the LIB variable if it is present in the
  environment.

  LINK accepts several optional switches as part of the command line or at
  the end of any interactive prompt. Figure 4-4 lists these switches. The
  number of switches available and their actions vary among different
  versions of LINK. See your Microsoft Object Linker instruction manual for
  detailed information about your particular version.

╓┌─┌────────┌───────────────────────────┌────────────────────────────────────╖
  Switch   Full form                   Meaning
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /A:n     /ALIGNMENT:n                Set segment sector alignment factor.
                                       N must be a power of 2 (default =
                                       512). Not related to logical-segment
                                       alignment (BYTE, WORD, PARA, PAGE,
                                       and so forth). Relevant to segmented
                                       executable files (Microsoft Windows
                                       and MS OS/2) only.

  /B       /BATCH                      Suppress linker prompt if a library
                                       cannot be found in the current
                                       directory or in the locations
                                       specified by the LIB environment
                                       variable.

  /CO      /CODEVIEW                   Include symbolic debugging
                                       information in the .EXE file for use
                                       by CodeView.

  /CP      /CPARMAXALLOC               Set the field in the .EXE file header
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /CP      /CPARMAXALLOC               Set the field in the .EXE file header
                                       controlling the amount of memory
                                       allocated to the program in addition
                                       to the memory required for the
                                       program's code, stack, and
                                       initialized data.

  /DO      /DOSSEG                     Use standard Microsoft segment naming
                                       and ordering conventions.

  /DS      /DSALLOCATE                 Load data at high end of the data
                                       segment. Relevant to real-mode
                                       programs only.

  /E       /EXEPACK                    Pack executable file by removing
                                       sequences of repeated bytes and
                                       optimizing relocation table.

  /F       /FARCALLTRANSLATION         Optimize far calls to labels within
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /F       /FARCALLTRANSLATION         Optimize far calls to labels within
                                       the same physical segment for speed
                                       by replacing them with near calls and
                                       NOPs.

  /HE      /HELP                       Display information about available
                                       options.

  /HI      /HIGH                       Load program as high in memory as
                                       possible.

  /I       /INFORMATION                Display information about progress of
                                       linking, including pass numbers and
                                       the names of object files being
                                       linked.

  /INC     /INCREMENTAL                Force production of .SYM and .ILK
                                       files for subsequent use by ILINK
                                       (incremental linker). May not be used
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       (incremental linker). May not be used
                                       with /EXEPACK. Relevant to segmented
                                       executable files (Microsoft Windows
                                       and MS OS/2) only.

  /LI      /LINENUMBERS                Write address of the first
                                       instruction that corresponds to each
                                       source-code line to the map file. Has
                                       no effect if the compiler does not
                                       include line-number information in
                                       the object module. Force creation of
                                       a map file.

  /M[:n]   /MAP[:n]                    Force creation of a .MAP file listing
                                       all public symbols, sorted by name
                                       and by location. The optional value n
                                       is the maximum number of symbols that
                                       can be sorted (default = 2048); when
                                       n is supplied, the alphabetically
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       n is supplied, the alphabetically
                                       sorted list is omitted.

  /NOD     /NODEFAULTLIBRARYSEARCH     Skip search of any default compiler
                                       libraries specified in the .OBJ file.

  /NOE     /NOEXTENDEDDICTSEARCH       Ignore extended library dictionary
                                       (if it is present). The extended
                                       dictionary ordinarily provides the
                                       linker with information about
                                       inter-module dependencies, to speed
                                       up linking.

  /NOF     /NOFARCALLTRANSLATION       Disable optimization of far calls to
                                       labels within the same segment.

  /NOG     /NOGROUPASSOCIATION         Ignore group associations when
                                       assigning addresses to data and code
                                       items.
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       items.

  /NOI     /NOIGNORECASE               Do not ignore case in names during
                                       linking.

  /NON     /NONULLSDOSSEG              Arrange segments as for /DOSSEG but
                                       do not insert 16 null bytes at start
                                       of _TEXT segment.

  /NOP     /NOPACKCODE                 Do not pack contiguous logical code
                                       segments into a single physical
                                       segment.

  /O:n     /OVERLAYINTERRUPT:n         Use interrupt number n with the
                                       overlay manager supplied with some
                                       Microsoft high-level languages.

  /PAC[:n] /PACKCODE[:n]               Pack contiguous logical code segments
                                       into a single physical code segment.
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       into a single physical code segment.
                                       The optional value n is the maximum
                                       size for each packed physical code
                                       segment (default = 65,536 bytes).
                                       Segments in different groups are not
                                       packed.

  /PADC:n  /PADCODE:n                  Add n filler bytes to end of each
                                       code module so that a larger module
                                       can be inserted later with ILINK.
                                       Relevant to segmented executable
                                       files (Windows and MS OS/2) only.

  /PADD:n  /PADDATA:n                  Add n filler bytes to end of each
                                       data module so that a larger module
                                       can be inserted later with ILINK.
                                       Relevant to segmented executable
                                       files (Microsoft Windows and MS OS/2)
                                       only.
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       only.

  /PAU     /PAUSE                      Pause during linking, allowing a
                                       change of disks before .EXE file is
                                       written.

  /SE:n    /SEGMENTS:n                 Set maximum number of segments in
                                       linked program (default = 128).

  /ST:n    /STACK:n                    Set stack size of program in bytes;
                                       ignore stack segment size
                                       declarations within object modules
                                       and definition file.

  /W       /WARNFIXUP                  Display warning messages for offsets
                                       relative to a segment base that is
                                       not the same as the group base.
                                       Relevant to segmented executable
                                       files (Microsoft Windows and MS OS/2)
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       files (Microsoft Windows and MS OS/2)
                                       only.
  ──────────────────────────────────────────────────────────────────────────


  Figure 4-4.  Switches accepted by the Microsoft Object Linker (LINK)
  version 5.0. Earlier versions use a subset of these switches. Note that
  any abbreviation for a switch is acceptable as long as it is sufficient to
  specify the switch uniquely.


The EXE2BIN Utility

  The EXE2BIN utility (EXE2BIN.EXE) transforms a .EXE file created by LINK
  into an executable .COM file, if the program meets the following
  prerequisites:

  ■  It cannot contain more than one declared segment and cannot
     define a stack.

  ■  It must be less than 64 KB in length.

  ■  It must have an origin at 0100H.

  ■  The first location in the file must be specified as the entry point
     in the source code's END directive.

  Although .COM files are somewhat more compact than .EXE files, you should
  avoid using them. Programs that use separate segments for code, data, and
  stack are much easier to port to protected-mode environments such as MS
  OS/2; in addition, .COM files do not support the symbolic debugging
  information used by CodeView.

  Another use for the EXE2BIN utility is to convert an installable device
  driver──after it is assembled and linked into a .EXE file──into a
  memory-image .BIN or .SYS file with an origin of zero. This conversion is
  required in MS-DOS version 2, which cannot load device drivers as .EXE
  files. The process of writing an installable device driver is discussed in
  more detail in Chapter 14.

  Unlike most of the other programming utilities, EXE2BIN does not have an
  interactive mode. It always takes its source and destination filenames,
  separated by spaces, from the MS-DOS command line, as follows:

    EXE2BIN sourcefile [destinationfile]

  If you do not supply the source-file extension, it defaults to .EXE; the
  destination-file extension defaults to .BIN. If you do not specify a name
  for the destination file, EXE2BIN gives it the same name as the source
  file, with a .BIN extension.

  For example, to convert the file HELLO.EXE into HELLO.COM, you would use
  the following command line:

  C>EXE2BIN HELLO.EXE HELLO.COM  <Enter>

  The EXE2BIN program also has other capabilities, such as pure binary
  conversion with segment fixup for creating program images to be placed in
  ROM; but because these features are rarely used during MS-DOS application
  development, they will not be discussed here.


The CREF Utility

  The CREF cross-reference utility CREF.EXE processes a .CRF file produced
  by MASM, creating an ASCII text file with the default extension .REF. The
  file contains a cross-reference listing of all symbols declared in the
  program and the line numbers in which they are referenced. (See Figure
  4-5.) Such a listing is very useful when debugging large
  assembly-language programs with many interdependent procedures and
  variables.

  CREF may be supplied with its parameters interactively or in a single
  command line. If you enter the utility name alone, CREF prompts you for
  the input and output filenames, as shown in the following example:

  C>CREF  <Enter>

  Microsoft (R) Cross-Reference Utility  Version 5.10
  Copyright (C) Microsoft Corp 1981-1985, 1987. All rights reserved.

  Cross-reference [.CRF]: HELLO  <Enter>
  Listing [HELLO.REF]:

  15 Symbols

  C>

  ──────────────────────────────────────────────────────────────────────────
  Microsoft Cross-Reference  Version 5.10       Thu May 26 11:09:34 1988
  HELLO.EXE --- print Hello on terminal

    Symbol Cross-Reference    (# definition, + modification)Cref-1

  @CPU . . . . . . . . . . . . . .   1#
  @VERSION . . . . . . . . . . . .   1#

  CODE . . . . . . . . . . . . . .  21
  CR . . . . . . . . . . . . . . .  17#    46     47

  DATA . . . . . . . . . . . . . .  44

  LF . . . . . . . . . . . . . . .  18#    46     47

  MSG. . . . . . . . . . . . . . .  33     46#
  MSG_LEN. . . . . . . . . . . . .  32     49#

  PRINT. . . . . . . . . . . . . .  25#    39     60

  STACK. . . . . . . . . . . . . .  23     54#    54     58
  STDERR . . . . . . . . . . . . .  15#
  STDIN. . . . . . . . . . . . . .  13#
  STDOUT . . . . . . . . . . . . .  14#    31

  _DATA. . . . . . . . . . . . . .  23     27     44#    51
  _TEXT. . . . . . . . . . . . . .  21#    23     41

   15 Symbols
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-5.  Cross-reference listing HELLO.REF produced by the CREF
  utility from the file HELLO.CRF, for the HELLO.EXE program example from
  Chapter 3. The symbols declared in the program are listed on the left in
  alphabetic order. To the right of each symbol is a list of all the lines
  where that symbol is referenced. The number with a # sign after it denotes
  the line where the symbol is declared. Numbers followed by a + sign
  indicate that the symbol is modified at the specified line. The line
  numbers given in the cross-reference listing correspond to the line
  numbers generated by the assembler in the program-listing (.LST) file, not
  to any physical line count in the original source file.

  The parameters may also be entered in the command line in the following
  form:

    CREF CRF_file, listing_file

  For example, the command-line equivalent to the preceding interactive
  session is:

  C>CREF HELLO,HELLO  <Enter>

  If CREF cannot find the specified .CRF file, it displays an error message.
  Otherwise, it leaves the cross-reference listing in the specified file on
  the disk. You can send the file to the printer with the COPY command, in
  the following form:

    COPY listing_file PRN:

  You can also send the cross-reference listing directly to a character
  device as it is generated by responding to the Listing prompt with the
  name of the device.


The Microsoft Library Manager

  Although the object modules that are produced by MASM or by high-level-
  language compilers can be linked directly into executable load modules,
  they can also be collected into special files called object-module
  libraries. The modules in a library are indexed by name and by the public
  symbols they contain, so that they can be extracted by the linker to
  satisfy external references in a program.

  The Microsoft Library Manager (LIB) is distributed as the file LIB.EXE.
  LIB creates and maintains program libraries, adding, updating, and
  deleting object files as necessary. LIB can also check a library file for
  internal consistency or print a table of its contents (Figure 4-6).

  LIB follows the command conventions of most other Microsoft programming
  tools. You must supply it with the name of a library file to work on, one
  or more operations to perform, the name of a listing file or device, and
  (optionally) the name of the output library. If you do not specify a name
  for the output library, LIB gives it the same name as the input library
  and changes the extension of the input library to .BAK.

  The LIB operations are simply the names of object files, with a prefix
  character that specifies the action to be taken:

  Prefix     Meaning
  ──────────────────────────────────────────────────────────────────────────
  -          Delete an object module from the library.
  *          Extract a module and place it in a separate .OBJ file.
  +          Add an object module or the entire contents of another library
             to the library.
  ──────────────────────────────────────────────────────────────────────────

  You can combine command prefixes. For example, -+ replaces a module, and
  *- extracts a module into a new file and then deletes it from the library.

  ──────────────────────────────────────────────────────────────────────────
  _abort............abort             _abs..............abs
  _access...........access            _asctime..........asctime
  _atof.............atof              _atoi.............atoi
  _atol.............atol              _bdos.............bdos
  _brk..............brk               _brkctl...........brkctl
  _bsearch..........bsearch           _calloc...........calloc
  _cgets............cgets             _chdir............dir
  _chmod............chmod             _chsize...........chsize
       .
       .
       .
  _exit             Offset: 00000010H  Code and data size: 44H
    __exit

  _filbuf           Offset: 00000160H  Code and data size: BBH
    __filbuf

  _file             Offset: 00000300H  Code and data size: CAH
    __iob             __iob2            __lastiob
       .
       .
       .
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-6.  Extract from the table-of-contents listing produced by the
  Microsoft Library Manager (LIB) for the Microsoft C library SLIBC.LIB. The
  first part of the listing is an alphabetic list of all public names
  declared in all of the modules in the library. Each name is associated
  with the object module to which it belongs. The second part of the listing
  is an alphabetic list of the object-module names in the library, each
  followed by its offset within the library file and the actual size of the
  module in bytes. The entry for each module is followed by a summary of the
  public names that are declared within it.

  When you invoke LIB with its name alone, it requests the other information
  it needs interactively, as shown in the following example:

  C>LIB  <Enter>

  Microsoft (R) Library Manager  Version 3.08
  Copyright (C) Microsoft Corp 1983-1987. All rights reserved.

  Library name:  SLIBC  <Enter>
  Operations: +VIDEO  <Enter>
  List file:  SLIBC.LST  <Enter>
  Output library:  SLIBC2  <Enter>

  C>

  In this example, LIB added the object module VIDEO.OBJ to the library
  SLIBC.LIB, wrote a library table of contents into the file SLIBC.LST, and
  named the resulting new library SLIBC2.LIB.

  The Library Manager can also be run with a command line of the following
  form:

    LIB library [commands],[list],[newlibrary]

  For example, the following command line is equivalent to the preceding
  interactive session:

  C>LIB SLIBC +VIDEO,SLIBC.LST,SLIBC2;  <Enter>

  As with the other Microsoft utilities, a semicolon at the end of the
  command line causes LIB to use the default responses for any parameters
  that are omitted.

  Like LINK, LIB can also accept its commands from a response file. The
  contents of the file are lines of text that correspond exactly to the
  responses you would give LIB interactively. You specify the name of the
  response file in the command line with a leading @ character, as follows:

    LIB @filename

  LIB has only three switches: /I (/IGNORECASE), /N (/NOIGNORECASE), and
  /PAGESIZE:number. The /IGNORECASE switch is the default. The /NOIGNORECASE
  switch causes LIB to regard as distinct any symbols that differ only in
  the case of their component letters. You should place the /PAGESIZE
  switch, which defines the size of a unit of allocation space for a given
  library, immediately after the library filename. The library page size is
  in bytes and must be a power of 2 between 16 and 32,768 (16, 32, 64, and
  so forth); the default is 16 bytes. Because the index to a library is
  always a fixed number of pages, setting a larger page size allows you to
  store more object modules in that library; on the other hand, it will
  result in more wasted space within the file.


The MAKE Utility

  The MAKE utility (MAKE.EXE) compares dates of files and carries out
  commands based on the result of that comparison. Because of this single,
  rather basic capability, MAKE can be used to maintain complex programs
  built from many modules. The dates of source, object, and executable files
  are simply compared in a logical sequence; the assembler, compiler,
  linker, and other programming tools are invoked as appropriate.

  The MAKE utility processes a plain ASCII text file called, as you might
  expect, a make file. You start the utility with a command-line entry in
  the following form:

    MAKE makefile [options]

  By convention, a make file has the same name as the executable file that
  is being maintained, but without an extension. The available MAKE switches
  are listed in Figure 4-7.

  A simple make file contains one or more dependency statements separated by
  blank lines. Each dependency statement can be followed by a list of MS-DOS
  commands, in the following form:

    targetfile : sourcefile ...

      command

      command

      .

      .

      .

  If the date and time of any source file are later than those of the target
  file, the accompanying list of commands is carried out. You may use
  comment lines, which begin with a # character, freely in a make file. MAKE
  can also process inference rules and macro definitions. For further
  details on these advanced capabilities, see the Microsoft or IBM
  documentation.

  Switch     Meaning
  ──────────────────────────────────────────────────────────────────────────
  /D         Display last modification date of each file as it is processed.
  /I         Ignore exit (return) codes returned by commands and programs
             executed as a result of dependency statements.
  /N         Display commands that would be executed as a result of
             dependency statements but do not execute those commands.
  /S         Do not display commands as they are executed.
  /X         Direct error messages from MAKE, or any program that MAKE runs,
  <filename> to the specified file. If filename is a hyphen (-), direct
             error messages to the standard output.
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-7.  Switches for the MAKE utility.


A Complete Example

  Let's put together everything we've learned about using the MS-DOS
  programming tools so far. Figure 4-8 shows a sketch of the overall
  process of building an executable program.

  Assume that we have the source code for the HELLO.EXE program from Chapter
  3 in the file HELLO.ASM. To assemble the source program into the
  relocatable object module HELLO.OBJ with symbolic debugging information
  included, also producing a program listing in the file HELLO.LST and a
  cross-reference data file HELLO.CRF, we would enter

  C>MASM /C /L /Zi /T HELLO;  <Enter>

  To convert the cross-reference raw-data file HELLO.CRF into a
  cross-reference listing in the file HELLO.REF, we would enter

  C>CREF HELLO,HELLO  <Enter>

  ┌───────────────┐             ┌───────────────┐
       MASM                     C or other   
    source-code                 HLL source-  
       file                      code file   
  └───┬───────────┘             └───┬───────────┘
             ┌─────────────────────┘  Compiler
  ┌───▼───────▼───┐
    Relocatable  
   object-module ├────┐
    file (.OBJ)      
  └───┬───────────┘    
       LIB            
  ┌───▼───────────┐            ┌───────────────┐
   Object-module       LINK    Executable   
     libraries   ├─────────────►   program     
      (.LIB)                     (.EXE)     
  └───────────────┘            └───┬───────────┘
                                    EXE2BIN
  ┌───────────────┐            ┌───▼───────────┐
       HLL                      Executable  
     runtime     ├──────┘          program    
    libraries                      (.COM)    
  └───────────────┘             └───────────────┘

  Figure 4-8.  Creation of an MS-DOS application program, from source code
  to executable file.

  To convert the relocatable object file HELLO.OBJ into the executable file
  HELLO.EXE, creating a load map in the file HELLO.MAP and appending
  symbolic debugging information to the executable file, we would enter

  C>LINK /MAP /CODEVIEW HELLO;  <Enter>

  We could also automate the entire process just described by creating a
  make file named HELLO (with no extension) and including the following
  instructions:

  hello.obj : hello.asm
   masm /C /L /Zi /T hello;
   cref hello,hello

  hello.exe : hello.obj
   link /MAP /CODEVIEW hello;

  Then, when we have made some change to HELLO.ASM and want to rebuild the
  executable HELLO.EXE file, we need only enter

  C>MAKE HELLO  <Enter>


Programming Resources and References

  The literature on IBM PCcompatible personal computers, the Intel 80x86
  microprocessor family, and assembly-language and C programming is vast.
  The list below contains a selection of those books that I have found to be
  useful and reliable. The list should not be construed as an endorsement by
  Microsoft Corporation.

MASM Tutorials

  Assembly Language Primer for the IBM PC and XT, by Robert Lafore. New
  American Library, New York, NY, 1984. ISBN 0-452-25711-5.

  8086/8088/80286 Assembly Language, by Leo Scanlon. Brady Books, Simon and
  Schuster, New York, NY, 1988. ISBN 0-13-246919-7.

C Tutorials

  Microsoft C Programming for the IBM, by Robert Lafore. Howard K. Sams &
  Co., Indianapolis, IN, 1987. ISBN 0-672-22515-8.

  Proficient C, by Augie Hansen. Microsoft Press, Redmond, WA, 1987. ISBN
  1-55615-007-5.

Intel 80x86 Microprocessor References

  iAPX 88 Book. Intel Corporation, Literature Department SV3-3, 3065 Bowers
  Ave., Santa Clara, CA 95051. Order no. 210200.

  iAPX 286 Programmer's Reference Manual. Intel Corporation, Literature
  Department SV3-3, 3065 Bowers Ave., Santa Clara, CA 95051. Order no.
  210498.

  iAPX 386 Programmer's Reference Manual. Intel Corporation, Literature
  Department SV3-3, 3065 Bowers Ave., Santa Clara, CA 95051. Order no.
  230985.

PC, PC/AT, and PS/2 Architecture

  The IBM Personal Computer from the Inside Out (Revised Edition), by Murray
  Sargent and Richard L. Shoemaker. Addison-Wesley Publishing Company,
  Reading, MA, 1986. ISBN 0-201-06918-0.

  Programmer's Guide to PC & PS/2 Video Systems, by Richard Wilton.
  Microsoft Press, Redmond, WA, 1987. ISBN 1-55615-103-9.

  Personal Computer Technical Reference. IBM Corporation, IBM Technical
  Directory, P. O. Box 2009, Racine, WI 53404. Part no. 6322507.

  Personal Computer AT Technical Reference. IBM Corporation, IBM Technical
  Directory, P. O. Box 2009, Racine, WI 53404. Part no. 6280070.

  Options and Adapters Technical Reference. IBM Corporation, IBM Technical
  Directory, P. O. Box 2009, Racine, WI 53404. Part no. 6322509.

  Personal System/2 Model 30 Technical Reference. IBM Corporation, IBM
  Technical Directory, P. O. Box 2009, Racine, WI 53404. Part no. 68X2201.

  Personal System/2 Model 50/60 Technical Reference. IBM Corporation, IBM
  Technical Directory, P. O. Box 2009, Racine, WI 53404. Part no. 68X2224.

  Personal System/2 Model 80 Technical Reference. IBM Corporation, IBM
  Technical Directory, P. O. Box 2009, Racine, WI 53404. Part no. 68X2256.



────────────────────────────────────────────────────────────────────────────
Chapter 5  Keyboard and Mouse Input

  The fundamental means of user input under MS-DOS is the keyboard. This
  follows naturally from the MS-DOS command-line interface, whose lineage
  can be traced directly to minicomputer operating systems with Teletype
  consoles. During the first few years of MS-DOS's existence, when
  8088/8086-based machines were the norm, nearly every popular application
  program used key-driven menus and text-mode displays.

  However, as high-resolution graphics adapters (and 80286/80386-based
  machines with enough power to drive them) have become less expensive,
  programs that support windows and a graphical user interface have steadily
  grown more popular. Such programs typically rely on a pointing device such
  as a mouse, stylus, joystick, or light pen to let the user navigate in a
  "point-and-shoot" manner, reducing keyboard entry to a minimum. As a
  result, support for pointing devices has become an important consideration
  for all software developers.


Keyboard Input Methods

  Applications running under MS-DOS on IBM PCcompatible machines can use
  several methods to obtain keyboard input:

    MS-DOS handle-oriented functions

    MS-DOS traditional character functions

    IBM ROM BIOS keyboard-driver functions

  These methods offer different degrees of flexibility, portability, and
  hardware independence.

  The handle, or stream-oriented, functions are philosophically derived from
  UNIX/XENIX and were first introduced in MS-DOS version 2.0. A program uses
  these functions by supplying a handle, or token, for the desired device,
  plus the address and length of a buffer.

  When a program begins executing, MS-DOS supplies it with predefined
  handles for certain commonly used character devices, including the
  keyboard:

  Handle             Device name                          Opened to
  ──────────────────────────────────────────────────────────────────────────
  0                  Standard input (stdin)               CON
  1                  Standard output (stdout)             CON
  2                  Standard error (stderr)              CON
  3                  Standard auxiliary (stdaux)          AUX
  4                  Standard printer (stdprn)            PRN
  ──────────────────────────────────────────────────────────────────────────

  These handles can be used for read and write operations without further
  preliminaries. A program can also obtain a handle for a character device
  by explicitly opening the device for input or output using its logical
  name (as though it were a file). The handle functions support I/O
  redirection, allowing a program to take its input from another device or
  file instead of the keyboard, for example. Redirection is discussed in
  detail in Chapter 15.

  The traditional character-input functions are a superset of the character
  I/O functions that were present in CP/M. Originally included in MS-DOS
  simply to facilitate the porting of existing applications from CP/M, they
  are still widely used. In MS-DOS versions 2.0 and later, most of the
  traditional functions also support I/O redirection (although not as well
  as the handle functions do).

  Use of the IBM ROM BIOS keyboard functions presupposes that the program is
  running on an IBM PCcompatible machine. The ROM BIOS keyboard driver
  operates at a much more primitive level than the MS-DOS functions and
  allows a program to circumvent I/O redirection or MS-DOS's special
  handling of certain control characters. Programs that use the ROM BIOS
  keyboard driver are inherently less portable than those that use the
  MS-DOS functions and may interfere with the proper operation of other
  programs; many of the popular terminate-and-stay-resident (TSR) utilities
  fall into this category.

Keyboard Input with Handles

  The principal MS-DOS function for keyboard input using handles is Int 21H
  Function 3FH (Read File or Device). The parameters for this function are
  a handle, the segment and offset of a buffer, and the length of the
  buffer. (For a more detailed explanation of this function, see Section
  II of this book, "MS-DOS Functions Reference.")

  As an example, let's use the predefined standard input handle (0) and Int
  21H Function 3FH to read a line from the keyboard:

  ──────────────────────────────────────────────────────────────────────────
  buffer  db   80 dup (?)     ; keyboard input buffer
          .
          .
          .
          mov  ah,3fh         ; function 3fh = read file or device
          mov  bx,0           ; handle for standard input
          mov  cx,80          ; maximum bytes to read
          mov  dx,seg buffer  ; DS:DX = buffer address
          mov  ds,dx
          mov  dx,offset buffer
          int  21h            ; transfer to MS-DOS
          jc   error          ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  When control returns from Int 21H Function 3FH, the carry flag is clear if
  the function was successful, and AX contains the number of characters
  read. If there was an error, the carry flag is set and AX contains an
  error code; however, this should never occur when reading the keyboard.

  The standard input is redirectable, so the code just shown is not a
  foolproof way of obtaining input from the keyboard. Depending upon whether
  a redirection parameter was included in the command line by the user,
  program input might be coming from the keyboard, a file, another character
  device, or even the bit bucket (NUL device). To bypass redirection and be
  absolutely certain where your input is coming from, you can ignore the
  predefined standard input handle and open the console as though it were a
  file, using the handle obtained from that open operation to perform your
  keyboard input, as in the following example:

  ──────────────────────────────────────────────────────────────────────────
  buffer  db     80 dup (?)   ; keyboard input buffer
  fname   db     'CON',0      ; keyboard device name
  handle  dw     0            ; keyboard device handle
          .
          .
          .
          mov    ah,3dh       ; function 3dh = open
          mov    al,0         ; mode = read
          mov    dx,seg fname ; DS:DX = device name
          mov    ds,dx
          mov    dx,offset fname
          int    21h          ; transfer to MS-DOS
          jc     error        ; jump if open failed
          mov    handle,ax    ; save handle for CON
          .
          .
          .
          mov    ah,3fh       ; function 3fh = read file or device
          mov    bx,handle    ; BX = handle for CON
          mov    cx,80        ; maximum bytes to read
          mov    dx,offset buffer ; DS:DX = buffer address
          int    21h          ; transfer to MS-DOS
          jc     error        ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  When a programmer uses Int 21H Function 3FH to read from the keyboard, the
  exact result depends on whether MS-DOS regards the handle to be in ASCII
  mode or binary mode (sometimes known as cooked mode and raw mode). ASCII
  mode is the default, although binary mode can be selected with Int 21H
  Function 44H (IOCTL) when necessary.

  In ASCII mode, MS-DOS initially places characters obtained from the
  keyboard in a 128-byte internal buffer, and the user can edit the input
  with the Backspace key and the special function keys. MS-DOS automatically
  echoes the characters to the standard output, expanding tab characters to
  spaces (although they are left as the ASCII code 09H in the buffer). The
  Ctrl-C, Ctrl-S, and Ctrl-P key combinations receive special handling, and
  the Enter key is translated to a carriage returnlinefeed pair. When the
  user presses Enter or Ctrl-Z, MS-DOS copies the requested number of
  characters (or the actual number of characters entered, if less than the
  number requested) out of the internal buffer into the calling program's
  buffer.

  In binary mode, MS-DOS never echoes input characters. It passes the
  Ctrl-C, Ctrl-S, Ctrl-P, and Ctrl-Z key combinations and the Enter key
  through to the application unchanged, and Int 21H Function 3FH does not
  return control to the application until the exact number of characters
  requested has been received.

  Ctrl-C checking is discussed in more detail at the end of this chapter.
  For now, simply note that the application programmer can substitute a
  custom handler for the default MS-DOS Ctrl-C handler and thereby avoid
  having the application program lose control of the machine when the user
  enters a Ctrl-C or Ctrl-Break.

Keyboard Input with Traditional Calls

  The MS-DOS traditional keyboard functions offer a variety of character and
  line-oriented services with or without echo and Ctrl-C detection. These
  functions are summarized on the following page.

  Int 21H Function   Action                               Ctrl-C checking
  ──────────────────────────────────────────────────────────────────────────
  01H               Keyboard input with echo             Yes
  06H               Direct console I/O                   No
  07H               Keyboard input without echo          No
  08H               Keyboard input without echo          Yes
  0AH               Buffered keyboard input              Yes
  0BH               Input-status check                   Yes
  0CH               Input-buffer reset and input         Varies
  ──────────────────────────────────────────────────────────────────────────

  In MS-DOS versions 2.0 and later, redirection of the standard input
  affects all these functions. In other words, they act as though they were
  special cases of an Int 21H Function 3FH call using the predefined
  standard input handle (0).

  The character-input functions (01H, 06H, 07H, and 08H) all return a
  character in the AL register. For example, the following sequence waits
  until a key is pressed and then returns it in AL:

  ──────────────────────────────────────────────────────────────────────────
          mov     ah,1        ; function 01h = read keyboard
          int     21h         ; transfer to MS-DOS
  ──────────────────────────────────────────────────────────────────────────

  The character-input functions differ in whether the input is echoed to the
  screen and whether they are sensitive to Ctrl-C interrupts. Although
  MS-DOS provides no pure keyboard-status function that is immune to Ctrl-C,
  a program can read keyboard status (somewhat circuitously) without
  interference by using Int 21H Function 06H. Extended keys, such as the
  IBM PC keyboard's special function keys, require two calls to a
  character-input function.

  As an alternative to single-character input, a program can use
  buffered-line input (Int 21H Function 0AH) to obtain an entire line from
  the keyboard in one operation. MS-DOS builds up buffered lines in an
  internal buffer and does not pass them to the calling program until the
  user presses the Enter key. While the line is being entered, all the usual
  editing keys are active and are handled by the MS-DOS keyboard driver. You
  use Int 21H Function 0AH as follows:

  ──────────────────────────────────────────────────────────────────────────
  buff    db      81          ; maximum length of input
          db      0           ; actual length (from MS-DOS)
          db      81 dup (0)  ; receives keyboard input
          .
          .
          .
          mov     ah,0ah      ; function 0ah = read buffered line
          mov     dx,seg buff ; DS:DX = buffer address
          mov     ds,dx
          mov     dx,offset buff
          int     21h         ; transfer to MS-DOS
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Int 21H Function 0AH differs from Int 21H Function 3FH in several
  important ways. First, the maximum length is passed in the first byte of
  the buffer, rather than in the CX register. Second, the actual length is
  returned in the second byte of the structure, rather than in the AX
  register. Finally, when the user has entered one less than the specified
  maximum number of characters, MS-DOS ignores all subsequent characters and
  sounds a warning beep until the Enter key is pressed.

  For detailed information about each of the traditional keyboard-input
  functions, see Section II of this book, "MS-DOS Functions Reference."

Keyboard Input with ROM BIOS Functions

  Programmers writing applications for IBM PC compatibles can bypass the
  MS-DOS keyboard functions and choose from two hardware-dependent
  techniques for keyboard input.

  The first method is to call the ROM BIOS keyboard driver using Int 16H.
  For example, the following sequence reads a single character from the
  keyboard input buffer and returns it in the AL register:

  ──────────────────────────────────────────────────────────────────────────
          mov    ah,0         ; function 0=read keyboard
          int    16h          ; transfer to ROM BIOS
  ──────────────────────────────────────────────────────────────────────────

  Int 16H Function 00H also returns the keyboard scan code in the AH
  register, allowing the program to detect key codes that are not ordinarily
  returned by MS-DOS. Other Int 16H services return the keyboard status
  (that is, whether a character is waiting) or the keyboard shift state
  (from the ROM BIOS data area 0000:0417H). For a more detailed explanation
  of ROM BIOS keyboard functions, see Section III of this book, "IBM ROM
  BIOS and Mouse Functions Reference."

  You should consider carefully before building ROM BIOS dependence into an
  application. Although this technique allows you to bypass any I/O
  redirection that may be in effect, ways exist to do this without
  introducing dependence on the ROM BIOS. And there are real disadvantages
  to calling the ROM BIOS keyboard driver:

    It always bypasses I/O redirection, which sometimes may not be
     desirable.

    It is dependent on IBM PC compatibility and does not work correctly,
     unchanged, on some older machines such as the Hewlett-Packard
     TouchScreen or the Wang Professional Computer.

    It may introduce complicated interactions with TSR utilities.

  The other and more hardware-dependent method of keyboard input on an IBM
  PC is to write a new handler for ROM BIOS Int 09H and service the keyboard
  controller's interrupts directly. This involves translation of scan codes
  to ASCII characters and maintenance of the type-ahead buffer. In ordinary
  PC applications, there is no reason to take over keyboard I/O at this
  level; therefore, I will not discuss this method further here. If you are
  curious about the techniques that would be required, the best reference is
  the listing for the ROM BIOS Int 09H handler in the IBM PC or PC/AT
  technical reference manual.


Ctrl-C and Ctrl-Break Handlers

  In the discussion of keyboard input with the MS-DOS handle and traditional
  functions, I made some passing references to the fact that Ctrl-C entries
  can interfere with the expected behavior of those functions. Let's look at
  this subject in more detail now.

  During most character I/O operations, MS-DOS checks for a Ctrl-C (ASCII
  code 03H) waiting at the keyboard and executes an Int 23H if one is
  detected. If the system break flag is on, MS-DOS also checks for a Ctrl-C
  entry during certain other operations (such as file reads and writes).
  Ordinarily, the Int 23H vector points to a routine that simply terminates
  the currently active process and returns control to the parent process──
  usually the MS-DOS command interpreter.

  In other words, if your program is executing and you enter a Ctrl-C,
  accidentally or intentionally, MS-DOS simply aborts the program. Any files
  the program has opened using file control blocks will not be closed
  properly, any interrupt vectors it has altered may not be restored
  correctly, and if it is performing any direct I/O operations (for example,
  if it contains an interrupt driver for the serial port), all kinds of
  unexpected events may occur.

  Although you can use a number of partially effective methods to defeat
  Ctrl-C checking, such as performing keyboard input with Int 21H Functions
  06H and 07H, placing all character devices into binary mode, or turning
  off the system break flag with Int 21H Function 33H, none of these is
  completely foolproof. The simplest and most elegant way to defeat Ctrl-C
  checking is simply to substitute your own Int 23H handler, which can take
  some action appropriate to your program. When the program terminates,
  MS-DOS automatically restores the previous contents of the Int 23H vector
  from information saved in the program segment prefix. The following
  example shows how to install your own Ctrl-C handler (which in this case
  does nothing at all):

  ──────────────────────────────────────────────────────────────────────────
          push    ds          ; save data segment
                              ; set int 23h vector...
          mov     ax,2523h    ; function 25h = set interrupt
                              ; int 23h = vector for
                              ; Ctrl-C handler
          mov     dx,seg handler ; DS:DX = handler address
          mov     ds,dx
          mov     dx,offset handler
          int     21h         ; transfer to MS-DOS

          pop     ds          ; restore data segment
          .
          .
          .
  handler:                    ; a Ctrl-C handler
          iret                ; that does nothing
  ──────────────────────────────────────────────────────────────────────────

  The first part of the code (which alters the contents of the Int 23H
  vector) would be executed in the initialization part of the application.
  The handler receives control whenever MS-DOS detects a Ctrl-C at the
  keyboard. (Because this handler consists only of an interrupt return, the
  Ctrl-C will remain in the keyboard input stream and will be passed to the
  application when it requests a character from the keyboard, appearing on
  the screen as ^C.)

  When an Int 23H handler is called, MS-DOS is in a stable state. Thus, the
  handler can call any MS-DOS function. It can also reset the segment
  registers and the stack pointer and transfer control to some other point
  in the application without ever returning control to MS-DOS with an IRET.

  On IBM PC compatibles, an additional interrupt handler must be taken into
  consideration. Whenever the ROM BIOS keyboard driver detects the key
  combination Ctrl-Break, it calls a handler whose address is stored in the
  vector for Int 1BH. The default ROM BIOS Int 1BH handler does nothing.
  MS-DOS alters the Int 1BH vector to point to its own handler, which sets a
  flag and returns; the net effect is to remap the Ctrl-Break into a Ctrl-C
  that is forced ahead of any other characters waiting in the keyboard
  buffer.

  Taking over the Int 1BH vector in an application is somewhat tricky but
  extremely useful. Because the keyboard is interrupt driven, a press of
  Ctrl-Break lets the application regain control under almost any
  circumstance──often, even if the program has crashed or is in an endless
  loop.

  You cannot, in general, use the same handler for Int 1BH that you use for
  Int 23H. The Int 1BH handler is more limited in what it can do, because it
  has been called as a result of a hardware interrupt and MS-DOS may have
  been executing a critical section of code at the time the interrupt was
  serviced. Thus, all registers except CS:IP are in an unknown state; they
  may have to be saved and then modified before your interrupt handler can
  execute. Similarly, the depth of the stack in use when the Int 1BH handler
  is called is unknown, and if the handler is to perform stack-intensive
  operations, it may have to save the stack segment and the stack pointer
  and switch to a new stack that is known to have sufficient depth.

  In normal application programs, you should probably avoid retaining
  control in an Int 1BH handler, rather than performing an IRET. Because of
  subtle differences among non-IBM ROM BIOSes, it is difficult to predict
  the state of the keyboard controller and the 8259 Programmable Interrupt
  Controller (PIC) when the Int 1BH handler begins executing. Also, MS-DOS
  itself may not be in a stable state at the point of interrupt, a situation
  that can manifest itself in unexpected critical errors during subsequent
  I/O operations. Finally, MS-DOS versions 3.2 and later allocate a stack
  from an internal pool for use by the Int 09H handler. If the Int 1BH
  handler never returns, the Int 09H handler never returns either, and
  repeated entries of Ctrl-Break will eventually exhaust the stack pool,
  halting the system.

  Because Int 1BH is a ROM BIOS interrupt and not an MS-DOS interrupt,
  MS-DOS does not restore the previous contents of the Int 1BH vector when a
  program exits. If your program modifies this vector, it must save the
  original value and restore it before terminating. Otherwise, the vector
  will be left pointing to some random area in the next program that runs,
  and the next time the user presses Ctrl-Break a system crash is the best
  you can hope for.

Ctrl-C and Ctrl-Break Handlers and High-Level Languages

  Capturing the Ctrl-C and Ctrl-Break interrupts is straightforward when you
  are programming in assembly language. The process is only slightly more
  difficult with high-level languages, as long as you have enough
  information about the language's calling conventions that you can link in
  a small assembly-language routine as part of the program.

  The BREAK.ASM listing in Figure 5-1 contains source code for a Ctrl-Break
  handler that can be linked with small-model Microsoft C programs running
  on an IBM PC compatible. The short C program in Figure 5-2 demonstrates
  use of the handler. (This code should be readily portable to other C
  compilers.)

  ──────────────────────────────────────────────────────────────────────────
          page    55,132
          title   Ctrl-C & Ctrl-Break Handlers
          name    break

  ;
  ; Ctrl-C and Ctrl-Break handler for Microsoft C
  ; programs running on IBM PC compatibles
  ;
  ; by Ray Duncan
  ;
  ; Assemble with:  C>MASM /Mx BREAK;
  ;
  ; This module allows C programs to retain control
  ; when the user enters a Ctrl-Break or Ctrl-C.
  ; It uses Microsoft C parameter-passing conventions
  ; and assumes the C small memory model.
  ;
  ; The procedure _capture is called to install
  ; a new handler for the Ctrl-C and Ctrl-Break
  ; interrupts (1bh and 23h).  _capture is passed
  ; the address of a static variable, which will be
  ; set to true by the handler whenever a Ctrl-C
  ; or Ctrl-Break is detected.  The C syntax is:
  ;
  ;               static int flag;
  ;               capture(&flag);
  ;
  ; The procedure _release is called by the C program
  ; to restore the original Ctrl-Break and Ctrl-C
  ; handler. The C syntax is:
  ;               release();
  ;
  ; The procedure ctrlbrk is the actual interrupt
  ; handler.  It receives control when a software
  ; int 1bh is executed by the ROM BIOS or int 23h
  ; is executed by MS-DOS.  It simply sets the C
  ; program's variable to true (1) and returns.
  ;

  args    equ     4               ; stack offset of arguments,
                                  ; C small memory model

  cr      equ     0dh             ; ASCII carriage return
  lf      equ     0ah             ; ASCII linefeed

  _TEXT   segment word public 'CODE'

          assume cs:_TEXT


          public  _capture
  _capture proc   near            ; take over Ctrl-Break
                                  ; and Ctrl-C interrupt vectors

          push    bp              ; set up stack frame
          mov     bp,sp

          push    ds              ; save registers
          push    di
          push    si

                                  ; save address of
                                  ; calling program's "flag"
          mov     ax,word ptr [bp+args]
          mov     word ptr cs:flag,ax
          mov     word ptr cs:flag+2,ds

                                  ; save address of original
          mov     ax,3523h        ; int 23h handler
          int     21h
          mov     word ptr cs:int23,bx
          mov     word ptr cs:int23+2,es
          mov     ax,351bh        ; save address of original
          int     21h             ; int 1bh handler
          mov     word ptr cs:int1b,bx
          mov     word ptr cs:int1b+2,es
          push    cs              ; set DS:DX = address
          pop     ds              ; of new handler
          mov     dx,offset _TEXT:ctrlbrk

          mov     ax,02523h       ; set int 23h vector
          int     21h

          mov     ax,0251bh       ; set int 1bh vector
          int     21h

          pop     si              ; restore registers
          pop     di
          pop     ds

          pop     bp              ; discard stack frame
          ret                     ; and return to caller

  _capture endp


          public  _release
  _release proc   near            ; restore original Ctrl-C
                                  ; and Ctrl-Break handlers

          push    bp              ; save registers
          push    ds
          push    di
          push    si

          lds     dx,cs:int1b     ; get address of previous
                                  ; int 1bh handler

          mov     ax,251bh        ; set int 1bh vector
          int     21h

          lds     dx,cs:int23     ; get address of previous
                                  ; int 23h handler

          mov     ax,2523h        ; set int 23h vector
          int     21h

          pop     si              ; restore registers
          pop     di              ; and return to caller
          pop     ds
          pop     bp
          ret
  release endp

  ctrlbrk proc    far             ; Ctrl-C and Ctrl-Break
                                  ; interrupt handler

          push    bx              ; save registers
          push    ds

          lds     bx,cs:flag      ; get address of C program's
                                  ; "flag variable"

                                  ; and set the flag "true"
          mov     word ptr ds:[bx],1

          pop     ds              ; restore registers
          pop     bx

          iret                    ; return from handler

  ctrlbrk endp

  flag    dd      0               ; far pointer to caller's
                                  ; Ctrl-Break or Ctrl-C flag

  int23   dd      0               ; address of original
                                  ; Ctrl-C handler

  int1b   dd      0               ; address of original
                                  ; Ctrl-Break handler

  _TEXT   ends

          end
  ──────────────────────────────────────────────────────────────────────────

  Figure 5-1.  BREAK.ASM: A Ctrl-C and Ctrl-Break interrupt handler that can
  be linked with Microsoft C programs.

  ──────────────────────────────────────────────────────────────────────────
  /*
      TRYBREAK.C

      Demo of BREAK.ASM Ctrl-Break and Ctrl-C
      interrupt handler, by Ray Duncan

      To create the executable file TRYBREAK.EXE, enter:

      MASM /Mx BREAK;
      CL TRYBREAK.C BREAK.OBJ
  */

  #include <stdio.h>

  main(int argc, char *argv[])
  {
      int hit = 0;                     /* flag for key press      */
      int c = 0;                       /* character from keyboard */
      static int flag = 0;             /* true if Ctrl-Break
                                          or Ctrl-C detected      */

      puts("\n*** TRYBREAK.C running ***\n");
      puts("Press Ctrl-C or Ctrl-Break to test handler,");
      puts("Press the Esc key to exit TRYBREAK.\n");

      capture(&flag);                  /* install new Ctrl-C and
                                          Ctrl-Break handler and
                                          pass address of flag    */

      puts("TRYBREAK has captured interrupt vectors.\n");

      while(1)
      {
          hit = kbhit();               /* check for key press     */
                                       /* (MS-DOS sees Ctrl-C
                                           when keyboard polled)  */

          if(flag != 0)                /* if flag is true, an     */
          {                            /* interrupt has occurred  */
              puts("\nControl-Break detected.\n");
              flag = 0;                /* reset interrupt flag    */
          }
          if(hit != 0)                 /* if any key waiting      */
          {
              c = getch();             /* read key, exit if Esc   */
              if( (c & 0x7f) == 0x1b) break;
              putch(c);                /* otherwise display it    */
          }
      }
      release();                       /* restore original Ctrl-C
                                          and Ctrl-Break handlers */

      puts("\n\nTRYBREAK has released interrupt vectors.");
  }
  ──────────────────────────────────────────────────────────────────────────

  Figure 5-2.  TRYBREAK.C: A simple Microsoft C program that demonstrates
  use of the interrupt handler BREAK.ASM from Figure 5-1.

  In the example handler, the procedure named capture is called with the
  address of an integer variable within the C program. It saves the address
  of the variable, points the Int 1BH and Int 23H vectors to its own
  interrupt handler, and then returns.

  When MS-DOS detects a Ctrl-C or Ctrl-Break, the interrupt handler sets the
  integer variable within the C program to true (1) and returns. The C
  program can then poll this variable at its leisure. Of course, to detect
  more than one Ctrl-C, the program must reset the variable to zero again.

  The procedure named release simply restores the Int 1BH and Int 23H
  vectors to their original values, thereby disabling the interrupt handler.
  Although it is not strictly necessary for release to do anything about Int
  23H, this action does give the C program the option of restoring the
  default handler for Int 23H without terminating.


Pointing Devices

  Device drivers for pointing devices are supplied by the hardware
  manufacturer and are loaded with a DEVICE statement in the CONFIG.SYS
  file. Although the hardware characteristics of the available pointing
  devices differ greatly, nearly all of their drivers present the same
  software interface to application programs: the Int 33H protocol used by
  the Microsoft Mouse driver. Version 6 of the Microsoft Mouse driver (which
  was current as this was written) offers the following functions:

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────
  00H               Reset mouse and get status.
  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────
  00H               Reset mouse and get status.
  01H               Show mouse pointer.
  02H               Hide mouse pointer.
  03H               Get button status and pointer position.
  04H               Set pointer position.
  05H               Get button-press information.
  06H               Get button-release information.
  07H               Set horizontal limits for pointer.
  08H               Set vertical limits for pointer.
  09H               Set graphics pointer type.
  0AH               Set text pointer type.
  0BH               Read mouse-motion counters.
  0CH               Install interrupt handler for mouse events.
  0DH               Turn on light pen emulation.
  0EH               Turn off light pen emulation.
  0FH               Set mickeys to pixel ratio.
  10H               Set pointer exclusion area.
  13H               Set double-speed threshold.
  14H               Swap mouse-event interrupt routines.
  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────
  14H               Swap mouse-event interrupt routines.
  15H               Get buffer size for mouse-driver state.
  16H               Save mouse-driver state.
  17H               Restore mouse-driver state.
  18H               Install alternate handler for mouse events.
  19H               Get address of alternate handler.
  1AH               Set mouse sensitivity.
  1BH               Get mouse sensitivity.
  1CH               Set mouse interrupt rate.
  1DH               Select display page for pointer.
  1EH               Get display page for pointer.
  1FH               Disable mouse driver.
  20H               Enable mouse driver.
  21H               Reset mouse driver.
  22H               Set language for mouse-driver messages.
  23H               Get language number.
  24H               Get driver version, mouse type, and IRQ number.
  ──────────────────────────────────────────────────────────────────────────

  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────


  Although this list of mouse functions may appear intimidating, the average
  application will only need a few of them.

  A program first calls Int 33H Function 00H to initialize the mouse driver
  for the current display mode and to check its status. At this point, the
  mouse is "alive" and the application can obtain its state and position;
  however, the pointer does not become visible until the process calls Int
  33H Function 01H.

  The program can then call Int 33H Functions 03H, 05H, and 06H to
  monitor the mouse position and the status of the mouse buttons.
  Alternatively, the program can register an interrupt handler for mouse
  events, using Int 33H Function 0CH. This latter technique eliminates the
  need to poll the mouse driver; the driver will notify the program by
  calling the interrupt handler whenever the mouse is moved or a button is
  pressed or released.

  When the application is finished with the mouse, it can call Int 33H
  Function 02H to hide the mouse pointer. If the program has registered an
  interrupt handler for mouse events, it should disable further calls to the
  handler by resetting the mouse driver again with Int 33H Function 00H.

  For a complete description of the mouse-driver functions, see Section
  III of this book, "IBM ROM BIOS and Mouse Functions Reference." Figure
  5-3 shows a small demonstration program that polls the mouse continually,
  to display its position and status.

  ──────────────────────────────────────────────────────────────────────────
  /*
      Simple Demo of Int 33H Mouse Driver
      (C) 1988 Ray Duncan

      Compile with: CL MOUDEMO.C
  */

  #include <stdio.h>
  #include <dos.h>

  union REGS regs;

  void cls(void);                     /* function prototypes       */
  void gotoxy(int, int);

  main(int argc, char *argv[])
  {
      int x,y,buttons;                /* some scratch variables    */
                                      /* for the mouse state       */

      regs.x.ax = 0;                  /* reset mouse driver        */
      int86(0x33, &regs, &regs);      /* and check status          */

      if(regs.x.ax == 0)              /* exit if no mouse          */
      {   printf("\nMouse not available\n");
          exit(1);
      }

      cls();                          /* clear the screen          */
      gotoxy(45,0);                   /* and show help info        */
      puts("Press Both Mouse Buttons To Exit");

      regs.x.ax = 1;                  /* display mouse cursor      */
      int86(0x33, &regs, &regs);

      do {
          regs.x.ax = 3;              /* get mouse position        */
          int86(0x33, &regs, &regs);  /* and button status         */
          buttons = regs.x.bx & 3;
          x = regs.x.cx;
          y = regs.x.dx;
          gotoxy(0,0);                 /* display mouse position    */
          printf("X = %3d  Y = %3d", x, y);

      } while(buttons != 3);           /* exit if both buttons down */

      regs.x.ax = 2;                   /* hide mouse cursor         */
      int86(0x33, &regs, &regs);

      cls();                           /* display message and exit  */
      gotoxy(0,0);
      puts("Have a Mice Day!");
  }

  /*
      Clear the screen
  */
  void cls(void)
  {
      regs.x.ax = 0x0600;              /* ROM BIOS video driver     */
      regs.h.bh = 7;                   /* int 10h function 06h      */
      regs.x.cx = 0;                   /* initializes a window      */
      regs.h.dh = 24;
      regs.h.dl = 79;
      int86(0x10, &regs, &regs);
  }

  /*
      Position cursor to (x,y)
  */
  void gotoxy(int x, int y)
  {
      regs.h.dl = x;                   /* ROM BIOS video driver     */
      regs.h.dh = y;                   /* int 10h function 02h      */
      regs.h.bh = 0;                   /* positions the cursor      */
      regs.h.ah = 2;
      int86(0x10, &regs, &regs);
  }
  ──────────────────────────────────────────────────────────────────────────

  Figure 5-3.  MOUDEMO.C: A simple Microsoft C program that polls the mouse
  and continually displays the coordinates of the mouse pointer in the upper
  left corner of the screen. The program uses the ROM BIOS video driver,
  which is discussed in Chapter 6, to clear the screen and position the
  text cursor.



────────────────────────────────────────────────────────────────────────────
Chapter 6  Video Display

  The visual presentation of an application program is one of its most
  important elements. Users frequently base their conclusions about a
  program's performance and "polish" on the speed and attractiveness of its
  displays. Therefore, a feel for the computer system's display facilities
  and capabilities at all levels, from MS-DOS down to the bare hardware, is
  important to you as a programmer.


Video Display Adapters

  The video display adapters found in IBM PC─compatible computers have a
  hybrid interface to the central processor. The overall display
  characteristics, such as vertical and horizontal resolution, background
  color, and palette, are controlled by values written to I/O ports whose
  addresses are hardwired on the adapter, whereas the appearance of each
  individual character or graphics pixel on the display is controlled by a
  specific location within an area of memory called the regen buffer or
  refresh buffer. Both the CPU and the video controller access this memory;
  the software updates the display by simply writing character codes or bit
  patterns directly into the regen buffer. (This is called memory-mapped
  I/O.)

  The following adapters are in common use as this book is being written:

  ■  Monochrome/Printer Display Adapter (MDA). Introduced with the original
     IBM PC in 1981, this adapter supports 80-by-25 text display on a green
     (monochrome) screen and has no graphics capabilities at all.

  ■  Color/Graphics Adapter (CGA). Also introduced by IBM in 1981, this
     adapter supports 40-by-25 and 80-by-25 text modes and 320-by-200,
     4-color or 640-by-200, 2-color graphics (all-points-addressable, or
     APA) modes on composite or digital RGB monitors.

  ■  Enhanced Graphics Adapter (EGA). Introduced by IBM in 1985 and upwardly
     compatible from the CGA, this adapter adds support for 640-by-350,
     16-color graphics modes on digital RGB monitors. It also supports an
     MDA-compatible text mode.

  ■  Multi-Color Graphics Array (MCGA). Introduced by IBM in 1987 with the
     Personal System/2 (PS/2) models 25 and 30, this adapter is partially
     compatible with the CGA and EGA and supports 640-by-480, 2-color or
     320-by-200, 256-color graphics on analog RGB monitors.

  ■  Video Graphics Array (VGA). Introduced by IBM in 1987 with the PS/2
     models 50, 60, and 80, this adapter is upwardly compatible from the EGA
     and supports 640-by-480, 16-color or 320-by-200, 256-color graphics on
     analog RGB monitors. It also supports an MDA-compatible text mode.

  ■  Hercules Graphics Card, Graphics CardPlus, and InColor Cards. These are
     upwardly compatible from the MDA for text display but offer graphics
     capabilities that are incompatible with all of the IBM adapters.

  The locations of the regen buffers for the various IBM PC─compatible
  adapters are shown in Figure 6-1.

         ┌───────────────────────────────────────────────────────┐
         │                       ROM BIOS                        │
  FE000H ├───────────────────────────────────────────────────────┤
         │          System ROM, Stand-alone BASIC, etc.          │
  F4000H ├───────────────────────────────────────────────────────┤
         │             Reserved for BIOS extensions              │
         │             (hard-disk controller, etc.)              │
  C0000H ├───────────────────────────────────────────────────────┤
         │                       Reserved                        │
  BC000H ├───────────────────────────────────────────────────────┤
         │    16 KB regen buffer for CGA, EGA, MCGA, and VGA     │
         │       in text modes and 200-line graphics modes       │
  B8000H ├───────────────────────────────────────────────────────┤
         │                       Reserved                        │
  B1000H ├───────────────────────────────────────────────────────┤
         │         4 KB Monochrome Adapter regen buffer          │
  B0000H ├───────────────────────────────────────────────────────┤
         │       Regen buffer area for EGA, MCGA, and VGA        │
         │        in 350-line or 480-line graphics modes         │
  A0000H ├───────────────────────────────────────────────────────┤
         │             Transient part of COMMAND.COM             │
         ├───────────────────────────────────────────────────────┤
         │                Transient program area                 │
  varies ├───────────────────────────────────────────────────────┤
         │                MS-DOS and its buffers,                │
         │              tables, and device drivers               │
  00400H ├───────────────────────────────────────────────────────┤
         │                   Interrupt vectors                   │
  00000H └───────────────────────────────────────────────────────┘

  Figure 6-1.  Memory diagram of an IBM PC─compatible personal computer,
  showing the locations of the regen buffers for various adapters.


Support Considerations

  MS-DOS offers several functions to transfer text to the display. Version 1
  supported only Teletype-like output capabilities; version 2 added an
  optional ANSI console driver to allow the programmer to clear the screen,
  position the cursor, and select colors and attributes with standard escape
  sequences embedded in the output. Programs that use only the MS-DOS
  functions will operate properly on any computer system that runs MS-DOS,
  regardless of the level of IBM hardware compatibility.

  On IBM PC─compatible machines, the ROM BIOS contains a video driver that
  programs can invoke directly, bypassing MS-DOS. The ROM BIOS functions
  allow a program to write text or individual pixels to the screen or to
  select display modes, video pages, palette, and foreground and background
  colors. These functions are relatively efficient (compared with the MS-DOS
  functions, at least), although the graphics support is primitive.

  Unfortunately, the display functions of both MS-DOS and the ROM BIOS were
  designed around the model of a cursor-addressable terminal and therefore
  do not fully exploit the capabilities of the memory-mapped, high-bandwidth
  display adapters used on IBM PC─compatible machines. As a result, nearly
  every popular interactive application with full-screen displays or
  graphics capability ignores both MS-DOS and the ROM BIOS and writes
  directly to the video controller's registers and regen buffer.

  Programs that control the hardware directly are sometimes called
  "ill-behaved," because they are performing operations that are normally
  reserved for operating-system device drivers. These programs are a severe
  management problem in multitasking real-mode environments such as DesqView
  and Microsoft Windows, and they are the main reason why such environments
  are not used more widely. It could be argued, however, that the blame for
  such problematic behavior lies not with the application programs but with
  the failure of MS-DOS and the ROM BIOS──even six years after the first
  appearance of the IBM PC──to provide display functions of adequate range
  and power.


MS-DOS Display Functions

  Under MS-DOS versions 2.0 and later, the preferred method for sending text
  to the display is to use handle-based Int 21H Function 40H (Write File or
  Device). When an application program receives control, MS-DOS has already
  assigned it handles for the standard output (1) and standard error (2)
  devices, and these handles can be used immediately. For example, the
  sequence at the top of the following page writes the message hello to the
  display using the standard output handle.

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message to display
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     ah,40h      ; function 40h = write file or device
          mov     bx,1        ; BX = standard output handle
          mov     cx,msg_len  ; CX = message length
          mov     dx,seg msg  ; DS:DX = address of message
          mov     ds,dx
          mov     dx,offset msg
          int     21h         ; transfer to MS-DOS
          jc      error       ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  If there is no error, the function returns the carry flag cleared and the
  number of characters actually transferred in register AX. Unless a Ctrl-Z
  is embedded in the text or the standard output is redirected to a disk
  file and the disk is full, this number should equal the number of
  characters requested.

  As in the case of keyboard input, the user's ability to specify
  command-line redirection parameters that are invisible to the application
  means that if you use the predefined standard output handle, you can't
  always be sure where your output is going. However, to ensure that your
  output actually goes to the display, you can use the predefined standard
  error handle, which is always opened to the CON (logical console) device
  and is not redirectable.

  As an alternative to the standard output and standard error handles, you
  can bypass any output redirection and open a separate channel to CON,
  using the handle obtained from that open operation for character output.
  For example, the following code opens the console display for output and
  then writes the string hello to it:

  ──────────────────────────────────────────────────────────────────────────
  fname   db      'CON',0      ; name of CON device
  handle  dw      0            ; handle for CON device
  msg     db      'hello'      ; message to display
  msg_len equ     $-msg        ; length of message
          .
          .
          .
          mov     ax,3d02h     ; AH = function 3dh = open
                               ; AL = mode = read/write
          mov     dx,seg fname ; DS:DX = device name
          mov     ds,dx
          mov     dx,offset fname
          int     21h          ; transfer to MS-DOS
          jc      error        ; jump if open failed
          mov     handle,ax    ; save handle for CON
          .
          .
          .
          mov     ah,40h       ; function 40h = write
          mov     cx,msg_len   ; CX = message length
          mov     dx,seg msg   ; DS:DX = address of message
          mov     ds,dx
          mov     dx,offset msg
          mov     bx,handle    ; BX = CON device handle
          int     21h          ; transfer to MS-DOS
          jc      error        ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  As with the keyboard input functions, MS-DOS also supports traditional
  display functions that are upwardly compatible from the corresponding CP/M
  output calls:

    Int 21H Function 02H sends the character in the DL register to the
     standard output device. It is sensitive to Ctrl-C interrupts, and it
     handles carriage returns, linefeeds, bell codes, and backspaces
     appropriately.

    Int 21H Function 06H transfers the character in the DL register to the
     standard output device, but it is not sensitive to Ctrl-C interrupts.
     You must take care when using this function, because it can also be
     used for input and for status requests.

    Int 21H Function 09H sends a string to the standard output device. The
     string is terminated by the $ character.

  With MS-DOS version 2 or later, these three traditional functions are
  converted internally to handle-based writes to the standard output and
  thus are susceptible to output redirection.

  The sequence at the top of the following page sounds a warning beep by
  sending an ASCII bell code (07H) to the display driver using the
  traditional character-output call Int 21H Function 02H.

  ──────────────────────────────────────────────────────────────────────────
          .
          .
          .
          mov     dl,7        ; 07h = ASCII bell code
          mov     ah,2        ; function 02h = display character
          int     21h         ; transfer to MS-DOS
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  The following sequence uses the traditional string-output call Int 21H
  Function 09H to display a string:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello$'
          .
          .
          .
          mov     dx,seg msg  ; DS:DX = message address
          mov     ds,dx
          mov     dx,offset msg
          mov     ah,9        ; function 09h = write string
          int     21h         ; transfer to MS-DOS
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Note that MS-DOS detects the $ character as a terminator and does not
  display it on the screen.

Screen Control with MS-DOS Functions

  With version 2.0 or later, if MS-DOS loads the optional device driver
  ANSI.SYS in response to a DEVICE directive in the CONFIG.SYS file,
  programs can clear the screen, control the cursor position, and select
  foreground and background colors by embedding escape sequences in the text
  output. Escape sequences are so called because they begin with an escape
  character (1BH), which alerts the driver to intercept and interpret the
  subsequent characters in the sequence. When the ANSI driver is not loaded,
  MS-DOS simply passes the escape sequence to the display like any other
  text, usually resulting in a chaotic screen.

  The escape sequences that can be used with the ANSI driver for screen
  control are a subset of those defined in the ANSI 3.641979 Standard.
  These standard sequences are summarized in Figure 6-2. Note that case is
  significant for the last character in an escape sequence and that numbers
  must always be represented as ASCII digit strings, not as their binary
  values. (A separate set of escape sequences supported by ANSI.SYS, but not
  compatible with the ANSI standard, may be used for reprogramming and
  remapping the keyboard.)

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
  Esc[2J             Clear screen; place cursor in upper left corner (home
                     position).
  Esc[K              Clear from cursor to end of line.
  Esc[row;colH       Position cursor. (Row is the y coordinate in the range
                     125 and col is the x coordinate in the range 180 for
                     80-by-25 text display modes.) Escape sequences
                     terminated with the letter f instead of H have the same
                     effect.
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                     effect.
  Esc[nA             Move cursor up n rows.
  Esc[nB             Move cursor down n rows.
  Esc[nC             Move cursor right n columns.
  Esc[nD             Move cursor left n columns.
  Esc[s              Save current cursor position.
  Esc[u              Restore cursor to saved position.
  Esc[6n             Return current cursor position on the standard input
                     handle in the format Esc[row;colR.
  Esc[nm             Select character attributes:
                      0 = no special attributes
                      1 = high intensity
                      2 = low intensity
                      3 = italic
                      4 = underline
                      5 = blink
                      6 = rapid blink
                      7 = reverse video
                      8 = concealed text (no display)
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                      8 = concealed text (no display)
                     30 = foreground black
                     31 = foreground red
                     32 = foreground green
                     33 = foreground yellow
                     34 = foreground blue
                     35 = foreground magenta
                     36 = foreground cyan
                     37 = foreground white
                     40 = background black
                     41 = background red
                     42 = background green
                     43 = background yellow
                     44 = background blue
                     45 = background magenta
                     46 = background cyan
                     47 = background white
  Esc[=nh            Select display mode:
                      0 = 40-by-25, 16-color text (color burst off)
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                      0 = 40-by-25, 16-color text (color burst off)
                      1 = 40-by-25, 16-color text
                      2 = 80-by-25, 16-color text (color burst off)
                      3 = 80-by-25, 16-color text
                      4 = 320-by-200, 4-color graphics
                      5 = 320-by-200, 4-color graphics (color burst off)
                      6 = 620-by-200, 2-color graphics
                     14 = 640-by-200, 16-color graphics (EGA and VGA,
                     MS-DOS 4.0)
                     15 = 640-by-350, 2-color graphics (EGA and VGA,
                     MS-DOS 4.0)
                     16 = 640-by-350, 16-color graphics (EGA and VGA,
                     MS-DOS 4.0)
                     17 = 640-by-480, 2-color graphics (MCGA and VGA,
                     MS-DOS 4.0)
                     18 = 640-by-480, 16-color graphics (VGA, MS-DOS 4.0)
                     19 = 320-by-200, 256-color graphics (MCGA and VGA,
                     MS-DOS 4.0)
                     Escape sequences terminated with l instead of h have
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                     Escape sequences terminated with l instead of h have
                     the same effect.
  Esc[=7h            Enable line wrap.
  Esc[=7l            Disable line wrap.
  ──────────────────────────────────────────────────────────────────────────


  Figure 6-2.  The ANSI escape sequences supported by the MS-DOS ANSI.SYS
  driver. Programs running under MS-DOS 2.0 or later may use these
  functions, if ANSI.SYS is loaded, to control the appearance of the display
  in a hardware-independent manner. The symbol Esc indicates an ASCII escape
  code──a character with the value 1BH. Note that cursor positions in ANSI
  escape sequences are one-based, unlike the cursor coordinates used by the
  IBM ROM BIOS, which are zero-based. Numbers embedded in an escape sequence
  must always be represented as a string of ASCII digits, not as their
  binary values.

Binary Output Mode

  Under MS-DOS version 2 or later, you can substantially increase display
  speeds for well-behaved application programs without sacrificing hardware
  independence by selecting binary (raw) mode for the standard output. In
  binary mode, MS-DOS does not check between each character it transfers to
  the output device for a Ctrl-C waiting at the keyboard, nor does it filter
  the output string for certain characters such as Ctrl-Z.

  Bit 5 in the device information word associated with a device handle
  controls binary mode. Programs access the device information word by using
  Subfunctions 00H and 01H of the MS-DOS IOCTL function (I/O Control, Int
  21H Function 44H). For example, the sequence on the following page places
  the standard output handle into binary mode.

  ──────────────────────────────────────────────────────────────────────────
                              ; get device information...
          mov     bx,1        ; standard output handle
          mov     ax,4400h    ; function 44h subfunction 00h
          int     21h         ; transfer to MS-DOS

          mov     dh,0        ; set upper byte of DX = 0
          or      dl,20h      ; set binary mode bit in DL

                              ; write device information...
                              ; (BX still has handle)
          mov     ax,4401h    ; function 44h subfunction 01h
          int     21h         ; transfer to MS-DOS
  ──────────────────────────────────────────────────────────────────────────

  Note that if a program changes the mode of any of the standard handles, it
  should restore those handles to ASCII (cooked) mode before it exits.
  Otherwise, subsequent application programs may behave in unexpected ways.
  For more detailed information on the IOCTL function, see Section II of
  this book, "MS-DOS Functions Reference."


The ROM BIOS Display Functions

  You can somewhat improve the display performance of programs that are
  intended for use only on IBM PCcompatible machines by using the ROM BIOS
  video driver instead of the MS-DOS output functions. Accessed by means of
  Int 10H, the ROM BIOS driver supports the following functions for all of
  the currently available IBM display adapters:

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Function           Action
  ──────────────────────────────────────────────────────────────────────────
  Display mode control
  00H               Set display mode.
  0FH               Get display mode.

  Cursor control
  01H               Set cursor size.
  02H               Set cursor position.
  03H               Get cursor position and size.

  Writing to the display
  09H               Write character and attribute at cursor.
  0AH               Write character-only at cursor.
  0EH               Write character in teletype mode.

  Reading from the display
  08H               Read character and attribute at cursor.

  Function           Action
  ──────────────────────────────────────────────────────────────────────────

  Graphics support
  0CH               Write pixel.
  0DH               Read pixel.

  Scroll or clear display
  06H               Scroll up or initialize window.
  07H               Scroll down or initialize window.

  Miscellaneous
  04H               Read light pen.
  05H               Select display page.
  0BH               Select palette/set border color.
  ──────────────────────────────────────────────────────────────────────────


  Additional ROM BIOS functions are available on the EGA, MCGA, VGA, and
  PCjr to support the enhanced features of these adapters, such as
  programmable palettes and character sets (fonts). Some of the functions
  are valid only in certain display modes.

  Each display mode is characterized by the number of colors it can display,
  its vertical resolution, its horizontal resolution, and whether it
  supports text or graphics memory mapping. The ROM BIOS identifies it with
  a unique number. Section III of this book, "IBM ROM BIOS and Mouse
  Functions Reference," documents all of the ROM BIOS Int 10H functions and
  display modes.

  As you can see from the preceding list, the ROM BIOS offers several
  desirable capabilities that are not available from MS-DOS, including
  initialization or scrolling of selected screen windows, modification of
  the cursor shape, and reading back the character being displayed at an
  arbitrary screen location. These functions can be used to isolate your
  program from the hardware on any IBM PCcompatible adapter. However, the
  ROM BIOS functions do not suffice for the needs of a high-performance,
  interactive, full-screen program such as a word processor. They do not
  support the rapid display of character strings at an arbitrary screen
  position, and they do not implement graphics operations at the level
  normally required by applications (for example, bit-block transfers and
  rapid drawing of lines, circles, and filled polygons). And, of course,
  they are of no use whatsoever in non-IBM display modes such as the
  monochrome graphics mode of the Hercules Graphics Card.

  Let's look at a simple example of a call to the ROM BIOS video driver. The
  following sequence writes the string hello to the screen:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'
  msg_len equ     $-msg
          .
          .
          .
          mov     si,seg msg  ; DS:SI = message address
          mov     ds,si
          mov     si,offset msg
          mov     cx,msg_len  ; CX = message length
          cld
  next:   lodsb               ; get AL = next character
          push    si          ; save message pointer
          mov     ah,0eh      ; int 10h function 0eh = write
                              ; character in teletype mode
          mov     bh,0        ; assume video page 0
          mov     bl,color    ; (use in graphics modes only)
          int     10h         ; transfer to ROM BIOS
          pop     si          ; restore message pointer
          loop    next        ; loop until message done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  (Note that the SI and DI registers are not necessarily preserved across a
  call to a ROM BIOS video function.)


Memory-mapped Display Techniques

  Display performance is best when an application program takes over
  complete control of the video adapter and the refresh buffer. Because the
  display is memory-mapped, the speed at which characters can be put on the
  screen is limited only by the CPU's ability to copy bytes from one
  location in memory to another. The trade-off for this performance is that
  such programs are highly sensitive to hardware compatibility and do not
  always function properly on "clones" or even on new models of IBM video
  adapters.

Text Mode

  Direct programming of the IBM PCcompatible video adapters in their text
  display modes (sometimes also called alphanumeric display modes) is
  straightforward. The character set is the same for all, and the cursor
  home position──(x,y) = (0,0)──is defined to be the upper left corner of
  the screen (Figure 6-3). The MDA uses 4 KB of memory starting at segment
  B000H as a regen buffer, and the various adapters with both text and
  graphics capabilities (CGA, EGA, MCGA, and VGA) use 16 KB of memory
  starting at segment B800H. (See Figure 6-1.) In the latter case, the 16
  KB is divided into "pages" that can be independently updated and
  displayed.

   (0,0)┌─────────────────────────────────┐(79,0)
                                         
                                         
                                         
                                         
                                         
                                         
                                         
  (0,24)└─────────────────────────────────┘(79,24)

  Figure 6-3.  Cursor addressing for 80-by-25 text display modes (IBM ROM
  BIOS modes 2, 3, and 7).

  Each character-display position is allotted 2 bytes in the regen buffer.
  The first byte (even address) contains the ASCII code of the character,
  which is translated by a special hardware character generator into a
  dot-matrix pattern for the screen. The second byte (odd address) is the
  attribute byte. Several bit fields in this byte control such features as
  blinking, intensity (highlighting), and reverse video, depending on the
  adapter type and display mode (Figures 6-4 and 6-5). Figure 6-6 shows a
  hex and ASCII dump of part of the video map for the MDA.

  Display                  Background              Foreground
  ──────────────────────────────────────────────────────────────────────────
  No display (black)       000                     000
  No display (white)      111                     111
  Underline                000                     001
  Normal video             000                     111
  Reverse video            111                     000
  ──────────────────────────────────────────────────────────────────────────

  Figure 6-4.  Attribute byte for 80-by-25 monochrome text display mode on
  the MDA, Hercules cards, EGA, and VGA (IBM ROM BIOS mode 7).

  Value              Color
  ──────────────────────────────────────────────────────────────────────────
   0                 Black
   1                 Blue
   2                 Green
   3                 Cyan
   4                 Red
   5                 Magenta
   6                 Brown
   7                 White
   8                 Gray
   9                 Light blue
  10                 Light green
  11                 Light cyan
  12                 Light red
  13                 Light magenta
  14                 Yellow
  15                 Intense white
  ──────────────────────────────────────────────────────────────────────────

  Figure 6-5.  Attribute byte for the 40-by-25 and 80-by-25 text display
  modes on the CGA, EGA, MCGA, and VGA (IBM ROM BIOS modes 03). The table
  of color values assumes default palette programming and that the B or I
  bit controls intensity.

  ──────────────────────────────────────────────────────────────────────────
  B000:0000 3e 07 73 07 65 07 6c 07 65 07 63 07 74 07 20 07
  B000:0010 74 07 65 07 6d 07 70 07 20 07 20 07 20 07 20 07
  B000:0020 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0030 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0040 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0050 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0060 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0070 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0080 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0090 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  ──────────────────────────────────────────────────────────────────────────

  Figure 6-6.  Example dump of the first 160 bytes of the MDA's regen
  buffer. These bytes correspond to the first visible line on the screen.
  Note that ASCII character codes are stored in even bytes and their
  respective character attributes in odd bytes; all the characters in this
  example line have the attribute normal video.

  You can calculate the memory offset of any character on the display as the
  line number (y coordinate) times 80 characters per line times 2 bytes per
  character, plus the column number (x coordinate) times 2 bytes per
  character, plus (for the text/graphics adapters) the page number times the
  size of the page (4 KB per page in 80-by-25 modes; 2 KB per page in
  40-by-25 modes). In short, the formula for the offset of the
  character-attribute pair for a given screen position (x,y) in 80-by-25
  text modes is

    offset = ((y * 50H + x) * 2) + (page * 1000H)

  In 40-by-25 text modes, the formula is

    offset = ((y * 50H + x) * 2) + (page * 0800H)

  Of course, the segment register being used to address the video buffer
  must be set appropriately, depending on the type of display adapter.

  As a simple example, assume that the character to be displayed is in the
  AL register, the desired attribute byte for the character is in the AH
  register, the x coordinate (column) is in the BX register, and the y
  coordinate (row) is in the CX register. The following code stores the
  character and attribute byte into the MDA's video refresh buffer at the
  proper location:

  ──────────────────────────────────────────────────────────────────────────
          push    ax          ; save char and attribute
          mov     ax,160
          mul     cx          ; DX:AX = Y * 160
          shl     bx,1        ; multiply X by 2
          add     bx,ax       ; BX = (Y*160) + (X*2)
          mov     ax,0b000h   ; ES = segment of monochrome
          mov     es,ax       ; adapter refresh buffer
          pop     ax          ; restore char and attribute
          mov     es:[bx],ax  ; write them to video buffer
  ──────────────────────────────────────────────────────────────────────────

  More frequently, we wish to move entire strings into the refresh buffer,
  starting at a given coordinate. In the next example, assume that the DS:SI
  registers point to the source string, the ES:DI registers point to the
  starting position in the video buffer (calculated as shown in the previous
  example), the AH register contains the attribute byte to be assigned to
  every character in the string, and the CX register contains the length of
  the string. The following code moves the entire string into the refresh
  buffer:

  ──────────────────────────────────────────────────────────────────────────
  xfer:   lodsb               ; fetch next character
          stosw               ; store char + attribute
          loop    xfer        ; until all chars moved
  ──────────────────────────────────────────────────────────────────────────

  Of course, the video drivers written for actual application programs must
  take into account many additional factors, such as checking for special
  control codes (linefeeds, carriage returns, tabs), line wrap, and
  scrolling.

  Programs that write characters directly to the CGA regen buffer in text
  modes must deal with an additional complicating factor──they must examine
  the video controller's status port and access the refresh buffer only
  during the horizontal retrace or vertical retrace intervals. (A retrace
  interval is the period when the electron beam that illuminates the screen
  phosphors is being repositioned to the start of a new scan line.)
  Otherwise, the contention for memory between the CPU and the video
  controller is manifest as unsightly "snow" on the display. (If you are
  writing programs for any of the other IBM PC─compatible video adapters,
  such as the MDA, EGA, MCGA, or VGA, you can ignore the retrace intervals;
  snow is not a problem with these video controllers.)

  A program can detect the occurrence of a retrace interval by monitoring
  certain bits in the video controller's status register. For example,
  assume that the offset for the desired character position has been
  calculated as in the preceding example and placed in the BX register, the
  segment for the CGA's refresh buffer is in the ES register, and an ASCII
  character code to be displayed is in the CL register. The following code
  waits for the beginning of a new horizontal retrace interval and then
  writes the character into the buffer:

  ──────────────────────────────────────────────────────────────────────────
          mov     dx,03dah    ; DX = video controller's
                              ; status port address
          cli                 ; disable interrupts

                              ; if retrace is already
                              ; in progress, wait for
                              ; it to end...
  wait1:  in      al,dx       ; read status port
          and     al,1        ; check if retrace bit on
          jnz     wait1       ; yes, wait

                              ; wait for new retrace
                              ; interval to start...
  wait2:  in      al,dx       ; read status port
          and     al,1        ; retrace bit on yet?
          jz      wait2       ; jump if not yet on

          mov     es:[bx],cl  ; write character to
                              ; the regen buffer
          sti                 ; enable interrupts again
  ──────────────────────────────────────────────────────────────────────────

  The first wait loop "synchronizes" the code to the beginning of a
  horizontal retrace interval. If only the second wait loop were used (that
  is, if a character were written when a retrace interval was already in
  progress), the write would occasionally begin so close to the end of a
  horizontal retrace "window" that it would partially miss the retrace,
  resulting in scattered snow at the left edge of the display. Notice that
  the code also disables interrupts during accesses to the video buffer, so
  that service of a hardware interrupt won't disrupt the synchronization
  process.

  Because of the retrace-interval constraints just outlined, the rate at
  which you can update the CGA in text modes is severely limited when the
  updating is done one character at a time. You can obtain better results by
  calculating all the relevant addresses and setting up the appropriate
  registers, disabling the video controller by writing to register 3D8H,
  moving the entire string to the buffer with a REP MOVSW operation, and
  then reenabling the video controller. If the string is of reasonable
  length, the user won't even notice a flicker in the display. Of course,
  this procedure introduces additional hardware dependence into your code
  because it requires much greater knowledge of the 6845 controller.
  Luckily, snow is not a problem in CGA graphics modes.

Graphics Mode

  Graphics-mode memory-mapped programming for IBM PCcompatible adapters is
  considerably more complicated than text-mode programming. Each bit or
  group of bits in the regen buffer corresponds to an addressable point, or
  pixel, on the screen. The mapping of bits to pixels differs for each of
  the available graphics modes, with their differences in resolution and
  number of supported colors. The newer adapters (EGA, MCGA, and VGA) also
  use the concept of bit planes, where bits of a pixel are segregated into
  multiple banks of memory mapped at the same address; you must manipulate
  these bit planes by a combination of memory-mapped I/O and port
  addressing.

  IBM-video-systems graphics programming is a subject large enough for a
  book of its own, but we can use the 640-by-200, 2-color graphics display
  mode of the CGA (which is also supported by all subsequent IBM
  text/graphics adapters) to illustrate a few of the techniques involved.
  This mode is simple to deal with because each pixel is represented by a
  single bit. The pixels are assigned (x,y) coordinates in the range (0,0)
  through (639,199), where x is the horizontal displacement, y is the
  vertical displacement, and the home position (0,0) is the upper left
  corner of the display. (See Figure 6-7.)

    (0,0)┌─────────────────────────────────┐(639,0)
                                          
                                          
                                          
                                          
                                          
                                          
                                          
  (0,199)└─────────────────────────────────┘(639,199)

  Figure 6-7.  Point addressing for 640-by-200, 2-color graphics modes on
  the CGA, EGA, MCGA, and VGA (IBM ROM BIOS mode 6).

  Each successive group of 80 bytes (640 bits) represents one horizontal
  scan line. Within each byte, the bits map one-for-one onto pixels, with
  the most significant bit corresponding to the leftmost displayed pixel of
  a set of eight pixels and the least significant bit corresponding to the
  rightmost displayed pixel of the set. The memory map is set up so that all
  the even y coordinates are scanned as a set and all the odd y coordinates
  are scanned as a set; this mapping is referred to as the memory interlace.

  To find the regen buffer offset for a particular (x,y) coordinate, you
  would use the following formula:

    offset = ((y AND 1) * 2000H) + (y/2 * 50H) + (x/8)

  The assembly-language implementation of this formula is as follows:

  ──────────────────────────────────────────────────────────────────────────
                              ; assume AX = Y, BX = X
          shr     bx,1        ; divide X by 8
          shr     bx,1
          shr     bx,1
          push    ax          ; save copy of Y
          shr     ax,1        ; find (Y/2) * 50h
          mov     cx,50h      ; with product in DX:AX
          mul     cx
          add     bx,ax       ; add product to X/8
          pop     ax          ; add (Y AND 1) * 2000h
          and     ax,1
          jz      label1
          add     bx,2000h
  label1:                     ; now BX = offset into
                              ; video buffer
  ──────────────────────────────────────────────────────────────────────────

  After calculating the correct byte address, you can use the following
  formula to calculate the bit position for a given pixel coordinate:

    bit = 7 - (x MOD 8)

  where bit 7 is the most significant bit and bit 0 is the least significant
  bit. It is easiest to build an 8-byte table, or array of bit masks, and
  use the operation X AND 7 to extract the appropriate entry from the table:

  (X AND 7)          Bit mask          (X AND 7)          Bit mask
  ──────────────────────────────────────────────────────────────────────────
  0                  80H               4                  08H
  1                  40H               5                  04H
  2                  20H               6                  02H
  3                  10H               7                  01H
  ──────────────────────────────────────────────────────────────────────────

  The assembly-language implementation of this second calculation is as
  follows:

  ──────────────────────────────────────────────────────────────────────────
  table   db      80h         ; X AND 7 = offset 0
          db      40h         ; X AND 7 = offset 1
          db      20h         ; X AND 7 = offset 2
          db      10h         ; X AND 7 = offset 3
          db      08h         ; X AND 7 = offset 4
          db      04h         ; X AND 7 = offset 5
          db      02h         ; X AND 7 = offset 6
          db      01h         ; X AND 7 = offset 7
          .
          .
          .
                              ; assume BX = X coordinate
          and     bx,7        ; isolate 07 offset
          mov     al,[bx+table]
                              ; now AL = mask from table
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  The program can then use the mask, together with the byte offset
  previously calculated, to set or clear the appropriate bit in the video
  controller's regen buffer.



────────────────────────────────────────────────────────────────────────────
Chapter 7  Printer and Serial Port

  MS-DOS supports printers, plotters, modems, and other hard-copy output or
  communication devices with device drivers for parallel ports and serial
  ports. Parallel ports are so named because they transfer a byte──8 bits──
  in parallel to the destination device over eight separate physical paths
  (plus additional status and handshaking signals). The serial port, on the
  other hand, communicates with the CPU with bytes but sends data to or
  receives data from its destination device serially──a bit at a time──over
  a single physical connection.

  Parallel ports are typically used for high-speed output devices, such as
  line printers, over relatively short distances (less than 50 feet). They
  are rarely used for devices that require two-way communication with the
  computer. Serial ports are used for lower-speed devices, such as modems
  and terminals, that require two-way communication (although some printers
  also have serial interfaces). A serial port can drive its device reliably
  over much greater distances (up to 1000 feet) over as few as three wires──
  transmit, receive, and ground.

  The most commonly used type of serial interface follows a standard called
  RS-232. This standard specifies a 25-wire interface with certain
  electrical characteristics, the use of various handshaking signals, and a
  standard DB-25 connector. Other serial-interface standards exist──for
  example, the RS-422, which is capable of considerably higher speeds than
  the RS-232── but these are rarely used in personal computers (except for
  the Apple Macintosh) at this time.

  MS-DOS has built-in device drivers for three parallel adapters, and for
  two serial adapters on the PC or PC/AT and three serial adapters on the
  PS/2. The logical names for these devices are LPT1, LPT2, LPT3, COM1,
  COM2, and COM3. The standard printer (PRN) and standard auxiliary (AUX)
  devices are normally aliased to LPT1 and COM1, but you can redirect PRN to
  one of the serial ports with the MS-DOS MODE command.

  As with keyboard and video display I/O, you can manage printer and
  serial-port I/O at several levels that offer different degrees of
  flexibility and hardware independence:

  ■  MS-DOS handle-oriented functions

  ■  MS-DOS traditional character functions

  ■  IBM ROM BIOS driver functions

  In the case of the serial port, direct control of the hardware by
  application programs is also common. I will discuss each of these I/O
  methods briefly, with examples, in the following pages.


Printer Output

  The preferred method of printer output is to use the handle write function
  (Int 21H Function 40H) with the predefined standard printer handle (4).
  For example, you could write the string hello to the printer as follows:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for printer
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     ah,40h      ; function 40h = write file or device
          mov     bx,4        ; BX = standard printer handle
          mov     cx,msg_len  ; CX = length of string
          mov     dx,seg msg  ; DS:DX = string address
          mov     ds,dx
          mov     dx,offset msg
          int     21h         ; transfer to MS-DOS
          jc      error       ; jump if error
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  If there is no error, the function returns the carry flag cleared and the
  number of characters actually transferred to the list device in register
  AX. Under normal circumstances, this number should always be the same as
  the length requested and the carry flag indicating an error should never
  be set. However, the output will terminate early if your data contains an
  end-of-file mark (Ctrl-Z).

  You can write independently to several list devices (for example, LPT1,
  LPT2) by issuing a specific open request (Int 21H Function 3DH) for each
  device and using the handles returned to access the printers individually
  with Int 21H Function 40H. You have already seen this general approach in
  Chapters 5 and 6.

  An alternative method of printer output is to use the traditional Int 21H
  Function 05H, which transfers the character in the DL register to the
  printer. (This function is sensitive to Ctrl-C interrupts.) For example,
  the assembly-language code sequence at the top of the following page would
  write the the string hello to the line printer.

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for printer
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     bx,seg msg  ; DS:BX = string address
          mov     ds,bx
          mov     bx,offset msg
          mov     cx,msg_len  ; CX = string length

  next:   mov     dl,[bx]     ; get next character
          mov     ah,5        ; function 05h = printer output
          int     21h         ; transfer to MS-DOS
          inc     bx          ; bump string pointer
          loop    next        ; loop until string done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Programs that run on IBM PC─compatible machines can obtain improved
  printer throughput by bypassing MS-DOS and calling the ROM BIOS printer
  driver directly by means of Int 17H. Section III of this book, "IBM ROM
  BIOS and Mouse Functions Reference," documents the Int 17H functions in
  detail. Use of the ROM BIOS functions also allows your program to test
  whether the printer is off line or out of paper, a capability that MS-DOS
  does not offer.

  For example, the following sequence of instructions calls the ROM BIOS
  printer driver to send the string hello to the line printer:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for printer
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     bx,seg msg  ; DS:BX = string address
          mov     ds,bx
          mov     bx,offset msg
          mov     cx,msg_len  ; CX = string length
          mov     dx,0        ; DX = printer number

  next:   mov     al,[bx]     ; AL = character to print
          mov     ah,0        ; function 00h = printer output
          int     17h         ; transfer to ROM BIOS
          inc     bx          ; bump string pointer
          loop    next        ; loop until string done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Note that the printer numbers used by the ROM BIOS are zero-based, whereas
  the printer numbers in MS-DOS logical-device names are one-based. For
  example, ROM BIOS printer 0 corresponds to LPT1.

  Finally, the most hardware-dependent technique of printer output is to
  access the printer controller directly. Considering the functionality
  already provided in MS-DOS and the IBM ROM BIOS, as well as the speeds of
  the devices involved, I cannot see any justification for using direct
  hardware control in this case. The disadvantage of introducing such
  extreme hardware dependence for such a low-speed device would far outweigh
  any small performance gains that might be obtained.


The Serial Port

  MS-DOS support for serial ports (often referred to as the auxiliary device
  in MS-DOS manuals) is weak compared with its keyboard, video-display, and
  printer support. This is one area where the application programmer is
  justified in making programs hardware dependent to extract adequate
  performance.

  Programs that restrict themselves to MS-DOS functions to ensure
  portability can use the handle read and write functions (Int 21H Functions
  3FH and 40H), with the predefined standard auxiliary handle (3) to
  access the serial port. For example, the following code writes the string
  hello to the serial port that is currently defined as the AUX device:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for serial port
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     ah,40h      ; function 40h = write file or device
          mov     bx,3        ; BX = standard aux handle
          mov     cx,msg_len  ; CX = string length
          mov     dx,seg msg  ; DS:DX = string address
          mov     ds,dx
          mov     dx,offset msg
          int     21h         ; transfer to MS-DOS
          jc      error       ; jump if error
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  The standard auxiliary handle gives access to only the first serial port
  (COM1). If you want to read or write COM2 and COM3 using the handle calls,
  you must issue an open request (Int 21H Function 3DH) for the desired
  serial port and use the handle returned by that function with Int 21H
  Functions 3FH and 40H.

  Some versions of MS-DOS have a bug in character-device handling that
  manifests itself as follows: If you issue a read request with Int 21H
  Function 3FH for the exact number of characters that are waiting in the
  driver's buffer, the length returned in the AX register is the number of
  characters transferred minus one. You can circumvent this problem by
  always requesting more characters than you expect to receive or by placing
  the device handle into binary mode using Int 21H Function 44H.

  MS-DOS also supports two traditional functions for serial-port I/O. Int
  21H Function 03H inputs a character from COM1 and returns it in the AL
  register; Int 21H Function 04H transmits the character in the DL register
  to COM1. Like the other traditional calls, these two are direct
  descendants of the CP/M auxiliary-device functions.

  For example, the following code sends the string hello to COM1 using the
  traditional Int 21H Function 04H:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for serial port
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     bx,seg msg  ; DS:BX = string address
          mov     ds,bx
          mov     bx,offset msg
          mov     cx,msg_len  ; CX = length of string
    mov     dl,[bx]     ; get next character
          mov     ah,4        ; function 04h = aux output
          int     21h         ; transfer to MS-DOS
          inc     bx          ; bump pointer to string
          loop    next        ; loop until string done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  MS-DOS translates the traditional auxiliary-device functions into calls on
  the same device driver used by the handle calls. Therefore, it is
  generally preferable to use the handle functions in the first place,
  because they allow very long strings to be read or written in one
  operation, they give access to serial ports other than COM1, and they are
  symmetrical with the handle video-display, keyboard, printer, and file I/O
  methods described elsewhere in this book.

  Although the handle or traditional serial-port functions allow you to
  write programs that are portable to any machine running MS-DOS, they have
  a number of disadvantages:

    The built-in MS-DOS serial-port driver is slow and is not interrupt
     driven.

    MS-DOS serial-port I/O is not buffered.

    Determining the status of the auxiliary device requires a separate call
     to the IOCTL function (Int 21H Function 44H)──if you request input and
     no characters are ready, your program will simply hang.

    MS-DOS offers no standardized function to configure the serial port
     from within a program.

  For programs that are going to run on the IBM PC or compatibles, a more
  flexible technique for serial-port I/O is to call the IBM ROM BIOS
  serial-port driver by means of Int 14H. You can use this driver to
  initialize the serial port to a desired configuration and baud rate,
  examine the status of the controller, and read or write characters.
  Section III of this book, "IBM ROM BIOS and Mouse Functions Reference,"
  documents the functions available from the ROM BIOS serial-port driver.

  For example, the following sequence sends the character X to the first
  serial port (COM1):

  ──────────────────────────────────────────────────────────────────────────
          .
          .
          .
          mov     ah,1        ; function 01h = send character
          mov     al,'X'      ; AL = character to transmit
          mov     dx,0        ; DX = serial-port number
          int     14h         ; transfer to ROM BIOS
          and     ah,80h      ; did transmit fail?
          jnz     error       ; jump if transmit error
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  As with the ROM BIOS printer driver, the serial-port numbers used by the
  ROM BIOS are zero-based, whereas the serial-port numbers in MS-DOS
  logical-device names are one-based. In this example, serial port 0
  corresponds to COM1.

  Unfortunately, like the MS-DOS auxiliary-device driver, the ROM BIOS
  serial-port driver is not interrupt driven. Although it will support
  higher transfer speeds than the MS-DOS functions, at rates greater than
  2400 baud it may still lose characters. Consequently, most programmers
  writing high-performance applications that use a serial port (such as
  telecommunications programs) take complete control of the serial-port
  controller and provide their own interrupt driver. The built-in functions
  provided by MS-DOS, and by the ROM BIOS in the case of the IBM PC, are
  simply not adequate.

  Writing such programs requires a good understanding of the hardware. In
  the case of the IBM PC, the chips to study are the INS8250 Asynchronous
  Communications Controller and the Intel 8259A Programmable Interrupt
  Controller. The IBM technical reference documentation for these chips is a
  bit disorganized, but most of the necessary information is there if you
  look for it.


The TALK Program

  The simple terminal-emulator program TALK.ASM (Figure 7-1) is an example
  of a useful program that performs screen, keyboard, and serial-port I/O.
  This program recapitulates all of the topics discussed in Chapters 5
  through 7. TALK uses the IBM PC's ROM BIOS video driver to put characters
  on the screen, to clear the display, and to position the cursor; it uses
  the MS-DOS character-input calls to read the keyboard; and it contains its
  own interrupt driver for the serial-port controller.

  ──────────────────────────────────────────────────────────────────────────
          name      talk
          page      55,132
          .lfcond             ; List false conditionals too
          title     TALK--Simple terminal emulator

  ;
  ; TALK.ASM--Simple IBM PC terminal emulator
  ;
  ; Copyright (c) 1988 Ray Duncan
  ;
  ; To assemble and link this program into TALK.EXE:
  ;
  ;       C>MASM TALK;
  ;       C>LINK TALK;
  ;

  stdin   equ     0               ; standard input handle
  stdout  equ     1               ; standard output handle
  stderr  equ     2               ; standard error handle

  cr      equ     0dh             ; ASCII carriage return
  lf      equ     0ah             ; ASCII linefeed
  bsp     equ     08h             ; ASCII backspace
  escape  equ     1bh             ; ASCII escape code

  dattr   equ     07h             ; display attribute to use
                                  ; while in emulation mode

  bufsiz  equ     4096            ; size of serial-port buffer

  echo    equ     0               ; 0 = full-duplex, -1 = half-duplex
     equ     -1
  false   equ     0

  com1    equ     true            ; use COM1 if nonzero
  com2    equ     not com1        ; use COM2 if nonzero

  pic_mask  equ   21h             ; 8259 interrupt mask port
  pic_eoi   equ   20h             ; 8259 EOI port

          if      com1
  com_data equ    03f8h           ; port assignments for COM1
  com_ier  equ    03f9h
  com_mcr  equ    03fch
  com_sts  equ    03fdh
  com_int  equ    0ch             ; COM1 interrupt number
  int_mask equ    10h             ; IRQ4 mask for 8259
          endif

          if      com2
  com_data equ    02f8h           ; port assignments for COM2
  com_ier  equ    02f9h
  com_mcr  equ    02fch
  com_sts  equ    02fdh
  com_int  equ    0bh             ; COM2 interrupt number
  int_mask equ    08h             ; IRQ3 mask for 8259
          endif

  _TEXT   segment word public 'CODE'

          assume  cs:_TEXT,ds:_DATA,es:_DATA,ss:STACK

  talk    proc    far             ; entry point from MS-DOS

          mov     ax,_DATA        ; make data segment addressable
          mov     ds,ax
          mov     es,ax
                                  ; initialize display for
                                  ; terminal emulator mode...

          mov     ah,15           ; get display width and
          int     10h             ; current display mode
          dec     ah              ; save display width for use
          mov     columns,ah      ; by the screen-clear routine

          cmp     al,7            ; enforce text display mode
          je      talk2           ; mode 7 ok, proceed
         cmp     al,3
          jbe     talk2           ; modes 0-3 ok, proceed

          mov     dx,offset msg1
          mov     cx,msg1_len
          jmp     talk6           ; print error message and exit

  talk2:  mov     bh,dattr        ; clear screen and home cursor
          call    cls

          call    asc_enb         ; capture serial-port interrupt
                                  ; vector and enable interrupts

          mov     dx,offset msg2  ; display message
          mov     cx,msg2_len     ; 'terminal emulator running'
          mov     bx,stdout       ; BX = standard output handle
          mov     ah,40h          ; function 40h = write file or device
          int     21h             ; transfer to MS-DOS

  talk3:  call    pc_stat         ; keyboard character waiting?
          jz      talk4           ; nothing waiting, jump

          call    pc_in           ; read keyboard character

          cmp     al,0            ; is it a function key?
          jne     talk32          ; not function key, jump

          call    pc_in           ; function key, discard 2nd
                                  ; character of sequence
          jmp     talk5           ; then terminate program

  talk32:                         ; keyboard character received
          if      echo
          push    ax              ; if half-duplex, echo
          call    pc_out          ; character to PC display
          pop     ax
          endif

          call    com_out         ; write char to serial port

  talk4:  call    com_stat        ; serial-port character waiting?
          jz      talk3           ; nothing waiting, jump

          call    com_in          ; read serial-port character

          cmp     al,20h          ; is it control code?
          jae     talk45          ; jump if not
          call    ctrl_code       ; control code, process it

          jmp     talk3           ; check keyboard again

  talk45:                         ; noncontrol char received,
          call    pc_out          ; write it to PC display

          jmp     talk4           ; see if any more waiting

  talk5:                          ; function key detected,
                                  ; prepare to terminate...

          mov     bh,07h          ; clear screen and home cursor
          call    cls

          mov     dx,offset msg3  ; display farewell message
          mov     cx,msg3_len

  talk6:  push    dx              ; save message address
          push    cx              ; and message length

          call    asc_dsb         ; disable serial-port interrupts
                                  ; and release interrupt vector

          pop     cx              ; restore message length
          pop     dx              ; and address

          mov     bx,stdout       ; handle for standard output
          mov     ah,40h          ; function 40h = write device
          int     21h             ; transfer to MS-DOS

          mov     ax,4c00h        ; terminate program with
          int     21h             ; return code = 0

  talk    endp

  com_stat proc   near            ; check asynch status; returns
                                  ; Z = false if character ready
                                  ; Z = true if nothing waiting
          push    ax
          mov     ax,asc_in       ; compare ring buffer pointers
          cmp     ax,asc_out
          pop     ax
          ret                     ; return to caller
  stat endp

  com_in  proc    near            ; get character from serial-
                                  ; port buffer; returns
                                  ; new character in AL

          push    bx              ; save register BX

  com_in1:                        ; if no char waiting, wait
          mov     bx,asc_out      ; until one is received
          cmp     bx,asc_in
          je      com_in1         ; jump, nothing waiting

          mov     al,[bx+asc_buf] ; character is ready,
                                  ; extract it from buffer

          inc     bx              ; update buffer pointer
          cmp     bx,bufsiz
          jne     com_in2
          xor     bx,bx           ; reset pointer if wrapped
  com_in2:
          mov     asc_out,bx      ; store updated pointer
          pop     bx              ; restore register BX
          ret                     ; and return to caller

  com_in  endp

  com_out proc    near            ; write character in AL
                                  ; to serial port

          push    dx              ; save register DX
          push    ax              ; save character to send
          mov     dx,com_sts      ; DX = status port address

  com_out1:                       ; check if transmit buffer
          in      al,dx           ; is empty (TBE bit = set)
          and     al,20h
          jz      com_out1        ; no, must wait

          pop     ax              ; get character to send
          mov     dx,com_data     ; DX = data port address
          out     dx,al           ; transmit the character
          pop     dx              ; restore register DX
          ret                     ; and return to caller

  com_out endp
  pc_stat proc    near            ; read keyboard status; returns
                                  ; Z = false if character ready
                                  ; Z = true if nothing waiting
                                  ; register DX destroyed

          mov     al,in_flag      ; if character already
          or      al,al           ; waiting, return status
          jnz     pc_stat1

          mov     ah,6            ; otherwise call MS-DOS to
          mov     dl,0ffh         ; determine keyboard status
          int     21h

          jz      pc_stat1        ; jump if no key ready

          mov     in_char,al      ; got key, save it for
          mov     in_flag,0ffh    ; "pc_in" routine

  pc_stat1:                       ; return to caller with
          ret                     ; Z flag set appropriately

  pc_stat endp

  pc_in   proc    near            ; read keyboard character,
                                  ; return it in AL
                                  ; DX may be destroyed

          mov     al,in_flag      ; key already waiting?
          or      al,al
          jnz     pc_in1          ; yes, return it to caller

          call    pc_stat         ; try to read a character
          jmp     pc_in

  pc_in1: mov     in_flag,0       ; clear char-waiting flag
          mov     al,in_char      ; and return AL = character
          ret

  pc_in   endp

  pc_out  proc    near            ; write character in AL
                                  ; to the PC's display

          mov     ah,0eh          ; ROM BIOS function 0eh =
                                  ; "teletype output"
          push    bx              ; save register BX
          xor     bx,bx           ; assume page 0
          int     10h             ; transfer to ROM BIOS
          pop     bx              ; restore register BX
          ret                     ; and return to caller

  pc_out  endp


  cls     proc    near            ; clear display using
                                  ; char attribute in BH
                                  ; registers AX, CX,
                                  ; and DX destroyed

          mov     dl,columns      ; set DL,DH = X,Y of
          mov     dh,24           ; lower right corner
          mov     cx,0            ; set CL,CH = X,Y of
                                  ; upper left corner
          mov     ax,600h         ; ROM BIOS function 06h =
                                  ; "scroll or initialize
                                  ; window"
          int     10h             ; transfer to ROM BIOS
          call    home            ; set cursor at (0,0)
          ret                     ; and return to caller

  cls     endp

  clreol  proc    near            ; clear from cursor to end
                                  ; of line using attribute
                                  ; in BH, registers AX, CX,
                                  ; and DX destroyed

          call    getxy           ; get current cursor position
          mov     cx,dx           ; current position = "upper
                                  ; left corner" of window;
          mov     dl,columns      ; "lower right corner" X is
                                  ; max columns, Y is same
                                  ; as upper left corner
          mov     ax,600h         ; ROM BIOS function 06h =
                                  ; "scroll or initialize
                                  ; window"
          int     10h             ; transfer to ROM BIOS
          ret                     ; return to caller

  clreol  endp
  home    proc    near            ; put cursor at home position

          mov     dx,0            ; set (X,Y) = (0,0)
          call    gotoxy          ; position the cursor
          ret                     ; return to caller

  home    endp

  gotoxy  proc    near            ; position the cursor
                                  ; call with DL = X, DH = Y

          push    bx              ; save registers
          push    ax

          mov     bh,0            ; assume page 0
          mov     ah,2            ; ROM BIOS function 02h =
                                  ; set cursor position
          int     10h             ; transfer to ROM BIOS

          pop     ax              ; restore registers
          pop     bx
          ret                     ; and return to caller

  gotoxy  endp


  getxy   proc    near            ; get cursor position,
                                  ; returns DL = X, DH = Y

          push    ax              ; save registers
          push    bx
          push    cx

          mov     ah,3            ; ROM BIOS function 03h =
                                  ; get cursor position
          mov     bh,0            ; assume page 0
          int     10h             ; transfer to ROM BIOS

          pop     cx              ; restore registers
          pop     bx
          pop     ax
          ret                     ; and return to caller

  getxy   endp
  ctrl_code proc  near            ; process control code
                                  ; call with AL = char

          cmp     al,cr           ; if carriage return
          je      ctrl8           ; just send it

          cmp     al,lf           ; if linefeed
          je      ctrl8           ; just send it

          cmp     al,bsp          ; if backspace
          je      ctrl8           ; just send it

          cmp     al,26           ; is it cls control code?
          jne     ctrl7           ; no, jump

          mov     bh,dattr        ; cls control code, clear
          call    cls             ; screen and home cursor

          jmp     ctrl9

  ctrl7:
          cmp     al,escape       ; is it Escape character?
          jne     ctrl9           ; no, throw it away

          call    esc_seq         ; yes, emulate CRT terminal
          jmp     ctrl9

  ctrl8:  call    pc_out          ; send CR, LF, or backspace
                                  ; to the display

  ctrl9:  ret                     ; return to caller

  ctrl_code endp


  esc_seq proc    near            ; decode Televideo 950 escape
                                  ; sequence for screen control

          call    com_in          ; get next character
          cmp     al,84           ; is it clear to end of line?
          jne     esc_seq1        ; no, jump

          mov     bh,dattr        ; yes, clear to end of line
          call    clreol
          jmp     esc_seq2        ; then exit
  esc_seq1:
          cmp     al,61           ; is it cursor positioning?
          jne     esc_seq2        ; no jump

          call    com_in          ; yes, get Y parameter
          sub     al,33           ; and remove offset
          mov     dh,al

          call    com_in          ; get X parameter
          sub     al,33           ; and remove offset
          mov     dl,al
          call    gotoxy          ; position the cursor

  esc_seq2:                       ; return to caller
          ret

  esc_seq endp


  asc_enb proc    near            ; capture serial-port interrupt
                                  ; vector and enable interrupt

                                  ; save address of previous
                                  ; interrupt handler...
          mov     ax,3500h+com_int ; function 35h = get vector
          int     21h             ; transfer to MS-DOS
          mov     word ptr oldvec+2,es
          mov     word ptr oldvec,bx

                                  ; now install our handler...
          push    ds              ; save our data segment
          mov     ax,cs           ; set DS:DX = address
          mov     ds,ax           ; of our interrupt handler
          mov     dx,offset asc_int
          mov     ax,2500h+com_int ; function 25h = set vector
          int     21h             ; transfer to MS-DOS
          pop     ds              ; restore data segment

          mov     dx,com_mcr      ; set modem-control register
          mov     al,0bh          ; DTR and OUT2 bits
          out     dx,al

          mov     dx,com_ier      ; set interrupt-enable
          mov     al,1            ; register on serial-
          out     dx,al           ; port controller
          in      al,pic_mask     ; read current 8259 mask
          and     al,not int_mask ; set mask for COM port
          out     pic_mask,al     ; write new 8259 mask

          ret                     ; back to caller

  asc_enb endp


  asc_dsb proc    near            ; disable interrupt and
                                  ; release interrupt vector

          in      al,pic_mask     ; read current 8259 mask
          or      al,int_mask     ; reset mask for COM port
          out     pic_mask,al     ; write new 8259 mask

          push    ds              ; save our data segment
          lds     dx,oldvec       ; load address of
                                  ; previous interrupt handler
          mov     ax,2500h+com_int ; function 25h = set vector
          int     21h             ; transfer to MS-DOS
          pop     ds              ; restore data segment

          ret                     ; back to caller

  asc_dsb endp


  asc_int proc    far             ; interrupt service routine
                                  ; for serial port

          sti                     ; turn interrupts back on

          push    ax              ; save registers
          push    bx
          push    dx
          push    ds

          mov     ax,_DATA        ; make our data segment
          mov     ds,ax           ; addressable

          cli                     ; clear interrupts for
                                  ; pointer manipulation

          mov     dx,com_data     ; DX = data port address
          in      al,dx           ; read this character
          mov     bx,asc_in       ; get buffer pointer
          mov     [asc_buf+bx],al ; store this character
          inc     bx              ; bump pointer
          cmp     bx,bufsiz       ; time for wrap?
          jne     asc_int1        ; no, jump
          xor     bx,bx           ; yes, reset pointer

  asc_int1:                       ; store updated pointer
          mov     asc_in,bx

          sti                     ; turn interrupts back on

          mov     al,20h          ; send EOI to 8259
          out     pic_eoi,al

          pop     ds              ; restore all registers
          pop     dx
          pop     bx
          pop     ax

          iret                    ; return from interrupt

  asc_int endp

  _TEXT   ends


  _DATA   segment word public 'DATA'

  in_char db      0               ; PC keyboard input char
  in_flag db      0               ; <>0 if char waiting

  columns db      0               ; highest numbered column in
                                  ; current display mode (39 or 79)

  msg1    db      cr,lf
          db      'Display must be text mode.'
          db      cr,lf
  msg1_len equ $-msg1

  msg2    db      'Terminal emulator running...'
          db      cr,lf
  msg2_len equ $-msg2

  msg3    db      'Exit from terminal emulator.'
          db      cr,lf
  msg3_len equ $-msg3
  oldvec  dd      0               ; original contents of serial-
                                  ; port interrupt vector

  asc_in  dw      0               ; input pointer to ring buffer
  asc_out dw      0               ; output pointer to ring buffer

  asc_buf db      bufsiz dup (?)  ; communications buffer

  _DATA   ends


  STACK   segment para stack 'STACK'

          db      128 dup (?)

  STACK   ends

          end     talk            ;  defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 7-1.  TALK.ASM: A simple terminal-emulator program for IBM
  PCcompatible computers. This program demonstrates use of the MS-DOS and
  ROM BIOS video and keyboard functions and direct control of the
  serial-communications adapter.

  The TALK program illustrates the methods that an application should use to
  take over and service interrupts from the serial port without running
  afoul of MS-DOS conventions.

  The program begins with some equates and conditional assembly statements
  that configure the program for half- or full-duplex and for the desired
  serial port (COM1 or COM2). At entry from MS-DOS, the main routine of the
  program──the procedure named talk──checks the status of the serial port,
  initializes the display, and calls the asc_enb routine to take over the
  serial-port interrupt vector and enable interrupts. The talk procedure
  then enters a loop that reads the keyboard and sends the characters out
  the serial port and then reads the serial port and puts the characters on
  the display──in other words, it causes the PC to emulate a simple CRT
  terminal.

  The TALK program intercepts and handles control codes (carriage return,
  linefeed, and so forth) appropriately. It detects escape sequences and
  handles them as a subset of the Televideo 950 terminal capabilities. (You
  can easily modify the program to emulate any other cursor-addressable
  terminal.) When one of the PC's special function keys is pressed, the
  program disables serial-port interrupts, releases the serial-port
  interrupt vector, and exits back to MS-DOS.

  There are several TALK program procedures that are worth your attention
  because they can easily be incorporated into other programs. These are
  listed in the table on the following page.

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Procedure          Action
  ──────────────────────────────────────────────────────────────────────────
  asc_enb            Takes over the serial-port interrupt vector and enables
                     interrupts by writing to the modem-control register of
                     the INS8250 and the interrupt-mask register of the
                     8259A.

  asc_dsb            Restores the original state of the serial-port
                     interrupt vector and disables interrupts by writing to
                     the interrupt-mask register of the 8259A.

  asc_int            Services serial-port interrupts, placing received
                     characters into a ring buffer.

  com_stat           Tests whether characters from the serial port are
                     waiting in the ring buffer.

  com_in             Removes characters from the interrupt handler's ring
                     buffer and increments the buffer pointers
                     appropriately.
  Procedure          Action
  ──────────────────────────────────────────────────────────────────────────
                     appropriately.

  com_out            Sends one character to the serial port.

  cls                Calls the ROM BIOS video driver to clear the screen.

  clreol             Calls the ROM BIOS video driver to clear from the
                     current cursor position to the end of the line.

  home               Places the cursor in the upper left corner of the
                     screen.

  gotoxy             Positions the cursor at the desired position on the
                     display.

  getxy              Obtains the current cursor position.

  pc_out             Sends one character to the PC's display.

  Procedure          Action
  ──────────────────────────────────────────────────────────────────────────

  pc_stat            Gets status for the PC's keyboard.

  pc_in              Returns a character from the PC's keyboard.
  ──────────────────────────────────────────────────────────────────────────





────────────────────────────────────────────────────────────────────────────
Chapter 8  File Management

  The dual heritage of MS-DOS──CP/M and UNIX/XENIX──is perhaps most clearly
  demonstrated in its file-management services. In general, MS-DOS provides
  at least two distinct operating-system calls for each major file or record
  operation. This chapter breaks this overlapping battery of functions into
  two groups and explains the usage, advantages, and disadvantages of each.

  I will refer to the set of file and record functions that are compatible
  with CP/M as FCB functions. These functions rely on a data structure
  called a file control block (hence, FCB) to maintain certain bookkeeping
  information about open files. This structure resides in the application
  program's memory space. The FCB functions allow the programmer to create,
  open, close, and delete files and to read or write records of any size at
  any record position within such files. These functions do not support the
  hierarchical (treelike) file structure that was first introduced in MS-DOS
  version 2.0, so they can be used only to access files in the current
  subdirectory for a given disk drive.

  I will refer to the set of file and record functions that provide
  compatibility with UNIX/XENIX as the handle functions. These functions
  allow the programmer to open or create files by passing MS-DOS a
  null-terminated string that describes the file's location in the
  hierarchical file structure (the drive and path), the file's name, and its
  extension. If the open or create operation is successful, MS-DOS returns a
  16-bit token, or handle, that is saved by the application program and used
  to specify the file in subsequent operations.

  When you use the handle functions, the operating system maintains the data
  structures that contain bookkeeping information about the file inside its
  own memory space, and these structures are not accessible to the
  application program. The handle functions fully support the hierarchical
  file structure, allowing the programmer to create, open, close, and delete
  files in any subdirectory on any disk drive and to read or write records
  of any size at any byte offset within such files.

  Although we are discussing the FCB functions first in this chapter for
  historical reasons, new MS-DOS applications should always be written using
  the more powerful handle functions. Use of the FCB functions in new
  programs should be avoided, unless compatibility with MS-DOS version 1.0
  is needed.


Using the FCB Functions

  Understanding the structure of the file control block is the key to
  success with the FCB family of file and record functions. An FCB is a
  37-byte data structure allocated within the application program's memory
  space; it is divided into many fields (Figure 8-1). Typically, the
  program initializes an FCB with a drive code, a filename, and an extension
  (conveniently accomplished with the parse-filename service, Int 21H
  Function 29H) and then passes the address of the FCB to MS-DOS to open or
  create the file. If the file is successfully opened or created, MS-DOS
  fills in certain fields of the FCB with information from the file's entry
  in the disk directory. This information includes the file's exact size in
  bytes and the date and time the file was created or last updated. MS-DOS
  also places certain other information within a reserved area of the FCB;
  however, this area is used by the operating system for its own purposes
  and varies among different versions of MS-DOS. Application programs should
  never modify the reserved area.

  For compatibility with CP/M, MS-DOS automatically sets the record-size
  field of the FCB to 128 bytes. If the program does not want to use this
  default record size, it must place the desired size (in bytes) into the
  record-size field after the open or create operation. Subsequently, when
  the program needs to read or write records from the file, it must pass the
  address of the FCB to MS-DOS; MS-DOS, in turn, keeps the FCB updated with
  information about the current position of the file pointer and the size of
  the file. Data is always read to or written from the current disk transfer
  area (DTA), whose address is set with Int 21H Function 1AH. If the
  application program wants to perform random record access, it must set the
  record number into the FCB before issuing each function call; when
  sequential record access is being used, MS-DOS maintains the FCB and no
  special intervention is needed from the application.

  Byte offset
  00H ┌───────────────────────────────────────────────────────┐
      │                 Drive identification                  │ Note 1
  01H ├───────────────────────────────────────────────────────┤
      │                Filename (8 characters)                │ Note 2
  09H ├───────────────────────────────────────────────────────┤
      │               Extension (3 characters)                │ Note 2
  0CH ├───────────────────────────────────────────────────────┤
      │                 Current block number                  │ Note 9
  0EH ├───────────────────────────────────────────────────────┤
      │                      Record size                      │ Note 10
  10H ├───────────────────────────────────────────────────────┤
      │                  File size (4 bytes)                  │ Notes 3, 6
  14H ├───────────────────────────────────────────────────────┤
      │                 Date created/updated                  │ Note 7
  16H ├───────────────────────────────────────────────────────┤
      │                 Time created/updated                  │ Note 8
  18H ├───────────────────────────────────────────────────────┤
      │                       Reserved                        │
  20H ├───────────────────────────────────────────────────────┤
      │                 Current-record number                 │ Note 9
  21H ├───────────────────────────────────────────────────────┤
      │           Relative-record number (4 bytes)            │ Note 5
      └───────────────────────────────────────────────────────┘

  Figure 8-1.  Normal file control block. Total length is 37 bytes (25H
  bytes). See notes on pages 133─34.

  In general, MS-DOS functions that use FCBs accept the full address of the
  FCB in the DS:DX register and pass back a return code in the AL register
  (Figure 8-2). For file-management calls (open, close, create, and
  delete), this return code is zero if the function was successful and 0FFH
  (255) if the function failed. For the FCB-type record read and write
  functions, the success code returned in the AL register is again zero, but
  there are several failure codes. Under MS-DOS version 3.0 or later, more
  detailed error reporting can be obtained by calling Int 21H Function 59H
  (Get Extended Error Information) after a failed FCB function call.

  When a program is loaded under MS-DOS, the operating system sets up two
  FCBs in the program segment prefix, at offsets 005CH and 006CH. These are
  often referred to as the default FCBs, and they are included to provide
  upward compatibility from CP/M. MS-DOS parses the first two parameters in
  the command line that invokes the program (excluding any redirection
  directives) into the default FCBs, under the assumption that they may be
  file specifications. The application must determine whether they really
  are filenames or not. In addition, because the default FCBs overlap and
  are not in a particularly convenient location (especially for .EXE
  programs), they usually must be copied elsewhere in order to be used
  safely. (See Chapter 3.)

  ──────────────────────────────────────────────────────────────────────────
                                               ; filename was previously
                                               ; parsed into "my_fcb"
                  mov   dx,seg my_fcb          ; DS:DX = address of
                  mov   ds,dx                  ; file control block
                  mov   dx,offset my_fcb
                  mov   ah,0fh                 ; function 0fh = open
                  int   21h
                  or    al,al                  ; was open successful?
                  jnz   error                  ; no, jump to error routine
                  .
                  .
                  .
  my_fcb          db    37 dup (0)             ; file control block
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-2.  A typical FCB file operation. This sequence of code attempts
  to open the file whose name was previously parsed into the FCB named
  my_fcb.

  Note that the structures of FCBs under CP/M and MS-DOS are not identical.
  However, the differences lie chiefly in the reserved areas of the FCBs
  (which should not be manipulated by application programs in any case), so
  well-behaved CP/M applications should be relatively easy to port into
  MS-DOS. It seems, however, that few such applications exist. Many of the
  tricks that were played by clever CP/M programmers to increase performance
  or circumvent the limitations of that operating system can cause severe
  problems under MS-DOS, particularly in networking environments. At any
  rate, much better performance can be achieved by thoroughly rewriting the
  CP/M applications to take advantage of the superior capabilities of
  MS-DOS.

  You can use a special FCB variant called an extended file control block to
  create or access files with special attributes (such as hidden or
  read-only files), volume labels, and subdirectories. An extended FCB has a
  7-byte header followed by the 37-byte structure of a normal FCB (Figure
  8-3). The first byte contains 0FFH, which could never be a legal drive
  code and thus indicates to MS-DOS that an extended FCB is being used. The
  next 5 bytes are reserved and are unused in current versions of MS-DOS.
  The seventh byte contains the attribute of the special file type that is
  being accessed. (Attribute bytes are discussed in more detail in Chapter
  9.) Any MS-DOS function that uses a normal FCB can also use an extended
  FCB.

  The FCB file- and record-management functions may be gathered into the
  following broad classifications:

  Byte
  offset
  00H ┌───────────────────────────────────────────────────────┐
      │                         0FFH                          │ Note 11
  01H ├───────────────────────────────────────────────────────┤
      │           Reserved (5 bytes, must be zero)            │
  06H ├───────────────────────────────────────────────────────┤
      │                    Attribute byte                     │ Note 12
  07H ├───────────────────────────────────────────────────────┤
      │                 Drive identification                  │ Note 1
  08H ├───────────────────────────────────────────────────────┤
      │                Filename (8 characters)                │ Note 2
  10H ├───────────────────────────────────────────────────────┤
      │               Extension (3 characters)                │ Note 2
  13H ├───────────────────────────────────────────────────────┤
      │                 Current-block number                  │ Note 9
  15H ├───────────────────────────────────────────────────────┤
      │                      Record size                      │ Note 10
  17H ├───────────────────────────────────────────────────────┤
      │                  File size (4 bytes)                  │ Notes 3, 6
  1BH ├───────────────────────────────────────────────────────┤
      │                 Date created/updated                  │ Note 7
  1DH ├───────────────────────────────────────────────────────┤
      │                 Time created/updated                  │ Note 8
  1FH ├───────────────────────────────────────────────────────┤
      │                       Reserved                        │
  27H ├───────────────────────────────────────────────────────┤
      │                 Current-record number                 │ Note 9
  28H ├───────────────────────────────────────────────────────┤
      │           Relative-record number (4 bytes)            │ Note 5
      └───────────────────────────────────────────────────────┘

  Figure 8-3.  Extended file control block. Total length is 44 bytes (2CH
  bytes). See notes on pages 133─34.

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  Common FCB file operations
  0FH                     Open file.
  10H                     Close file.
  16H                     Create file.

  Common FCB record operations
  14H                     Perform sequential read.
  15H                     Perform sequential write.
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  15H                     Perform sequential write.
  21H                     Perform random read.
  22H                     Perform random write.
  27H                     Perform random block read.
  28H                     Perform random block write.

  Other vital FCB operations
  1AH                     Set disk transfer address.
  29H                     Parse filename.

  Less commonly used FCB file operations
  13H                     Delete file.
  17H                     Rename file.

  Less commonly used FCB record operations
  23H                     Obtain file size.
  24H                     Set relative-record number.
  ──────────────────────────────────────────────────────────────────────────

  Function                 Action
  ──────────────────────────────────────────────────────────────────────────


  Several of these functions have special properties. For example, Int 21H
  Functions 27H (Random Block Read) and 28H (Random Block Write) allow
  reading and writing of multiple records of any size and also update the
  random-record field automatically (unlike Int 21H Functions 21H and
  22H). Int 21H Function 28H can truncate a file to any desired size, and
  Int 21H Function 17H used with an extended FCB can alter a volume label
  or rename a subdirectory.

  Section 2 of this book, "MS-DOS Functions Reference," gives detailed
  specifications for each of the FCB file and record functions, along with
  assembly-language examples. It is also instructive to compare the
  preceding groups with the corresponding groups of handle-type functions
  listed on pages 140─41.

  ──────────────────────────────────────────────────────────────────────────
  Notes for Figures 8-1 and 8-3
    1.  The drive identification is a binary number: 00=default drive,
        01=drive A:, 02=drive B:, and so on. If the application program
        supplies the drive code as zero (default drive), MS-DOS fills in the
        code for the actual current disk drive after a successful open or
        create call.

    2.  File and extension names must be left justified and padded with
        blanks.

    3.  The file size, date, time, and reserved fields should not be
        modified by applications.

    4.  All word fields are stored with the least significant byte at the
        lower address.

    5.  The relative-record field is treated as 4 bytes if the record size
        is less than 64 bytes; otherwise, only the first 3 bytes of this
        field are used.

    6.  The file-size field is in the same format as in the directory, with
        the less significant word at the lower address.

    7.  The date field is mapped as in the directory. Viewed as a 16-bit
        word (as it would appear in a register), the field is broken down as
        follows:

      F  E  D  C  B  A  9   8     7     6     5    4   3   2   1   0
    ┌─────────────────────┬─────────────────────┬─────────────────────┐
    │        Year         │        Month        │         Day         │
    └─────────────────────┴─────────────────────┴─────────────────────┘

    Bits              Contents
    ────────────────────────────────────────────────────────────────────────
    00H─04H           Day (1─31)
    05H─08H           Month (1─12)
    09H─0FH           Year, relative to 1980
    ────────────────────────────────────────────────────────────────────────

    8.  The time field is mapped as in the directory. Viewed as a 16-bit
        word (as it would appear in a register), the field is broken down as
        follows:

      F   E   D   C   B   A   9   8   7   6   5   4   3   2   1   0
    ┌───────────────────┬───────────────────────┬─────────────────────┐
    │     Hours         │        Minutes        │ 2-second increments │
    └───────────────────┴───────────────────────┴─────────────────────┘

    Bits              Contents
    ────────────────────────────────────────────────────────────────────────
    00H─04H           2-second increments (0─29)
    05H─0AH           Minutes (0─59)
    0BH─0FH           Hours (0─23)
    ────────────────────────────────────────────────────────────────────────

    9.  The current-block and current-record numbers are used together on
        sequential reads and writes. This simulates the behavior of CP/M.

    10. The Int 21H open (0FH) and create (16H) functions set the
        record-size field to 128 bytes, to provide compatibility with CP/M.
        If you use another record size, you must fill it in after the open
        or create operation.

    11. An 0FFH (255) in the first byte of the structure signifies that it
        is an extended file control block. You can use extended FCBs with
        any of the functions that accept an ordinary FCB. (See also note
        12.)

    12. The attribute byte in an extended FCB allows access to files with
        the special characteristics hidden, system, or read-only. You can
        also use extended FCBs to read volume labels and the contents of
        special subdirectory files.

  ──────────────────────────────────────────────────────────────────────────

FCB File-Access Skeleton

  The following is a typical program sequence to access a file using the
  FCB, or traditional, functions (Figure 8-4):

  1.  Zero out the prospective FCB.

  2.  Obtain the filename from the user, from the default FCBs, or from the
      command tail in the PSP.

  3.  If the filename was not obtained from one of the default FCBs, parse
      the filename into the new FCB using Int 21H Function 29H.

  4.  Open the file (Int 21H Function 0FH) or, if writing new data only,
      create the file or truncate any existing file of the same name to zero
      length (Int 21H Function 16H).

  5.  Set the record-size field in the FCB, unless you are using the default
      record size. Recall that it is important to do this after a successful
      open or create operation. (See Figure 8-5.)

  6.  Set the relative-record field in the FCB if you are performing random
      record I/O.

  7.  Set the disk transfer area address using Int 21H Function 1AH, unless
      the buffer address has not been changed since the last call to this
      function. If the application never performs a set DTA, the DTA address
      defaults to offset 0080H in the PSP.

  8.  Request the needed read- or write-record operation (Int 21H Function
      14H─Sequential Read, 15H─Sequential Write, 21H─Random Read,
      22H─Random Write, 27H─Random Block Read, 28H─Random Block Write).

  9.  If the program is not finished processing the file, go to step 6;
      otherwise, close the file (Int 21H Function 10H). If the file was
      used for reading only, you can skip the close operation under early
      versions of MS-DOS. However, this shortcut can cause problems under
      MS-DOS versions 3.0 and later, especially when the files are being
      accessed across a network.

  ──────────────────────────────────────────────────────────────────────────
  recsize      equ   1024                   ; file record size
               .
               .
               .
               mov   ah,29h                 ; parse input filename
               mov   al,1                   ; skip leading blanks
               mov   si,offset fname1       ; address of filename
               mov   di,offset fcb1         ; address of FCB
               int   21h
               or    al,al                  ; jump if name
               jnz   name_err               ; was bad
               .
               .
               .
               mov   ah,29h                 ; parse output filename
               mov   al,1                   ; skip leading blanks
               mov   si,offset fname2       ; address of filename
               mov   di,offset fcb2         ; address of FCB
               int   21h
               or    al,al                  ; jump if name
               jnz   name_err               ; was bad
               .
               .
               .
               mov   ah,0fh                 ; open input file
               mov   dx,offset fcb1
               int   21h
               or    al,al                  ; open successful?
               jnz   no_file                ; no, jump
               .
               .
               .
               mov   ah,16h                 ; create and open
               mov   dx,offset fcb2         ; output file
               int   21h
               or    al,al                  ; create successful?
               jnz   disk_full              ; no, jump
               .
               .
               .                            ; set record sizes
               mov   word ptr fcb1+0eh,recsize
               mov   word ptr fcb2+0eh,recsize
               .
               .
               .
               mov   ah,1ah                 ; set disk transfer
               mov   dx,offset buffer       ; address for reads
               int   21h                    ; and writes
               .
  next:        .                            ; process next record
               .
               mov   ah,14h                 ; sequential read from
               mov   dx,offset fcb1         ; input file
               int   21h
               cmp   al,01                  ; check for end of file
               je    file_end               ; jump if end of file
               cmp   al,03
               je    file_end               ; jump if end of file
               or    al,al                  ; other read fault?
               jnz   bad_read               ; jump if bad read
               .
               .
               .
               mov   ah,15h                 ; sequential write to
               mov   dx,offset fcb2         ; output file
               int   21h
               or    al,al                  ; write successful?
               jnz   bad_write              ; jump if write failed
               .
               .
               .
               jmp   next                   ; process next record
               .
  file_end:    .                            ; reached end of input
               .
               mov   ah,10h                 ; close input file
               mov   dx,offset fcb1
               int   21h
               .
               .
               .
               mov   ah,10h                 ; close output file
               mov   dx,offset fcb2
               int   21h
               .
               .
               .
               mov   ax,4c00h               ; exit with return
               int   21h                    ; code of zero
               .
               .
               .
  fname1       db    'OLDFILE.DAT',0        ; name of input file
  fname2       db    'NEWFILE.DAT',0        ; name of output file
  fcb1         db    37 dup (0)             ; FCB for input file
  fcb2         db    37 dup (0)             ; FCB for output file
  buffer       db    recsize dup (?)        ; buffer for file I/O
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-4.  Skeleton of an assembly-language program that performs file
  and record I/O using the FCB family of functions.

  Byte Offset  FCB before open       FCB contents       FCB after open
           ┌────────────────────┬────────────────────┬────────────────────┐
       00H │         00         │       Drive        │         03         │
           ├────────────────────┼────────────────────┼────────────────────┤
       01H │         4D         │                    │         4D         │
       02H │         59         │                    │         59         │
       03H │         46         │                    │         46         │
       04H │         49         │      Filename      │         49         │
       05H │         4C         │                    │         4C         │
       06H │         45         │                    │         45         │
       07H │         20         │                    │         20         │
       08H │         20         │                    │         20         │
           ├────────────────────┼────────────────────┼────────────────────┤
       09H │         44         │                    │         44         │
       0AH │         41         │     Extension      │         41         │
       0BH │         54         │                    │         54         │
           ├────────────────────┼────────────────────┼────────────────────┤
       0CH │         00         │                    │         00         │
       0DH │         00         │   Current block    │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       0EH │         00         │                    │         80         │
       0FH │         00         │    Record size     │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       10H │         00         │                    │         80         │
       11H │         00         │                    │         3D         │
       12H │         00         │     File size      │         00         │
       13H │         00         │                    │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       14H │         00         │                    │         43         │
       15H │         00         │     File date      │         0B         │
           ├────────────────────┼────────────────────┼────────────────────┤
       16H │         00         │                    │         A1         │
       17H │         00         │     File time      │         52         │
           ├────────────────────┼────────────────────┼────────────────────┤
       18H │         00         │                    │         03         │
       19H │         00         │                    │         02         │
       1AH │         00         │                    │         42         │
       1BH │         00         │                    │         73         │
       1CH │         00         │      Reserved      │         00         │
       1DH │         00         │                    │         01         │
       1EH │         00         │                    │         35         │
       1FH │         00         │                    │         0F         │
           ├────────────────────┼────────────────────┼────────────────────┤
       20H │         00         │   Current record   │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       21H │         00         │                    │         00         │
       22H │         00         │  Relative-record   │         00         │
       23H │         00         │       number       │         00         │
       24H │         00         │                    │         00         │
           └────────────────────┴────────────────────┴────────────────────┘

  Figure 8-5.  A typical file control block before and after a successful
  open call (Int 21H Function 0FH).

Points to Remember

  Here is a summary of the pros and cons of using the FCB-related file and
  record functions in your programs.

  Advantages:

  ■  Under MS-DOS versions 1 and 2, the number of files that can be open
     concurrently when using FCBs is unlimited. (This is not true under
     MS-DOS versions 3.0 and later, especially if networking software is
     running.)

  ■  File-access methods using FCBs are familiar to programmers with a CP/M
     background, and well-behaved CP/M applications require little change in
     logical flow to run under MS-DOS.

  ■  MS-DOS supplies the size, time, and date for a file to its FCB after
     the file is opened. The calling program can inspect this information.

  Disadvantages:

  ■  FCBs take up room in the application program's memory space.

    FCBs offer no support for the hierarchical file structure (no access to
     files outside the current directory).

    FCBs provide no support for file locking/sharing or record locking in
     networking environments.

    In addition to the read or write call itself, file reads or writes
     using FCBs require manipulation of the FCB to set record size and
     record number, plus a previous call to a separate MS-DOS function to
     set the DTA address.

    Random record I/O using FCBs for a file containing variable-length
     records is very clumsy and inconvenient.

    You must use extended FCBs, which are incompatible with CP/M anyway, to
     access or create files with special attributes such as hidden,
     read-only, or system.

    The FCB file functions have poor error reporting. This situation has
     been improved somewhat in MS-DOS version 3 because a program can call
     the added Int 21H Function 59H (Get Extended Error Information) after
     a failed FCB function to obtain additional information.

    Microsoft discourages use of FCBs. FCBs will make your program more
     difficult to port to MS OS/2 later because MS OS/2 does not support
     FCBs in protected mode at all.


Using the Handle Functions

  The handle file- and record-management functions access files in a fashion
  similar to that used under the UNIX/XENIX operating system. Files are
  designated by an ASCIIZ string (an ASCII character string terminated by a
  null, or zero, byte) that can contain a drive designator, path, filename,
  and extension. For example, the file specification

  C:\SYSTEM\COMMAND.COM

  would appear in memory as the following sequence of bytes:

  43 3A 5C 53 59 53 54 45 4D 5C 43 4F 4D 4D 41 4E 44 2E 43 4F 4D 00

  When a program wishes to open or create a file, it passes the address of
  the ASCIIZ string specifying the file to MS-DOS in the DS:DX registers
  (Figure 8-6). If the operation is successful, MS-DOS returns a 16-bit
  handle to the program in the AX register. The program must save this
  handle for further reference.

  ──────────────────────────────────────────────────────────────────────────
               mov   ah,3dh                  ; function 3dh = open
               mov   al,2                    ; mode 2 = read/write
               mov   dx,seg filename         ; address of ASCIIZ
               mov   ds,dx                   ; file specification
               mov   dx,offset filename
               int   21h                     ; request open from DOS
               jc    error                   ; jump if open failed
               mov   handle,ax               ; save file handle
               .
               .
               .
  filename     db    'C:\MYDIR\MYFILE.DAT',0 ; filename
  handle       dw    0                       ; file handle
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-6.  A typical handle file operation. This sequence of code
  attempts to open the file designated in the ASCIIZ string whose address is
  passed to MS-DOS in the DS:DX registers.

  When the program requests subsequent operations on the file, it usually
  places the handle in the BX register before the call to MS-DOS. All the
  handle functions return with the CPU's carry flag cleared if the operation
  was successful, or set if the operation failed; in the latter case, the AX
  register contains a code describing the failure.

  MS-DOS restricts the number of handles that can be active at any one
  time──that is, the number of files and devices that can be open
  concurrently when using the handle family of functions──in two different
  ways:

  ■  The maximum number of concurrently open files in the system, for all
     active processes combined, is specified by the entry

     FILES=nn

     in the CONFIG.SYS file. This entry determines the number of entries
     to be allocated in the system file table; under MS-DOS version 3, the
     default value is 8 and the maximum is 255. After MS-DOS is booted and
     running, you cannot expand this table to increase the total number of
     files that can be open. You must use an editor to modify the CONFIG.SYS
     file and then restart the system.

  ■  The maximum number of concurrently open files for a single process is
     20, assuming that sufficient entries are also available in the system
     file table. When a program is loaded, MS-DOS preassigns 5 of its
     potential 20 handles to the standard devices. Each time the process
     issues an open or create call, MS-DOS assigns a handle from the
     process's private allocation of 20, until all the handles are used up
     or the system file table is full. In MS-DOS versions 3.3 and later, you
     can expand the per-process limit of 20 handles with a call to Int 21H
     Function 67H (Set Handle Count).

  The handle file- and record-management calls may be gathered into the
  following broad classifications for study:

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Function                 Action
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  Common handle file operations
  3CH                     Create file (requires ASCIIZ string).
  3DH                     Open file (requires ASCIIZ string).
  3EH                     Close file.

  Common handle record operations
  42H                     Set file pointer (also used to find file size).
  3FH                     Read file.
  40H                     Write file.

  Less commonly used handle operations
  41H                     Delete file.
  43H                     Get or modify file attributes.
  44H                     IOCTL (I/O Control).
  45H                     Duplicate handle.
  46H                     Redirect handle.
  56H                     Rename file.
  57H                     Get or set file date and time.
  5AH                     Create temporary file (versions 3.0 and later).
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  5AH                     Create temporary file (versions 3.0 and later).
  5BH                     Create file (fails if file already exists;
                           versions 3.0 and later).
  5CH                     Lock or unlock file region (versions 3.0 and
                           later).
  67H                     Set handle count (versions 3.3 and later).
  68H                     Commit file (versions 3.3 and later).
  6CH                     Extended open file (version 4).
  ──────────────────────────────────────────────────────────────────────────


  Compare the groups of handle-type functions in the preceding table with
  the groups of FCB functions outlined earlier, noting the degree of
  functional overlap. Section 2 of this book, "MS-DOS Functions Reference,"
  gives detailed specifications for each of the handle functions, along with
  assembly-language examples.

Handle File-Access Skeleton

  The following is a typical program sequence to access a file using the
  handle family of functions (Figure 8-7):

  1.  Get the filename from the user by means of the buffered input service
      (Int 21H Function 0AH) or from the command tail supplied by MS-DOS in
      the PSP.

  2.  Put a zero at the end of the file specification in order to create an
      ASCIIZ string.

  3.  Open the file using Int 21H Function 3DH and mode 2 (read/write
      access), or create the file using Int 21H Function 3CH. (Be sure to
      set the CX register to zero, so that you don't accidentally make a
      file with special attributes.) Save the handle that is returned.

  4.  Set the file pointer using Int 21H Function 42H. You may set the
      file-pointer position relative to one of three different locations:
      the start of the file, the current pointer position, or the end of the
      file. If you are performing sequential record I/O, you can usually
      skip this step because MS-DOS will maintain the file pointer for you
      automatically.

  5.  Read from the file (Int 21H Function 3FH) or write to the file (Int
      21H Function 40H). Both of these functions require that the BX
      register contain the file's handle, the CX register contain the length
      of the record, and the DS:DX registers point to the data being
      transferred. Both return the actual number of bytes transferred in the
      AX register.

      In a read operation, if the number of bytes read is less than the
      number requested, the end of the file has been reached. In a write
      operation, if the number of bytes written is less than the number
      requested, the disk containing the file is full. Neither of these
      conditions is returned as an error code; that is, the carry flag is
      not set.

  6.  If the program is not finished processing the file, go to step 4;
      otherwise, close the file (Int 21H Function 3EH). Any normal exit
      from the program will also close all active handles.

  ──────────────────────────────────────────────────────────────────────────
  recsize      equ     1024                 ; file record size
               .
               .
               .
               mov   ah,3dh                 ; open input file
               mov   al,0                   ; mode = read only
               mov   dx,offset fname1       ; name of input file
               int   21h
               jc    no_file                ; jump if no file
               mov   handle1,ax             ; save token for file
               .
               .
               .
               mov   ah,3ch                 ; create output file
               mov   cx,0                   ; attribute = normal
               mov   dx,offset fname2       ; name of output file
               int   21h
               jc    disk_full              ; jump if create fails
               mov   handle2,ax             ; save token for file
               .
  next:        .                            ; process next record
               .
               mov   ah,3fh                 ; sequential read from
               mov   bx,handle1             ; input file
               mov   cx,recsize
               mov   dx,offset buffer
               int   21h
               jc    bad_read               ; jump if read error
               or    ax,ax                  ; check bytes transferred
               jz    file_end               ; jump if end of file
               .
               .
               .
               mov   ah,40h                 ; sequential write to
               mov   bx,handle2             ; output file
               mov   cx,recsize
               mov   dx,offset buffer
               int   21h
               jc    bad_write              ; jump if write error
               cmp   ax,recsize             ; whole record written?
               jne   disk_full              ; jump if disk is full
               .
               .
               .
               jmp   next                   ; process next record
               .
  file_end:    .                            ; reached end of input
               .
               mov   ah,3eh                 ; close input file
               mov   bx,handle1
               int   21h
               .
               .
               .
               mov   ah,3eh                 ; close output file
               mov   bx,handle2
               int   21h
               .
               .
               .
               mov   ax,4c00h               ; exit with return
               int   21h                    ; code of zero
               .
               .
               .
  fname1       db    'OLDFILE.DAT',0        ; name of input file
  fname2       db    'NEWFILE.DAT',0        ; name of output file
  handle1      dw    0                      ; token for input file
  handle2      dw    0                      ; token for output file
  buffer       db    recsize dup (?)        ; buffer for file I/O
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-7.  Skeleton of an assembly-language program that performs
  sequential processing on an input file and writes the results to an output
  file using the handle file and record functions. This code assumes that
  the DS and ES registers have already been set to point to the segment
  containing the buffers and filenames.

Points to Remember

  Here is a summary of the pros and cons of using the handle file and record
  operations in your program. Compare this list with the one given earlier
  in the chapter for the FCB family of functions.

  Advantages:

    The handle calls provide direct support for I/O redirection and pipes
     with the standard input and output devices in a manner functionally
     similar to that used by UNIX/XENIX.

    The handle functions provide direct support for directories (the
     hierarchical file structure) and special file attributes.

    The handle calls support file sharing/locking and record locking in
     networking environments.

    Using the handle functions, the programmer can open channels to
     character devices and treat them as files.

    The handle calls make the use of random record access extremely easy.
     The current file pointer can be moved to any byte offset relative to
     the start of the file, the end of the file, or the current pointer
     position. Records of any length, up to an entire segment (65,535
     bytes), can be read to any memory address in one operation.

    The handle functions have relatively good error reporting in MS-DOS
     version 2, and error reporting has been enhanced even further in MS-DOS
     versions 3.0 and later.

    Microsoft strongly encourages use of the handle family of functions in
     order to provide upward compatibility with MS OS/2.

  Disadvantages:

    There is a limit per program of 20 concurrently open files and devices
     using handles in MS-DOS versions 2.0 through 3.2.

    Minor gaps still exist in the implementation of the handle functions.
     For example, you must still use extended FCBs to change volume labels
     and to access the contents of the special files that implement
     directories.


MS-DOS Error Codes

  When one of the handle file functions fails with the carry flag set, or
  when a program calls Int 21H Function 59H (Get Extended Error
  Information) following a failed FCB function or other system service, one
  of the following error codes may be returned:

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  MS-DOS version 2 error codes
  01H                      Function number invalid
  02H                      File not found
  03H                      Path not found
  04H                      Too many open files
  05H                      Access denied
  06H                      Handle invalid
  07H                      Memory control blocks destroyed
  08H                      Insufficient memory
  09H                      Memory block address invalid
  0AH (10)                 Environment invalid
  0BH (11)                 Format invalid
  0CH (12)                 Access code invalid
  0DH (13)                 Data invalid
  0EH (14)                 Unknown unit
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  0EH (14)                 Unknown unit
  0FH (15)                 Disk drive invalid
  10H (16)                 Attempted to remove current directory
  11H (17)                 Not same device
  12H (18)                 No more files

  Mappings to critical-error codes
  13H (19)                 Write-protected disk
  14H (20)                 Unknown unit
  15H (21)                 Drive not ready
  16H (22)                 Unknown command
  17H (23)                 Data error (CRC)
  18H (24)                 Bad request-structure length
  19H (25)                 Seek error
  1AH (26)                 Unknown media type
  1BH (27)                 Sector not found
  1CH (28)                 Printer out of paper
  1DH (29)                 Write fault
  1EH (30)                 Read fault
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  1EH (30)                 Read fault
  1FH (31)                 General failure

  MS-DOS version 3 and later extended error codes
  20H (32)                 Sharing violation
  21H (33)                 File-lock violation
  22H (34)                 Disk change invalid
  23H (35)                 FCB unavailable
  24H (36)                 Sharing buffer exceeded
  25H31H (3749)          Reserved
  32H (50)                 Unsupported network request
  33H (51)                 Remote machine not listening
  34H (52)                 Duplicate name on network
  35H (53)                 Network name not found
  36H (54)                 Network busy
  37H (55)                 Device no longer exists on network
  38H (56)                 NetBIOS command limit exceeded
  39H (57)                 Error in network adapter hardware
  3AH (58)                 Incorrect response from network
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  3AH (58)                 Incorrect response from network
  3BH (59)                 Unexpected network error
  3CH (60)                 Remote adapter incompatible
  3DH (61)                 Print queue full
  3EH (62)                 Not enough room for print file
  3FH (63)                 Print file was deleted
  40H (64)                 Network name deleted
  41H (65)                 Network access denied
  42H (66)                 Incorrect network device type
  43H (67)                 Network name not found
  44H (68)                 Network name limit exceeded
  45H (69)                 NetBIOS session limit exceeded
  46H (70)                 Temporary pause
  47H (71)                 Network request not accepted
  48H (72)                 Print or disk redirection paused
  49H4FH (7379)          Reserved
  50H (80)                 File already exists
  51H (81)                 Reserved
  52H (82)                 Cannot make directory
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  52H (82)                 Cannot make directory
  53H (83)                 Fail on Int 24H (critical error)
  54H (84)                 Too many redirections
  55H (85)                 Duplicate redirection
  56H (86)                 Invalid password
  57H (87)                 Invalid parameter
  58H (88)                 Net write fault
  ──────────────────────────────────────────────────────────────────────────


  Under MS-DOS versions 3.0 and later, you can also use Int 21H Function
  59H to obtain other information about the error, such as the error locus
  and the recommended recovery action.

Critical-Error Handlers

  In Chapter 5, we discussed how an application program can take over the
  Ctrl-C handler vector (Int 23H) and replace the MS-DOS default handler, to
  avoid losing control of the computer when the user enters a Ctrl-C or
  Ctrl-Break at the keyboard. Similarly, MS-DOS provides a
  critical-error-handler vector (Int 24H) that defines the routine to be
  called when unrecoverable hardware faults occur. The default MS-DOS
  critical-error handler is the routine that displays a message describing
  the error type and the cue

  Abort, Retry, Ignore?

  This message appears after such actions as the following:

    Attempting to open a file on a disk drive that doesn't contain a floppy
     disk or whose door isn't closed

    Trying to read a disk sector that contains a CRC error

    Trying to print when the printer is off line

  The unpleasant thing about MS-DOS's default critical-error handler is, of
  course, that if the user enters an A for Abort, the application that is
  currently executing is terminated abruptly and never has a chance to clean
  up and make a graceful exit. Intermediate files may be left on the disk,
  files that have been extended using FCBs are not properly closed so that
  the directory is updated, interrupt vectors may be left pointing into the
  transient program area, and so forth.

  To write a truly bombproof MS-DOS application, you must take over the
  critical-error-handler vector and point it to your own routine, so that
  your program intercepts all catastrophic hardware errors and handles them
  appropriately. You can use MS-DOS Int 21H Function 25H to alter the Int
  24H vector in a well-behaved manner. When your application exits, MS-DOS
  will automatically restore the previous contents of the Int 24H vector
  from information saved in the program segment prefix.

  MS-DOS calls the critical-error handler for two general classes of
  errors── disk-related and non-disk-related──and passes different
  information to the handler in the registers for each of these classes.

  For disk-related errors, MS-DOS sets the registers as shown on the
  following page. (Bits 3─5 of the AH register are relevant only in MS-DOS
  versions 3.1 and later.)

╓┌─┌──────────────────┌─────────────────┌────────────────────────────────────╖
  Register           Bit(s)            Significance
  ──────────────────────────────────────────────────────────────────────────
  AH                 7                 0, to signify disk error
                     6                 Reserved
                     5                 0 = ignore response not allowed
                                       1 = ignore response allowed
                     4                 0 = retry response not allowed
                                       1 = retry response allowed
                     3                 0 = fail response not allowed
                                       1 = fail response allowed
                     1─2               Area where disk error occurred
                                       00 = MS-DOS area
                                       01 = file allocation table
                                       10 = root directory
                                       11 = files area
                     0                 0 = read operation
                                       1 = write operation
  AL                 0─7               Drive code (0 = A, 1 = B, and so
                                       forth)
  DI                 0─7               Driver error code
                     8─15              Not used
  Register           Bit(s)            Significance
  ──────────────────────────────────────────────────────────────────────────
                     8─15              Not used
  BP:SI                                Segment:offset of device-driver
                                       header
  ──────────────────────────────────────────────────────────────────────────


  For non-disk-related errors, the interrupt was generated either as the
  result of a character-device error or because a corrupted memory image of
  the file allocation table was detected. In this case, MS-DOS sets the
  registers as follows:

  Register           Bit(s)            Significance
  ──────────────────────────────────────────────────────────────────────────
  AH                 7                 1, to signify a non-disk error
  DI                 0─7               Driver error code
                     8─15              Not used
  BP:SI                                Segment:offset of device-driver
                                       header
  ──────────────────────────────────────────────────────────────────────────

  To determine whether the critical error was caused by a character device,
  use the address in the BP:SI registers to examine the device attribute
  word at offset 0004H in the presumed device-driver header. If bit 15 is
  set, then the error was indeed caused by a character device, and the
  program can inspect the name field of the driver's header to determine the
  device.

  At entry to a critical-error handler, MS-DOS has already disabled
  interrupts and set up the stack as shown in Figure 8-8. A critical-error
  handler cannot use any MS-DOS services except Int 21H Functions 01H
  through 0CH (Traditional Character I/O), Int 21H Function 30H (Get MS-DOS
  Version), and Int 21H Function 59H (Get Extended Error Information).
  These functions use a special stack so that the context of the original
  function (which generated the critical error) will not be lost.

  ┌───────┐─┐
   Flags  
  ├───────┤   Flags and CS:IP pushed
    CS    ├─ on stack by original
  ├───────┤   Int 21H call
    IP    
  ├───────┤═╡◄─SS:SP on entry to
    ES      Int 21H handler
  ├───────┤ 
    DS    
  ├───────┤ 
    BP    
  ├───────┤ 
    DI    
  ├───────┤ ├─ Registers at point of
    SI      original Int 21H call
  ├───────┤ 
    DX    
  ├───────┤ 
    CX    
  ├───────┤ 
    BX    
  ├───────┤ 
    AX    
  ├───────┤═╡
   Flags  
  ├───────┤ 
    CS