PCjs Machines

Home of the original IBM PC emulator for browsers.

MS C 6.0 Advanced Programming Techniques

The following document is from the Microsoft Programmer’s Library 1.3 CD-ROM.

Microsoft  C - Advanced Programming Techniques

────────────────────────────────────────────────────────────────────────────
Microsoft (R) C - Advanced Programming Techniques

FOR MS (R) OS/2 AND MS-DOS (R)
OPERATING SYSTEMS
────────────────────────────────────────────────────────────────────────────

MICROSOFT CORPORATION

Information in this document is subject to change without notice and does
not represent a commitment on the part of Microsoft Corporation. The
software described in this document is furnished under a license agreement
or nondisclosure agreement. The software may be used or copied only in
accordance with the terms of the agreement. It is against the law to copy
the software on any medium except as specifically allowed in the license or
nondisclosure agreement. No part of this manual may be reproduced or trans-
mitted in any form or by any means, electronic or mechanical, including
photocopying and recording, for any purpose without the express written
permission of Microsoft.

Printed and bound in the United States of America.

Microsoft, MS, MS-DOS, CodeView, InPort, and XENIX are
registered trademarks and Windows is a trademark of Microsoft Corporation.

Apple and Macintosh are registered trademarks and Finder
is a trademark of Apple Computer, Inc.

AT&T is a registered trademark of American Telephone
and Telegraph Company.

Hercules is a registered trademark and InColor is a trademark
of Hercules Computer Technology.

IBM is a registered trademark of International Business
Machines Corporation.

Intel is a registered trademark of Intel Corporation.

Olivetti is a registered trademark of Ing. C. Olivetti.

PDP-11 and VAX-11 are registered trademarks of Digital
Equipment Corporation.

WANG is a registered trademark of Wang Laboratories.

Z8000 is a registered trademark of Zilog, Inc.

Document No. LN06514-1189 OEMO711-6Z
10 9 8 7 6 5 4 3 2 1

────────────────────────────────────────────────────────────────────────────

Introduction
Scope of This Book
Document Conventions

PART I  Improving Program Performance
────────────────────────────────────────────────────────────────────────────

Chapter 1  Optimizing C Programs

1.1   Controlling Optimization from the Programmer's WorkBench
1.2   Controlling Optimization from the Command Line
1.3   Controlling Optimization with Pragmas
1.4   Default Optimization
1.4.1    Common Subexpression Elimination
1.4.3    Constant Propagation
1.5   Customizing Your Optimizations
1.5.1    Choosing Speed or Size (/Ot and /Os)
1.5.2    Generating Intrinsic Functions (/Oi)
1.5.3    Assuming No Aliasing (/Oa and /Ow)
1.5.4    Performing Loop Optimizations (/Ol)
1.5.5    Disabling Unsafe Loop Optimizations (/On)
1.5.6    Enabling Aggressive Optimizations (/Oz)
1.5.7    Removing Stack Probes (/Gs)
1.5.8    Enabling Global Register Allocation (/Oe)
1.5.9    Enabling Common Subexpression Optimization (/Oc and
/Og)
1.5.10   Achieving Consistent Floating-Point Results (/Op)
1.5.11   Using the 80186, 80188, or 80286 Processor (/G0, /G1,
/G2)
1.5.12   Optimizing for Maximum Efficiency (/Ox)
1.6   Linker (LINK) Options that Control Optimization
1.6.1    Enabling Far Call Optimization (/FARCALLTRANSLATION)
1.6.2    Packing Code (/PACKCODE)
1.6.3    Packing Data (/PACKDATA)
1.6.4    Packing the Executable File (/EXEPACK)
1.7   Optimizing in Different Environments
1.7.1    Optimizing in DOS
1.7.2    Optimizing in OS/2
1.7.3    Optimizing in Microsoft Windows(tm)
1.8   Choosing Function-Calling Conventions
1.8.1    The C Calling Convention (/Gd)
1.8.2    The FORTRAN/Pascal Calling Convention (/Gc)
1.8.3    The Register Calling Convention (/Gr)
1.8.4    The _fastcall Calling Convention

Chapter 2  Managing Memory

2.1   Pointer Sizes
2.1.1    Pointers and 64K Segments
2.1.2    Near Pointers
2.1.3    Far Pointers
2.1.4    Huge Pointers
2.2   Selecting a Standard Memory Model
2.2.1    The Six Standard Memory Models
2.2.2    Limitations on Code Size and Data Size
2.2.3    The Tiny Memory Model
2.2.4    The Huge Memory Model
2.2.5    Null Pointers
2.2.6    Specifying a Memory Model
2.3   Mixing Memory Models
2.3.1    Pointer Problems
2.3.2    Declaring Near, Far, Huge, and Based Variables
2.3.3    Declaring Near and Far Functions
2.3.4    Pointer Conversions
2.4   Customizing Memory Models
2.4.1    Setting a Size for Code Pointers
2.4.2    Setting a Size for Data Pointers
2.4.3    Setting Up Segments
2.4.4    Library Support for Customized Memory Models
2.4.5    Setting the Data Threshold
2.4.6    Naming Modules and Segments

Chapter 3  Using the In-Line Assembler

3.1   Advantages of In-Line Assembly
3.2   The _asm Keyword
3.3   Using Assembly Language in _asm Blocks
3.4   Using C in _asm Blocks
3.4.1    Using Operators
3.4.2    Using C Symbols
3.4.3    Accessing C Data
3.4.4    Writing Functions
3.5   Using and Preserving Registers
3.6   Jumping to Labels
3.7   Calling C Functions
3.8   Defining _asm Blocks as C Macros
3.9   Optimizing

Chapter 4  Controlling Floating-Point Math Operations

4.1   Declaring Floating-Point Types
4.1.1    Declaring Variables as Floating-Point Types
4.1.2    Declaring Functions that Return Floating-Point Types
4.2   C Run-Time Library Support of Type long double
4.3   Summary of Math Packages
4.3.1    Emulator Package
4.3.2    Math Coprocessor Package
4.3.3    Alternate Math Package
4.4   Selecting Floating-Point Options (/FP)
4.4.1    In-Line Emulator Option (/FPi)
4.4.2    In-Line Math Coprocessor Instructions Option (/FPi87)
4.4.3    Calls to Emulator Option (/FPc)
4.4.4    Calls to Math Coprocessor Option (/FPc87)
4.4.5    Use Alternate Math Option (/FPa)
4.5   Library Considerations for Floating-Point Options
4.5.1    Using One Standard Library for Linking
4.5.2    In-Line Instructions or Calls
4.6   Compatibility between Floating-Point Options
4.7   Using the NO87 Environment Variable
4.8   Incompatibility Issues

PART II  Improving Programmer Productivity
────────────────────────────────────────────────────────────────────────────

Chapter 5  Compiling and Linking Quickly

5.1   Compiling Quickly
5.1.1    Quick Compiler
5.1.2    Incremental Compile Option
5.2.1    Preparing for Incremental Linking
5.2.2    Incremental Violations

Chapter 6  Managing Development Projects with NMAKE

6.1   Overview of NMAKE
6.2   The NMAKE Command
6.3   NMAKE Description Files
6.3.1    Description Blocks
6.3.3    Macros
6.3.4    Inference Rules
6.3.5    Directives
6.3.6    Pseudotargets
6.3.7    PWB's extmake Syntax
6.4   Command-Line Options
6.5   NMAKE Command Files
6.6   The TOOLS.INI File
6.7   In-Line Files
6.8   NMAKE Operations Sequence
6.9   Differences between NMAKE and MAKE

Chapter 7  Creating Help Files with HELPMAKE

7.1   Structure and Contents of a Help Database
7.1.1    Contents of a Help File
7.1.2    Help File Formats
7.2   Invoking HELPMAKE
7.3   HELPMAKE Options
7.3.1    Options for Encoding
7.3.2    Options for Decoding
7.4   Creating a Help Database
7.5   Help Text Conventions
7.5.1    Structure of the Help Text File
7.5.2    Local Contexts
7.5.3    Context Prefixes
7.6   Using Help Database Formats
7.6.1    QuickHelp Format
7.6.2    Minimally Formatted ASCII Format
7.6.3    Rich Text Format (RTF)

Chapter 8  Customizing the Microsoft Programmer's WorkBench

8.1   Setting Switches
8.1.1    Editing the <assign> Pseudofile
8.1.2    Editing the TOOLS.INI Initialization File
8.2   Assigning Keystrokes
8.3   Writing Macros
8.3.1    Macro Syntax
8.3.2    Macro Responses
8.3.3    Macro Arguments
8.3.4    Macro Conditionals
8.3.5    Temporary Macros
8.3.6    Macro Recordings
8.4   Writing and Building C Extensions
8.4.1    Building Real-Mode Extensions
8.4.2    Building Protected-Mode Extensions
8.4.3    Describing Functions and Switches
8.4.4    Initializing Functions
8.4.5    Prototyping Functions
8.4.6    Receiving Parameters
8.4.7    Calling PWB Functions
8.4.8    Calling C Library Functions

Chapter 9  Debugging C Programs with CodeView

9.1   Understanding CodeView Windows
9.2   Overview of Debugging Techniques
9.3   Viewing and Modifying Program Data
9.3.1    Displaying Variables in the Watch Window
9.3.2    Displaying Expressions in the Watch Window
9.3.3    Displaying Arrays and Structures
9.3.4    Displaying Array Elements Dynamically
9.3.5    Using Quick Watch
9.3.6    Displaying Memory
9.3.7    Displaying the Processor Registers
9.3.8    Modifying the Values of Variables, Registers,
and Memory
9.4   Controlling Execution
9.4.1    Continuous Execution
9.4.2    Single-Stepping
9.5   Replaying a Debug Session
9.6   Advanced CodeView Techniques
9.7   Controlling CodeView with Command-Line Options
9.8   Customizing CodeView with the TOOLS.INI FILE

PART III  Special Environments
────────────────────────────────────────────────────────────────────────────

Chapter 10  Communicating with Graphics

10.1  Video Modes
10.1.1    Sample Low-Level Graphics Program
10.1.2    Setting a Video Mode
10.1.3    Reading the videoconfig Structure
10.1.4    Maximizing Resolution or Color
10.1.5    Selecting Your Own Video Modes
10.2  Mixing Colors and Changing Palettes
10.2.1    CGA Palettes
10.2.2    Olivetti(R) Palettes
10.2.3    VGA Palettes
10.2.4    MCGA Palettes
10.2.5    EGA Palettes
10.2.6    Symbolic Constants
10.3  Specifying Points within Coordinate Systems
10.3.1    Physical Coordinates
10.3.2    Viewport Coordinates
10.3.3    Window Coordinates
10.3.4    Screen Locations
10.3.5    Bounding Rectangles
10.3.6    The Pixel Cursor
10.4  Graphics Functions
10.4.1    Controlling Video Modes
10.4.2    Changing Colors
10.4.3    Drawing Points, Lines, and Shapes
10.4.4    Defining Patterns
10.4.5    Manipulating Images
10.5  Using Graphic Fonts
10.5.1    Using the C Font Library
10.5.2    Registering the Fonts
10.5.3    Setting the Current Font
10.5.4    Displaying Text
10.5.5    A Sample Program
10.5.6    Using Fonts Effectively

Chapter 11  Creating Charts and Graphs

11.1  Overview of Presentation Graphics
11.2  Parts of a Graph
11.3  Writing a Presentation Graphics Program
11.3.1    Pie Chart
11.3.2    Bar, Column, and Line Charts
11.3.3    Scatter Diagram
11.4  Manipulating Colors and Patterns
11.4.1    Color Pool
11.4.2    Style Pool
11.4.3    Pattern Pool
11.4.4    Character Pool
11.5  Customizing the Chart Environment
11.5.1    titletype Structures
11.5.2    axistype Structures
11.5.3    windowtype Structures
11.5.4    legendtype Structures
11.5.5    chartenv Structures

Chapter 12  Programming with Mixed Languages

12.1  Making Mixed-Language Calls
12.2  Language Convention Requirements
12.2.1    Naming Convention Requirement
12.2.2    Calling Convention Requirement
12.2.3    Parameter-Passing Requirement
12.3  Compiling and Linking
12.3.1    Compiling with Correct Memory Models
12.3.2    Linking with Language Libraries
12.4  C Calls to High-Level Languages
12.5  C Calls to BASIC
12.6  C Calls to FORTRAN
12.6.1    Calling a FORTRAN Subroutine from C
12.6.2    Calling a FORTRAN Function from C
12.7  C Calls to Pascal
12.7.1    Calling a Pascal Procedure from C
12.7.2    Calling a Pascal Function from C
12.8  C Calls to Assembly Language
12.8.1    Writing the Assembly-Language Procedure
12.8.2    Setting Up the Procedure
12.8.3    Entering the Procedure
12.8.4    Allocating Local Data
12.8.5    Preserving Register Values
12.8.6    Accessing Parameters
12.8.7    Returning a Value
12.8.8    Exiting the Procedure
12.9  Handling Data in Mixed-Language Programming
12.9.1    Default Naming and Calling Conventions
12.9.2    Numeric Data Representation
12.9.3    Strings
12.9.4    Arrays
12.9.5    Array Declaration and Indexing
12.9.6    Structures, Records, and User-Defined Types
12.9.7    External Data
12.9.8    Pointers and Address Variables
12.9.9    Common Blocks
12.9.10   Using a Varying Number of Parameters

Chapter 13  Writing Portable Programs

13.1  Assumptions about Hardware
13.1.1    Size of Basic Types
13.1.2    Storage Order and Alignment
13.1.3    Byte Order in a Word
13.1.4    Reading and Writing Structures
13.1.5    Bit Fields in Structures
13.1.6    Processor Arithmetic Mode
13.1.7    Pointers
13.1.9    Character Set
13.2  Assumptions about the Compiler
13.2.1    Sign Extension
13.2.2    Length and Case of Identifiers
13.2.3    Register Variables
13.2.4    Functions with a Variable Number of Arguments
13.2.5    Evaluation Order
13.2.6    Function and Macro Arguments with Side Effects
13.2.7    Environment Differences
13.3  Portability of Data Files
13.4  Portability Concerns Specific to Microsoft C
13.5  Microsoft C Byte Ordering

PART IV  OS/2 Support
────────────────────────────────────────────────────────────────────────────

Chapter 14  Building OS/2 Applications

14.1  The OS/2 Applications Program Interface
14.1.1    Calling the OS/2 API
14.1.2    Including the OS/2 Header Files
14.1.3    Creating Dual-Mode Programs as Family Applications
14.2  Compile Options for the CL Command
14.2.1    The Link Mode Options (/Lp, /Lr, and /Lc)
14.2.2    Creating Bound Programs Option (/Fb)
14.2.3    Library Selection Options (/MT, /ML, /MD, /Zl)
14.2.4    Memory-Model Options (/Ax)
14.3  Module-Definition Files and Import Libraries
14.3.1    Adding a Module-Definition File to the LINK Command
14.3.2    Creating Dynamic-Link Libraries (DLLs)
14.3.3    Creating Programs with I/O Privileges
14.3.4    Creating Presentation Manager Applications
14.3.5    Creating Import Libraries with the IMPLIB Utility
14.4  Link Command-Line Options
14.5  The BIND Utility

Chapter 15  Creating Multithread OS/2 Applications

15.1.1    Library Support
15.1.2    Include Files
15.1.3    C Run-Time Library Functions for Thread Control
15.2  Sample Multithread C Program
15.3  Writing a Multithread Program
15.4  Compiling and Linking
15.5  Avoiding Problem Areas
15.6  Using the Protected-Mode CodeView Debugger
15.6.1    Compiling with the /Zi Option
15.6.2    Prompt for Thread Number
15.6.4    Screen Groups Used by CodeView

Chapter 16  Dynamic Linking with OS/2

16.1  Overview of Dynamic Linking
16.1.2    Application Programs and DLLs
16.1.3    DLLs and Microsoft C Run-Time Libraries
16.2  Designing and Writing DLLs
16.2.1    Floating-Point Math Requirements
16.2.2    Initialization and Termination Requirements
16.2.3    Making the DLL Re-Entrant
16.2.4    Signal Handling
16.2.5    Using Microsoft C Keywords
16.2.6    Compile Options for Dynamic-Link Libraries
16.3  Building DLLs with Microsoft C
16.3.1    DLLs with Static C Run-Time Library Functions
16.3.2    DLLs without C Run-Time Library Functions
16.3.3    Programs and DLLs with a C Run-Time DLL
16.3.4    Using CodeView to Debug Dynamic-Link Libraries

Appendix A  Using Exit Codes

A.1   The exit Function
A.2   Testing Exit Codes from Command and Batch Files
A.3   Accessing Exit Codes from Other Programs

Appendix B  Differences between C Versions 5.1 and 6.0

B.1   Modifications for ANSI Compatibility
B.1.1    ANSI-Mandated New Features
B.1.2    Integer Promotion Rules
B.1.3    Defining NULL as a Pointer
B.1.4    Shift Operators
B.1.5    Pointers to Typedefs
B.1.6    Identifying Nonstandard Keywords
B.1.7    Trigraphs
B.1.8    ANSI Nonconformance
B.2   New Keywords and Functions
B.2.1    In-Line Assembler
B.2.2    Based Pointers and Objects
B.2.3    Based Heap Allocation Support
B.2.4    Releasing Unused Heap Memory
B.2.5    Making Static Data Available to the Heap
B.2.6    Long Doubles
B.2.7    Long Double Functions
B.2.8    Model-Independent String and Memory Functions
B.2.9    Mixed-Model Memory Allocation Support
B.2.10   The _fastcall Attribute (/Gr Option)
B.2.11   Drive and Directory Functions
B.2.12   Text Output Functions for OS/2
B.3   New Features
B.3.1    Strings and Macros
B.3.2    CL Options
B.3.3    Tiny Memory Model (.COM Files)
B.3.4    The Optimize Pragma
B.3.5    Nameless Structures and Unions
B.3.6    Unsized Arrays as the Last Member of a Structure
B.3.7    Improved Warnings
B.3.8    Macros
B.3.9    Improved Multithread Support in OS/2
B.3.10   Pipe Support in OS/2
B.4   Differences in Code Generation
B.4.1    Speed and Space Improvements
B.4.2    Code Quality
B.4.3    Floating-Point Code Generation
B.4.4    Intrinsic Functions
B.5   Changes and Deletions
B.5.1    Deleted Features
B.5.2    Evaluation of Real Expressions
B.5.3    Default Optimizations
B.5.4    Sign Extension of char Arguments
B.5.5    Conditional Compilation and Signed Values
B.5.6    The const and volatile Qualifiers
B.5.7    Memory Allocation
B.5.8    Memory Used by Command-Line Arguments
B.5.9    Format Specifiers in printf
B.5.10   Functions that Return Float Values

Appendix C  Implementation-Defined Behavior

C.1   Translation
C.1.1    Diagnostics
C.2   Environment
C.2.1    Arguments to main
C.2.2    Interactive Devices
C.3   Identifiers
C.3.1    Significant Characters without External Linkage
C.3.2    Significant Characters with External Linkage
C.3.3    Upper- and Lowercase
C.4   Characters
C.4.1    The ASCII Character Set
C.4.2    Multibyte Characters
C.4.3    Bits per Character
C.4.4    Character Sets
C.4.5    Unrepresented Character Constants
C.4.6    Wide Characters
C.4.7    Converting Multibyte Characters
C.4.8    Range of char Values
C.5   Integers
C.5.1    Range of Integer Values
C.5.2    Demotion of Integers
C.5.3    Signed Bitwise Operations
C.5.4    Remainders
C.5.5    Right Shifts
C.6   Floating-Point Math
C.6.1    Values
C.6.2    Casting Integers to Floating-Point Values
C.6.3    Truncation of Floating-Point Values
C.7   Arrays and Pointers
C.7.1    Largest Array Size
C.7.2    Casting Pointers
C.7.3    Pointer Subtraction
C.8   Registers
C.8.1    Availability of Registers
C.9   Structures, Unions, Enumerations, and Bit Fields
C.9.2    Sign of Bit Fields
C.9.3    Storage of Bit Fields
C.9.4    Alignment of Bit Fields
C.9.5    The enum Type
C.10  Qualifiers
C.11  Declarators
C.11.1   Maximum Number
C.12  Statements
C.12.1   Limits on Switch Statements
C.13  Preprocessing Directives
C.13.1   Character Constants and Conditional Inclusion
C.13.2   Including Bracketed File Names
C.13.3   Including Quoted File Names
C.13.4   Character Sequences
C.13.5   Pragmas
C.13.6   Default Date and Time
C.14  Library Functions
C.14.1   NULL Macro
C.14.2   Diagnostic Printed by the assert Function
C.14.3   Character Testing
C.14.4   Domain Errors
C.14.5   Underflow of Floating-Point Values
C.14.6   The fmod Function
C.14.7   The signal Function
C.14.8   Default Signals
C.14.9   The SIGILL Signal
C.14.10  Terminating Newline Characters
C.14.11  Blank Lines
C.14.12  Null Characters
C.14.13  File Position in Append Mode
C.14.14  Truncation of Text Files
C.14.15  File Buffering
C.14.16  Zero-Length Files
C.14.17  File Names
C.14.18  File Access Limits
C.14.19  Deleting Open Files
C.14.20  Renaming with a Name that Exists
C.14.21  Printing Pointer Values
C.14.22  Reading Pointer Values
C.14.24  File Position Errors
C.14.25  Messages Generated by the perror Function
C.14.26  Allocating Zero Memory
C.14.27  The abort Function
C.14.28  The atexit Function
C.14.29  Environment Names
C.14.30  The system Function
C.14.31  The strerror Function
C.14.32  The Time Zone
C.14.33  The clock Function

Index

Introduction
────────────────────────────────────────────────────────────────────────────

Advanced Programming Techniques describes how to get the most out of the
Microsoft(R) C Professional Development System with its new integrated
development environment─the Microsoft Programmer's WorkBench─and
source-level debugging tool─the CodeView(R) debugger.

In this manual, you will see how all the components of the Microsoft C
Professional Development System work together to provide you with the most
powerful development environment available. A key element in the power of
the Professional Development System is your ability to customize it to suit
your individual needs as a programmer.

Because this book is arranged by topic, it answers questions about using
Microsoft C version 6.0, rather than providing lists of options. If you have
specific questions about menu items in the CodeView debugger, the
Programmer's WorkBench, or any of the command-line utilities included in the
Professional Development System, you can use the Microsoft C Advisor
(on-line help) or the C Reference manual.

Advanced Programming Techniques shows you how tools and utilities all fit
together.

Scope of This Book

Advanced Programming Techniques is divided into four parts. Part 1,
"Improving Program Performance," helps you write more efficient programs. It
provides specific information about optimizing─when and why to use various
optimizing options. It also explains new memory management options and when
to use them. For example, Chapter 3 describes the in-line assembler, a new
feature that lets you mix assembly language with your C source code.

Part 2, "Improving Programmer Productivity," will help you perform
programming tasks more quickly and efficiently. Chapter 8 explains the
different ways you can customize the new Programmer's WorkBench (PWB)─an
editor and integrated development environment that allows you to

■   Create new programs

■   Modify existing programs

■   Browse source files

■   Obtain help about PWB, the C language, and the C run-time libraries

■   Set program build lists

■   Build programs

■   Debug programs with the CodeView debugger

Chapter 8 also describes how to change PWB behavior to suit your programming
style by making keyboard assignments, recording or writing macros, and
writing C extensions.

Also in Part 2 is a chapter about the Microsoft Program Maintenance Utility,
NMAKE. NMAKE is a new program maintenance facility that allows you to use
program lists as input, which provides extra flexibility in your program
build process. It is a superset of the Microsoft XENIX(R) MAKE utility and
is substantially more powerful than previous versions of MAKE.

Chapter 9 in Part 2 describes the CodeView debugger, which is even more
powerful than in previous releases. With CodeView version 3.0, you get many
new features, including the ability to record a debugging session, then play
it back (history and dynamic replay).

Part 3, "Special Environments," describes new graphics capabilities. It also
shows how to program in mixed languages and offers tips to make your
programs more portable. Microsoft C helps you create graphics applications
easily. The Microsoft C run-time libraries contain graphics functions for
low-level graphics operations, such as drawing lines, rectangles, and
circles. The libraries also contain functions for creating presentation
graphics, such as pie charts and bar charts.

Part 4, "OS/2 Support," describes how the Professional Development System
helps you build OS/2 applications. The three chapters in Part 4 provide
information about dual-mode applications, creating multithread applications,
and creating dynamic-link libraries.

A postage-paid documentation feedback card is at the end of this manual.
After you have had a chance to become familiar with Microsoft C 6.0 and its
documentation, please give us your opinion. Your ideas will help us as we
develop future documentation. Also at the end of this book is a Product
Assistance Request form. If you need to call Microsoft for assistance, use
this form first to compile and organize pertinent information.

Document Conventions

────────────────────────────────────────────────────────────────────────────
NOTE

The pages that follow use the term "OS/2" to refer to the OS/2
systems─Microsoft Operating System/2 (MS(R) OS/2) and IBM(R) OS/2.
Similarly, the term "DOS" refers to both the MS-DOS(R) and IBM Personal
Computer DOS operating systems. The name of a specific operating system is
used when it is necessary to note features that are unique to the system.

────────────────────────────────────────────────────────────────────────────

Example                           Description
────────────────────────────────────────────────────────────────────────────
STDIO.H                           Uppercase letters indicate file names,
segment names, registers, and terms used
at the DOS- or OS/2-command level.

_cdecl                            Boldface letters indicate C keywords,
operators, language-specific characters,
and library functions, as well as OS/2
functions. Within discussions of syntax,
bold type indicates that the text must
be entered exactly as shown.

expression                        Words in italics indicate placeholders
for information you must supply, such as
a file name. Italics are also
occasionally used for emphasis in the
text.

«option»                          Items inside double square brackets are
optional.

#pragma pack {1|2}                Braces and a vertical bar indicate a
choice among two or more items. You must
choose one of these items unless double
square brackets surround the braces.

CL A.C B.C C.OBJ                  This font is used for examples, user
input, program output, and error
messages in text.

CL options « files...»            A horizontal ellipsis following an item
indicates that more items having the
same form may follow.

while(   )                        A vertical ellipsis tells you that part
{                                 of the example program has been
.                              intentionally omitted.
.
.
}

CTRL+ENTER                        Small capital letters are used for the
names of keys on the keyboard. When you
see a plus sign (+) between two key
names, you should hold down the first
key while pressing the second.

The carriage-return key (sometimes
appearing as a bent arrow on the
keyboard) is called ENTER.

The cursor-movement keys (sometimes
called direction keys) are called the
ARROW keys. Individual keys are referred
to by their direction (LEFT, UP) or by
the name on the key (PGUP).

"argument"                        Quotation marks enclose a new term the
first time it is defined in text.

Enhanced Graphics Adapter (EGA)   The first time an acronym is used, it is
often spelled out.

PART I  Improving Program Performance
────────────────────────────────────────────────────────────────────────────

The Microsoft C Professional Development System helps you create the
fastest, smallest applications using its sophisticated optimizer and
enhanced memory management capabilities.

Chapter 1 tells when to use certain optimizations and describes how
Microsoft C generates code that is efficient in execution speed and size.
Chapter 2 explains the sophisticated tools Microsoft C gives you to allocate
and manage program memory, including the new _based type. For cases where
your program requires localized optimization, you can use the in-line
assembler, described in Chapter 3, to introduce the tightest possible code.
If your application requires floating-point math computations, you will find
Chapter 4 helpful in explaining the options in the Microsoft C math
packages; it explains which floating-point options yield the fastest,
smallest, and most flexible code.

Chapter 1  Optimizing C Programs
────────────────────────────────────────────────────────────────────────────

The Microsoft C compiler translates C source statements into
machineexecutable instructions. In addition, the compiler rewrites or
"optimizes"  parts of your program to make it more efficient in ways that
are not apparent at the source level.

The compiler performs three general types of optimization:

1.  It modifies or moves sections of code so that fewer instructions are
used, or so that the instructions used make more efficient use of the
processor.

2.  It moves code and combines operations to maximize use of registers
because operations on data stored in processor registers are far
faster than the same operations on data stored in memory.

3.  It eliminates sections of code that are redundant or unused.

This chapter explains the various ways you can control how the Microsoft C
compiler optimizes your code.

1.1  Controlling Optimization from the Programmer's WorkBench

The Programmer's WorkBench (PWB) is an integrated development environment
for editing, building, and debugging applications written in Microsoft C.
For more information on the PWB, see Installing and Using the Microsoft C
Professional Development System.

There are two ways to compile from inside the Programmer's WorkBench:

1.  Debug compile. In a default debug compile, the compiler performs no
optimizations at all.

2.  Release compile. In a default release compile, the compiler performs
most optimizations.

By modifying the settings in C Global Build Options, C Debug Build Options,
and C Release Build Options (on the Options menu), you can fine-tune
optimization by individually enabling or disabling any of the optimizations
the compiler performs.

The optimizations in each of the Build Options dialog boxes correspond to a
command-line option to CL. (In fact, the PWB constructs a command line from
your input and passes it to CL.)

────────────────────────────────────────────────────────────────────────────
NOTE

In this chapter, optimization options are discussed in terms of the effect
of the optimization, the command-line option to invoke the optimization, and
pragmas that control the optimization. All of these optimizations can be
controlled at the compilation-unit (file) level using the Build Options
dialog boxes.
────────────────────────────────────────────────────────────────────────────

1.2  Controlling Optimization from the Command Line

Controlling optimization from the command line requires that you determine
which optimizations you need for your application. You then specify those
optimizations using command-line options that begin with /O (and in some
cases /G).

If there is any conflict between options, the compiler uses the last option
specified on the command line. The command line

CL /Oa /Ol /Ot TEST.C

compiles the program TEST.C. It specifies that the compiler can

■   Optimize on the assumption that you are doing no aliasing (/Oa)

■   Perform loop optimization (/Ol)

■   Perform other general speed-enhancing optimizations (/Ot)

The preceding command line can also be written

CL /Oalt TEST.C

1.3  Controlling Optimization with Pragmas

Occasionally you will need to exercise a fine level of control over compiler
optimizations. Command-line options allow you to control optimization over
an entire compilation unit (file). In addition, Microsoft C supports several
pragmas that allow you to exercise such control on a per-function basis.

The pragmas that control optimization are described in this chapter under
the type of optimization they affect.

The optimize pragma is new to version 6.0.

In version 6.0, you can control each of the following optimization
parameters on a function-by-function basis using the optimize pragma:

■   Behavior of code with respect to aliasing (a and w)

■   Reduction of local common subexpressions (c)

■   Reduction of global common subexpressions (g)

■   Global register allocation (e)

■   Loop optimization (l)

■   Aggressiveness of optimizations (z)

■   Disabling of unsafe optimizations (n)

■   Achieving consistent floating-point results (p)

■   Optimizing for smaller code size or for faster execution speed (t)

Any optimization or combination of options can be enabled or disabled using
the optimize pragma. For example, if you have one function that uses aliases
heavily, you need to inhibit optimizations that could cause problems with
aliases. You do not, however, want to inhibit these optimizations for code
that does not do aliasing. To do this, use the optimize pragma as follows:

/* Function(s) that do not do aliasing. */
.
.
.
#pragma optimize( "a", off )
/* Function(s) that do aliasing. */
.
.
.
#pragma optimize( "a", on )
/* More function(s) that do not do aliasing. */

The parameters to the optimize pragma can be combined in a string to enable
or disable multiple options at once. For example,

#pragma optimize( "lge", off )

disables loop optimization, global common subexpression optimization, and
global register allocation.

1.4  Default Optimization

Many optimizations are not explicitly disabled by any command-line option
except /Od (disable optimizations). These optimizations are small in scope
and are almost always helpful. They include

■   Short range common subexpression elimination

■   Constant propagation

1.4.1  Common Subexpression Elimination

In common subexpression elimination, the compiler finds code containing
repeated subexpressions and produces modified code in which the
subexpressions are evaluated only once. Subexpression elimination is usually
done with temporary variables as shown in the following example:

a = b + c * d;
x = c * d / y;

The preceding two lines contain the common subexpression  c * d. This code
can be modified to evaluate  c * d  only once; the result is placed in a
temporary variable (usually a register):

tmp = c * d;
a = b + tmp;
x = tmp / y;

Dead-store elimination is an extension of common subexpression elimination.
Variables that contain the same value in a short piece of code can be
combined into a single temporary variable.

In the following code fragment, the compiler detects that the expression
func( x )  is equivalent to  func( a + b ):

x = a + b;
x = func( x );

Thus, the compiler can rewrite the code as follows:

x = func( a + b);

1.4.3  Constant Propagation

When doing constant propagation, the compiler analyzes variable assignments
and determines if they can be changed to constant assignments. In the
following example, the variable  i  must have a value of  7  when it is
assigned to  j:

i = 7;
j = i;

Instead of assigning  i  to  j, the constant  7  can be assigned to  j:

i = 7;
j = 7;

While you could make any of these changes in the source file, doing so might
reduce the readability of the program. In many cases, optimizations not only
increase the efficiency of the program but allow you to write more readable
code without any actual efficiency loss.

Remove optimization before using a symbolic debugger.

In some cases, you might want to disable even the default optimizations.
Because optimizations may rearrange code in the object file, it can become
difficult to recognize parts of your code during debugging. It is usually
best to remove all optimization before using a symbolic debugger. You can
remove all optimization with the /Od (disable optimizations) option.

You can disable all optimizations for a function by including the statement
#pragma optimize( "", off ). To restore optimization to its former state,
use the statement  #pragma optimize( "", on ).

1.5  Customizing Your Optimizations

The default optimizations are sufficient for many applications, but you may
want to tune your programs according to criteria not known to the compiler.
The optimization options offer you a way of providing the compiler specific
goals for optimizing your code.

1.5.1  Choosing Speed or Size (/Ot and /Os)

In addition to the default optimizations, the Microsoft C compiler also
automatically uses the /Ot option, which optimizes for speed. The /Ot option
enables optimizations that increase speed but may also increase size. If you
would rather optimize for program size, use the /Os option. The /Os option
enables optimizations that decrease program size but may also decrease
program speed.

To optimize for speed or size on a per-function basis, use the optimize
pragma with the t option. The on setting instructs the compiler to optimize
for speed; the off setting instructs the compiler to optimize for
compactness of code. For example,

#pragma optimize( "t", off )    /* Optimize for smallest
code. */
.
.
.
#pragma optimize( "t", on )     /* Optimize for fastest
code. */

1.5.2  Generating Intrinsic Functions (/Oi)

In place of some normal function calls, the C compiler can insert "intrinsic
functions," which operate more quickly. Every time a function is called, a
set of instructions must be executed to store parameters and to create space
for local variables. When the function returns, more code must be executed
to release space used by local variables and parameters and to return values
to the calling routine. These instructions take time to execute. In the
context of an average-sized function, the additional code is minimal, but if
the function is only a line or two, the additional code can comprise almost
half of the function's compiled code.

One way to avoid this type of code expansion is to avoid such short
functions, especially in often-used sections of code where speed is
critical. But many library functions contain only a line or two of code. The
compiler provides two forms of certain library functions. One form is a
standard C function, which requires the overhead of a function call. The
other form is a set of instructions that

performs the same action as the function without issuing a function call.
This second form is called an intrinsic function. Intrinsic functions are
always faster than their function-call equivalents and can provide
significant optimizations at the object-code level.

For example, the function strcpy might be written as follows:

int strcpy(char * dest, char * source)
{
while( *dest++ = *source++ );
}

The compiler contains an intrinsic form of strcpy. If you instruct the
compiler to generate intrinsic functions, any call to strcpy will be
replaced with this intrin-sic form.

────────────────────────────────────────────────────────────────────────────
NOTE

While the example above is written in C for clarity, most of the library
functions use assembly language to take full advantage of the 80x86
instruction set. Intrinsic functions are not simply library functions
defined as macros.
────────────────────────────────────────────────────────────────────────────

Compiling with the /Oi option causes the compiler to use the intrinsic forms
of the following functions:

abs             labs            memset          strcat
_disable        lrotl           outp            strcmp
_enable         lrotr           outpw           strcpy
fabs            memcmp          rotl            strlen
inp             memcpy          rotr            strset
inpw

While the following floating-point functions do not have true intrinsic
forms, they do have versions that pass arguments directly to the
floating-point chip instead of pushing them on the normal argument stack:

acos            fmod            acosl           fmodl
asin            log             asinl           logl
atan            log10           atanl           log10l
atan2           pow             atan2l          powl
ceil            sin             ceill           sinl
cos             sinh            cosl            sinhl
cosh            sqrt            coshl           sqrtl
exp             tan             expl            tanl
floor           tanh            floorl          tanhl

────────────────────────────────────────────────────────────────────────────
WARNING

The compiler performs optimizations assuming math intrinsics have no side
effects. This assumption is true except if you have written your own matherr
function and that function alters global variables. If you have written a
matherr function to handle floating-point errors, and your function has side
effects, use the function pragma to instruct the compiler not to generate
intrinsic code for math functions.
────────────────────────────────────────────────────────────────────────────

If you want the compiler to generate intrinsic functions for only a subset
of the functions listed above, use the intrinsic pragma rather than the /Oi
option. The intrinsic pragma has the following format:

#pragma intrinsic( function1, ... )

If you want to have intrinsic functions generated for most of the functions
above and function calls for only a few, compile with the /Oi option and
force function use with the function pragma. The function pragma has the
following format:

#pragma function( function1, ... )

The following code illustrates the use of the intrinsic pragma:

#pragma intrinsic(abs)

void main( void )
{
int i, j;

i = big_routine_1();
j = abs( i );
big_routine_2( j );
}

Generating intrinsic functions for this program causes the call to abs to be
replaced with assembly-language code that takes the absolute value of a
number. The program will execute more quickly because the function-calling
overhead is no longer required when abs is called.

In the previous example, the overall speed increase is small because there
is only a single call to abs. In the following example, where the call to
abs is in a loop and there are many calls, you can save a significant amount
of execution time by generating intrinsic functions.

#pragma intrinsic( abs )
void main( void )
{
int i, j, x;

for( j = 0; j < 1000; j++ )
{
for( i = 0; i < 1000; i++)
{
x += abs( i - j );
}
}
printf( "The value of x is %d\n", x );
}

The following is a list of restrictions on using the intrinsic forms of
function calls:

■   Do not use the intrinsic forms of the floating-point math functions
with the alternate math libraries (mLIBCAy.LIB).

■   Do not use the intrinsic forms of the floating-point math functions in
OS/2 dynamic-link libraries (DLLs) because you must use the alternate
math library with LLIBCDLL.LIB.

■   If you use the /Ox (maximum optimization) option, you are enabling the
/Oi (generate intrinsic functions) option. Be careful that your use of
/Ox does not conflict with the points listed previously.

────────────────────────────────────────────────────────────────────────────
NOTE

Intrinsic versions of _enable, _disable, inp, outp, inpw, and outpw
do not work under OS/2. You must use the library versions. You can use the
function pragma to force these functions to become library calls.
────────────────────────────────────────────────────────────────────────────

1.5.3  Assuming No Aliasing (/Oa and /Ow)

An "alias" is a name used to refer to a memory location already referred to
by a different name. Because a memory access takes more time than it takes
to access the CPU's registers, the compiler tries to store frequently used
variables in registers. However, the aliasing reduces the extent to which a
compiler can keep variables in registers.

A pointer is a reference to a memory location. Because the value of a
pointer is not determined until the program is run, the compiler has no way
of knowing which memory location will be modified when the program executes;
it could be a reference to a variable. Therefore, the compiler must assume
that any time the value pointed to by any pointer changes, the value of any
variable might also change. This limits the extent to which the compiler can
move values from memory to registers.

The /Oa option tells the compiler to ignore the possibility of multiple
aliases for a memory location. In the list that follows, the term
"reference" means read or write; that is, whether a variable is on the
left-hand side of an assignment statement or the right-hand side, you are
still referring to it. In addition, any function calls that use a variable
as a parameter are references to that variable. When you tell the compiler
to assume that you are not doing aliasing, it expects that the following
rules are being followed for any variable not declared as volatile:

■   If a variable is used directly, no pointers are used to reference that
variable.

■   If a pointer is used to refer to a variable, that variable is not
referred to directly.

■   If a pointer is used to modify a memory location, no other pointers
are used to access the same memory location.

To clarify how these rules affect your code, consider the following example:

char    p;
char    *ptr_p;

ptr_p = &p;    /* Take the address of p. */

You can now refer either to  *ptr_p  or to  p, but not to both within the
same function. If you must refer to the variable by both names, you are
using aliases.

Code referring to the same location with two pointers uses aliases. For
example,

char    *p_buf;
char    *p_alias;

if( (p_alias = p_buf = malloc( 5000 )) == NULL )
return;
else
{
.
.
.
}

The code in the example above is common. It demonstrates dynamically
allocating a block of memory from the heap, and preserving the original
address in  p_buf. The program then performs all pointer arithmetic on the
alias  p_alias. When the function finishes with the block of memory,  p_buf
is a valid argument for the free function because it still contains the

The /Oa and /Ow options tell the compiler that you have not used aliases in

The difference between the /Oa and the /Ow option is that when you use /Oa
you specify that you will not be doing aliasing (which allows the compiler
to perform significant optimizations that might not otherwise have been
possible), and that function calls are safe. The /Ow option is similar to
the /Oa option, except that after a function call, pointer variables must be

Here is an example of a program that would be a poor candidate for the /Oa
or /Ow optimization option:

int g;

void main( void )
{
}

int add_em( int *p )
{
*p = 2;          /* Assign a value to an alias for g. */
g = 3;           /* Assign a value directly to g. */
return( *p + g );
}

In the function  add_em, both  g  and  *p  refer to the same memory
location. This location is first assigned  2, then  3. The value pointed to
by  *p  (the alias for  g) is then added to  g, and the result is returned
to the main program. If you do not use the /Oa command-line option, the
compiler assumes that the reference to  *p  could refer to the same memory
location as does  g  and makes no attempt to use a register to store the
value of either. If, however, you do specify the /Oa option, the compiler
assumes that  g  and  *p  refer to different memory locations and stores
each in a different register. At the return statement,  g  will have a
different value than  *p, even though both aliases should actually contain
the same value.

Note that the compiler keeps values in registers for only a limited time. If
different aliases to a memory location occur in different functions, for
example, they will not cause unexpected results. When in doubt, avoid
aliasing.

Bugs involving aliasing are difficult to spot.

Aliasing bugs most frequently show up as corruption of data. If you find
that global or local variables are being assigned seemingly random values,
take the following steps to determine if you have a problem with
optimization and aliasing:

■   Compile the program with /Od (disable optimizations).

■   If the program works when compiled with the /Od option, check your
normal compile options for the /Oa option (assume no aliasing).

■   If you were using the /Oa option, fix your compile options so that /Oa
is not specified.

────────────────────────────────────────────────────────────────────────────
NOTE
You can instruct the compiler to disable optimizations that are unsafe with
code that does aliasing by using the optimize pragma with the a or w option.
────────────────────────────────────────────────────────────────────────────

1.5.4  Performing Loop Optimizations (/Ol)

The /Ol option enables a set of optimizations involving loops. Because loops
involve sections of code that are executed repeatedly, they are targets for
optimization. These optimizations all involve moving code or rewriting code
so that it executes faster.

Loop optimization can be turned on with the /Ol option or with the loop_opt
pragma. The following line enables loop optimization for all subsequent
functions:

#pragma loop_opt( on )

The following line turns it off:

#pragma loop_opt( off )

The /Ol option removes invariant code.

An optimal loop contains only expressions whose values change through each
execution of the loop. Any subexpression whose value is constant should be
evaluated before the body of the loop is executed. Unfortunately, these
subexpressions are not always readily apparent. The optimizer can remove
many of these expressions from the body of a loop at compile time. This
example illustrates invariant code in a loop:

i = -100;
while( i < 0 )
{
i += x + y;
}

In the preceding example, the expression  x + y  does not change in the loop
body. Loop optimization removes this subexpression from the body of the loop
so that it is only executed once, not every time the loop body is executed.
The optimizer will change the code to the following fragment:

i = -100;
t = x + y;
while( i < 0 )
{
i += t;
}

Loop optimization is much more effective when the compiler can assume no
aliasing. While you can use loop optimization without the /Oa or /Ow option,
use /Oa to ensure that the most options possible are used.

Here is a code fragment that could have an aliasing problem:

i = -100;
while( i < 0 )
{
i += x + y;
*p = i;
}

If you do not specify the /Oa option, the compiler must assume that either
x  or  y  could be modified by the assignment to  *p. Therefore, the
compiler cannot assume the subexpression  x + y  is constant for each loop
iteration. If you specify that you are not doing any aliasing (with the /Oa
option), the compiler assumes that modifying  *p  cannot affect either  x
or  y, and that the subexpression is indeed constant and can be removed from
the loop, as in the previous example.

────────────────────────────────────────────────────────────────────────────
NOTE

All loop optimizations specified by the /Ol option or the loop_opt pragma
are safe optimizations. To enable aggressive loop optimizations, you must
use the enable aggressive optimizations (/Oz) option. While the
optimizations enabled by the combination of /Ol and /Oz are not safe for all
cases, they will work properly for most programs.
────────────────────────────────────────────────────────────────────────────

1.5.5  Disabling Unsafe Loop Optimizations (/On)

The disable unsafe loop optimizations (/On) option is an obsolescent option
and is only retained for compatibility with existing makefiles. Loop
optimizations are, by default, safe optimizations. The /On option is the
default and has the opposite effect of the /Oz (enable aggressive
optimizations) option.

1.5.6  Enabling Aggressive Optimizations (/Oz)

The compiler can perform extremely aggressive optimizations. These
optimizations produce high code quality both in terms of speed and size.
Certain programs, however, cannot be optimized with the technologies enabled
by the /Oz option. For these programs, you should not specify this option;
you can still use all other optimization options.

Because the optimization strategies enabled by the /Oz option are so
aggressive, they are not part of the maximum optimization (/Ox) option.

Examples of the effects of the /Oz option are

■   Loop optimization (/Ol). Loop optimization enables a technology that
anticipates program flow and tries to remove invariant expressions
from loops. When you specify the enable aggressive optimizations
option (/Oz), the compiler removes invariant expressions even when it
might cause an error. Errors with the enable aggressive optimizations
option occur most often when an invariant expression that can cause an
exception is protected by an if statement. The invariant expression is
hoisted out of the loop body, causing it to be evaluated prior to the
evaluation of the if statement that was designed to protect it. Here
are two examples that illustrate this problem:

for( i = 0; i  100; ++i )
if( float_val != 0.0F )
/* Protect against divide-by-zero. */
float_result = pi / float_val;

while( condition )
if( ptr_val != NULL )
/* Protect pointer dereference. */
char_var = *ptr_val;

■   Global register allocation (/Oe). The enable aggressive optimizations
option enables some register allocation strategies that can cause
invalid segment selectors to be placed in registers. Although this
problem is benign in DOS, it causes protection faults in OS/2.

────────────────────────────────────────────────────────────────────────────
NOTE

You can instruct the compiler to enable aggressive optimizations on a
function-by-function basis by using the optimize pragma with the z option.
────────────────────────────────────────────────────────────────────────────

1.5.7  Removing Stack Probes (/Gs)

Every time a function is called, the stack provides space for all parameters
and local variables declared in that function. A short assembly function
that checks for a stack overflow condition is then called. Stack overflows
are usually caused either by infinite loops or by runaway recursive
routines. Such errors can also be caused by extremely large parameters or
local variables.

Stack probes can be important during program development. Stack-overflow
errors alert you to problems in your code. When the program has been tested,
however, stack checking often becomes unnecessary. The compiler allows you
to remove stack-checking code with either the /Gs option or the check_stack
pragma. Eliminating stack probes produces programs that are smaller and that
run more quickly.

1.5.8  Enabling Global Register Allocation (/Oe)

The global register allocation option (/Oe) instructs the compiler to
analyze your program and allocate CPU registers as efficiently as possible.
Without the global register allocation option, the compiler uses the CPU's
registers for several purposes:

■   Holding temporary copies of variables

■   Holding variables declared with the register keyword

■   Passing parameters to functions declared with the _fastcall keyword
(or functions in programs compiled with the /Gr command-line option)

When you enable global register allocation, the compiler ignores the
register keyword and allocates register storage to variables (and possibly
to common subexpressions). The compiler allocates register storage to
variables or subexpressions according to frequency of use. Because of the
limited number of physical registers, variables held in registers are
sometimes placed back in memory to free the register for another use. Here
is a C program example that demonstrates how the compiler might rewrite your
code to accomplish this:

/* Original program */

func()
{
int i, j;
char *pc;

for( i = 0; i < 1000; ++i )
{
j = i / 3;
*pc++ = (char)i;
}

for( j = 0, --pc; j < 1000;
++j, --pc )
*pc--;
}

/* Example of how the compiler might optimize the
* code to move i and j in and out of registers */

func()
{
int i, j;
char *pc;

{
register int i; /* i is in a register for this block. */
for( i = 0; i < 1000; ++i )
{
j = i / 3;
*pc++ = (char)i;
}
}

{
register int j; /* j is in a register for this block. */
for( j = 0, --pc; j < 1000;
++j, --pc )
*pc--;
}
}

In the preceding example, there are blocks (enclosed in curly braces) whose
only purpose is to delimit the span of code across which variables should
remain in registers.

────────────────────────────────────────────────────────────────────────────
NOTE

You can enable or disable global register allocation on a
function-by-function basis using the optimize pragma with the e option.
────────────────────────────────────────────────────────────────────────────

1.5.9  Enabling Common Subexpression Optimization (/Oc and /Og)

When you use option /Og (enable global common subexpression optimizations),
the compiler searches entire functions for common subexpressions. Option /Oc
(default common subexpression optimization) examines only short sections of
code for common subexpressions. You can disable default common subexpression
optimization with the /Od option. For more information about common
subexpression optimization, see Section 1.4, "Default Optimization."

────────────────────────────────────────────────────────────────────────────
NOTE

You can enable or disable block-scope common subexpression optimization on a
function-by-function basis using the optimize pragma with the c option. You
can enable or disable global common subexpression optimization on a
function-by-function basis using the optimize pragma with the g option.
────────────────────────────────────────────────────────────────────────────

1.5.10  Achieving Consistent Floating-Point Results (/Op)

Floating-point numbers stored in memory use either 32, 64, or 80 bits,
depending on whether they are of type float, type double, or type long
double. The 80x87 family of coprocessors uses 80-bit registers for all
operations. If a value of type float or type double is kept in these
registers through a number of operations, it will be more accurate than if
that value is moved to and from memory between operations.

Because of the difference in precision between memory and register
representation of a floating-point number, a value stored in memory is not
always equal to the same value in the 80x87 register.

The difference in precision primarily affects strict equality or strict
inequality tests (== and !=); however, relational tests of magnitude (>, >=,
, and ) can behave erroneously if the coprocessor is able to maintain
significant digits that memory variables cannot.

You can avoid the difference in precision by using the /Op option. This
option forces floating-point values to be written to memory between
floating-point operations. While storing these values to memory reduces the
precision of floating-point expressions, it also ensures that these
expressions will produce consistent results regardless of the rest of the
code.

You can change the handling of floating-point results on a
function-by-function basis using the optimize pragma with the p option.

────────────────────────────────────────────────────────────────────────────
NOTE

Using the /Op option suppresses other optimizations because the
floating-point registers are not available for storage of intermediate
results. Because you suppress these optimizations, code compiled with the
/Op option executes more slowly than code compiled without this option.
Careful coding practices, especially in tests of strict equality and
inequality, can alleviate the need for this option.
────────────────────────────────────────────────────────────────────────────

1.5.11  Using the 80186, 80188, or 80286 Processor (/G0, /G1, /G2)

The compiler generates 8086 object code (/G0) unless you take special steps.
Because the newer processors (the 80186, 80188, and 80286) are
backwardcompatible with the 8086 instruction set, using this instruction set
ensures compatibility with all 80x86-based computers. While you gain
compatibility across the entire family of 80x86 processors, you lose the
advantage of some of the more powerful instructions in the newer processors.

If you know your program will only be running on an 80186, 80188, or 80286
processor, you can cause the compiler to generate instructions specific to
these processors. These instructions increase the speed of your program, but
you lose compatibility with machines that use older processors in the 80x86
family. Table 1.1 lists the options for processor-specific code generation:

Table 1.1  Processor Compatibility

╓┌──────────────────────┌────────────────────────────────────────────────────╖
Command-Line Option    Compatible Processors
────────────────────────────────────────────────────────────────────────────
/G0                    8088, 8086, 80188, 80186, 80286, 80388, 80486
/G1                    80188, 80186, 80286, 80386, 80486
Command-Line Option    Compatible Processors
────────────────────────────────────────────────────────────────────────────
/G1                    80188, 80186, 80286, 80386, 80486
/G2                    80286, 80386, 80486
────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────
NOTE

When developing only for OS/2, always use the /G2 option, because OS/2 does
not run on the 8086, 8088, 80186, or 80188. Do not use /G2 for Family
Applications because they might be run on machines with 8088, 8086, 80188,
or 80186 processors.
────────────────────────────────────────────────────────────────────────────

1.5.12  Optimizing for Maximum Efficiency (/Ox)

The /Ox option combines a number of different optimizations:

■   Enable global register allocation (/Oe)

■   Enable global common subexpression optimization (/Og)

■   Enable block-scoped common subexpression optimization (/Oc)

■   Generate intrinsic functions (/Oi)

■   Perform loop optimizations (/Ol)

■   Optimize for speed (/Ot)

■   Remove stack probes (/Gs)

Use /Ozax /Gr to get the fastest program.

The /Ox option does not include several optimizations that can improve code
efficiency: /Oa (assume no aliasing), /Oz (enable aggressive optimizations),
and /Gr (use fastcall calling convention). Before enabling these
optimizations, you should read the sections that describe the /Oa and /Oz
options and the fastcall calling convention to determine if they are
appropriate for your application.

Use the optimize pragma to reduce code size.

If you are more concerned with executable file size than execution time, use
the /Ox and /Gs options, then issue the optimize pragma as follows:

#pragma optimize( "t", off )

This set of options produces the smallest possible code, while also
performing some speed optimizations.

1.6  Linker (LINK) Options that Control Optimization

Most code optimization is performed before the object file is produced.
There are four optimizations that the linker can perform to speed program
execution and reduce the disk space used by an executable file.

1.6.1  Enabling Far Call Optimization (/FARCALLTRANSLATION)

You can call a function two ways. In a far call, the function is called
using both the segment and the offset of the function. This allows a program
to call a routine outside a 64K segment. In a near call, both the calling
statement and the function must be located in the same segment. Only the
offset is used to access the function; the segment address is implicit. You
can only use near calls to routines located in the same segment.

Because of the architecture of the processor, near function calls execute
faster than far calls. The decision to declare functions as near or far is
often made when selecting a memory model. As it is difficult to determine
where the linker will place a given function in memory, it is impractical
for the programmer to choose the way a function is called.

Use /FARCALLTRANSLATION with medium, large, and huge model programs.

The /FARCALLTRANSLATION option enables far call optimization. When you use
this option, any function calls within the same segment as the function
being called are converted to near calls. This optimization has no effect if
you have selected the tiny, small, or compact model, because all calls are

The abbreviation for the /FARCALLTRANSLATION option is /F.

How /FARCALLTRANSLATION Affects Your Code

The linker can perform a form of post-optimization (an optimization that
occurs after most of the actual code generation is complete) that translates
far calls into near calls when possible. This optimization allows a given
function to be called with both near and far calls in the same program. To
perform this translation, the linker takes a section of object code such as

CALL    FAR    _func

where  func  is defined in the current segment, and replaces it with the
following code:

PUSH    CS
CALL    NEAR    _func
NOP

This substitution works because the linker has inserted  PUSH CS  to place a
far return address on the stack.

Use /FARCALLTRANSLATION with /PACKCODE.

The /FARCALLTRANSLATION option is most effective when used in conjunction
with the /PACKCODE option discussed in Section 1.6.2. Using the /PACKCODE
option causes far calls that were intersegment to become intrasegment calls.
The /FARCALLTRANSLATION feature can then take advantage of the new grouping
to translate all intrasegment far calls into near calls.

Benefits of /FARCALLTRANSLATION

The /FARCALLTRANSLATION option is of significant benefit to protected-mode
programs. Table 1.2 illustrates why.

Table 1.2  Processor Clock Cycles for Calling Sequence

╓┌───────────────────┌──────────────┌──────────┌──────────────┌──────────────╖
Cycles  (Real             Cycles
Mode)                     (Protected
Mode)
────────────────────────────────────────────────────────────────────────────

Instructions        286            386        286            386

────────────────────────────────────────────────────────────────────────────

Far Function Call

CALL FAR PTR _func  13             17         26             34

Total               13             17         26             34

────────────────────────────────────────────────────────────────────────────

Near Function Call

Cycles  (Real             Cycles
Mode)                     (Protected
Mode)

PUSH CS             3              2          3              2

CALL NEAR PTR       7              7          7              7
_func

NOP                 3              3          3              3

Total               13             12         13             12

────────────────────────────────────────────────────────────────────────────

Savings             0              5          13             22

────────────────────────────────────────────────────────────────────────────

1.6.2  Packing Code (/PACKCODE)

The /PACKCODE linker option groups neighboring code segments together. When
used with the /F option, the /PACKCODE option greatly increases the number
of near calls that can be made to a function. This option can be followed
with a limit (expressed in bytes) at which to stop packing and to begin a
new group. Here is the syntax for the /PACKCODE option: ;/PACKCODE option

/PACKCODE:number

where number is an optional hexadecimal, octal, or decimal number that
specifies the limit for packing. The radix (octal, decimal, or hexadecimal)
is specified just as you would specify it to a C program.

Radix                             Rules for Specification
────────────────────────────────────────────────────────────────────────────
Octal                             Specify the octal number with a leading
0. You can only use the digits 0 through
7 in an octal number. For example, 07777.

Decimal                           Specify the decimal number without a
leading 0. For example, 65530.

Hexadecimal                       Specify the hexadecimal number with a
leading 0x. For example, 0x3FFF.

If you omit the packing limit, the linker supplies a default value of 65,
530.

The abbreviation for the /PACKCODE option is /PACKC.

1.6.3  Packing Data (/PACKDATA)

The /PACKDATA option is analogous to the /PACKCODE option, except that it
groups together neighboring data segments instead of code segments. This
option is most useful when you have a large-model program that exceeds the
OS/2 limitation of 255 segments. By using /PACKDATA, you can group segments,
thereby reducing the total number OS/2 has to manage. Here is the syntax for
the /PACKDATA option:

/PACKDATA:number

where number is an optional hexadecimal, octal, or decimal number that
specifies the limit for packing. The radix (hexadecimal, octal, or decimal)
is specified just as you would specify it to a C program. For more
information on specifying hexadecimal, octal, or decimal numbers, see
Section 1.6.2 above.

If the packing limit is omitted, the linker supplies a default value of
65,535 (0xFFFF).

The abbreviation for the /PACKDATA option is /PACKD.

1.6.4  Packing the Executable File (/EXEPACK)

The executable file created by the compiler often contains sequences of
repeated bytes. You can remove these repeated sequences with the /EXEPACK
option. This decreases the size of the resulting executable file as well as

────────────────────────────────────────────────────────────────────────────
WARNING

Because the /EXEPACK option removes debug information from the executable
file, you should not use it with the /CODEVIEW option.
────────────────────────────────────────────────────────────────────────────

1.7  Optimizing in Different Environments

The environment in which you plan to use a program can have a bearing on the
types of optimizations that you should use.

1.7.1  Optimizing in DOS

You need not take special precautions for programs written under DOS unless
you are writing a terminate-and-stay-resident (TSR) program. If an
interrupt-driven routine could modify a memory location in a program, you
should declare that variable volatile.

1.7.2  Optimizing in OS/2

Many of the rules for interrupt routines apply to OS/2. If one thread can
modify variables in another thread, declare these variables as volatile.

1.7.3  Optimizing in Microsoft Windows(tm)

Microsoft Windows(tm) can move segments dynamically. As a result of dynamic
heap compaction, pointers maintained in registers can be invalidated. The
/Ow option instructs the compiler that you will not be using aliases, but
that Windows might cause certain optimizations to be unsafe across function
calls.

If you are not using any aliases you must still use the /Ow option with
Windows programs. See Section 1.5.3, "Assuming No Aliasing (/Oa and /Ow),"

1.8  Choosing Function-Calling Conventions

In Microsoft C, version 6.0, functions can call other functions using three
different conventions. Note that, while no calling convention has been
defined as "standard," most C compilers use conventions similar to those
described here. The C calling convention requires the most object code to
set up, but it is the only calling convention that supports functions with
variable-length argument lists. The FORTRAN/Pascal calling convention is
more compact, but does not allow for variable-length argument lists. The
_fastcall, or register calling convention is the fastest of the three
calling conventions, but it does not support  variable-length argument lists
or mixed-language program interfaces.

1.8.1  The C Calling Convention (/Gd)

Because C allows functions to have a variable number of parameters,
parameters must be pushed onto the stack from right to left. (If parameters
were pushed from left to right, it would be difficult for the compiler to
determine which parameter was first.) If you do not specify command-line
options that modify the function-calling convention, the C calling
convention is used; otherwise, the _cdecl keyword must be used before any
function using the C calling convention.

If, for example, you use the /Gr (register calling convention) option when
you compile, and the function  add_two  must have the C calling convention,
declare  add_two  as follows:

int _cdecl add_two( int x, int y );

1.8.2  The FORTRAN/Pascal Calling Convention (/Gc)

Use the FORTRAN/Pascal calling convention for any functions declared with
either the _fortran or _pascal keywords. (The two keywords currently produce
identical results.) Parameters to these functions are always pushed on the
stack from left to right. While any function can be declared with the
FORTRAN/ Pascal convention, it is used primarily for prototypes to Pascal or
FORTRAN routines called from within C programs. This calling convention can
also produce smaller, faster programs.

The /Gc option (generate Pascal-style function calls) can be used to make
all functions in a file observe the FORTRAN/Pascal calling convention.

Note that C run-time library routines must still be called using C calling
conventions. Because these routines are declared using the _cdecl keyword
header files, you must include the appropriate header files in any program
using run-time library routines.

Functions with variable-length parameter lists (such as printf) cannot use
the FORTRAN/Pascal calling convention.

────────────────────────────────────────────────────────────────────────────
NOTE

The /ML, /MD, and /MT options cause all floating-point functions to be
declared as FORTRAN/Pascal. See Chapter 16, "Dynamic Linking with OS/2," for
────────────────────────────────────────────────────────────────────────────

1.8.3  The Register Calling Convention (/Gr)

You can decrease execution time if parameters to functions are passed in
registers rather than on the stack. Compiling with the /Gr command-line
option enables the register calling convention for an entire file. The
_fastcall keyword enables the register calling convention on a
function-by-function basis.

Because the 80x86 processor has a limited number of registers, only the
first three parameters are allocated to registers; the rest are passed using
the FORTRAN/Pascal calling convention. The register calling convention can
increase the speed of a program.

────────────────────────────────────────────────────────────────────────────
NOTE

The compiler allocates different registers for variables declared as
register and for passing arguments using the register calling convention.
This calling convention will not conflict with any register variables that
you may have declared.
────────────────────────────────────────────────────────────────────────────

Exercise caution when using the register calling convention for any function
written in in-line assembly language. Your use of registers in
assembly-language could conflict with the compiler's use of registers for
storing parameters.

1.8.4  The _fastcall Calling Convention

This section describes the details of the _fastcall calling convention. The
information is for the use of assembly-language programmers who are
interested in using either the in-line assembler or the Microsoft Macro
Assembler (MASM) to write functions declared as _fastcall. Functions
declared as _fastcall accept arguments in registers rather than on the
stack; functions declared as _cdecl or _pascal accept parameters only on the
stack.

────────────────────────────────────────────────────────────────────────────
WARNING

The register usage documented here applies only to Microsoft C, version 6.0.
It may change in future releases of the compiler.
────────────────────────────────────────────────────────────────────────────

Argument-Passing Convention

The _fastcall calling convention is a "strongly typed" register calling
convention. This typing allows the compiler to generate better code by
passing arguments in registers that correspond to the data type you are
passing. Because the compiler chooses registers depending on the type of the
argument and not in a strict linear order, the calling program and called
function must agree on the types of the arguments in order to communicate
data correctly.

For each type of argument there is a list of register candidates. The
arguments are allocated to registers or, if no suitable register remains
unused, are pushed onto the stack left-to-right. Each argument is put in the
first register candidate that does not already contain an argument. Table
1.3 shows the basic types and the register candidate list for each.

Table   1.3 Register Candidates

╓┌────────────────────────────────┌──────────────────────────────────────────╖
Type                             Register Candidates
Type                             Register Candidates
────────────────────────────────────────────────────────────────────────────
character                        AL, DL, BL
unsigned character               AL, DL, BL
integer                          AX, DX, BX
unsigned integer                 AX, DX, BX
long integer                     DX:AX
unsigned long integer            DX:AX
near pointer                     BX, AX, DX
far or huge pointer              passed on the stack
────────────────────────────────────────────────────────────────────────────

All far and huge pointers are pushed on the stack, as are all structures,
unions, and floating-point types.

Return Value Convention

The _fastcall return value convention is based on the size of the return
value, except with floating-point types. All floating point types are
returned on the top of the NDP stack. For more information about the NDP
stack and returning floating-point values, see Chapter 4, "Controlling
Floating-Point Math Operations." The following list shows how values 4 bytes
or smaller, including unions and structures, are returned from a _fastcall
function.

Size                              Return Convention
────────────────────────────────────────────────────────────────────────────
1 Byte                            AL Register

2 Bytes                           AX Register

4 Bytes                           DX, AX Registers (for pointers, the
segment is returned in DX, the offset in
AX; for long integers,
the most-significant byte is returned in
DX, leastsignificant byte in AX)

Note that the protocol for returning values 4 bytes or smaller is the same
as for functions declared as _cdecl. To return structures and unions larger
than 4 bytes, the calling program passes a hidden parameter as the last item
pushed. This parameter is a near pointer, implicitly SS-relative, to a
buffer in which the value is to be returned. A far pointer to
SS:hidden-param must be returned in DX:AX. This is the same convention for
returning structures as _pascal.

Unlike functions declared as _cdecl, functions declared as _fastcall must
pop the arguments off the stack. The calling program does not adjust the
stack after function return.

Register Preservation Requirement

All functions must preserve the DS, BP, SI, and DI registers. Your _fastcall
function can modify the values in AX, BX, CX, DX, and ES.

Function-Naming Convention

The public name put into the object file for a function declared as
_fastcall is the name given by the user with a leading "at sign" (@). No
case translation is performed on the function name. The function declaration

int _fastcall FCFunc( void );

causes the compiler to place the public symbol  @FCFunc  in your object file
at every location  FCFunc  is referenced in your program.

If you do not declare the function as _fastcall in your C program, the
compiler assumes the default calling convention. The default is usually the
C calling convention but can be changed by the /Gc (Pascal Calling
Convention), /Gr (Register Calling Convention), or /Gd (C Calling
Convention) options. If the linker gives you an unresolved external
reference, you may have failed to declare an external _fastcall function
properly. For more information about calling conventions, see Chapter 12,
"Programming with Mixed Languages."

Chapter 2  Managing Memory
────────────────────────────────────────────────────────────────────────────

When you develop advanced applications in Microsoft C, you must pay
attention to memory management─that is, how data and code are stored and
accessed in memory. A well-thought-out memory strategy will make your
programs run faster and occupy less memory.

You can follow one or more of these memory management strategies:

■   Choose a standard memory model.

■   Create a mixed-model program with the _near, _far, _huge, and _based
keywords.

■   Create your own customized memory model.

■   Allocate memory as you need it with the malloc family of functions.

This chapter explains pointers, memory models (including the new tiny
model), variations such as custom memory models and mixed models, and based
pointers.

2.1  Pointer Sizes

One of the strengths of the C language is that it allows you to use pointers
to directly access memory locations.

Every Microsoft C program has at least two parts: the code (function
definitions) and the data (variables and constants). As a program runs, it
refers to elements of the code or the data by their addresses. These
addresses can be stored in pointer variables.

Pointer variables can fit into 16 bits or 32 bits, depending on the distance
of the object to which they refer.

2.1.1  Pointers and 64K Segments

IBM personal computers and compatibles use the Intel(R) 8086, 80186, 80286,
or 80386 processors (collectively called the 80x86 family). These processors
have a "segmented" architecture, which means they all have a mode that
treats memory as a series of segments, each of which occupies up to 64K of
memory. An offset from the base of the segment allows you to access
information within a given segment. Moving to a new segment requires

A 16-bit pointer can address up to 65,536 locations.

The 64K limit is necessary because the 80x86 registers are 16 bits (2 bytes)
wide. A single register can address only 65,536 (64K) unique memory
locations.

A pointer variable that fully specifies a memory address needs 16 bits for
the segment location and another 16 bits for the offset within the segment,
a total of 32 bits. However, if you have several variables in the same
general area, your program can set the segment register once and treat the
pointers as smaller 16-bit quantities.

The 80x86 register CS holds the base for the code segment; the register DS
holds the base for the data segment. Two other segment registers are
available: the stack segment register (SS) and the extra segment register
(ES). (The 80386 has additional segment registers: FS and GS.)

2.1.2  Near Pointers

If you don't explicitly specify a memory model, Microsoft C defaults to the
small model, which allots up to 64K for the code and another 64K for the
data (see Figure 2.1).

(This figure may be found in the printed book.)

When a small-model program runs, the CS and DS segment registers never
change. All code pointers and all data pointers contain 16 bits because they
remain within the 64K range.

These 16-bit pointers to objects within a single 64K segment are called
"near pointers." Accessing a near object is called "near addressing."

2.1.3  Far Pointers

If your program needs more than 64K for code or data, at least some of the
pointers must specify the memory segment, which means these pointers occupy
32 bits instead of 16 bits.

These larger 32-bit pointers that can point anywhere in memory are called
"far pointers." Accessing a far object is called "far addressing."

Far pointers can address any location, but they are bigger and slower.

memory location─up to 640K in DOS or several megabytes in OS/2. The
disadvantages of the larger far pointers is that they take up more memory
(four bytes instead of two) and that any use of the pointers (assigning,
modifying, or otherwise accessing values) takes more time.

Allowing either code or data to expand beyond 64K makes your programs larger
and slower.

2.1.4  Huge Pointers

A third type of pointer in Microsoft C is the "huge" pointer, which applies
only to data pointers. Code pointers cannot be declared as huge.

A huge address is similar to a far address in that both contain 32 bits,
made up of a segment value and an offset value. They differ only in the way
pointer arithmetic is performed.

For far pointers, Microsoft C assumes that code and data objects lie
completely within the segment in which they start, so pointer arithmetic
operates only on the offset portion of the address. Limiting the size of any
single item to 64K makes pointer arithmetic faster.

Huge pointers overcome this size limitation; pointer arithmetic is performed
on all 32 bits of the data item's address, thus allowing data items
referenced by huge pointers to span more than one segment. In this code
fragment,

int _huge *hp;
int _far *fp;
.
.
.
hp++;
fp++;

both  hp  and  fp  are incremented. The huge pointer is incremented as a
32-bit value that represents the combined segment and offset. Only the
offset part of the far pointer (a 16-bit value) is incremented.

Extending the size of pointer arithmetic from 16 to 32 bits causes such
arithmetic to execute more slowly. You gain the use of larger arrays by
paying a price in execution speed.

When you declare near, far, and huge variables, the Microsoft C compiler and
linker automatically manage details such as allocating memory and keeping
track of segments.

A "based pointer" is a fourth kind of pointer that operates as a 16-bit
offset from a base that you specify. In this respect, based addressing
differs from near, far, or huge addressing; you're responsible for naming
the base, instead of letting the compiler decide.

Based pointers are new to version 6.0 of Microsoft C. They are explained in
more detail in Section 2.5, "Using Based Variables."

2.2  Selecting a Standard Memory Model

If you want to choose one size for all pointers, there's no need to declare
each variable as near or far. Instead, you select a standard memory model
and your choice applies to all variables in the program.

One advantage of using standard memory models is simplicity. You specify the
way the compiler allocates storage for code and data only once.

A standard memory model assumes all pointers are the same size.

Another advantage is that the standard memory models do not require the use
of Microsoft-specific keywords such as _near and _far, so they are best for
writing code that is portable to other (non-DOS) systems.

The disadvantage of standard memory models is that, because they make global
assumptions about the environment, they do not always produce the most
efficient code.

2.2.1  The Six Standard Memory Models

The six Microsoft C memory models are shown in Table 2.1.

Table 2.1  Memory Models

╓┌─────────┌─────────────────────┌──────────┌────────────────────────────────╖
Maximum Total Memory
Model     Code                  Data       Data Arrays
────────────────────────────────────────────────────────────────────────────
Maximum Total Memory
Model     Code                  Data       Data Arrays
────────────────────────────────────────────────────────────────────────────
Tiny      <64K                  <64K       <64K
Small     64K                   64K        64K
Medium    No limit              64K        64K
Compact   64K                   No limit   64K
Large     No limit              No limit   64K
Huge      No limit              No limit   No limit
────────────────────────────────────────────────────────────────────────────

The SETUP program creates the libraries that support the six standard memory
models.

When you choose one of the standard memory models, the compiler inserts the
name of the corresponding C run-time library in the object file so the
linker chooses it automatically. Each memory model has its own library,
except for the huge memory model (which uses the large-model library) and
the tiny model (which uses the small-model library).

2.2.2  Limitations on Code Size and Data Size

When writing a program in Microsoft C, keep in mind two limitations that
apply to all six memory models:

■   No single source module can generate 64K or more of code. You must
break large programs into modules and link their individual .OBJ files
to create the .EXE file.

■   No single data item can exceed 64K unless it appears in a huge-model
program or it has been declared with the _huge keyword.

2.2.3  The Tiny Memory Model

The tiny memory model is new to Microsoft C. It resembles the small model
with three exceptions:

■   The tiny model cannot exceed 64K per program (including both code and
data). A small-model program, on the other hand, can occupy up to
128K: 64K for code and 64K for data.

■   The tiny model produces .COM, rather than .EXE, files. To produce .COM
files, compile with the /AT option. Then link with the / TINY option
and link in CRTCOM.OBJ.

■   The tiny model applies to DOS only; it is not available in OS/2.

Although the tiny model imposes the most severe limits on code and data
size, it produces the smallest programs. The tiny memory model only offers a
load-time speed advantage over the small model; they both produce the
fastest programs.

2.2.4  The Huge Memory Model

The huge memory model is nearly identical to the large model. The only
difference is that the huge model permits individual arrays to exceed 64K in
size. For example, an int uses two bytes, so an array of 40,000 integers,
occupying 80,000 bytes of memory, would be permitted in the huge model. All
other models limit each array, structure, or other data object to no more
than 64K.

────────────────────────────────────────────────────────────────────────────
NOTE
Automatic arrays cannot be declared huge. Only static arrays and arrays
occupying memory allocated by the halloc function can be huge.
────────────────────────────────────────────────────────────────────────────

The huge model lifts  the limits on arrays.

Although the huge model lifts the limits on arrays, some size restrictions
do apply. To maintain efficient addressing, no individual array element is
allowed to cross a segment boundary. This has the following implications:

■   No single element of an array can be larger than 64K. An array can be
larger than 64K, but its individual elements cannot.

■   For any array larger than 128K, all elements must have a size in bytes
equal to a power of 2: 2 bytes, 4 bytes, 8 bytes, 16 bytes, and so on.
If the array is 128K or smaller, its elements can be any size, up to
and including 64K.

Pointer arithmetic changes within the huge model, as well. In particular,
the sizeof operator may return an incorrect value. The ANSI draft standard
for C defines the value returned by sizeof to be of type size_t (which, in
Microsoft C, is

an unsigned int). The size in bytes of a huge array is an unsigned long
value, however. To find the correct value, you must use a type cast:

(unsigned long)sizeof(monster_array)

Similarly, the C language defines the result of subtracting two pointers as
ptrdiff_t (a signed int in Microsoft C). Subtracting two huge pointers will
yield a long value. Microsoft C gives the correct result with the following
type cast:

(long)(ptr1_huge - ptr2_huge)

When you select huge model, all extern arrays are treated as _huge.
Operations on data declared as _huge can be less efficient than the same
operations on data declared as _far.

2.2.5  Null Pointers

Within the medium and compact models, code pointers and data pointers differ
in size: one is 16 bits wide and the other is 32 bits wide. When using these
memory models, you should be careful in your use of the manifest constant
NULL.

NULL represents a null data pointer. The C include files define it as

#define NULL ((void *) 0)

There can be problems in models with different sizes of code and data
pointers.

In memory models where data pointers have the same size as code pointers,
the actual size of a null pointer doesn't matter. In memory models where
code and data pointers are different sizes, problems can occur. Consider
this example:

void main()
{
func1( NULL );
func2( NULL );
}

func1( char *dp )
{
.
.
.
}

func2( char (*fp)( void ) )
{
.
.
.
}

In the absence of function prototypes for  func1  and  func2, the compiler
always assumes that NULL refers to data and not code.

The example above works correctly in tiny, small, large, and huge models
because, in those models, a data pointer is the same size as a code pointer.
Under medium or compact model, however,  main  passes  NULL to  func2  as a
null data pointer rather than as a null code pointer (a pointer to a
function), which means the pointer is the wrong size.

To ensure that your code works properly in all models, declare each function
with a prototype. For example, before  main, include these two lines:

int func1( char *dp );
int func2( char (*fp)( void ));

If you add these prototypes to the example, the code works properly in all
memory models. Prototypes force the compiler to coerce code pointers to the
correct size. Prototypes also enable strong type-checking of parameters.

2.2.6  Specifying a Memory Model

If you do not specify a memory model, Microsoft C defaults to the small
model, which is adequate for many small to mid-sized programs.

You can select a memory model from the Programmer's WorkBench or from the
command line.

Selecting from within PWB

If you're compiling from the Programmer's WorkBench, open the Options menu
and choose C Global Build Options. The available memory models appear in the
upper left corner. Choose one of the six standard models or choose
Customized and type in the options for a customized model.

Selecting from the Command Line

You can choose a memory model by including an option on the command line.
For example, to compile CLICK.C as a compact-model program, type this:

CL /AC CLICK.C

The /AC option selects the compact memory model. The six options and four
libraries are listed below:

Option                            Memory Model: Library
────────────────────────────────────────────────────────────────────────────
/AT                               Tiny Model: SLIBCxx.LIB (plus CRTCOM.OBJ)

/AS                               Small Model: SLIBCxx.LIB

/AM                               Medium Model: MLIBCxx.LIB

/AC                               Compact Model: CLIBCxx.LIB

/AL                               Large Model: LLIBCxx.LIB

/AH                               Huge Model: LLIBCxx.LIB

2.3  Mixing Memory Models

In standard memory models, explained above, all data pointers are the same
size and all code pointers are the same size.

A mixed memory model selectively combines different types of pointers within
the same program. A mixed model extends the limits of a given memory model
while retaining its benefits.

A mixed memory model lets you mix near and far pointers.

For example, imagine a programming situation where you add an array to a
small-model program, pushing the data segment past the 64K limit.

You could solve the problem by moving up from the small to the compact
memory model. Doing so would bump all data pointers from two to four bytes.
The .EXE file would grow accordingly. Execution time would slow.

A second and perhaps better solution is to stay within the standard small
memory model, which uses near pointers, but to declare the new array as far.
You mix near pointers and far pointers, creating a mixed model.

Microsoft C lets you override the standard addressing convention for a given
memory model by specifying that certain items are _near, _far, _huge, or
_based. These keywords are not a standard part of the C language; they are
Microsoft extensions, meaningful only on systems that use 80x86
microprocessors. Using these keywords may affect the portability of your
code.

────────────────────────────────────────────────────────────────────────────
NOTE

Previous versions of the Microsoft C Compiler accepted the keywords near,
far, and huge without an initial underscore. Since the ANSI draft standard
for C permits compiler implementors to reserve identifiers that begin with
underscores, an underscore was added to these keywords to mark them as
Microsoft-specific. To maintain compatibility with existing source code, the
compiler still recognizes the obsolescent versions of these keywords.
────────────────────────────────────────────────────────────────────────────

You can compile a program in the small model, for example, but declare a
certain array to be _far. At run time, the address of that array occupies
four bytes. The program may slow slightly when accessing items in that
particular far array, but throughout the rest of the program, all addressing
would be near. Note that all pointers to elements of an array declared as
_far must also be declared as _far.

Table 2.2 lists the effects of these keywords on data pointers, code
pointers, and pointer arithmetic.

Table 2.2  Addressing Declared with Microsoft Keywords

╓┌────────┌─────────────────────┌─────────────────────┌──────────────────────╖
Keyword  Data                  Code                  Arithmetic
────────────────────────────────────────────────────────────────────────────
_near    Data reside in        Functions reside in   16 bits
default data          current code
segment; 16-bit       segment; 16-bit
Keyword  Data                  Code                  Arithmetic
────────────────────────────────────────────────────────────────────────────

_far     Data can be anywhere  Functions can be      16 bits
in memory, not        called from anywhere
necessarily in the    in memory; 32-bit
segment; 32-bit

_huge    Data can be anywhere  Not applicable;       32 bits
in memory, not        code cannot be         (data only)
necessarily in the    declared _huge
default data segment.
Individual data
items (arrays) can
exceed 64K in size;

_based   Data can be anywhere  Not applicable;       16 bits
Keyword  Data                  Code                  Arithmetic
────────────────────────────────────────────────────────────────────────────
_based   Data can be anywhere  Not applicable;       16 bits
in memory, not        code cannot be         (data only)
necessarily in the    declared _based
default data
segment; 16-bit
known base provide
the range of 32-bit

────────────────────────────────────────────────────────────────────────────

2.3.1  Pointer Problems

When you declare items to be _near, _far, _huge, or _based, you can link
with a standard run-time library. Be aware, however, that in some cases, the
modified pointers will be incompatible with standard library functions.
Watch for these problems that affect pointers:

■   A library function that expects a 16-bit pointer as an argument will
not function properly with modified variables that occupy 32 bits. In
other words, you can cast a near pointer to a far pointer, because it
adds the segment value and maintains the integrity of the address. If
you cast a far pointer to near, however, the compiler generates a
warning message because the offset may not lie within the default data
segment, in which case the original far address is  irretrievably
lost.

■   A library function that returns a pointer will return a pointer of the
default size for the memory model. This is only a problem if you are
assigning the return value to a pointer of a smaller size. For
example, there may be difficulties if you compile with a model that
selects far data pointers, but you have explicitly declared the
variable to receive the return value _near.

This warning does not apply to all functions. See Section B.2.8 in
Appendix B for a list of model-independent string and memory functions
such as _fstrcat, the far version of strcat.

■   Based pointers pose a special problem. Based pointers are passed to
other functions as is (without normalization). Certain functions
expect to receive based pointers, but most do not. Therefore, in most
cases, you must either explicitly cast a based pointer to a far
pointer or make sure that all functions that receive based pointers
are prototyped.

Some run-time library functions support near, far, huge, and based
variables. For example, halloc allocates memory for a huge data array.

You can always pass the value (but not the address) of a far item to a
small-model library routine. For example,

/* Compile in small model */
#include <stdio.h>
long _far time_val;

void main()
{
time( &time_val );             /* Illegal far address */
printf( "%ld\n", time_val );   /* Legal value */
}

When you use a mixed memory model, you should include function prototypes
with argument-type lists to ensure that all pointer arguments are passed to
functions correctly.

2.3.2  Declaring Near, Far, Huge, and Based Variables

The _near, _far, _huge, and _based keywords modify either objects or
pointers to objects. When using them to declare variables, keep these rules
in mind:

■   The keyword always modifies the object or pointer immediately to its
right. In complex declarations, think of the _far keyword and the item
to its right as being a single unit. For example, in the case of the
declaration

char _far * _near *p;

p  is a near pointer to a far pointer to char, which resides in the
default data segment for the memory model being used.

By contrast, the declaration

char _far * _near p;

is a far pointer to char that will always be stored in DGROUP,
regardless of the memory model being used.

■   If the item immediately to the right of the keyword is an identifier,
the keyword determines whether the item will be allocated in the
default data segment ( _near) or a separate data segment ( _far,
_huge, or _based). For example,

char _far a;

allocates  a  as an item of type char with a _far address.

■   If the item immediately to the right of the keyword is a pointer, the
keyword determines whether the pointer will hold a near address (16
bits), a based address (16 bits), a far address (32 bits), or a huge
address (also 32 bits). For example,

char _huge *p;

allocates  p  as a huge pointer (32 bits) to an item of type char. Any
arithmetic performed on the huge pointer  p  will affect all 32 bits.
That is, the instruction  p++  increments the pointer as a 32-bit
entity.

2.3.3  Declaring Near and Far Functions

You cannot declare functions as _huge or _based. The rules for using the
_near and _far keywords for functions are similar to those for using them
with data:

■   The keyword always modifies the function or pointer immediately to its
right.

■   If the item immediately to the right of the keyword is a function, the
keyword determines whether the function will be allocated as near or
far. For example,

char _far fun();

defines  fun  as a function with a 32-bit address that returns a char.
The function may be located in near memory or far memory, but it is
called with the full 32-bit address. The _far keyword applies to the
function, not to the return type.

■   If the item immediately to the right of the keyword is a pointer to a
function, the keyword determines whether the function will be called
using a near (16-bit) or far (32-bit) address. For example,

char (_far *pfun)( );

defines  pfun  as a far pointer (32 bits) to a function returning type
char.

■   Function declarations must match function definitions.

■   The _huge and _based keywords do not apply to functions. That is, a
function cannot be huge (larger than 64K) or based. A function can
return a huge data pointer to the calling function. A function can
return a based pointer unless it is a pointer based on _self (see
Section 2.5.2, "Declaring Based Variables").

The example below declares  fun1  as a far function returning type char:

char _far fun1(void);              /* small model */
char _far fun(void)
{
.
.
.
}

Here, the  fun2  function is a near function that returns a far pointer to
type char:

char _far * _near fun2( );        /* large model */
char _far * _near fun( )
{
.
.
.
}

The example below declares  pfun  as a far pointer to a function that has an
int return type, assigns the address of printf to  pfun, and prints "Hello
world." twice.

/* Compile in medium, large, or huge model */

#include <stdio.h>
int (_far *pfun)( char *, ... );

void main()
{
pfun = printf;
pfun( "Hello world.\n" );
(*pfun)( "Hello world.\n" );
}

2.3.4  Pointer Conversions

Passing near or far pointers as arguments to functions can cause automatic
conversions in the size of the pointer argument. Passing a pointer to an
unprototyped function forces the pointer size to the larger of the following
two sizes:

■   The default pointer size for that type, as defined by the memory model
selected during compilation.

For example, in medium-model programs, data pointer arguments are near
by default, and code pointer arguments are far by default.

■   The size of the type of the argument.

Note that if you supply a based pointer as an argument to a function and do
not specifically cast it to a far pointer type, a 16-bit offset from the
base segment is passed.

Function prototypes prevent problems that may occur in mixed memory models.

If you provide a function prototype with complete argument types, the
compiler performs type-checking and enforces the conversion of actual
arguments to the declared type of the corresponding formal argument.
However, if no declaration is present or the argument-type list is empty,
the compiler will convert nonbased pointer arguments automatically to the
default type or the type of the argument, whichever is larger. To avoid
mismatched arguments, always use a prototype with the argument types.

For example, the following program produces unexpected results in
compact-model, large-model, or huge-model programs.

void main( )
{
int _near *x;
char _far *y;
int z = 1;

test_fun( x, y, z );   /* x is coerced to far
pointer in compact,
large, or huge model */
}

int test_fun( int _near *ptr1, char _far *ptr2, int a)
{
printf("Value of a = %d\n", a);
}

If the preceding example is compiled as a tiny, small, or medium program,
the size of  x  is 16 bits, the size of  y  is 32 bits, and the value
printed for  a  is 1.

However, if the example is compiled in compact, large, or huge model, both
x  and  y  are automatically converted to far pointers when they are passed
to  test_fun. Since  ptr1, the first parameter of  test_fun, is defined as a
near pointer argument, it takes only 16 bits of the 32 bits passed to it.
The next parameter,  ptr2, takes the remaining 16 bits passed to  ptr1, plus
16 bits of the 32 bits passed to it. Finally, the third parameter,  a, takes
the leftover 16 bits from  ptr2, instead of the value of  z  in the main
function.

This shifting process does not generate an error message, because both the
function call and the function definition are legal. In this case the
program does not work as intended, however, since the value assigned to  a
is not the value intended.

To pass  ptr1  as a near pointer, you should include a function prototype
that specifically declares this argument for  test_fun  as a near pointer,
as shown below:

/* First, prototype test_fun so the compiler
* knows in advance about the near pointer argument:
*/
int test_fun (int _near*, char _far *, int);

main ( )
{
int _near *x;
char _far *y;
int z = 1;

test_fun ( x, y, z );    /* now, x is not coerced
* to a far pointer; it is
* passed as a near pointer,
* no matter which memory
* model is used
*/
}

int test_fun ( int _near *ptr1, char _far *ptr2, int a)
{
printf ( "Value of a = %d\n", a );
}

2.4  Customizing Memory Models

A third way to manage memory is to combine different features from standard
memory models to create your own customized memory model. You should have a
thorough understanding of C memory models and the architecture of 80x86
processors before creating your own nonstandard memory models.

In a customized model, you select the size of code pointers and data
pointers.

The /Astring option lets you change the attributes of the standard memory
models to create your own memory models. The three letters in string
correspond to the code pointer size, the data pointer size, and the stack
and data segment setup, respectively. Because the letter allowed in each
field is unique to that field, you can give the letters in any order after
/A. All three letters must be present.

The standard memory-model options (/AT, /AS, /AM, /AC, /AL, and /AH) can be
specified in the /Astring form. As an example of how to construct memory
models, the standard memory-model options are listed below with their
/Astring equivalents:

Standard                          Custom Equivalent
────────────────────────────────────────────────────────────────────────────
/AT                               /Asnd

/AS                               /Asnd

/AM                               /Alnd

/AC                               /Asfd

/AL                               /Alfd

/AH                               /Alhd

For example, you might want to create a huge-compact model. This model would
allow huge data items but only one code segment. The option for specifying
this model would be /Ashd.

────────────────────────────────────────────────────────────────────────────
NOTE

Tiny model is identical to small model except that it causes the linker to
search for CRTCOM.LIB. The executable file generated when you specify tiny
model is a .COM file rather than a .EXE.
────────────────────────────────────────────────────────────────────────────

2.4.1  Setting a Size for Code Pointers

Within a custom memory model, you choose whether code pointers are short or
long:

Option                            Size
────────────────────────────────────────────────────────────────────────────
/Asxx                             Short (near) code pointers

/Alxx                             Long (far) code pointers

The /As (short) option tells the compiler to generate near 16-bit pointers
and addresses for all functions. This is the default for tiny-, small-, and
compact-model programs.

The /Al (long) option means that far 32-bit pointers and addresses are used
to address all functions. Far pointers are the default for medium-, large-,
and huge-model programs.

2.4.2  Setting a Size for Data Pointers

Data pointers can be near, far, or huge:

Option                            Size
────────────────────────────────────────────────────────────────────────────
/Axnx                             Near data pointers

/Axfx                             Far data pointers

/Axhx                             Huge data pointers

The /An (near) option tells the compiler to use 16-bit pointers and
addresses for all data. This is the default for tiny-, small-, and
medium-model programs.

The /Af (far) option specifies that all data pointers and addresses are 32
bits. This is the default for compact- and large-model programs.

The /Ah (huge) option specifies that all data pointers and addresses are far
(32-bit) and that arrays are permitted to extend beyond a 64K segment. This
is the default for huge-model programs.

With far data pointers, no single data item can be larger than a segment
(64K) because address arithmetic is performed only on 16 bits (the offset
portion) of the address. When huge data pointers are used, individual data
items can be larger than a segment (64K) because address arithmetic is
performed on both the segment and the offset.

2.4.3  Setting Up Segments

Within a customized model, you can choose to make the stack segment (SS)
equal the data segment (DS), in which case they overlap:

Option                            Effect
────────────────────────────────────────────────────────────────────────────
/Axxd                             SS == DS

/A«xx»u                           SS != DS; DS reloaded on function entry

/A«xx»w                           SS != DS; DS not reloaded on function
entry

Segment Setup Option /Ad

The option /Ad tells the compiler that the segment addresses stored in the
SS and DS registers are equal. The stack segment and the default data
segment are combined into a single segment. This is the default for all
standard-model programs. In small- and medium-model programs, the stack plus
all data must occupy less than 64K; thus, any data item is accessed with
only a 16-bit offset from the segment address in the SS and DS registers.

In compact-, large-, and huge-model programs, initialized global and static
data are placed in the default data segment up to a certain threshold. The
address of this segment is stored in the DS and SS registers. All pointers
to data, including pointers to local data (the stack), are full 32-bit
addresses. This is important to remember when passing pointers as arguments
in multiple-segment programs. Although you may have more than 64K of total
data in these models, no more than 64K of data can occupy the default
segment. The /Gt and /ND options control allocation of items in the default
data segment if a program exceeds this limit.

Segment Setup Option /Au

The option /Au tells the compiler that the stack segment does not
necessarily coincide with the data segment. In addition, it adds the _loadds
attribute to all functions within a module, forcing the compiler to generate
code to load the DS register with the correct value prior to entering the
function body. Combine the /ND option with /Au to name data segments other
than the default. When /Au is combined with /ND, the address in the DS
register is saved upon entry to each function, and the new DS value for the
module in which the function was defined is loaded into the register. The
previous DS value is restored on exit from the function. Therefore, only one
data segment is accessible at any given time. The /ND option lets you
combine these segments into a single segment.

If a standard memory-model option precedes it on the command line, the /Au
option can be specified without any letters indicating data pointer or code
pointer sizes. The program uses a standard memory model, but different
segments are set up for the stack and data segments.

The /Au option is useful for OS/2 or Microsoft Windows dynamic-link
libraries (DLLs), since it forces DS to be loaded on entry to each function.
It is also useful for writing extensions to the Programmer's WorkBench. This
is a costly operation, however, so consider using the /Aw option.

Segment Setup Option /Aw

The option /Aw, like /Au, causes the compiler to assume that the stack
segment is separate from the data segment. The compiler does not
automatically load the DS register at each function entry point. The /Aw
option is useful in creating applications that interface with an operating
system or with a program running at the operating-system level. The
operating system or the program running under the operating system actually
receives the data intended for the application program and places that data
in a segment; then the operating system or program must load the DS register
with the segment address for the application program.

As with the /Au option, the /Aw option can be specified without data pointer
and code pointer letters if a standard memory-model option precedes it on
the command line. In such a case, the program uses the specified memory
model just as with /Au, but the DS register is not reloaded at each function
entry point.

Even though /Au and /Aw indicate that the stack may be in a separate
segment, the stack's size is still fixed at the default size unless this is
overridden with the /F compiler option or the /STACK linker option.

The /Aw option is useful for writing OS/2 and Microsoft Windows dynamic-link
libraries (DLLs), but care must be taken when it is used. Declare all entry
points to the dynamic-link library as _loadds to force DS to be loaded on
entry to the function (exactly like the /Au option). The other functions
will then be more efficient, though, because they will not have to perform
redundant loads of the DS register. For example,

_export _loadds _far pascal LibFunc( void )
{
.
.
.
HelperFunc(); }

HelperFunc( void )
{
.
.
.
}

The library entry point,  LibFunc, is declared as _loadds to force the DS
register to be loaded on entry. The function  HelperFunc, which is private
to the dynamic-link library, is declared as a normal C function. Since it
cannot be called from outside of the module,  HelperFunc  does not need to

If you choose one of the options that specifies that the stack segment is
not equal to the data segment (SS != DS), you cannot pass the address of
frame variables as arguments to functions that take near pointers. That is,
in tiny, small, and medium models, you cannot pass the address of a local
variable (which is allocated on the stack) as an argument, because the
receiving function will assume the pointer is relative to the data segment.
However, the receiving function could solve this problem by declaring the
pointer to be the following:

based(_segname("_STACK"))

Another solution would be to cast the pointer to a far pointer in both
locations as follows:

/* Call func with an explicit cast to far */
func( (char far *)frame_var );
.
.
.
void func( char far *formal_var )

2.4.4  Library Support for Customized Memory Models

Most C programs make function calls to the routines in the C run-time
library. When you write mixed-model programs, you are responsible for
determining which library (if any) is suitable for your program and for
ensuring that the appropriate library is linked. Table 2.3 shows the
libraries from which to extract the start-up routine for each customized
memory model.

Table 2.3  Start-Up Routines for Customized Memory Models

╓┌────────────────────────────────────────────────┌──────────────────────────╖
Memory-Model Option                              From Library
────────────────────────────────────────────────────────────────────────────
/Asnx; /AS plus /Ax                              SLIBCf.LIB
/Asfx; /Ashx; /AC plus /Ax                       CLIBCf.LIB
/Alnx; /AM plus /Ax                              MLIBCf.LIB
/Alfx; /Alhx; /AL plus /Ax; /AH plus /Ax         LLIBCf.LIB
────────────────────────────────────────────────────────────────────────────

The /Ax option represents either /Au or /Aw. In the library names,  f is
either E (emulator library), 7 (8087/80287 library), or A (alternate math
library).

2.4.5  Setting the Data Threshold

Option                            Effect
────────────────────────────────────────────────────────────────────────────
/Gt«number»                       Sets the threshold

The /Gt option causes all data items whose size is greater than to number
bytes to be allocated to a new data segment. When number is specified, it
must follow the /Gt option immediately, with no intervening spaces. When
number is omitted, the default threshold value is 256. When the /Gt option
is omitted, the default threshold value is 32,767.

The /Gt option applies only to compact-, large-, and huge-model programs,
since small- and medium-model programs have only one data segment. The
option is particularly useful with programs that have more than 64K of
initialized static and global data in small data items, because otherwise
you run out of memory in the default data segment and can't link the
program. The /Gt option has no effect on uninitialized global data.

2.4.6  Naming Modules and Segments

Option                            Effect
────────────────────────────────────────────────────────────────────────────
/NM modulename                    Names the module

/NT textsegment                   Names the code segment

/ND datasegment                   Names the data segment

"Module" is another name for an object file created by the C compiler from a
single source file. Every module has a name. The compiler uses this name in
error messages if problems are encountered during processing. The module
name is usually the same as the source-file name. You can change this name
using the /NM (name module) option. The new modulename can include any
combination of letters and digits. The space between /NM and modulename is
optional.

Every module has at least two segments: a code segment (sometimes called the
text segment) containing the program instructions, and a data segment
containing the program data.

The compiler normally creates the code and data segment names. The default
names depend on the memory model chosen for the program. For example, in
small-model programs the code segment is named _TEXT and the data segment is
named _DATA.

Table 2.4 summarizes the naming conventions for code and data segments.

Table 2.4  Segment-Naming Conventions

╓┌─────────┌─────────────┌───────┌───────────────────────────────────────────╖
Model     Code          Data    Module
────────────────────────────────────────────────────────────────────────────
Tiny      _TEXT         _DATA   ---
Small     _TEXT         _DATA   ---
Medium    module_TEXT   _DATA   filename
Model     Code          Data    Module
────────────────────────────────────────────────────────────────────────────
Medium    module_TEXT   _DATA   filename
Compact   _TEXT         _DATA   filename
Large     module_TEXT   _DATA   filename
Huge      module_TEXT   _DATA   filename
────────────────────────────────────────────────────────────────────────────

In memory models that contain multiple data segments (compact, large, and
huge), _DATA is the name of the default data segment. Other data segments
have unique private names. You can override the default names with the
options /NT (name text) and /ND (name data).

The /ND option is commonly used to create and compile modules that contain
data only. Such modules can be accessed from other parts of the program by
declaring their variables as external.

If you change the name of the default data segment with /ND, your program
must load the DS register with the segment selector of your named data
segment before it accesses it. You must therefore compile your program
either with the /Astringform of the memory-model option and the /Au option
for the segment setup, or with the /A option for a s

Chapter 3  Using the In-Line Assembler
────────────────────────────────────────────────────────────────────────────

This chapter explains how to use the Microsoft C in-line assembler. Assembly
language serves many purposes, such as improving program speed, reducing
memory needs, and controlling hardware. The in-line assembler lets you embed
assembly-language instructions directly in your C source programs without
extra assembly and link steps. The in-line assembler is built into the
compiler─you don't need a separate assembler such as the Microsoft Macro
Assembler (MASM).

3.1  Advantages of In-Line Assembly

Because the in-line assembler doesn't require separate assembly and link
steps, it is more convenient than a separate assembler. In-line assembly
code can use any C variable or function name that is in scope, so it is easy
to integrate it with your program's C code. And because the assembly code
can be mixed in-line with C statements, it can do tasks that are cumbersome
or impossible in C alone.

The uses of in-line assembly include

■   Writing functions in assembly language

■   Spot-optimizing speed-critical sections of code

■   Calling DOS and BIOS routines with the INT instruction

■   Creating TSR (terminate-and-stay-resident) code or handler routines
that require knowledge of processor states

In-line assembly is a special-purpose tool. If you plan to transport an
application, you'll probably want to place machine-specific code in a
separate module. And because the in-line assembler doesn't support all of
MASM's macro and data directives, you may find it more convenient to use
MASM for such modules.

3.2  The _asm Keyword

The _asm keyword invokes the in-line assembler and can appear wherever a C
statement is legal. It cannot appear by itself. It must be followed by an
assembly instruction, a group of instructions enclosed in braces, or, at the
very least, an empty pair of braces. The term "_asm block" here refers to
any instruction or group of instructions, whether or not in braces.

Below is a simple _asm block enclosed in braces. (The code prints the "beep"
character, ASCII 7.)

_asm
{
mov ah, 2
mov dl, 7
int 21h
}

Alternatively, you can put _asm in front of each assembly instruction:

_asm mov ah, 2
_asm mov dl, 7
_asm int 21h

Since the _asm keyword is a statement separator, you can also put assembly
instructions on the same line:

_asm mov ah, 2   _asm mov dl, 7   _asm int 21h

Braces can prevent ambiguity and needless repetition.

All three examples generate the same code, but the first style─enclosing the
_asm block in braces─has some advantages. The braces clearly separate
assembly code from C code and avoid needless repetition of the _asm keyword.
Braces can also prevent ambiguities. If you want to put a C statement on the
same line as an _asm block, you must enclose the block in braces. Without
the braces, the compiler cannot tell where assembly code stops and C
statements begin. Finally, since the text in braces has the same format as
ordinary MASM text, you can easily cut and paste text from existing MASM
source files.

The braces enclosing an _asm block don't affect variable scope, as do braces
in C. You can also nest _asm blocks, but the nesting doesn't affect variable
scope.

3.3  Using Assembly Language in _asm Blocks

The in-line assembler has much in common with other assemblers. For example,
it accepts any expression that is legal in MASM, and it supports all 80286
and 80287 instructions. This section describes the use of assembly-language
features in _asm blocks.

Instruction Set

The in-line assembler supports the full instruction set of the Intel 80286
and 80287 processors. It does not recognize 80386- and 80387-specific
instructions. To use 80286 or 80287 instructions, compile with the /G2
option.

Expressions

In-line assembly code can use any MASM expression, that is, any combination
of operands and operators that evaluates to a single value or address.

Data Directives and Operators

Although an _asm block can reference C data types and objects, it cannot
define data objects with MASM directives or operators. Specifically, you
cannot use the definition directives DB, DW, DD, DQ, DT, and DF, or the
operators DUP or THIS. Nor are MASM structures and records available. The
in-line assembler doesn't accept the directives STRUC, RECORD, WIDTH, or

EVEN and ALIGN Directives

While the in-line assembler doesn't support most MASM directives, it does
support EVEN and ALIGN. These directives put NOP (no operation) instructions
in the assembly code as needed to align labels to specific boundaries. This
makes instruction-fetch operations more efficient for some processors (not
including eight-bit processors such as the Intel 8088).

Macros

The in-line assembler is not a macro assembler. You cannot use MASM macro
directives (MACRO, REPT, IRC, IRP, and ENDM) or macro operators ( <>, !, &,
%, and .TYPE). An _asm block can use C preprocessor directives, however. See
Section 3.4, "Using C in _asm Blocks" for more information.

Segment References

You must refer to segments by register rather than by name (the segment name
_TEXT is invalid, for instance). Segment overrides must use the register
explicitly, as in ES:[BX].

Type and Variable Sizes

The LENGTH, SIZE, and TYPE operators have a limited meaning in in-line
assembly. They cannot be used at all with the DUP operator (because you
cannot define data with MASM directives or operators). But you can use them
to find the size of C variables or types:

■   The LENGTH operator can return the number of elements in an array. It
returns the value 1 for nonarray variables.

■   The SIZE operator can return the size of a C variable. A variable's
size is the product of its LENGTH and TYPE.

■   The TYPE operator can return the size of a C type or variable. If the
variable is an array, TYPE returns the size of a single element of the
array.

For instance, if your program has an eight-element int array,

int arr[8];

the following C and assembly expressions yield the size of  arr  and its
elements:

╓┌───────────┌──────────────────────────┌────────────────────────────────────╖
_asm        C                          Size
────────────────────────────────────────────────────────────────────────────
LENGTH arr  sizeof(ar)/sizeof(arr[0])  8
SIZE arr    sizeof (arr)               16
TYPE arr    size14(arr[0])             2
────────────────────────────────────────────────────────────────────────────

Instructions in an _asm block can use assembly-language comments:

_asm mov ax, offset buff ; Load address of buff

Because C macros expand into a single logical line, avoid using
assemblylanguage comments in macros (see Section 3.8, "Defining _asm Blocks
as C Macros"). An _asm block can also contain C-style comments, as noted
below.

The _emit Pseudoinstruction

The _emit pseudoinstruction is similar to the DB directive of MASM. It
allows you to define a single immediate byte at the current location in the
current text segment. However, _emit can define only one byte at a time, and
it can only define bytes in the text segment. It uses the same syntax as the
INT instruction.

One use for _emit is to define 80386-specific instructions, which the
in-line assembler does not support. The following fragment, for instance,
defines the 80386 CWDE instruction:

/* Assumes 16-bit mode */
#define cwde _asm _emit 0x66 _asm _emit 0x98
.
.
.
_asm {
cwde
}

Debugging and Listings

In-line assembly code can be debugged with CodeView.

Programs containing in-line assembly code can be debugged with the CodeView
debugger, assuming you compile with the /Zi option.

Within CodeView, you can set breakpoints on both C and assembly-language
lines. If you enable mixed assembly and C mode, you can display both the
source and disassembled form of the assembly code.

Note that putting multiple assembly instructions or C statements on one line
can hamper debugging with CodeView. In source mode, the CodeView debugger
lets you set breakpoints on a single line but not on individual statements
on the same line. The same principle applies to an _asm block defined as a C
macro, which expands to a single logical line.

If you create a mixed source and assembly listing with the /Fc compiler
option, the listing contains both the source and assembly forms of each
assemblylanguage line. Macros are not expanded in listings, but they are
expanded during compilation.

See Chapter 9, "Debugging C Programs with CodeView," for more information.

3.4  Using C in _asm Blocks

Because in-line assembly instructions can be mixed with C statements, they
can refer to C variables by name and use many other elements of C. An _asm
block can use the following C language elements:

■   Symbols, including labels and variable and function names

■   Constants, including symbolic constants and enum members

■   Macros and preprocessor directives

■   Comments (both  /* */  and  // )

■   Type names (wherever a MASM type would be legal)

■   typedef names, generally used with operators such as PTR and TYPE or
to specify structure or union members

Within an _asm block, you can specify integer constants with either C
notation or assembler radix notation (0x100 and 100h are equivalent, for
instance). This allows you to define (using #define) a constant in C, and
use it in both C and assembly portions of the program. You can also specify
constants in octal by preceding them with a 0. For example, 0777 specifies
an octal constant.

3.4.1  Using Operators

An _asm block cannot use C-specific operators, such as the  operator.
However, operators shared by C and MASM, such as the * operator, are
interpreted as assembly-language operators. For instance, outside an _asm
block, square brackets ( [] ) are interpreted as enclosing array subscripts,
which C automatically scales to the size of an element in the array. Inside
an _asm block, they are seen as the MASM index operator, which yields an
unscaled byte offset from any data object or label (not just an array). The
following code illustrates the difference:

int array[10];

_asm mov array[6], bx ;  Store BX at array+6 (not scaled)

array[6] = 0;         /* Store 0 at array+12 (scaled) */

The first reference to  array  is not scaled, but the second is. Note that
you can use the TYPE operator to achieve scaling based on a constant. For
instance, the following statements are equivalent:

_asm mov array[6 * TYPE int], 0 ; Store 0 at array + 12

array[6] = 0;                   /* Store 0 at array + 12 */

3.4.2  Using C Symbols

An _asm block can refer to any C symbol in scope where the block appears. (C
symbols are variable names, function names, and labels─in other words, names
that aren't symbolic constants or enum members.)

A few restrictions apply to the use of C symbols:

■   Each assembly-language statement can contain only one C symbol.
Multiple symbols can appear in the same assembly instruction only with
LENGTH, TYPE, and SIZE expressions.

■   Functions referenced in an _asm block must be declared (prototyped)
earlier in the program. Otherwise, the compiler cannot distinguish
between function names and labels in the _asm block.

■   An _asm block cannot use any C symbols with the same spelling as MASM
reserved words (regardless of case). MASM reserved words include
instruction names such as PUSH and register names such as SI.

■   Structure and union tags are not recognized in _asm blocks.

3.4.3  Accessing C Data

A great convenience of in-line assembly is the ability to refer to C
variables by name. An _asm block can refer to any symbols─including variable
names─that are in scope where the block appears. For instance, if the C
variable  var  is in scope, the instruction

_asm mov ax, var

stores the value of  var  in AX.

If a structure or union member has a unique name, an _asm block can refer to
it using only the member name, without specifying the C variable or typedef
name before the period (.) operator. If the member name is not unique,
however, you must place a variable or typedef name immediately before the
period (.) operator. For instance, the following structure types share
same_name  as their member name:

struct first_type
{
char *weasel;
int same_name;
};

struct second_type
{
int wonton;
long same_name;
};

If you declare variables with the types

struct first_type hal;
struct second_type oat;

all references to the member  same_name  must use the variable name, because
same_name  is not unique. But the member  weasel  has a unique name, so you
can refer to it using only its member name:

_asm
{
mov bx, OFFSET hal
mov cx, [bx]hal.same_name ; Must use 'hal'
mov si, [bx].weasel       ; Can omit 'hal'
}

Note that omitting the variable name is merely a coding convenience. The
same assembly instructions are generated whether or not it is present.

3.4.4  Writing Functions

If you write a function with in-line assembly code, it's a simple matter to
pass arguments to the function and return a value from it. The following
examples compare a function first written for a separate assembler and then
rewritten for the in-line assembler. The function, called  power2, receives
two parameters, multiplying the first parameter by 2 to the power of the
second parameter. Written for a separate assembler, the function might look
like this:

; POWER.ASM
; Compute the power of an integer
;
PUBLIC _power2
_TEXT SEGMENT WORD PUBLIC 'CODE'
_power2 PROC

push bp         ; Save BP
mov bp, sp      ; Move SP into BP so we can refer
;   to arguments on the stack
mov ax, [bp+4]  ; Get first argument
mov cx, [bp+6]  ; Get second argument
shl ax, cl      ; AX = AX * ( 2 ^ CL )
pop bp          ; Restore BP
ret             ; Return with sum in AX

_power2 ENDP
_TEXT   ENDS
END

Function arguments are usually passed on the stack.

Since it's written for a separate assembler, the function requires a
separate source file and assembly and link steps. C function arguments
usually are passed on the stack, so this version of the  power2  function
accesses its arguments by their positions on the stack. (Note that the MODEL
directive, available in MASM and some other assemblers, also allows you to
access stack arguments and local stack variables by name.)

The POWER2.C program below writes the  power2  function with in-line
assembly code:

/* POWER2.C */
#include <stdio.h>

int power2( int num, int power );

void main( void )
{
printf( "3 times 2 to the power of 5 is %d\n", \
power2( 3, 5) );
}

int power2( int num, int power )
{
_asm
{
mov ax, num    ; Get first argument
mov cx, power  ; Get second argument
shl ax, cl     ; AX = AX * ( 2 to the power of CL )
}
/* Return with result in AX */
}

The in-line version of the  power2  function refers to its arguments by name
and appears in the same source file as the rest of the program. This version
also requires fewer assembly instructions. Since C automatically preserves
BP, the _asm block doesn't need to do so. It can also dispense with the RET
instruction, since the C part of the function performs the return.

Because the in-line version of  power2  doesn't execute a C return
statement, it causes a harmless warning if you compile at warning levels 2
or higher:

warning C4035: 'power2' : no return value

The function does return a value, but the compiler cannot tell that in the
absence of a return statement. Simply ignore the warning in this context.

3.5  Using and Preserving Registers

In general, you should not assume that a register will have a given value
when an _asm block begins. An _asm block inherits whatever register values
happen to result from the normal flow of control.

If you use the _fastcall calling convention, the compiler passes function
arguments in registers instead of the stack. This can create problems in
functions with _asm blocks, since a function has no way to tell which
parameter is in which register. If the function happens to receive a
parameter in AX and immediately stores something else in AX, the parameter
is lost. In addition, you must preserve the CX and ES registers in any
function declared with _fastcall.

Don't use the _fastcall calling convention for functions with _asm blocks.

To avoid such register conflicts, don't use the _fastcall convention for
functions that contain an _asm block. If you specify the _fastcall
convention globally with the /Gr compiler option, declare every function
containing an _asm block with _cdecl. (The _cdecl attribute tells the
compiler to use the normal C calling convention for that function.) If you
are not compiling with /Gr, avoid declaring the function with the _fastcall
attribute.

As you may have noticed in the POWER2.C example in Section 3.4.4, the
power2  function doesn't preserve the value in the AX register. When you
write a function in assembly language, you don't need to preserve the AX,
BX, CX, DX, ES, and flags registers. However, you should preserve any other
registers you use (DI, SI, DS, SS, SP, and BP).

────────────────────────────────────────────────────────────────────────────
WARNING

If your in-line assembly code changes the direction flag using the STD or
CLD instructions, you must restore the flag to its original value.
────────────────────────────────────────────────────────────────────────────

Functions return values in the AX and DX registers.

The POWER2.C example in Section 3.4.4 also shows that functions return
values in registers. This is true whether the function is written in
assembly language or in C.

If the return value is short (a char, int, or near pointer), it is stored in
AX. The POWER2.C example returned a value by terminating with the desired
value in AX.

If the return value is long, store the high word in DX and the low word in
AX. To return a longer value (such as a floating-point value), store the
value in memory and return a pointer to the value (in AX if near or in DX:AX
if far).

Assembly instructions that appear in-line with C statements are free to
alter the AX, BX, CX, and DX registers. C doesn't expect these registers to
be maintained between statements, so you don't need to preserve them. The
same is true of the SI and DI registers, with some exceptions (see Section
3.9, "Optimizing"). You should preserve the SP and BP registers unless you
have some reason to change them─to switch stacks, for instance.

3.6  Jumping to Labels

Like an ordinary C label, a label in an _asm block has scope throughout the
function in which it is defined (not only in the block). Both assembly
instructions and C goto statements can jump to labels inside or outside the
_asm block.

Labels in _asm blocks have function scope and are not case sensitive.

Unlike C labels, labels defined in _asm blocks are not case sensitive, even
when used in C statements. C labels are not case sensitive in an _asm block,
either. (Outside an _asm block, a C label is case sensitive as usual.) The
following do-nothing code shows all the permutations:

void func( void )
{
goto C_Dest;  /* legal */
goto c_dest;  /* error */

goto A_Dest;  /* legal */
goto a_dest;  /* legal */

_asm
{
jmp C_Dest ; legal
jmp c_dest ; legal

jmp A_Dest ; legal
jmp a_dest ; legal

a_dest:    ; _asm label
}

C_Dest:       /* C label */
return;
}

Don't use C library function names as labels in _asm blocks. For instance,
you might be tempted to use  exit  as a label,

jne exit
.
.
.
exit:
; More _asm code follows

forgetting that exit is the name of a C library function. The code doesn't
cause a compiler error, but it might cause a jump to the exit function
instead of the desired location.

As in MASM programs, the dollar symbol ($) serves as the current location counter─a label for the instruction currently being assembled. In _asm blocks, its main use is to make long conditional jumps: jne$+5 ; next instruction is 5 bytes long
jmp farlabel
; +5 . . . farlabel: 3.7 Calling C Functions An _asm block can call C functions, including C library routines. The following example calls the printf library routine: #include <stdio.h> char format[] = "%s %s\n"; char hello[] = "Hello"; char world[] = "world"; void main( void ) { _asm { mov ax, offset world push ax mov ax, offset hello push ax mov ax, offset format push ax call printf } } Since function arguments are passed on the stack, you simply push the needed arguments─string pointers, in the example above─before calling the function. The arguments are pushed in reverse order, so they come off the stack in the desired order. To emulate the C statement printf( format, hello, world ); the example pushes pointers to world, hello, and format, in that order, then calls printf. 3.8 Defining _asm Blocks as C Macros C macros offer a convenient way to insert assembly code into C code, but they demand extra care because a macro expands into a single logical line. To create trouble-free macros, follow these rules: ■ Enclose the _asm block in braces. ■ Put the _asm keyword in front of each assembly instruction. ■ Use old-style C comments ( /* comment */ ) instead of assembly-style comments ( ; comment ) or single-line C comments ( // comment ). To illustrate, the following example defines a simple macro: #define BEEP _asm \ /* Beep sound */ \ { \ _asm mov ah, 2 \ _asm mov dl, 7 \ _asm int 21h \ } At first glance, the last three _asm keywords seem superfluous. They are needed, however, because the macro expands into a single line: _asm /* Beep sound */ { _asm mov ah, 2 _asm mov dl, 7 _asm int 21h } The third and fourth _asm keywords are needed as statement separators. The only statement separators recognized in _asm blocks are the newline character and _asm keyword. And since a block defined as a macro is one logical line, you must separate each instruction with _asm. The braces are essential as well. If you omit them, the compiler can be confused by C statements on the same line to the right of the macro invocation. Without the closing brace, the compiler cannot tell where assembly code stops, and it sees C statements after the _asm block as assembly instructions. Use C comments in _asm blocks written as macros. Assembly-style comments that start with a semicolon (;) continue to the end of the line. This causes problems in macros because the compiler ignores everything after the comment, all the way to the end of the logical line. The same is true of single-line C comments ( // comment ). To prevent errors, use old-style C comments ( /* comment */ ) in _asm blocks defined as macros. An _asm block written as a C macro can take arguments but cannot return a value. An _asm block written as a C macro can take arguments. Unlike an ordinary C macro, however, an _asm macro cannot return a value. So you cannot use such macros in C expressions. Be careful not to invoke macros of this type indiscriminately. For instance, invoking an assembly-language macro in a function declared with the _fastcall con-vention may cause unexpected results. (See Section 3.5, "Using and Preserving Registers.") You can convert MASM macros to C macros. Note that some MASM-style macros can be written as C macros. Below is a MASM macro that sets the video page to the value specified in the page argument: setpage MACRO page mov ah, 5 mov al, page int 10h ENDM The following code defines setpage as a C macro: #define setpage( page ) _asm \ { \ _asm mov ah, 5 \ _asm mov al, page \ _asm int 10h \ } Both macros do the same job. 3.9 Optimizing The presence of an _asm block in a function affects optimization in a few different ways. First, as you might expect, the compiler doesn't try to optimize the _asm block itself. What you write in assembly language is exactly what you get. Second, the presence of an _asm block affects register variable storage. Under normal circumstances (unless you suppress optimization with the /Od option) the compiler automatically stores variables in registers. This is not done, however, in any function that contains an _asm block. To get register variable storage in such a function, you must request it with the register keyword. Since the compiler stores register variables in the SI and DI registers, these registers represent variables in functions that request register storage. The first eligible variable is stored in SI and the second in DI. Preserve SI and DI in such functions unless you want to change the register variables. Keep in mind that the name of a variable declared with register translates directly into a register reference (assuming a register is available for such use). For instance, if you declare register int sample; and the variable sample happens to be stored in SI, then the _asm instruction _asm mov ax, sample is equivalent to _asm mov ax, si If you declare a variable with register and the compiler cannot store the variable in a register, the compiler issues a warning to that effect at compile time. The solution is to remove the register declaration from that variable. Register variables form a slight exception to the general rule that an assembly-language statement can contain no more than one C symbol. If one of the symbols is a register variable, for example, register int v1; int v2; then an instruction can use two C symbols, as in mov v1, v2 Finally, the presence of in-line assembly code inhibits the following optimizations for the entire function in which the code appears: ■ Loop ( /Ol ) ■ Global register allocation ( /Oe ) ■ Global optimizations and common subexpressions ( /Og ) These optimizations are suppressed no matter which compiler options you use. Chapter 4 Controlling Floating-Point Math Operations ──────────────────────────────────────────────────────────────────────────── This chapter describes how to control the way your Microsoft C programs perform floating-point math operations. It describes the math packages that you can include in C libraries when you run the SETUP program, then discusses the options you can specify in the Programmer's WorkBench (PWB) or on the CL command line to choose the appropriate library for linking and controlling floating-point instructions. This chapter also explains how to override floating-point options by changing libraries at link time, and how to control use of the Intel math coprocessor (80x87) using the NO87 environment variable. 4.1 Declaring Floating-Point Types Microsoft C supports three floating-point types that conform to the Institute of Electrical and Electronics Engineers (IEEE) standard 754 format: 1. Type float, a 32-bit floating-point quantity 2. Type double, a 64-bit floating-point quantity 3. Type long double, an 80-bit floating-point quantity You can declare variables as any of these types. You can also declare functions that return any of these types. 4.1.1 Declaring Variables as Floating-Point Types You can declare variables as float, double, or long double, depending on the needs of your application. The principal differences between the three types are the significance they can represent, the storage they require, and their range. Table 4.1 shows the relationship between significance and storage requirements. Table 4.1 Floating-Point Types ╓┌─────────────┌───────────────────┌─────────────────────────────────────────╖ Type Significant Digits Number of Bytes ──────────────────────────────────────────────────────────────────────────── float 6-7 4 double 15-16 8 Type Significant Digits Number of Bytes ──────────────────────────────────────────────────────────────────────────── double 15-16 8 long double 19 10 ──────────────────────────────────────────────────────────────────────────── Floating-point variables are represented by a mantissa, which contains the value of the number, and an exponent, which contains the order of magnitude of the number. Table 4.2 shows the number of bits allocated to the mantissa and the exponent for each floating-point type. The most-significant bit of any float, double, or long double is always the sign bit. If it is 1, the number is considered negative; otherwise, it is considered a positive number. Table 4.2 Lengths of Exponents and Mantissas ╓┌─────────────┌────────────────┌────────────────────────────────────────────╖ Type Exponent Length Mantissa Length ──────────────────────────────────────────────────────────────────────────── Type Exponent Length Mantissa Length ──────────────────────────────────────────────────────────────────────────── float 8 bits 23 bits double 11 bits 52 bits long double 15 bits 64 bits ──────────────────────────────────────────────────────────────────────────── Because exponents are stored in an unsigned form, the exponent is biased by half its possible value. For type float, the bias is 127; for type double, it is 1,023; for type long double, it is 16,383. You can compute the actual exponent value by subtracting the bias value from the exponent value. The mantissa is stored as a binary fraction greater than or equal to 1 and less than 2. For types float and double, there is an implied leading 1 in the mantissa in the most-significant bit position, so the mantissas are actually 24 and 53 bits long, respectively, even though the most-significant bit is never stored in memory. Instead of the storage method just described, the floating-point package can store binary floating-point numbers as denormalized numbers. Denormalized numbers are nonzero floating-point numbers with reserved exponent values in which the most-significant bit of the mantissa is zero. By using denormalized format, the range of a floating-point number can be extended at the cost of precision. You cannot control whether a floating-point number is represented in normalized or denormalized form; the floating-point package determines the representation. The floating-point packages never use denormalized form unless the exponent becomes less than the minimum that can be represented in a normalized form. Table 4.3 shows the minimum and maximum value you can store in variables of each floating-point type. The values listed in this table apply only to normalized floating-point numbers; denormalized floating-point numbers have a smaller minimum value. Note that numbers retained in 80x87 registers are always represented in 80-bit normal form; numbers can only be represented in denormal form when stored in 32- or 64-bit floating-point variables (type float and type long). Table 4.3 Range of Floating-Point Types ╓┌─────────────┌──────────────────────────────┌──────────────────────────────╖ Type Minimum Value Maximum Value Type Minimum Value Maximum Value ──────────────────────────────────────────────────────────────────────────── float 1.175494351 E - 38 3.402823466 E + 38 double 2.2250738585072014 E - 308 1.7976931348623158 E + 308 long double 3.362103143112093503 E - 4932 1.189731495357231765 E + 4932 ──────────────────────────────────────────────────────────────────────────── If precision is less of a concern than storage, consider using type float for floating-point variables. Conversely, if precision is the most important criterion, use type long double. Microsoft C observes type-widening rules. Floating-point variables can be promoted to a type of greater significance (for example, from type float to type double). Promotion often occurs when you perform arithmetic on floating-point variables. This arithmetic is always done in as high a degree of precision as the variable with the highest degree of precision. For example, consider the following type declarations: float f_short; double f_long; long double f_longer; f_short = f_short * f_long; In the preceding example, the variable f_short is promoted to type double and multiplied by f_long; then the result is rounded to type float before being assigned to f_short. In the example below (which uses the declarations from the preceding example), the arithmetic is done in float (32-bit) precision on the variables; the result is then promoted to type long double. f_longer = f_short * f_short; 4.1.2 Declaring Functions that Return Floating-Point Types You can declare functions that return the floating-point types float, double, and long double. Functions that return types float or double do not place their return values in registers; they place their return values in a global location called the floating-point accumulator ( fac). When declaring a function as a floating-point type in a multithreaded program for OS/2, you should use the _pascal keyword to specify the FORTRAN/Pascal calling convention. Declaring the function as _pascal causes the return value to be placed on the stack, rather than in the floating-point accumulator, fac. You can write re-entrant functions that return floating-point types. Using the current thread's private stack to return values allows you to write re-entrant functions by eliminating possible contention between threads for the floating-point accumulator. ──────────────────────────────────────────────────────────────────────────── NOTE Functions that return type long double always place their return values on the stack. You need not use the _pascal keyword with functions declared as long double. ──────────────────────────────────────────────────────────────────────────── 4.2 C Run-Time Library Support of Type long double All of the Microsoft C run-time libraries support type long double. Each of the normal floating-point math functions has a special version that supports type long double. These functions have the same name as the functions that support type float and type double, except that they end with l. For example, the function that returns the absolute value of a variable of type float or type double is fabs. The long double equivalent function is fabsl. The two exceptions to this rule are the _atold and _strtodl functions. 4.3 Summary of Math Packages The Microsoft C compiler offers a choice of the following three math packages for handling floating-point operations: 1. Emulator (default) 2. Math coprocessor (a library that supports the Intel 80x87 family of math coprocessors) 3. Alternate math When you install Microsoft C, the SETUP program allows you to build combined libraries. These libraries include the floating-point math library that you choose. Any programs linked with that library use the math package included in the library; you must use the appropriate PWB or CL option to make sure that the library you want is used at link time. The following descriptions of these math packages are designed to help you choose the appropriate math option for your needs when you build a library using SETUP. For more information about SETUP and about building combined libraries, see Installing and Using the Microsoft C Professional Development System. Note that this chapter does not describe mode-specific libraries. For simplicity, the base names of libraries are noted in their default form; that is mLIBCf.LIB, where m is the model designator and f is the floating-point math package designator. For information about mode-specific libraries, see Chapter 14, "Building OS/2 Applications," or Installing and Using the Microsoft C Professional Development System. 4.3.1 Emulator Package Programs created using the emulator math package automatically detect and use an 80x87 numeric coprocessor if one is installed. If no coprocessor is installed, these 80x87 instructions are carried out in software. The emulator package is the default math package; SETUP uses it if you do not explicitly choose another package. Also, the emulator math option is the option selected by default by the compiler if no other floating-point math option is specified. Use the emulator math package to maximize accuracy on systems without math coprocessors or if your program will be run on some systems with coprocessors and some systems without coprocessors. The emulator package performs basic operations to the same degree of accuracy as a math coprocessor. However, the emulator routines used for transcendental math functions (such as sin, cos, tan) differ slightly from the corresponding functions performed on a coprocessor. This difference can cause a slight discrepancy (usually within two bits) between the results of these operations when performed with the software emulation instead of with a math coprocessor. When you use the emulator package, some floating-point exceptions are masked. When you use a math coprocessor or the emulator floating-point math package, interrupt-enable, precision, underflow, and denormalized-operand exceptions are masked by default. The remaining floating-point exceptions are unmasked. See the discussion of the _control87 function in on-line help for more information about 80x87 floating-point exceptions. 4.3.2 Math Coprocessor Package The math coprocessor package utilizes the 80x87 math coprocessor exclusively for floating-point calculations. If you use the math coprocessor package, the machine on which your application is to run must have an 80x87 coprocessor to perform floating-point operations. This package gives you the fastest, smallest programs possible for handling floating-point math. 4.3.3 Alternate Math Package The alternate math package gives you the smallest and fastest programs possible without a coprocessor. However, the program results are not as accurate as results given by the emulator package. The alternate math package uses the same format as the IEEE standard-format numbers with less precision and weaker error checking. The alternate math package does not support infinities, NANs ("not a number"), and denormal numbers. You must always use the alternate math package when developing routines that are to be placed in an OS/2 dynamic-link library (DLL) using LLIBCDLL.LIB. Do not, however, use the alternate math package for building the C run-time DLL using CDLLOBJS.LIB; instead, use the emulator math package. For more information about creating dynamic-link libraries for OS/2, see Chapter 16. 4.4 Selecting Floating-Point Options (/FP) You can select a floating-point library and the method of accessing floatingpoint routines by setting options in PWB or by specifying command-line options to CL. You can choose between the emulator, alternate, or math coprocessor library. You can also access the floating-point routines by issuing a function call (or calls) or by generating in-line 80x87 instructions to execute the floating-point operation. The smallest and the fastest floating-point math option is the in-line math coprocessor package because the compiler generates true 80x87 coprocessor instructions. If, however, you cannot depend on the target computer having a coprocessor, you must use either the emulator or alternate math options. To specify floating-point options on the CL command line, you must specify an option from the list in Table 4.4. You specify these options to CL starting with the floating-point option string /FP. Based on the floating-point option and the memory-model option you choose, the compiler embeds a library name in the object file that it creates. This library is then considered the default library; that is, the linker searches in the standard places for a library with that name. If it finds a library with that name, the linker uses the library to resolve external references in the object file being linked. Otherwise, it displays a message indicating that it could not find the library. This mechanism allows the linker to automatically link object files with the appropriate library. However, you can link with a different library in some cases. See Table 4.4 and Section 4.5, "Library Considerations for Floating-Point Options," for more information about linking with different libraries. Table 4.4 summarizes the floating-point options and their effects. These options are described in detail in the following sections. Table 4.4 Summary of Floating-Point Options ╓┌──────────────────────┌──────────────┌──────────────┌────────────────────┌─► Option for CL for PWB Combined Use Lib of Method Effect Coprocessor Sel Option for CL for PWB Combined Use Lib of Method Effect Coprocessor Sel ───────────────────────────────────────────────────────────────────────────── /FPi In-line Default; Uses coprocessor if mLI In-Line larger than present(1) Emulation /FPi87, but can work without a coprocessor; most efficient way to get maximum precision without a coprocessor /FPi87 In-line Smallest and Requires mLI In-Line Math fastest coprocessor Coprocessor option available Option for CL for PWB Combined Use Lib of Method Effect Coprocessor Sel ───────────────────────────────────────────────────────────────────────────── available with a coprocessor /FPc Calls Slower than Uses coprocessor if mLI Calls to /FPi, but present(1) Emulator allows use of alternate math library at link time /FPc87 Calls Slower than Requires mLI Calls to Math /FPi87, but coprocessor unless Coprocessor allows use of library changed at alternate link time(5) math library at link time Option for CL for PWB Combined Use Lib of Method Effect Coprocessor Sel ───────────────────────────────────────────────────────────────────────────── /FPa Calls Fastest and Ignores mLI Alternate Math smallest coprocessor option available without a coprocessor, but sacrifices some accuracy for speed ───────────────────────────────────────────────────────────────────────────── (1) Use of the coprocessor can be suppressed by setting NO87. (2) Can be linked explicitly with mLIBC7.LIB at link time. (3) Can be linked explicitly with mLIBCA.LIB at link time. (4) Can be linked explicitly with mLIBCE.LIB at link time. (5) Use of the coprocessor can be suppressed by setting NO87 if you change to the emulator library at link time. Optimizations such as constant propagation and constant subexpression elimination can cause some expressions to be evaluated at compile time. Such evaluations always use IEEE format and are unaffected by the floating-point option you choose. For more information about optimizing, see Chapter 1, "Optimizing C Programs." You can specify floatingpoint options in the Programmer's WorkBench. To specify floating-point options when using the Programmer's WorkBench, you must modify the C Global Build Options (available on the Options menu). In the C Global Build Options dialog box, select one of the following floating-point math options: Option Effect ──────────────────────────────────────────────────────────────────────────── Emulation Calls Generates calls; makes emulator math library the default (/FPc) 80x87 Calls Generates calls; makes math coprocessor library the default (/FPc87) Fast Alternate Math Generates calls; makes alternate math library the default (/FPa) Inline Emulation Generates in-line instructions; makes emulator math library the default (/FPi); this is the default option Inline 80x87 Generates in-line instructions; selects Instructions math coprocessor library (/FPi87) 4.4.1 In-Line Emulator Option (/FPi) The in-line emulator option (/FPi) generates in-line instructions for an 80x87 coprocessor and places the name of the emulator library (mLIBCE.LIB) in the object file. At link time, you can specify the math coprocessor library (mLIBC7.LIB) instead. If you do not choose a floating-point option, the compiler uses the in-line emulator option by default. The in-line emulator option is useful if you cannot be sure that an 80x87 coprocessor will be available on the target computer. Programs compiled using the in-line emulator option work as described below: ■ If a coprocessor is present at run time, the program uses the coprocessor. ■ If no coprocessor is present, the program uses the emulator. In this case, the in-line emulator option offers the most efficient way to get maximum precision in floating-point results. When you use the in-line emulator option, the compiler does not generate in-line 80x87 instructions. For real-mode code, the compiler generates software interrupts to library code, which then fixes up the interrupts to use either the emulator or the coprocessor, depending on whether a coprocessor is present. For protected-mode code, the compiler generates no such interrupts; it generates 80x87 instructions. If the target computer does not have a coprocessor, an "unsupported extension" exception occurs, which is vectored to library code. If you want true in-line 80x87 instructions, use the in-line math coprocessor option (/FPi87). ──────────────────────────────────────────────────────────────────────────── NOTE In an OS/2 dynamic-link library built with LLIBCDLL.LIB, you cannot use code that requires the emulator library. You must use the alternate math library instead. ──────────────────────────────────────────────────────────────────────────── 4.4.2 In-Line Math Coprocessor Instructions Option (/FPi87) The in-line math coprocessor instructions option (/FPi87) instructs the compiler to place 80x87 coprocessor instructions in your code for many math operations. It also causes the name of a math coprocessor library (mLIBC7.LIB) to be embedded in the object file. If you use the in-line math coprocessor instructions option and link with the library mLIBC7.LIB, an 80x87 coprocessor must be present at run time, or the program fails and the following error message is displayed: run-time error R6002 - floating point not loaded Compiling with the in-line math coprocessor instructions option results in the smallest, fastest programs possible for handling floating-point results. 4.4.3 Calls to Emulator Option (/FPc) The calls to emulator option (/FPc) generates floating-point calls to the emulator library and places the names of an emulator library (mLIBCE.LIB) in the object file. At link time, you can specify a math coprocessor library (mLIBC7.LIB) or an alternate math library (mLIBCA.LIB) instead. Thus, the calls to emulator option gives you more flexibility in the libraries you can use for linking than the in-line emulator option. Using the calls to emulator option is also recommended in the following cases: ■ If you compile modules that perform floating-point operations and plan to include these modules in a library ■ If you compile modules that you want to link with libraries other than the libraries provided with Microsoft C You cannot link with an alternate math library if your program uses the intrinsic forms of floating-point library routines (that is, if you have compiled the program with the /Oi or /Ox option, selected the Generate Intrinsic Functions option from the Debug Build Options or Release Build Options dialog box in the Programmer's WorkBench, or specified math functions in an intrinsic pragma). 4.4.4 Calls to Math Coprocessor Option (/FPc87) The calls to math coprocessor option (/FPc87) generates function calls to routines in the math coprocessor library (mLIBC7.LIB) that issue the corresponding 80x87 instructions. As with the in-line math coprocessor instructions option (/FPi87), at link time you can choose to link with an emulator library (mLIBCE.LIB). However, /FPc offers more flexibility in choosing libraries, since you can change your mind and link with the appropriate alternate math library as well (mLIBCA.LIB). The disadvantages of using the calls to math coprocessor option as opposed to the in-line coprocessor option are the following: ■ Your executable size is larger because a call requires more instructions than a true coprocessor instruction. ■ Your program does not execute as fast because you must issue a function call for each floating-point operation. You cannot link with an alternate math library if your program uses the intrinsic forms of floating-point library routines (that is, if you have compiled the program with the /Oi or /Ox option, selected the Generate Intrinsic Functions option from the Debug Build Options or Release Build Options dialog box in the Programmer's WorkBench, or specified math functions in an intrinsic pragma). You must have a math coprocessor installed to run programs compiled with the /FPc option and linked with a math coprocessor library. Otherwise, the program fails and the following error message is displayed: run-time error R6002 - floating point not loaded ──────────────────────────────────────────────────────────────────────────── NOTE Certain optimizations are not performed when you use the calls to math coprocessor option. This can reduce the efficiency of your code; also, since arithmetic of different precision can result, there may be slight differences in your results. ──────────────────────────────────────────────────────────────────────────── 4.4.5 Use Alternate Math Option (/FPa) The use alternate math option (/FPa) generates floating-point calls and selects the alternate math library for the appropriate memory model (mLIBCA.LIB). Calls to this library provide the fastest and smallest option for code intended to run on a machine without an 80x87 coprocessor. With this option, you can choose an emulator library (mLIBCE.LIB) or a math coprocessor library (mLIBC7.LIB) at link time. You cannot link with an alternate math library if your program uses the intrinsic forms of floating-point library routines (that is, if you have compiled the program with the /Oi or /Ox option, selected the Generate Intrinsic Functions from the Debug Build Options or Release Build Options dialog box in the Programmer's WorkBench, or specified math functions in an intrinsic pragma). 4.5 Library Considerations for Floating-Point Options You may want to use libraries in addition to the default library for the floating-point option you have chosen in your compile options. For example, you may want to create your own libraries (or other collections of subprograms in object-file form), then link these libraries at a later time with object files that you have compiled using different options. The following sections describe these cases and ways to handle them. Although the discussion assumes that you are putting your object files into libraries, the same considerations apply if you are simply using individual object files. 4.5.1 Using One Standard Library for Linking You must use only one standard C run-time library when you link. You can control which library is used in one of two ways: 1. In the Programmer's WorkBench, add the name of the C run-time library file you want to the program list using the Edit Program List option from the Make menu. You must also modify the Linker Options (from the Make menu) by specifying No Default Library Search. 2. From the LINK command line, give the /NODEFAULTLIBRARYSEARCH (/NOD) option and then specify the name of the combined library file you want to use in the link-libinfo field of the CL command line. This overrides the library names embedded in the object files. 4.5.2 In-Line Instructions or Calls When deciding on a floating-point option, you should decide whether you want to use in-line instructions. If you do, compile with the in-line math coprocessor instructions (/FPi87) or in-line emulator (/FPi) option. Otherwise, compile for floating-point function calls using the calls to math coprocessor (/FPc87), calls to emulator (/FPc), or alternate math (/FPa) option. If you choose to use in-line instructions for your precompiled object files, you cannot link with an alternate math library (mLIBCA.LIB). However, in-line instructions achieve the best performance from your programs on machines that have an 80x87 coprocessor installed. If you choose to use calls, your programs are slower, but at link time you can switch to any standard C run-time library (that is, any library created by the SETUP program) that supports the memory model you have chosen. 4.6 Compatibility between Floating-Point Options Each time you compile a source file, you can specify a floating-point option. When you link two or more source files to produce an executable program file, you must ensure that floating-point operations are handled consistently and that the environment is set up properly to allow the linker to find the required library. If you are building libraries of C routines that contain floating-point operations, the calls to emulator option (/FPc) provides the most flexibility. The examples that follow illustrate how you can link your program with a library other than the default. The floating-point option and the substitute library are compatible. The example below compiles the program CALC.C with the medium-model option (/AM). Because no floating-point option is specified, the default in-line emulator option (/FPi) is used. The in-line emulator option generates 80x87 instructions and specifies the emulator library MLIBCE.LIB in the object file. The /LINK field specifies the /NODEFAULTLIBRARYSEARCH (/NOD) option and the names of the medium-model math coprocessor library. Specifying the math coprocessor library forces the program to use an 80x87 coprocessor; the program fails if a coprocessor is not present. CL /AM CALC.C /link MLIBC7 /NOD The example below compiles CALC.C using the small (default) memory model and the alternate math option (/FPa). The /LINK field specifies the /NOD option and the library SLIBCE.LIB. Specifying the emulator library causes all floating-point calls to refer to the emulator library instead of the alternate math library. CL /FPa CALC.C /link SLIBCE /NOD The example below compiles CALC.C with the calls to math coprocessor option (/FPc87), which places the library name SLIBC7.LIB in the object file. The /LINK field overrides this default-library specification by giving the /NOD option and the name of the small-model alternate math library (SLIBCA.LIB). CL /FPc87 CALC.C /link SLIBCA.LIB/NOD 4.7 Using the NO87 Environment Variable Programs compiled using either the calls to emulator (/FPc) or the in-line emulator (/FPi) option automatically use an 80x87 coprocessor at run time if one is installed. You can override this and force the use of the software emulator by setting an environment variable named NO87. Use the NO87 environment variable to suppress use of the 80x87 coprocessor at run time. If NO87 is set to any value when the program is executed, use of the coprocessor is suppressed. The value of the NO87 setting is printed on the standard output as a message. The message is printed only if a coprocessor is present and suppressed; if no coprocessor is present, no message appears. If you don't want a message to be printed, set NO87 equal to one or more spaces. A blank string for NO87 causes a blank line to be printed. Note that only the presence or absence of the NO87 definition is important in suppressing use of the coprocessor. The actual value of the NO87 setting is used only for printing the message. The NO87 variable takes effect with any program linked with an emulator library (mLIBCE.LIB). It has no effect on programs linked with math coprocessor libraries (mLIBC7.LIB) or programs linked with alternate math libraries (mLIBCA.LIB). When a program that uses an emulator library is executed and an 80x87 coprocessor is present, the example below causes the message Use of coprocessor suppressed to appear. SET NO87=Use of coprocessor suppressed The syntax below sets the NO87 variable to the space character. Use of the coprocessor is still suppressed, but no message is displayed. SET NO87=space 4.8 Incompatibility Issues The exception handler in the libraries for 80x87 floating-point calculations (mLIBCE.LIB and mLIBC7.LIB) is designed to work without modification on the IBM PC family of computers and on closely compatible computers, including the WANG(R) PC, the AT&T(R) 6300, and the Olivetti(R) personal computers. Also, the libraries need not be modified for the Texas Instruments(R) Professional Computer, even though it is not compatible. Any machine that uses nonmaskable interrupts (NMI) for 80x87 exceptions will run with the unmodified libraries. If your computer is not one of these, and if you are not sure whether it is completely compatible, you may need to modify the math coprocessor libraries. All Microsoft languages that support 80x87 coprocessors intercept 80x87 exceptions in order to produce accurate results and properly detect error conditions. To make the libraries work correctly on incompatible machines, you can modify the libraries. To make this easier, an assembly-language source file, EMOEM.ASM, is included on the C 6.0 distribution disk. Any machine that sends the 80x87 exception to an 8259 Priority Interrupt Controller (master or master/slave) can be supported by a simple table change to the EMOEM.ASM module. The source file contains further instructions about how to modify EMOEM.ASM, patch libraries, and executable files. PART II Improving Programmer Productivity ──────────────────────────────────────────────────────────────────────────── The Microsoft C Professional Development System helps you write and debug software rapidly. Chapter 5 describes the quick compile and incremental compile options, both of which can save you time when compiling programs. Chapter 5 also describes the incremental linker, ILINK, which can save you time when you link your application. Chapter 6 describes NMAKE, a powerful new program maintenance utility that automates your program build process. Chapter 7 describes how to build help files with HELPMAKE, the help-file maintenance utility. When you need to share documentation in a readily accessible form, you can add it to the Microsoft Advisor on-line help system using the information in Chapter 7. Chapter 8 explains how to customize the Programmer's WorkBench to make it a personalized development platform. Chapter 9 offers procedures (and some tips) for using the CodeView debugger to find errors in your programs. Chapter 5 Compiling and Linking Quickly ──────────────────────────────────────────────────────────────────────────── The fundamental processes of compiling and linking take time to perform. The larger your application grows, the longer it takes to compile and link. This chapter describes how you can speed up compiling by using the quick compiler and incremental compile option, and how you can speed up linking by using ILINK, the Incremental Linker. 5.1 Compiling Quickly This section describes two ways to speed up the compiling process: using the quick compiler and using the incremental compile option. 5.1.1 Quick Compiler The Microsoft C Professional Development System includes two separate C compilers: the full compiler and the quick compiler. If you don't specify otherwise, your program is compiled by the full compiler. You access the quick compiler by specifying the /qc command-line option for CL or by selecting the Quick Compile option from the C Release Build or C Debug Build Options dialogs in the PWB Options menu. The quick compiler cannot perform as many optimizations as the full compiler, but it is much faster. You can use it to save time during development, whenever optimizations are not critical. When your application is finished, you can compile with the full compiler, using all the desired optimizations. On-line help for the /qc option describes which optimizations the quick compiler can perform. 5.1.2 Incremental Compile Option You can speed up compiling even more by compiling incrementally. Incremental compilation means that the compiler compiles only those functions that have changed since you last compiled. The incremental compile option is available only with the quick compiler (see the previous section). You can access it from within PWB or from the DOS command line. Within PWB, select the Incremental Compile option in the C Release Build dialog box or in the C Debug Build Options dialog box. From the DOS command line, specify the /Gi option for CL. The incremental compile option automatically triggers another time-saving feature: the Incremental Linker, which is described in the next section. 5.2 Linking Quickly with ILINK ILINK links only those modules that have changed since the last link. The Incremental Linker (ILINK) offers the same advantage in linking that the incremental compile option offers in compiling. Rather than link every module in an application, as LINK does, ILINK links only those modules that have changed since the last link. The more modules your application contains, the more time ILINK can potentially save. In a normal development scenario, you use LINK at the beginning and end of the process, and use ILINK in the middle. In the early stages of development, when your application contains only a few modules, ILINK offers no speed advantage over LINK. Once your application contains several modules, you can save time by using ILINK. You must link once with LINK to prepare for incremental linking. To prepare for incremental linking, you must run LINK using /INCREMENTAL, as described in Section 5.2.1. At the same time, you have the option of adding padding bytes to code or data segments by specifying the /PADCODE and /PADDATA options. Padding allows ILINK to expand a segment without relinking the entire module in which it is contained. Now you can link with ILINK during the rest of development. If changes in your code require a full link, ILINK invokes LINK automatically. When the application is finished, you link a last time with LINK to produce the final executable file. You can use ILINK with programs compiled for any memory model except tiny model. (Memory models are described in Chapter 2, "Managing Memory.") Typically, ILINK is not efficient for small- or compact-model programs unless they were compiled with the incremental compile option, which is described in Section 5.1.2. 5.2.1 Preparing for Incremental Linking There are three LINK options that relate to the use of ILINK. One of them (/INCREMENTAL) is mandatory; the other two (/PADCODE and /PADDATA) are optional. This section explains the LINK options that prepare for ILINK. See on-line help for a complete list of LINK options. The /INCREMENTAL Option The /INCREMENTAL (/INC) option prepares an object file for incremental linking. You must always run LINK using this option before using ILINK. When you specify /INC, the linker produces two extra files: a symbol file (.SYM) and an ILINK support file (.ILK). The .SYM and .ILK files tell ILINK which parts of the executable file need to be updated. You must use /INCREMENTAL whenever you use the /PADCODE and /PADDATA options, which are described below. The /PADCODE Option The /PADCODE option causes LINK to add padding bytes at the end of a module's code segment. The padding bytes leave room for the code segment to grow in subsequent links, allowing ILINK to update only that module. You can use the /PADCODE option only when /INC is also specified. Code padding is usually necessary for programs using the small memory model. It is also recommended for compact- or mixed-model programs. You do not need to specify /PADCODE for other memory models (medium, large, or huge). If you don't specify /PADCODE, LINK doesn't pad the code segment at all. To add padding, specify the desired number of bytes. The optimum amount of padding depends on how much your code changes from one link to the next. If you expect to add only a little code, choose a relatively small amount of padding, say 32 to 64 bytes. If ILINK issues the message padding exceeded and performs a full link more often than desired, increase the padding by a small amount, say 32 bytes. In any case, remember that the total size of a code segment, including padding bytes, cannot exceed 64K (65,535) bytes. The /PADDATA Option Like /PADCODE, the /PADDATA option causes LINK to add padding bytes that leave room for the segment to grow in subsequent links. However, the /PADDATA option pads the end of the data segment rather than the code segment. You can use /PADDATA only when /INC is also specified. If you don't specify /PADDATA, LINK adds 16 bytes of padding by default. The default padding amount should suffice in many cases, since public variables are added less frequently than code. If you need more padding, specify the desired number of bytes. Remember that the total size of a data segment, including padding bytes, cannot exceed 64K (65,535) bytes. 5.2.2 Incremental Violations ILINK can generate two kinds of errors: real errors and incremental violations. Real errors are errors such as undefined symbols that cannot be resolved by a full link. If ILINK detects a real error, it displays an error message (real errors are documented in on-line help). Incremental violations are caused by code changes you have made that go beyond the scope of incremental linking. When an incremental violation occurs, ILINK invokes LINK automatically. The following sections describe the incremental violations. Changing Libraries An incremental violation occurs when a library changes. Furthermore, if an altered module shares a code segment with a library, ILINK needs access to the library as well as to the altered module. If you add a function, procedure, or subroutine call to a library that has never been called before, ILINK invokes LINK automatically. Exceeding Code/Data Padding An incremental violation occurs if two or more modules contribute to the same physical segment and either module exceeds its padding. The padding allows the module to increase the specified number of bytes before another full link is required. Moving or Deleting Data Symbols An incremental violation occurs if a data symbol is moved or deleted. To add new data symbols without requiring a full link, add the new symbols at the end of all other data symbols in the module. Deleting Code Symbols You can move or add code symbols, but an incremental violation occurs if you delete any code symbols from a module. Code symbols can be moved within a module but cannot be moved between modules. Changing Segment Definitions An incremental violation results if you add, delete, or change the order of segment definitions. Adding CodeView(R) Debugger Information If you include CodeView debugger information for a module when you fully link (by compiling and linking with CodeView debugger support), ILINK supports CodeView debugger information for the module. ILINK maintains symbolic information for current symbols, and it adds information for any new symbols. However, if you try to add CodeView debugger information for a module that did not previously have CodeView debugger support, an incremental violation occurs. See Chapter 9, "Debugging C Programs with CodeView," for more information about CodeView. Chapter 6 Managing Development Projects with NMAKE ──────────────────────────────────────────────────────────────────────────── The Microsoft Program-Maintenance Utility (NMAKE) is a sophisticated command processor that can save time and simplify project management. By determining which project files depend on others, NMAKE can automatically execute the commands needed to update your project when any project file has changed. The advantage of using NMAKE over simple batch files is that NMAKE does only what is needed. You don't waste time rebuilding files that are already up-to-date. NMAKE also has advanced features, such as macros, that help you manage complex projects. This chapter provides complete documentation for NMAKE. Information about NMAKE is also available in on-line help. If you are familiar with MAKE, the predecessor of NMAKE, be sure to read Section 6.9, "Differences Between NMAKE and MAKE." There are some important differences between the two utilities. 6.1 Overview of NMAKE NMAKE works by comparing the times and dates of two sets of files, which are called "targets" and "dependents." A target is normally a file that you want to create, such as an executable file. A dependent is a file used to create a target, such as a C source file. When you run NMAKE, it reads a "description file" that you supply. The description file consists of one or more blocks. Each block typically lists a target, the target's dependents, and the command that builds the target. NMAKE compares the date and time of the target to those of its dependents. If any dependent has changed more recently than the target, NMAKE updates the target by executing the command listed in the block. NMAKE's main purpose is to help you update applications quickly and simply. However, it can execute any command, so it is not limited to compiling and linking. NMAKE can also make backups, move files, and do many other project management tasks. 6.2 The NMAKE Command When you run NMAKE, you can supply the description-file name and other arguments using the following syntax: NMAKE «options» «macros» «targets» «descriptfile» All of the command-line fields are optional. If you don't supply any arguments, NMAKE looks for a default description file named MAKEFILE and follows various other defaults that are described in this chapter. The options field lists NMAKE options, which are described in Section 6.4, "Command-Line Options." The macros field lists macro definitions, which allow you to replace text in the description file. Macros are described in Section 6.3.3. The targets field lists targets to build. If you do not list any targets, NMAKE builds only the first target in the description file. (This is a significant departure from the behavior of MAKE, NMAKE's predecessor. See Section 6.9, "Differences between NMAKE and MAKE.") The descriptfile field specifies a description file. If this field is absent, NMAKE automatically looks for a file named MAKEFILE in the current directory. You can also specify the description file with the /F option (for information, see Section 6.4, "Command-Line Options"). Below is a typical NMAKE command: NMAKE /S "program = sample" sort.exe search.exe The command supplies four arguments: an option (/S), a macro definition ("program = sample"), and two target specifications (sort.exe search.exe). Because the command does not specify a description file, NMAKE looks for the default description file, MAKEFILE. The /S option tells NMAKE to suppress the display of commands as they are executed. The macro definition performs a text substitution throughout the description file, replacing every instance of program with sample. The target specifications tell NMAKE to update the targets SORT.EXE and SEARCH.EXE. 6.3 NMAKE Description Files You must always supply NMAKE with a description file. In addition to description blocks, which tell NMAKE how to build your project's target files, the description file can contain comments, macros, inference rules, and directives. This section describes all the elements of description files. 6.3.1 Description Blocks Description blocks form the heart of the description file. Figure 6.1 illustrates a typical NMAKE description block, including the three parts: targets, dependents, and commands. (This figure may be found in the printed book.) A target is a file that you want to build. The targets part of the description block lists one or more files to build. The line that lists targets and dependents is called the "dependency line." The example in Figure 6.1 tells NMAKE to build a single target, MYAPP.EXE. Although single targets are common, you can also list multiple targets; separate each target name with a space. If the rightmost target name is one character long, put a space between the name and the colon. The target is normally a file, but it can also be a "pseudotarget," a name that allows you to build groups of files or execute a group of commands. See Section 6.3.6, "Pseudotargets." A dependent is a file used to build a target. The dependents part of the description block lists one or more files from which the target is built. It is separated from the targets part by a colon. The example in Figure 6.1 lists three dependents: myapp.exe : myapp.obj another.obj myapp.def The example tells NMAKE to build the target MYAPP.EXE whenever MYAPP.OBJ, ANOTHER.OBJ, or MYAPP.DEF has changed more recently than MYAPP.EXE. If any dependents of a target are listed as targets in other description blocks, then NMAKE builds those files before it builds the original target. Essentially NMAKE evaluates a "dependency tree" for the entire description file. It builds files in the order needed to update the original target, never building a target until all files that depend on it are up-to-date. The dependent list can also include a list of directories in which NMAKE should search for dependents. The directory list is enclosed in curly braces ( {} ) and precedes the dependent list. NMAKE searches the current directory first, then the directories you list: forward.exe : {\src\alpha;d:\proj}pass.obj In the line above, the target, FORWARD.EXE, has one dependent: PASS.OBJ. The directory list specifies two directories: {\src\alpha;d:\proj} NMAKE begins searching for PASS.OBJ in the current directory. If it is not found, NMAKE searches the \ SRC \ ALPHA directory, then the D:\ PROJ directory. If NMAKE cannot find a dependent in the current directory or a listed directory, it looks for an inference rule that describes how to create the dependent (see Section 6.3.4, "Inference Rules"). The commands part of a description block can contain one or more commands. The commands part of the description block lists the command(s) NMAKE should use to build the target. This can be any command that you can execute from the command line. The example tells NMAKE to build MYAPP.EXE using the following LINK command: LINK myapp another.obj, /align:16, NUL, os2, myapp Notice that the line above is indented. NMAKE uses indentation to distinguish between the dependency line and command line. If the command appears on a separate line, as here, it must be indented at least one space or tab. The dependency line must not be indented (it cannot start with a space or tab). Many targets are built with a single command, but you can place more than one command after the dependency line. A long command can span several lines if each line ends with a backslash ( \ ). You can also place the command at the end of the dependency line. Separate the command from the rightmost dependent with a semicolon. In OS/2 description files, NMAKE imposes a slight restriction on the use of the CD, CHDIR, and SET commands. Do not place any of these commands on a command line that uses the ampersand (&) to execute multiple commands. For instance, the following command line is legal in an OS/2 description file, DIR & COPY sample.c backup.c but this line is not legal because it places a CD command after the ampersand: DIR & CD \mydir To use CD, CHDIR, or SET in a description block, place the command on a separate line: DIR CD \mydir Your OS/2 user's documentation contains more information about using the ampersand in command lines. Wild Cards You can use DOS wild-card characters (* and ?) to specify target and dependent file names. NMAKE expands wild cards in target names when it reads the description file. It expands wild cards in the dependent names when it builds the target. For example, the following description block compiles all source files with the .C extension: bondo.exe : *.c CL *.c Command Modifiers Command modifiers provide extra control over the command listed in a description block. They are special characters that appear in front of a command. You can use more than one modifier for a single command. Table 6.1 describes the three NMAKE command modifiers. Table 6.1 Command Modifiers ╓┌─────────────────────────────────┌─────────────────────────────────────────╖ Character Action ──────────────────────────────────────────────────────────────────────────── At sign (@) Prevents NMAKE from displaying the command as it executes. In the example below, NMAKE does not display the ECHO command line: sort.exe : sort.obj @ECHO sorting The output of the ECHO command appears as usual. Dash (-) Turns off error checking for the command. Character Action ──────────────────────────────────────────────────────────────────────────── Dash (-) Turns off error checking for the command. If the dash is followed by a number, NMAKE stops only if the error level returned by the command is greater than the number. In the following example, if the program sample returned an error code NMAKE does not stop but continues to execute commands: light.lst : light.txt -sample light.txt Exclamation point (!) Executes the command for each dependent file if the command uses the predefined macros? or $**. The$? macro refers to
all dependent files that are out-of-date
with respect to the target. The $** macro refers to all dependent files in the description block (see Section 6.3.3, Character Action ──────────────────────────────────────────────────────────────────────────── the description block (see Section 6.3.3, "Macros"). For example, print:hop.asm skip.bas jump.c !print$** lpt1:

generates the following commands:

print hop.asm lpt1:
print skip.bas lpt1:
print jump.c lpt1:

────────────────────────────────────────────────────────────────────────────

Using Control Characters as Literals

Occasionally, you may need to list a file name that contains a character
that NMAKE uses as a control character. These characters are

# ( ) $^ \ { } ! @ - To use an NMAKE control character as a literal character, place a caret (^) in front of it. For example, say that you define a macro that ends with a backslash: exepath=c:\bin\ The line above is intended to define a macro named exepath with the value c:\bin\. But the second backslash causes unexpected results. Since the back-slash is the NMAKE line-continuation character, the line actually defines the macro exepath as c:\bin followed by whatever appears on the next line of the description file. You can solve the problem by placing a caret in front of the second backslash: exepath=c:\bin^\ You can also use a caret to place a literal newline character in a description file. This feature can be useful in macro definitions: XYZ=abc^ def NMAKE interprets the example as if you assigned the C-style string abc\ndef to the XYZ macro. This effect differs from using the backslash ( \s ) to continue a line. A newline character that follows a backslash is replaced with a space. Carets that precede noncontrol characters are ignored. The line ign^ore : these ca^rets is interpreted as ignore : these carets A caret that appears in quotation marks is treated as a literal caret character. Listing a Target in Multiple Description Blocks You can specify more than one description block for the same target by placing two colons (::) after the target. This feature can be useful for building a complex target, such as a library, that contains components created with different commands. For example, target.lib :: a.asm b.asm c.asm CL a.asm b.asm c.asm LIB target -+a.obj -+b.obj -+c.obj; target.lib :: d.c e.c CL /c d.c e.c LIB target -+d.obj -+e.obj; Both description blocks update the library named TARGET.LIB. If any of the assembly-language files have changed more recently than the library, NMAKE executes the commands in the first block to assemble the source files and update the library. Similarly, if any of the C-language files have changed, NMAKE executes the second group of commands, which compile the C files and update the library. If you use a single colon in the example above, NMAKE issues an error message. It is legal, however, to use single colons if commands are listed in only one block. In this case, dependency lines are cumulative. For example, target: jump.bas target: up.c echo Building target... is equivalent to target: jump.bas up.c echo Building target... 6.3.2 Comments You can place comments in a description file by preceding them with a number sign (#): # This comment appears on its own line huey.exe : huey.obj dewey.obj # Comment on the same line link huey.obj dewey.obj; A comment extends to the end of the line in which it appears. Command lines cannot contain comments. 6.3.3 Macros Macros allow you to do text replacements throughout the description file. Macros offer a convenient way to replace a string in the description file with another string. The text is automatically replaced each time you run NMAKE. Macros are useful in a variety of tasks, including the following: ■ To create a standard description file for several projects. The macro represents the file names used in commands. These file names are then defined when you run NMAKE. When you switch to a different project, you can change file names throughout the description file by changing a single macro. ■ To control the options that NMAKE passes to the compiler or linker. When you specify options in a macro, you can change options throughout the description file in one easy step. You can define your own macros or use predefined macros. This section begins by describing user-defined macros. User-Defined Macros You can define a macro with macroname = string The macroname can be any combination of letters, digits, and the underscore ( _ ) character. Macro names are case sensitive. NMAKE interprets MyMacro and MYMACRO as different macro names. The string can be any string, including a null string. For example, command = LINK defines a macro named command and assigns it the string LINK. You can define macros in the description file or on the command line. In the description file, you must define each macro on a separate line; the line cannot start with a space or tab. The string can contain embedded spaces, and NMAKE ignores spaces on either side of the equal sign. You do not need to enclose string in quotation marks (if you do, they become part of the string). Slightly different rules apply when you define a macro on the command line, because of the way that the command line handles spaces. You must enclose string in quotation marks if it contains embedded spaces. No spaces can surround the equal sign. You can also enclose the entire macro definition, macroname and string, in quotation marks. For example, NMAKE "program=sample" defines the macro program, assigning it the value sample. Once you have defined a macro, you can "undefine" it with the !UNDEF directive (see Section 6.3.5, "Directives"). Invoking Macros You invoke a macro by enclosing its name in parentheses preceded by a dollar sign ($). (The parentheses are optional if macroname is one character long.)
For example, you can invoke the  command  macro defined above as

$(command) When NMAKE runs, it replaces every occurrence of$(command)  with  LINK.
The following description file defines and uses three macros:

program = sample
options =

$(program).exe :$(program).obj
$c$(options)  $(program).obj; NMAKE interprets the description block as sample.exe : sample.obj LINK sample.obj; NMAKE replaces every occurrence of$(program)  with  sample, every instance
of  $c with LINK, and every instance of$(options) with a null string.
Because  c  is only one character long, you do not need to enclose it in
parentheses.

If you invoke a macro that is not defined, NMAKE treats the macro as a null
string.

Occasionally, you may need to use the dollar sign ($) as a literal character. Use two signs ($$), or precede it with a caret (^). Predefined Macros NMAKE provides several predefined macros, which represent various file names and commands. Predefined macros are useful in their own right, and they are also employed in predefined inference rules, which are described later in this chapter. Table 6.2 lists NMAKE predefined macros. Table 6.2 Predefined Macros ╓┌─────────────────────────────────┌─────────────────────────────────────────╖ Macro Meaning ──────────────────────────────────────────────────────────────────────────── @ The current target's full name. * The current target's base name (full name minus the file extension). ** The dependents of the current target. Macro Meaning ──────────────────────────────────────────────────────────────────────────── ? The dependents that are out-of-date with respect to the current target.$$@ The target that NMAKE is currently evaluating. You can only use this macro to specify a dependent.$<                                The dependent file that is out-of-date
with respect to the current target
(evaluated only for inference rules).

$(CC) The command to invoke the C compiler. By default,$(CC) is predefined as  CC = cl,
which invokes the optimizing compiler.

$(AS) The command that invokes the Microsoft Macro Assembler. NMAKE predefines this macro as AS = masm. Macro Meaning ──────────────────────────────────────────────────────────────────────────── macro as AS = masm. Table 6.2 (continued) ╓┌─────────────────────────────────┌─────────────────────────────────────────╖ Macro Meaning ────────────────────────────────────────────────────────────────────────────$(MAKE)                           The name with which the NMAKE utility is
invoked. This macro is used to invoke
NMAKE recursively. It causes the line on
which it appears to be executed even if
the /N option is on. You can redefine
this macro if you want to execute
another program.
The $(MAKE) macro is useful for building different versions of a program. The following description file invokes NMAKE Macro Meaning ──────────────────────────────────────────────────────────────────────────── following description file invokes NMAKE recursively to build targets in the VERS1 and VERS2 directories. all :vers1 vers2 versl : cd versl$(MAKE)
cd  . .
vers2 :
cd vers2
$(MAKE) cd . . The example changes to the VERS1 directory, then invokes NMAKE recursively, causing NMAKE to process the file MAKEFILE in that directory. Then it changes to the VERS2 directory Macro Meaning ──────────────────────────────────────────────────────────────────────────── Then it changes to the VERS2 directory and invokes NMAKE again, processing the file MAKEFILE in that directory. Deeply recursive build procedures can exhaust NMAKE's run-time stack, causing a run-time error. To eliminate the error, use the EXEHDR utility to increase NMAKE's run-time stack. The following command, for example, gives NMAKE.EXE a stack size of 16,384 (0x4000) bytes: exehdr /stack:0x4000 nmake.exe$(MAKEFLAGS)                      The NMAKE options currently in effect.
If you invoke NMAKE recursively, you
should use the command:  $(MAKE)$(MAKEFLAGS). You cannot redefine this
macro.

Macro                             Meaning
────────────────────────────────────────────────────────────────────────────

$(MAKEDIR) The directory from which NMAKE is invoked. ──────────────────────────────────────────────────────────────────────────── Like user-defined macro names, predefined macro names are case sensitive. NMAKE interprets CC and cc as different macro names. Macro modifiers allow you to specify parts of predefined macros representing file names. You can append characters to any of the first six macros in Table 6.2 to modify its meaning. Appending a D specifies the directory part of the file name only, an F specifies the file name, a B specifies just the base name, and an R specifies the complete file name without the extension. If you add one of these characters, you must enclose the macro name in parentheses. (The predefined macros$$@ and$** are the only exceptions to the rule that
macro names more than one character long must be enclosed in parentheses.)

For example, assume that $@ has the value C:\ SOURCE \ PROG \ SORT.OBJ. The list below shows the effect of combining the special characters with$@:

Macro                             Value
────────────────────────────────────────────────────────────────────────────
$(@D) C:\ SOURCE \ PROG$(@F)                             SORT.OBJ

$(@B) SORT$(@R)                             C:\ SOURCE \ PROG \ SORT

For example, in the code below, the macro $? represents the names of all dependents that are more recent than the target. The exclamation point causes NMAKE to execute the LIB command once for each dependent in the list. As a result, the LIB command is executed up to three times, each time replacing a module with a newer version. trig.lib : sin.obj cos.obj arctan.obj !LIB trig.lib -+$?;

In the following example, NMAKE updates a group of include files:

# Include files depend on versions in current directory
DIR=c:\include
$(DIR)\globals.h : globals.h COPY globals.h$@
$(DIR)\types.h : types.h COPY types.h$@
$(DIR)\macros.h : macros.h COPY macros.h$@

Each of the files GLOBALS.H, TYPES.H, and MACROS.H in the directory  C:\
INCLUDE depends on its counterpart in the current directory. If one of the
include files is out-of-date, NMAKE replaces it with the file of the same
name from the current directory.

Substitution within Macros

Just as macros allow you to substitute text in a description file, you can
also substitute text within a macro itself. Use the following form:

$(macroname:string1 = string2) You can replace text in a macro, as well as in the description file. Every occurrence of string1 is replaced by string2 in the macro macroname. Do not put any spaces or tabs between macroname and the colon. Spaces between the colon and string1 are made part of string1. If string2 is a null string, all occurrences of string1 are deleted from the macroname macro. The following description file illustrates macro substitution: SRCS = prog.c sub1.c sub2.c prog.exe :$(SRCS:.c=.obj)
LINK  $**; DUP :$(SRCS)
!COPY $** c:\backup The predefined macro$** stands for the names of all the dependent files
(see the previous section). If you invoke the example file with a command
line that specifies both targets, NMAKE executes the following commands:

LINK prog.obj sub1.obj sub2.obj;

COPY prog.c c:\backup
COPY sub1.c c:\backup
COPY sub2.c c:\backup

The macro substitution does not alter the definition of the  SRCS  macro,
rather, it simply replaces the listed characters. When NMAKE builds the
target PROG.EXE, it gets the definition for the predefined macro $** (the dependent list) from the dependency line, which specifies the macro substitution in SRCS. The same is true for the second target, DUP. In this case, however, no macro substitution is requested, so SRCS retains its original value, and$** represents the names of the C source files. (In the
example above, the target  DUP  is a pseudotarget; Section 6.3.6 describes
pseudotargets.)

You can also perform substitution in the following predefined macros: $@,$*, $**,$?, and $. The principle is the same as for other macros. The command in the following description block substitutes within a predefined macro: target.abc : depend.xyz echo$(@:targ=blank)

If dependent  depend.xyz  is out-of-date relative to target  target.abc,
then NMAKE executes the command

echo blanket.abc

The example uses the predefined macro $@, which equals the full name of the current target ( target.abc). It substitutes blank for targ in the target, resulting in blanket.abc. Note that you do not put the usual dollar sign in front of the predefined macro. The example uses$(@:targ=blank)

$($@:targ=blank)

to substitute within the predefined macro $@. Inherited Macros When NMAKE executes, it creates macros equivalent to every current environment variable. These are called "inherited" macros because they have the same names and values as the corresponding environment variables. (The inherited macro is all uppercase, however, even if the corresponding environment variable is not.) Inherited macros can be used like other macros. You can also redefine them. The following example redefines the inherited macro PATH: PATH = c:\tools\bin sample.obj : sample.c CL /c sample.c Inherited macros take their definitions from environment variables. No matter what value PATH had in the DOS environment, it has the value c:\tools\bin when NMAKE executes the CL command in this description block. Redefining the inherited macro does not affect the original environment variable; when NMAKE terminates, PATH has its original value. The /E option defeats macro inheritance. If you supply this option, NMAKE ignores any attempt to redefine a macro that derives from an environment variable. Precedence among Macro Definitions If you define the same macro in more than one place, NMAKE uses the macro with the highest precedence. The precedence from highest to lowest is as follows: 1. Macros defined on the command line 2. Macros defined in a description file or include file 3. Inherited macros 4. Macros defined in the TOOLS.INI file 5. Predefined macros such as CC and AS The /E option defeats any attempt to redefine inherited macros. If you run NMAKE with this option, macros inherited from environment variables override any same-named macros in the description file. 6.3.4 Inference Rules Inference rules are templates that NMAKE uses to create files with a given extension. For instance, when NMAKE encounters a description block with no commands, it tries to apply an inference rule that tells how to create the target from the dependent files, given the two extensions. Similarly, if a dependent file does not exist, NMAKE tries to apply an inference rule that tells how to create the missing dependent from another file with the same base name. Inference rules tell NMAKE how to create files with a certain extension. Inference rules provide a convenient shorthand for common operations. For instance, you can use an inference rule to avoid repeating the same command in several description blocks. You can define your own inference rules or use predefined inference rules. This section begins by describing user-defined inference rules. User-Defined Inference Rules You can define inference rules in the description file or in the TOOLS.INI file. An inference-rule definition lists two file extensions and one or more commands. For instance, the following inference rule tells NMAKE how to build a .OBJ file using a .C file: .C.OBJ: CL /c$<;

The first line lists two extensions. The second extension (.OBJ) specifies
the type of the desired file and the first (.C) specifies the type of the
desired file's dependent. The second line lists the command used to build
the desired file. Here, the predefined macro $represents the name of a dependent that is out-of-date relative to the target. NMAKE could apply the above inference rule to the following description block: sample.obj : The description block lists only a target, SAMPLE.OBJ. Both the dependent and the command are missing. However, given the target's base name and extension, plus the above inference rule, NMAKE has enough information to build the target. NMAKE first looks for a .C file with the same base name as the target. If SAMPLE.C exists, NMAKE compares its date to that of SAMPLE.OBJ (the comparison is triggered by the predefined macro$). If
SAMPLE.C has changed more recently, NMAKE compiles it using the CL command
listed in the inference rule:

CL/c sample.c

────────────────────────────────────────────────────────────────────────────
NOTE

NMAKE applies an inference rule only if the base name of the file it is
trying to create matches the base name of a file that already exists. Thus,
inference rules are useful only when there is a one-to-one correspondence
between the desired file and its dependent. You cannot define an inference
rule that replaces several modules in a library, for example.
────────────────────────────────────────────────────────────────────────────

Extension Search Paths

If an inference rule does not specify a search path, as in the example
above, NMAKE looks for files in the current directory. You can specify a
single path for each of the extensions, using the following form:

{frompath}. fromext{topath}. toext:
commands

NMAKE searches in the frompath directory for files with the fromext
extension. It uses commands to create files with the toext extension in the
topath directory.

Predefined Inference Rules

NMAKE provides predefined inference rules to perform these common

■   Creating an .OBJ file by compiling a .C file

■   Creating an .OBJ file by assembling an .ASM file

■   Creating an .EXE file by compiling a .C file and linking the resulting
.OBJ file

Table 6.3 describes the predefined inference rules.

Table   6.3 Predefined Inference Rules

╓┌───────────────┌─────────────────────────┌─────────────────────────────────╖
Inference Rule  Command                   Default Action
────────────────────────────────────────────────────────────────────────────
.c.obj          $(CC)$(CFLAGS) /c $*.c cl /c$*.c
.asm.obj        $(AS)$(AFLAGS) $*; masm$*;
.c.exe          $(CC)$(CFLAGS) $*.c cl$*.c
────────────────────────────────────────────────────────────────────────────

For example, say that you have the following description file:

sample.exe :

Like the previous example, this description block lists a target without any
dependents or commands. NMAKE looks at the target's extension (.EXE) and
checks for an inference rule that describes how to create a .EXE file. The
last rule in Table 6.3 provides this information:

.c.exe:
$(CC)$(CFLAGS) $*.c To apply this rule, NMAKE first looks for a file with the same base name as the target (SAMPLE) and the .C extension. If SAMPLE.C exists in the current directory, NMAKE executes the CL command given in the rule. The command compiles SAMPLE.C and links the resulting file SAMPLE.OBJ to create SAMPLE.EXE. Precedence among Inference Rules If the same inference rule is defined in more than one place, NMAKE uses the rule with the highest precedence. The precedence from highest to lowest is 1. Inference rules defined in the description file 2. Inference rules defined in the TOOLS.INI file 3. Predefined inference rules NMAKE uses a predefined inference rule only if no user-defined inference rule exists for the desired operation. 6.3.5 Directives Directives allow you to write description files that are similar to batch files. Directives can execute commands conditionally, display error messages, include other files, and turn on or off certain options. NMAKE directives are similar to C preprocessor directives. A directive begins with an exclamation point (!), which must appear at the beginning of the line. You can place spaces between the exclamation point and the directive keyword. (See Table 6.4.) Table 6.4 Directives ╓┌────────────────────────┌──────────────────────────────────────────────────╖ Directive Description ──────────────────────────────────────────────────────────────────────────── !CMDSWITCHES Turns on or off one of four NMAKE options: /D, /I, {+| -}opt... /N, and /S. If no options are specified, the options are reset to the way they were when NMAKE started. Turn an option on by preceding it with a plus sign (+), or turn it off by preceding it with a minus sign (-). Using this keyword updates the MAKEFLAGS macro. !ELSE Executes the statements between the !ELSE and !ENDIF keywords if the statements preceding the !ELSE keyword were not executed. !ENDIF Marks the end of the !IF, !IFDEF, or !IFNDEF block of statements. Directive Description ──────────────────────────────────────────────────────────────────────────── !ERROR text Causes text to be printed and then stops execution. !IF constantexpression Executes the statements between the !IF keyword and the next !ELSE or !ENDIF keyword if constant expression evaluates to a nonzero value. !IFDEF macroname Executes the statements between the !IFDEF keyword and the next !ELSE or !ENDIF keyword if macroname is defined. NMAKE considers a macro with a null value to be defined. !IFNDEF macroname Executes the statements between the !IFNDEF keyword and the next !ELSE or !ENDIF keyword if macroname is not defined. !INCLUDE filename Reads and evaluates the file filename before continuing with the current description file. If Directive Description ──────────────────────────────────────────────────────────────────────────── continuing with the current description file. If filename is enclosed by angle brackets (< >), NMAKE searches for the file in the directories specified by the INCLUDE macro. Otherwise, it looks only in the current directory. The INCLUDE macro is initially set to the value of the INCLUDE environment variable. !UNDEF macroname Marks macroname as being undefined in NMAKE's symbol table. ──────────────────────────────────────────────────────────────────────────── The constantexpression used with the !IF directive can consist of integer constants, string constants, or program invocations. Integer constants can use the C unary operators for numerical negation (-), one's complement (~), and logical negation (!). They can also use any of the C binary operators listed in Table 6.5. Table 6.5 Directive Operators ╓┌─────────────────────┌─────────────────────────────────────────────────────╖ Operator Description ──────────────────────────────────────────────────────────────────────────── + Addition - Subtraction * Multiplication / Division % Modulus & Bitwise AND | Bitwise OR ^^ Bitwise XOR && Logical AND || Logical OR << Left shift >> Right shift == Equality Operator Description ──────────────────────────────────────────────────────────────────────────── == Equality != Inequality < Less than > Greater than <= Less than or equal to >= Greater than or equal to ──────────────────────────────────────────────────────────────────────────── You can group expressions using parentheses. NMAKE treats numbers as decimal unless they start with 0 (octal) or 0x (hexadecimal). Use the equality (==) operator to compare two strings for equality or the inequality (!=) operator to compare for inequality. Enclose strings with quotes. Program invocations must be in square brackets ([ ]). The following example illustrates directives: !INCLUDE <infrules.txt> !CMDSWITCHES +D winner.exe:winner.obj !IFDEF debug ! IF "$(debug)"=="y"
!  ELSE
!  ENDIF
!ELSE
!  ERROR Macro named debug is not defined.
!ENDIF

The !INCLUDE directive causes NMAKE to insert the file INFRULES.TXT into the
description file. The !CMDSWITCHES directive turns on the /D option, which
displays the dates of the files as they are checked. If WINNER.EXE is
out-of-date with respect to WINNER.OBJ, the !IFDEF directive checks to see
if the macro  debug  is defined. If it is defined, the !IF directive checks
to see if it is set to  y. If it is, the linker is invoked with the /CO
option; otherwise it is invoked without. If the  debug  macro is not
defined, the !ERROR directive prints the message and NMAKE stops.

6.3.6  Pseudotargets

Pseudotargets are useful for building a group of files or executing a group
of commands.

A "pseudotarget" is similar to a target, but it is not a file. It is a name
that serves as a "handle" for building a group of files or executing a group
of commands. In the following example,  UPDATE  is a pseudotarget.

UPDATE: *.*
!COPY $** a:\product When NMAKE evaluates a pseudotarget, it always considers the dependents to be out-of-date. In the example, NMAKE copies each of the dependent files to the specified drive and directory. Like macro names, pseudotarget names are case sensitive. Predefined pseudotarget names are all uppercase. The pseudotargets in Table 6.6 are predefined to provide special rules in a description file. You can use their names on the command line, in a description file, or in the TOOLS.INI file. You need not specify them as targets; NMAKE uses the rules they define no matter where they appear. Table 6.6 Pseudotargets ╓┌─────────────────────────────────┌─────────────────────────────────────────╖ Pseudotarget Action ──────────────────────────────────────────────────────────────────────────── .IGNORE: Ignores exit codes returned by programs called from the description file. Same effect as invoking NMAKE with the /I option. .PRECIOUS: target(s) Tells NMAKE not to delete target(s) if the commands that build it are quit or interrupted. Using this pseudotarget overrides the NMAKE default. By default, NMAKE deletes the target if it cannot be sure the target is built successfully. The .PRECIOUS pseudotarget is rarely Pseudotarget Action ──────────────────────────────────────────────────────────────────────────── The .PRECIOUS pseudotarget is rarely needed. Like most professional tools, Microsoft language tools clean up by themselves when errors occur. .SILENT: Does not display lines as they are executed. Same effect as invoking NMAKE with the /S option. .SUFFIXES:list Lists file suffixes for NMAKE to try when building a target file for which no dependents are specified. This list is used together with inference rules. See Section 6.3.4, "Inference Rules." When NMAKE finds a target without any dependents, it searches the current directory for a file with the same base name as the target and a suffix from the Pseudotarget Action ──────────────────────────────────────────────────────────────────────────── name as the target and a suffix from the list. If NMAKE finds such a file, and if an inference rule applies to the file, then NMAKE treats the file as a depen- dent of the target. The order of the suffixes in the list defines the order in which NMAKE searches for the file. The list is predefined as follows: .SUFFIXES: .obj .exe .c .asm To add suffixes to the list, specify .SUFFIXES : followed by the new suffixes. To clear the list, specify .SUFFIXES: ──────────────────────────────────────────────────────────────────────────── 6.3.7 PWB's extmake Syntax NMAKE description files can use the same syntax as the extmake switch of PWB (see Chapter 8, "Customizing the Microsoft Programmer's WorkBench"). This syntax allows you to determine the drive, path, base name, and extension of the first dependent, information that is not otherwise available. The file name, and parts of its name, are represented using the syntax %|partsF where parts is one or more of the following: Letter Description ──────────────────────────────────────────────────────────────────────────── d Drive e File extension f File base name p Path s Complete name The following example uses extmake syntax: sample.obj : sample.c CL /Fod:%|pfF %|dfeF In this example, the sequence %|pfF represents the path and base name of the first dependent file, while the sequence %|dfeF represents the drive, base name, and extension of the same file. The example, then, compiles the file and writes the output to a file on the same path but with the default .OBJ extension. The percent symbol (%) is a replacement character in DOS and OS/2 command lines in the description file. To use extmake syntax in command-line arguments, specify each percent symbol as a double percent symbol (%%). 6.4 Command-Line Options NMAKE accepts a number of options, which are listed in Table 6.7. You can specify options in uppercase or lowercase and use either a slash or dash. For example, -A, /A, -a, and /a all represent the same option. Table 6.7 NMAKE Options ╓┌─────────────────────────────────┌─────────────────────────────────────────╖ Option Action ──────────────────────────────────────────────────────────────────────────── /A Builds all of the requested targets even if they are not out-of-date. /C Suppresses nonfatal error or warning messages and the NMAKE logo display. /D Displays the modification date of each file. /E Causes environment variables to override Option Action ──────────────────────────────────────────────────────────────────────────── /E Causes environment variables to override macro definitions in description files. See Section 6.3.3, "Macros." /F filename Specifies filename as the name of the description file. If you supply a dash (-) instead of a file name, NMAKE gets input from the standard input device instead of the description file. /HELP Calls the QuickHelp utility. If the QuickHelp program is not available, NMAKE displays the most commonly used NMAKE options. /I Ignores return codes from commands listed in the description file. NMAKE processes the whole description file even if errors occur. Option Action ──────────────────────────────────────────────────────────────────────────── even if errors occur. /N Displays but does not execute the description file's commands. This option is useful for debugging description files and checking which targets are out-of-date. /NOLOGO Suppresses the NMAKE logo display. /P Displays all macro definitions and target descriptions on the standard output device. /Q Returns zero if the target is up-to-date and nonzero if it is not. This option is useful when running NMAKE from a batch file. Option Action ──────────────────────────────────────────────────────────────────────────── /R Ignores inference rules and macros that are predefined or defined in the TOOLS.INI file. /S Suppresses the display of commands listed in the description file. /T Changes the modification dates for out-of-date target files to the current date. /X filename Sends all error output to filename, which can be a file or a device. If you supply a dash (-) instead of a file name, the error output is sent to the standard output device. /Z Used for internal communication between Option Action ──────────────────────────────────────────────────────────────────────────── /Z Used for internal communication between NMAKE and PWB. /? Displays a brief summary of NMAKE syntax and exits to the operating system. ──────────────────────────────────────────────────────────────────────────── The following command specifies two NMAKE options: NMAKE /f sample.mak /c targ1 targ2 The /f option tells NMAKE to read the description file SAMPLE.MAK. The /c option tells NMAKE not to display nonfatal error messages and warnings. The command lists two targets (targ1 and targ2) to update. NMAKE /D /N targ1 targ1.mak In the example above, NMAKE updates the target targ1. If the current directory does not contain a file named MAKEFILE, NMAKE reads the file TARG1.MAK as the description file. The /D option displays the modification date of each file; the /N option displays the commands without executing them. 6.5 NMAKE Command Files Occasionally, you may need to give NMAKE a long list of command-line arguments that exceeds the maximum length of a command line (128 characters in DOS, 256 in OS/2). To do this, place the command arguments in a file, then give the name of the file when you run NMAKE. For instance, say that you create a file named UPDATE, which consists of this line: /S "program = sample" sort.exe search.exe If you start NMAKE with the command NMAKE @update NMAKE reads its command-line arguments from UPDATE. The at sign (@) tells NMAKE to read arguments from the file. The effect is the same as if you typed the arguments directly on the command line: NMAKE /S "program = sample" sort.exe search.exe Within the file, line breaks between arguments are treated as spaces. Macro definitions that contain spaces must be enclosed in quotation marks, just as if you typed them on the command line. You can continue a macro definition across multiple lines by ending each line except the last with a backslash ( \ ): /S "program \ = sample" sort.exe search.exe This file is equivalent to the first example. The backslash in the example allows the macro definition ("program = sample" ) to span two lines. 6.6 The TOOLS.INI File You can customize NMAKE by placing commonly used macros and inference rules in the TOOLS.INI initialization file. Settings for NMAKE must follow a line that begins with [NMAKE]. This part of the initialization file can contain macro definitions, .SUFFIXES lists, and inference rules. For example, [NMAKE] CC=cl CFLAGS=-Gc -Gs -W3 -Oat .c.obj:$(CC) -c $(CFLAGS)$*.c

If TOOLS.INI contains the code above, NMAKE reads and applies the lines
following  [NMAKE]. The example defines the macros CC and CFLAGS and
redefines the inference rule for making .OBJ files from .C sources.

NMAKE looks for TOOLS.INI in the current directory. If it is not found
there, NMAKE searches the directory specified by the INIT environment
variable.

6.7  In-Line Files

NMAKE can write "in-line files," which can contain any text you specify. One
use for in-line files is to write a response file for another utility such
as LIB. (Response files are useful when you need to supply a program with a
long list of arguments that exceeds the maximum length of the command line.)

Use this syntax to create an in-line file:

target : dependents
command << «filename»
inlinetext
<<«KEEP | NOKEEP»

All of the text between the two sets of double angle brackets () is placed
in the in-line file. The filename is optional. If you don't supply filename,
NMAKE gives the in-line file a unique name. NMAKE places the in-line file in
the current directory or, if the TMP environment variable is defined, in the
directory specified by TMP.

The in-line file can be temporary or permanent. If you don't specify
otherwise, or if you specify NOKEEP, it is temporary. Specify KEEP to retain
the file.

The following example creates a LIB response file named LIB.LRF:

math.lib : add.obj sub.obj mul.obj div.obj
LIB @<<lib.lrf
math.lib
listing
<<KEEP

The resulting response file tells LIB which library to use, the commands to
execute, and the listing file to produce:

math.lib
listing

The in-line file specification can create more than one in-line file. For
instance,

target.abc : depend.xyz
cat <<file1 <<file2
I am the contents of file1.
<<KEEP
I am the contents of file2.
<<KEEP

The example creates two in-line files named FILE1 and FILE2; then NMAKE
executes the command:

CAT file1 file2

The KEEP keywords tell NMAKE not to delete FILE1 and FILE2 when done.

6.8  NMAKE Operations Sequence

If you are writing a complex description file, you may need to know the
exact order of steps that NMAKE follows. This section describes those steps
in order.

When you run NMAKE from the command line, its first task is to find the
description file, following these steps:

1.  If NMAKE is invoked with the /F option, it uses the file name
specified in the option.

2.  If /F is not specified, NMAKE looks for a file named MAKEFILE in the
current directory. If such a file exists, it is used as a description
file.

3.  If MAKEFILE is not in the current directory, NMAKE parses the command
line for the first string that is not an option or a macro definition
and treats this string as a file name. If the file-name extension does
not appear in the .SUFFIXES list, NMAKE uses the file as the
description file. If the extension appears in the .SUFFIXES list,
NMAKE tries additional strings until it finds a suitable file. (See
Section 6.3.6, "Pseudotargets," for a description of the .SUFFIXES
list.)

4.  If NMAKE still has not found a description file, it returns an error.

NMAKE stops searching for a description file as soon as it finds one, even
if other potential description files exist. If you specify /F, NMAKE uses
the file specified by that option even if MAKEFILE exists in the current
directory. Similarly, if NMAKE uses MAKEFILE, any description file listed in
the command line is treated as a target.

If you do not specify targets, NMAKE updates only the first target in the
description file.

Next, NMAKE updates every target listed on the command line. If none is
listed, NMAKE updates only the first target in the description file. (This
behavior differs from the older MAKE program's default; see Section 6.9,
"Differences between NMAKE and MAKE.")

NMAKE then applies macro definitions and inference rules in the following
order, from highest to lowest priority:

1.  Macros defined on the command line

2.  Macros defined in a description file or include file

3.  Inherited macros

4.  Macros defined in the TOOLS.INI file

5.  Predefined macros such as CC and AS

Definitions in later steps take precedence over definitions in earlier
steps. The /E option, however, causes inherited macros to override macros
defined on the command line. The /R option causes NMAKE to ignore macros and
inference rules that are predefined or defined in TOOLS.INI.

Now NMAKE updates each target in the order in which it appears in the
description file. It compares the date and time of each dependent with that
of the target and performs the commands needed to update the target. If you
specify the /A option or if the target is a pseudotarget, NMAKE updates the
target even if its dependents are not out-of-date.

If the target has no explicit dependents, NMAKE looks in the current
directory for one or more files whose extensions are in the .SUFFIXES list.
If it finds such files, NMAKE treats them as dependents and updates the
target according to the commands.

If no commands are given to update the target or if the dependents cannot be
found, NMAKE applies inference rules to build the target. By default, it
tries to build .EXE files from .OBJ files; and it tries to build .OBJ files
from .C and .ASM sources. In practice, this means you should specify .OBJ
files as dependents, because NMAKE compiles your source files when it can't
find the .OBJ files.

NMAKE normally quits processing the description file when a command  returns
an error. In addition, if it cannot tell that the target was built
successfully, NMAKE deletes the partially created target. If you use the /I
commandline option, NMAKE ignores exit codes and attempts to continue
processing. The .IGNORE pseudotarget has the same effect. To prevent NMAKE
from  deleting the partially created target, specify the target name in the
.PRECIOUS pseudotarget.

Alternatively, you can use the dash (-) command modifier to ignore the error
code for an individual command. An optional number after the dash tells
NMAKE to continue if the command returns an error code that is less than or
equal to the number, and to stop if the error code is greater than the
number.

You can help document errors by using the !ERROR directive to print
descriptive text. The directive causes NMAKE to print some text, then stop,
even if you use /I, .IGNORE, or the dash (-) modifier.

6.9  Differences between NMAKE and MAKE

As its name implies, NMAKE is a new utility that replaces the older
Microsoft MAKE program. NMAKE differs from MAKE in the following ways:

■   NMAKE does not evaluate targets sequentially. Instead, NMAKE updates
the targets you specify when you invoke it, regardless of their
positions in the description file. If no targets are specified, NMAKE
updates only the first target in the file.

■   NMAKE accepts command-line arguments from a file.

■   NMAKE provides more command-line options.

■   NMAKE provides more predefined macros.

■   NMAKE permits substitutions within macros.

■   NMAKE supports directives placed in the description file.

■   NMAKE allows you to specify include files in the description file.

The first item in the list deserves special emphasis. While MAKE normally
builds every target, working from beginning to end of the description file,
NMAKE expects you to specify targets on the command line. If you do not,
NMAKE builds only the first target in the description file.

The difference is clear if you run NMAKE using a typical MAKE description
file, which lists a series of subordinate targets followed by a higher-level
target that depends on the subordinates:

pmapp.obj : pmapp.c
CL /c /G2sw /W3 pmapp.c

pmapp.exe : pmapp.obj pmapp.def
LINK pmapp, /align:16, NUL, os2, pmapp

MAKE builds both targets (PMAPP.OBJ and PMAPP.EXE), but NMAKE builds only
the first target (PMAPP.OBJ).

Because of these performance differences, you may want to convert MAKE files
to NMAKE files. MAKE description files are easy to convert. A simple method
is to create a new description block at the beginning of the file. Give this
block a pseudotarget named  ALL  and list the top-level target as a
dependent of  ALL. To build  ALL, NMAKE must update every target upon which
the target of  ALL  depends:

ALL : pmapp.exe

pmapp.obj : pmapp.c
CL /c /G2sw /W3 pmapp.c

pmapp.exe : pmapp.obj pmapp.def
LINK pmapp, /align:16, NUL, os2, pmapp

If the above file is named MAKEFILE, you can update the target PMAPP.EXE
with the command

NMAKE

or the command

NMAKE ALL

Note that it is not necessary to list PMAPP.OBJ as a dependent of  ALL.
NMAKE builds a dependency tree for the entire description file, and builds
whatever files are needed to update PMAPP.EXE. So if PMAPP.C is out-ofdate
with respect to PMAPP.OBJ, NMAKE compiles PMAPP.C to create PMAPP.OBJ, then
links PMAPP.OBJ to create PMAPP.EXE.

The same technique is suitable for description files with more than one
top-level target. List all of the top-level targets as dependents of  ALL:

ALL : pmapp.exe second.exe another.exe

The example updates the targets PMAPP.EXE, SECOND.EXE, and ANOTHER.EXE.

If the description file lists a single, top-level target, you can use an
even simpler technique. Move the top-level block to the beginning of the
file:

pmapp.exe : pmapp.obj pmapp.def
LINK pmapp, /align:16, NUL, os2, pmapp

pmapp.obj : pmapp.c
CL /c /G2sw /W3 pmapp.c

NMAKE updates the second target (PMAPP.OBJ) whenever needed to keep the
first target (PMAPP.EXE) current.

Chapter 7  Creating Help Files with HELPMAKE
────────────────────────────────────────────────────────────────────────────

If you have used PWB or other Microsoft language products such as QuickC,
you are familiar with the many advantages of on-line help. The Microsoft
Help-File-Creation Utility (HELPMAKE) allows you to create your own help
files for use with Microsoft products. It also allows you to customize the
help files supplied with Microsoft language products.

HELPMAKE translates help text files into a help database accessible from
within the following:

■   Microsoft C 6.0 Programmer's WorkBench (PWB)

■   QuickHelp Utility

■   Microsoft Editor 1.02

■   Microsoft QuickC 2.0

■   Microsoft QuickPascal 1.0

■   Microsoft QuickBASIC 4.5

This chapter describes how to create and modify help files using the
HELPMAKE utility.

7.1  Structure and Contents of a Help Database

HELPMAKE creates a help database from one or more input files that contain
information formatted for the help system. This section defines some of the
terms involved in formatting and outlines the formats that HELPMAKE can
process.

7.1.1  Contents of a Help File

As you might expect, each help text file starts with a topic and some
information about the topic, then lists another topic and some information
about it, and so on. In HELPMAKE terminology, topics are called "contexts";
the information is called "topic text."

The .context command introduces a context. In the source file for C 6.0
help, for example, this line introduces help for the open function:

.context open

The .context command and other formatting elements are described in Section
7.5, "Help Text Conventions."

Whether a context is one or several words depends on the application.
QuickBASIC, for example, considers spaces to be delimiters, so in QuickBASIC
help files contexts are limited to a single word. Other applications, such
as the Microsoft Editor, can handle contexts that span several words. Either
way, the application simply hands the context to an internal "help engine,"
which searches the database for information.

Often, especially with library routines, the same information applies to
more than one subject. For example, the string-to-number functions strtod,
strtol, and stroul share the same help text. The help file lists all three
function names as contexts for one block of topic text. The converse,
however, is not true. You cannot specify different blocks of topic text, in
different places in the help file, to describe a single subject.

Cross-references help you navigate through  a help database.

Cross-references make it possible to view information about related topics,
including header files and code examples. The help for the open function,
for example, references the access function and the ASCII header file
FCNTL.H. Cross-references can point to other contexts in the same help
database, to contexts in other help databases, or to ASCII files outside the
database.

Help files can have two kinds of cross-references:

■   Implicit

■   Explicit, or hyperlinks

Implicit cross-references  are coded with an ordinary .context command.

The word "open" is an implicit cross-reference throughout C 6.0 help. If you
select the word "open" anywhere in C 6.0 help, the help system displays
information on the open function. As illustrated above, the context for open
begins with an ordinary .context command. As a result, anywhere that you
select "open," the help system references this context.

Hyperlinks are explicit cross-references marked  by invisible text.

A "hyperlink" is an explicit cross-reference tied to a word or phrase at a
specific location in the help file. You create hyperlinks when you write the
help text. The hyperlink consists of a word or phrase followed by invisible
text that gives the context to which the hyperlink refers.

For example, to cause an instance of the word "formatting" to display help
on the printf function, you would create an explicit cross-reference from
the word "formatting" to the context "printf." Elsewhere in the file,
"formatting" has no special significance but, at that one position, it
references the help for printf.  Section 7.5.4 describes how to create

Formatting flags let you change the appearance of text.

Help text can also include formatting flags to control the appearance of the
text on the screen. Using these flags, you can make certain words appear in
various colors, inverse video, and so forth, depending on the application
displaying help and the graphics capabilities of the host computer.

7.1.2  Help File Formats

You can create help files using any of three formats:

■   QuickHelp format

■   Rich Text Format (RTF)

■   Minimally formatted ASCII

In addition, you can reference unformatted ASCII files, such as include
files, from within a help database.

An entire help system (such as the one supplied with Microsoft C or
QuickBASIC) can use any combination of files formatted with different format
types. With C, for example, the README.DOC information file is encoded as
minimally formatted ASCII; the help files for the PWB, C language, and
run-time library are encoded in the QuickHelp format. The database also
cross-references the header (include) files, which are unformatted ASCII
files stored outside the database.

QuickHelp

QuickHelp format is the default and is the format into which HELPMAKE
decodes help databases. Use any text editor to create a QuickHelp-format
help text file. QuickHelp format also lends itself to a relatively easy
automated translation from other document formats.

QuickHelp files can contain any kind of cross-reference or formatting
attribute. Typically, you use QuickHelp format for any changes to a database
supplied by Microsoft.

RTF

Rich Text Format (RTF) is a Microsoft word-processing format that many other
word processors also support. You can create RTF help text with any word
processor that generates RTF output. You can also use any utility program
that takes word-processor output and produces an RTF file.

Use RTF when you want to transfer help files from one application to another
while retaining formatting information. You can format RTF files directly
with the word-processing program; you need not edit them to insert any
special commands or tags. Like QuickHelp files, RTF files can contain
formatting attributes and cross-references.

Minimally Formatted ASCII

Minimally formatted ASCII files simply define contexts and their topic text.
These files cannot contain screen-formatting commands or explicit
crossreferences (implicit cross-references are allowed). They are often used
to display text such as README.DOC and small help files that do not require
compression.

Unformatted ASCII

Unformatted ASCII files are exactly what their name implies: regular ASCII
files with no special formatting commands, context definitions, or special
information. An unformatted ASCII file does not become part of the help
database. Only its name is used as the object of a cross-reference. The
standard C header (include) files are unformatted ASCII files used for
cross-references by the help system for the C run-time library. Unformatted
ASCII files are also useful for storing program examples.

7.2  Invoking HELPMAKE

The HELPMAKE program can encode or decode help files, allowing you to create
new help files or modify existing ones. Encoding converts a text file to a
compressed help database. HELPMAKE can encode text files written in
QuickHelp, RTF, and minimally formatted ASCII format. Decoding converts a
help database to a text file for editing. HELPMAKE always decodes a help
database into a QuickHelp format text file.

Invoke HELPMAKE with the following syntax:

HELPMAKE «options» { /En | /D } { sourcefiles }

The options modify the action of HELPMAKE; they are described in Section
7.3.

Use the /E option to encode with HELPMAKE and use the /D option to decode.

You must supply either the /E (encode) or the /D (decode) option. When
encoding (/E) to create a help database, you must use the /O option to
specify the file name of the database.

The sourcefile field is required. It specifies the input file for HELPMAKE.
If you use the /D (decode) option, sourcefile can be one or more help
database files (such as QC.HLP). HELPMAKE decodes the database files into a
single text file. If you use the /E (encode) option, sourcefile can be one
or more help text files (such as QC.SRC). Separate file names with a space.
Standard wild-card characters can also be used.

The example below invokes HELPMAKE with the /V, /E, and /O options (see
Section 7.3.1, "Options for Encoding"). HELPMAKE reads input from the text
file  my.txt  and writes the compressed help database in the file  my.hlp.
The /E option causes maximum compression. Note that the DOS redirection
symbol (>) sends a log of HELPMAKE activity to the file  my.log. You may
find it helpful to redirect the log file because, in its more verbose modes
(given by /V), HELPMAKE may generate a lengthy log.

HELPMAKE /V /E /Omy.hlp my.txt > my.log

The example below invokes HELPMAKE to decode the help database  my.hlp  into
the text file  my.src, given with the /O option. Once again, the /V option
results in verbose output, and the output is directed to the log file
my.log. Section 7.3.2 describes additional options for decoding.

HELPMAKE /V /D /Omy.src my.hlp > my.log

7.3  HELPMAKE Options

HELPMAKE accepts a number of command-line options, which are described
below. You can specify options in uppercase or lowercase letters, and
precede them with either a forward slash ( / ) or a dash (-). For example,
-L, /L, -l, and /l all represent the same option. Most options apply only to
encoding; others apply only to decoding; and a few apply to both.

7.3.1  Options for Encoding

When you encode a file─that is, when you build a help database─you must
specify the /E option. In addition, you can supply various other options
that control the way HELPMAKE works. All the options that apply when
encoding are listed below:

Option                            Action
────────────────────────────────────────────────────────────────────────────
/Ac                               Specifies c as an application-specific
control character for the help database
file. The character marks a line that
contains special information for
internal use by the application. For
example, QuickC uses the colon (:).

/C                                Indicates that the context strings for
this help file are case sensitive. At
run time, all searches for help topics
are case sensitive if the help database
was built with the /C option in effect.

/E«n»                             Creates (encodes) a help database from a
specified text file. The optional n
indicates the amount of compression to
take place. If n is omitted, HELPMAKE
compresses the file as much as possible,
thereby reducing the size of the file by
about 50%. The more compression
requested, the longer HELPMAKE takes to
create a database file. The value of n
is a number in the range 0 - 15. It is
the sum of successive powers of 2
representing various compression
techniques, as listed below:

Value       Technique
────────────────────────────────────────────────────────────────────────────
0           No compression

1           Run-length compression

2           Key word compression

4           Extended key word
compression

8           Huffman compression

Add values to combine compression
techniques. For example, use /E3 to get
run-length and key word compression.
This is useful in the testing stages of
creating a help database when you need
to create the database quickly and are
not too concerned with size.

/H                                Displays a summary of HELPMAKE syntax
and exits.

/HELP                             Invokes QH.EXE, the QuickHelp utility,
for help about HELPMAKE. If QuickHelp is
not available, displays the same
information as the /H option.

/K filename                       Optimizes key word compression by
supplying a
list of characters that act as word
separators. The filename is a file
containing your list of separator
characters.

When you select key word compression,
HELPMAKE scans the help file to identify
"key words." A key word is any word that
occurs often enough to justify replacing
it with a shorter character sequence.
HELPMAKE normally uses the following
characters as word separators:

■ All characters from 0-32 (including
the space)

■ !"#&'( )*+'-, /:;<=>?@[\]^_{|}~

■ 127

When performing key word compression,
HELPMAKE treats as a word any series of
characters not appearing in the
separator list.

Depending on the content of your help
file, you may be able to improve key
word compression by using the /K option
to specify a different list of separator
characters. For instance, the default
separator list contains the number sign
(#). If your help file contains #include
directives, HELPMAKE normally treats
#include as the word include without a
number sign. To cause HELPMAKE to treat
#include as a word, you could specify
the following separator list:

!"&'()*+'-,/:;<=>?@[\]^_{|}~

The list above does not include the
number sign. HELPMAKE always treats
characters in the range
0-32 as separators, so you do not need
to include them. Your list must include
all the other characters you want
HELPMAKE to use as separators, including
the space.

/L                                Locks the generated file so that it
cannot be decoded by HELPMAKE at a later
time.

/Odestfile                        Specifies destfile as the name of the
help database.

/Sn                               Specifies the type of input file,
according to the following n values:

Option      File Type
────────────────────────────────────────────────────────────────────────────
/S1         Rich Text Format (RTF)

/S2         QuickHelp (default)

/S3         Minimally formatted ASCII

/T                                Translates dot commands into internal
format. If your help file contains dot
commands other than .context, you should
supply this option when encoding it. Dot
commands are described in Section 7.6.1,
"QuickHelp Format," and in later
sections.

/V«n»                             Indicates the verbosity of diagnostic
and informational output, depending on
the value of n. Increasing the value
you omit this option or specify only /V,
HELPMAKE gives you its most verbose
output. The possible values of n are
listed below:

Option      Effect
────────────────────────────────────────────────────────────────────────────
/V          Maximum diagnostic output

/V0         No diagnostic output and no
banner

/V1         Prints only HELPMAKE banner
(default)

/V2         Prints pass names

/V3         Prints contexts on first
pass

/V4         Prints contexts on each pass

/V5         Prints any intermediate
steps within each pass

/V6         Prints statistics on help
file and compression

/Wwidth                           Indicates the fixed width of the
resulting help text in number of
characters. The values of width can
range from 11 to 255. If the /W option
is omitted, the default is 76. When
encoding RTF source (/S1), HELPMAKE
automatically formats the text to width.
When encoding QuickHelp (/S2) or
minimally formatted ASCII (/S3) files,
HELPMAKE truncates lines to this width.

7.3.2  Options for Decoding

To decode a help database into QuickHelp files, you must use the /D option.
In addition, HELPMAKE accepts other options to control the decoding process.
The list below shows all the options that are valid when decoding:

Option                            Action
────────────────────────────────────────────────────────────────────────────
/D«letter»                        Decodes the input file into its original
text or component parts. If a
destination file is not specified with
the /O option, the help file is decoded
to stdout. HELPMAKE decodes the file
differently depending on the letter
specified:

Letter      Effect
────────────────────────────────────────────────────────────────────────────
/D          "Decode." Fully decodes the
help database, leaving all
cross-references and
formatting information
intact.

/DS         "Decode split." Splits the
concatenated, compressed
help database into its
components using their
original names. If the
database was created without
concatenation (the default),
HELPMAKE simply copies it to
a file with its original
name. No decompression
occurs.

/DU         "Decode unformatted."
Decompresses the database
and removes all screen
formatting and
cross-references. The output
can still be used later for
input and recompression, but
all screen formatting and
cross-references are lost.

/H                                Displays a summary of HELPMAKE syntax
and exits without encoding or decoding
any files.

/HELP                             Invokes QH.EXE, the QuickHelp utility,
for information about HELPMAKE. If
QuickHelp is not available, displays the
same information as the /H option.

/Odestfile                        Specifies destfile for the decoded
output from HELPMAKE. If destfile is
omitted, the help database is decoded to
stdout. HELPMAKE always decodes help
database files into QuickHelp format.

/T                                Translates dot commands from internal
format into dot-command format. You
should always supply this option when
decoding a help database that contains
dot commands other than .context.

/V«n»                             Indicates the verbosity of diagnostic
and informational output depending on
the value of n. The possible values are
listed below. If you omit this option or
specify only /V, HELPMAKE gives you its
most verbose output.

Option      Effect
────────────────────────────────────────────────────────────────────────────
/V          Maximum diagnostic output

/V0         No diagnostic output and no
banner

/V1         Prints only the HELPMAKE
banner

/V2         Prints pass names

/V3         Prints contexts on first
pass

7.4  Creating a Help Database

You can create a Microsoft-compatible help database by either of two
methods.

The first method is to decompress an existing help database, modify the
resulting help text file, and recompress the help text file to form a new
database.

The second and simpler method is to append a new help database to an
existing help database. This method involves the following steps:

1.  Create a help text file in QuickHelp format, RTF, or minimally
formatted ASCII.

2.  Use HELPMAKE to create a help database file. The example below invokes
HELPMAKE, using SAMPLE.TXT as the input file and producing a help
database file named  sample.hlp:

HELPMAKE /V /E /Osample.hlp sample.txt > sample.log

3.  Make a backup copy of the existing database file (for safety's sake).

4.  Append the new help database file to the existing help database. The
example below concatenates the new database  sample.hlp  onto the end
of the CLANG.HLP database:

COPY clang.hlp /b + sample.hlp /b

5.  Test the database. The  sample.hlp  database contains the context
sample. If you type the word "sample" in the PWB and request help on
it, the help window displays the text associated with the context
sample.

7.5  Help Text Conventions

Microsoft help databases have a common structure and follow certain
organizational conventions. You should follow the same conventions to create
Microsoft-compatible help files.

7.5.1  Structure of the Help Text File

The help-retrieval capability that is built into Microsoft products is
simply a data-retrieval tool. It imposes no restrictions on the content and
format of the help text. The HELPMAKE utility and the display routines built
into Microsoft language environments, however, make certain assumptions
about the format of help text. This section provides some guidelines for
creating help text files compatible with those assumptions.

In all three help text formats, the help text source file is a sequence of
topics, each preceded by one or more unique context definitions. The
following list specifies the various formats and the corresponding context
definition statements:

Format                            Context Definition
────────────────────────────────────────────────────────────────────────────
QuickHelp                         .context context

RTF                               \ par >>context \ par

Minimally formatted               >>context

ASCII                             (none)

In QuickHelp format, each topic begins with one or more .context statements
that define the context strings that map to the topic text. Subsequent lines
up to the next .context statement constitute the topic text.

In RTF format, each context definition must be in a paragraph of its own
(denoted by \ par), beginning with the help delimiter (>>). Subsequent
paragraphs up to the next context definition constitute the topic text.

In minimally formatted ASCII, each context definition must be on a separate
line, and each must begin with the help delimiter (>>). As in RTF and
QuickHelp files, subsequent lines up to the next context definition
constitute the topic text.

See Section 7.6, "Using Help Database Formats," for detailed information
about these three formats.

7.5.2  Local Contexts

Context strings that begin with an "at" sign (@) are defined as "local" and
have no implicit cross-references. They are used in cross-references instead
of the context string that otherwise is generated.

When you use a local context, HELPMAKE does not generate a global context
string (a context string that is known throughout the help file). Instead,
it embeds an encoded cross-reference that has meaning only within the
current context. For example,

.context normal
This is a normal topic, accessible by the context string "normal."
[button\v@local\v] is a cross-reference to the following topic.

.context @local

This topic can be reached only if the user browses
sequentially through the file or uses the cross-reference
in the previous topic.

In the example above, the text  [button\v@local\v]  defines  local  as a
local context. If the user selects the text  [button]  or scrolls through
the file, the help system displays the topic text that follows the context
definition for  local. Because  local  is defined with the "at" sign (@), it
can be accessed only by a hyperlink within the help file or by sequentially
browsing through the file. Making a context local saves file space and
speeds access.

7.5.3  Context Prefixes

Microsoft help databases use several context prefixes. A "context prefix" is
a single letter followed by a period. It appears before a context string
that has a predefined meaning. If you decode a Microsoft help database, many
of these contexts may appear in the resulting text file.

Most context prefixes are internal.

Except for the h. prefix, which is described below, context prefixes are
internal. You do not need to add them in help files that you write.

You can use the h. prefix to identify standard help-file contexts. For
instance, h.default identifies the default help screen: the screen that
normally appears when you select "top-level" help. Table 7.1 lists the
standard h. contexts.

Table 7.1  Standard h. Contexts

╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Context                           Description
────────────────────────────────────────────────────────────────────────────
You should also define the string
"contents" for direct reference to this
context.

h.default                         The default help screen, typically
displayed when the user presses SHIFT+F1
at the "top level" in most applications.
The contents are generally devoted to
Context                           Description
────────────────────────────────────────────────────────────────────────────
The contents are generally devoted to
information about using help.

h.index                           The index for the help file.  You can
also define the string "index" for
direct reference to this context.

h.notfound                        The help text that is displayed when the
help system cannot find information
about the requested context. The text
could be an index of contexts, a topical
list, or general information about using
help.

h.pg#                             A specific page within the help file.
This is used in response to a "go to
page #" request.

h.pg$The help text that is logically last in Context Description ──────────────────────────────────────────────────────────────────────────── h.pg$                             The help text that is logically last in
the file. This is used by some
applications in response to a "go to the
end" request made within the help window.

h.pg1                             The help text that is logically first in
the file. This is used by some
applications in response to a "go to the
beginning" request made within the help
window.

h.title                           The title of the help database.

────────────────────────────────────────────────────────────────────────────

The context prefixes in Table 7.2 are internal to Microsoft products. They
appear in decompressed databases, but you do not need to use them.

Table 7.2  Microsoft Product Context Prefixes

╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Prefix                            Purpose
────────────────────────────────────────────────────────────────────────────
d.                                Dialog box. Each dialog box is assigned
a number. Its help context string is d.
followed by the number (for example,
d.12).

e.                                Error number. If a product supports the
error-numbering scheme used by Microsoft
languages, it displays help for each
error using this prefix. For example,
the context  e.c1234  refers to the C
compiler error message number C1234.

m.                                Menu item. Contexts that relate to
product menu items are defined by their
accelerator keys. For example, the Exit
Prefix                            Purpose
────────────────────────────────────────────────────────────────────────────
accelerator keys. For example, the Exit
selection on the FILE menu item is
accessed by ALT+F X and is referenced in
help by  m.f.x.

n.                                Message number. Each message box is
assigned a number. Its help context
string is n. plus the number (for
example,  n.5 ).

────────────────────────────────────────────────────────────────────────────

Explicit cross-references, or hyperlinks, in the help text file are marked
with invisible text. A hyperlink comprises a word or phrase followed by
invisible text that gives the context to which the hyperlink refers.

The keystroke that activates the hyperlink depends on the application.
Consult the documentation for each product to find the specific keystroke
needed.

When the user activates the hyperlink, the help system displays the topic
named by the invisible text. The invisible cross-reference text is formatted
as one of the following:

────────────────────────────────────────────────────────────────────────────
contextstring                     Causes the help topic associated with
contextstring to be displayed. For
example,  exeformat  results in the
display of the help topic associated
with the context  exeformat.

filename!                         Treats filename as a single topic to be
displayed. For example,
$INCLUDE:stdio.h! searches the INCLUDE environment variable for file STDIO.H and displays it as a single help topic. filename!contextstring Works the same way as contextstring above, except that only the help file filename is searched for the context. If the file is not already open, the help system finds it (by searching either the current path or an explicit environment variable) and opens it. For example,$BIN:readme.doc!patches  searches for
readme.doc  in the BIN environment
variable and displays the topic
associated with  patches.

In the following example, the word  Example  is a hyperlink:

The hyperlink refers to  open.ex. If you select any of the letters of
Example, the help system displays the topic whose context is  open.ex. On
the screen, this line appears as follows:

An application might display  See also:  and  Example  in different colors
or character types, depending on such factors as your default color
selection and type of monitor.

When a hyperlink needs to cross-reference more than one word, you must use
an anchor, as in the following example:

vfprintf, vprintf, vsprintf
\aformatting table\vprintf.table\v

This part of the example is an anchored hyperlink:

\aformatting table\vprintf.table\v

Anchored hyperlinks must fit on a single line.

The \ a flag creates an anchor for the cross-reference. In the example, the
phrase following the \ a flag (formatting table) is the hyperlink. It refers
to the context  printf.table. The first \v flag marks both the end of the
hyperlink and the beginning of the invisible text. The name  printf.table
is invisible; it does not appear on the screen when the help is displayed.
The second \v flag ends the invisible text.

7.6  Using Help Database Formats

The text format of the database can be any of three types. The list below
briefly describes these types. Sections 7.6.1-7.6.3 describe the formatting
types in detail.

An entire help system (such as the one supplied with the Professional
Development System or QuickC) can use any combination of files formatted
with different format types. With C, for example, the README.DOC information
file is encoded as minimally formatted ASCII; and the help files for the C
language and run-time library are encoded in the QuickHelp format. The
database also cross-references the header (include) files, which are
unformatted ASCII files stored outside the database.

Type                              Characteristics
────────────────────────────────────────────────────────────────────────────
QuickHelp                         Uses dot commands and embedded
formatting characters (the default
formatting type expected by HELPMAKE);
supports highlighting, color, and
cross-references. This format must be
compressed before using.

Minimally formatted ASCII         Uses a help delimiter (>>) to define
help contexts; does not support
highlighting, color, or crossreferences.
This format can be compressed, but
compression is not required.

RTF                               Uses a subset of standard RTF; supports
highlighting, color, and
cross-references; supports dot commands.
This format must be compressed before
using.

7.6.1  QuickHelp Format

The QuickHelp format uses a dot command and embedded formatting flags to
convey information to HELPMAKE.

QuickHelp Dot Commands

QuickHelp supports a number of dot commands, which identify topics and
convey other topic-related information to the help system. If your help file
contains dot commands other than .context, you must supply the /T option
when encoding and decoding with HELPMAKE.

You can define more than one context for a single topic.

The most important dot command is the .context command. Every topic in a
QuickHelp file begins with one or more .context commands. Each .context
command defines a context string for the topic text. You can define more
than one context for a single topic, as long as you do not place any topic
text between them.

Typical dot commands are shown below. The first defines a context for the
#include C preprocessor directive. The second set illustrates multiple
contexts for one block of topic text. In this case, the same topic text
explains all of the string-to-number conversion routines in C.

.context #include
.
.description of #include goes here
.
.context strtod
.context strtol
.context strtoul
.
. description of string-to-number functions goes here
.

The QuickHelp format supports several other dot commands. Table 7.3 lists
all of the dot commands available in QuickHelp format.

Table 7.3  QuickHelp Dot Commands

╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Command                           Action
────────────────────────────────────────────────────────────────────────────
.category string                  Lists the category in which the current
topic appears and its position in the
list of topics. The category name is
used by the QuickHelp Topic command,
which brings up the list of topics to
which the current topic belongs. Some
applications, such as the PWB, use this
name as a pointer to the applicable

.command                          Indicates that the topic text is not a
displayable help topic. Use this command
to hide hyperlink topics and other
internal information. Hyperlink topics
are described in Section 7.5.5,
Command                           Action
────────────────────────────────────────────────────────────────────────────
are described in Section 7.5.5,

.comment string                   The string is a comment that appears
only in the help source file. Comments
are especially useful for documenting
the purpose of cross-references.

Because comments are not inserted in the
help database, they are not restored
when you decompress a help file.

.context string                   The string introduces a topic.

.end                              Ends a paste section. See the .paste
command below.

Command                           Action
────────────────────────────────────────────────────────────────────────────

.freeze numlines                  Indicates that the first numlines lines
should be frozen as the top line of the
help screen. This is normally used to
freeze a row of cross-reference buttons
at the top of a help topic that might be
scrolled.

.length topiclength               Indicates the default window size, in
topiclength lines, of the topic about to
be displayed. This command is always the
first line in the topic if present.

.list                             Indicates that the current topic
contains a list of topics. QuickHelp
displays a highlighted line; you can
choose
a topic by moving the highlighted line
over the desired topic and pressing
Command                           Action
────────────────────────────────────────────────────────────────────────────
over the desired topic and pressing
ENTER. Help searches for the first word
of the line.

.mark name «column»               Defines a mark immediately preceding the
following line of text. This command can
be used in help script commands to
indicate that the display of a
particular topic begins at the marked
line. The name identifies the mark. The
optional column value is an integer that
indicates a column location within the
specified line.

.next context                     Tells the help system to look up the
next topic using
context instead of the next topic's name.
You can use this command to skip large
blocks of .command or .popup topics.
Command                           Action
────────────────────────────────────────────────────────────────────────────
blocks of .command or .popup topics.

.previous context                 Tells the help system to look up the
previous topic using context instead of
the previous topic's name. You can use
this command to skip large blocks of
.command or .popup topics.

.paste pastename                  Begins a paste section. The pastename
appears in the QuickHelp Paste menu.

.popup                            Tells the help system to display the
current topic as a popup instead of a
normal, scrollable topic.

.ref string(s)                    Tells the help system to display the
list of string topics in the Reference
menu. You can list as many topics as
needed; separate each additional string
Command                           Action
────────────────────────────────────────────────────────────────────────────
needed; separate each additional string
with a comma.

.topic text                       Defines text as the name or title to be
displayed in place of the context string
if the application help displays a title.
This command is always the first line in
the context unless you also use the
.length command.

────────────────────────────────────────────────────────────────────────────

QuickHelp Formatting Flags

The QuickHelp format supports a number of formatting flags that are used to
highlight parts of the help database and to mark hyperlinks in the help
text.

Each formatting flag consists of a backslash ( \ ) followed by a character.
Table 7.4 lists the formatting flags.

Table 7.4  Formatting Flags

╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Formatting Flag                   Action
────────────────────────────────────────────────────────────────────────────
\a                                Anchors text for cross-references

\b, \B                            Turns boldface on or off

\i, \I                            Turns italics on or off

\p, \P                            Turns off all attributes

\u, \U                            Turns underlining on or off

\v, \V                            Turns invisibility on or off (hides
Formatting Flag                   Action
────────────────────────────────────────────────────────────────────────────
\v, \V                            Turns invisibility on or off (hides
cross-references in text)

\\                                Inserts a single backslash in text

────────────────────────────────────────────────────────────────────────────

On monochrome monitors, text labeled with the bold, italic, and underlining
attributes appears in various ways, depending on the application (for
example, high intensity and reverse video are commonly displayed). On color
monitors, these attributes are translated by the application into suitable
colors, depending on the user's default color selections.

The \b, \i, \u, and \v options are toggles, turning on and off their
respective attributes. You can use several of these on the same text. Use
the \p attribute to turn off all attributes. Use the \v attribute to hide
cross-references and hyperlinks in the text.

HELPMAKE truncates the lines in QuickHelp files to the width specified with
the /W option. (See Section 7.3.1, "Options for Encoding," for more
information.) Only visible characters count toward the character-width
limit. Lines that begin with an application-specific control character are
truncated to 255 characters regardless of the width specification. See
characters.

In the example below, the \b flag initiates boldface text for  Returns:, and
the \p flag that follows the word reverts to plain text for the remainder of
the line.

\bReturns:\p    a handle if successful, or -1 if not.
errno:  EACCES, EEXIST, EMFILE, ENOENT

In the example below, \a anchors text for the hyperlink  Example . The \v
flags define the cross-reference to be  sample_prog  and cause the text
between the flags to be invisible. Cross-references are described in the
following section.

\aExample \vsample_prog\v

QuickHelp Cross-References

Help databases contain two types of cross-references: implicit
cross-references and explicit cross-references. They are described in
Section 7.1.1, "Contents of a Help File."

An implicit cross-reference is any word that appears both in the topic text
and as a context in the help file. For example, any time you request help on
the word "close," the help window displays help on the close function. You
don't need to code implicit cross-references in your help text files.

Insert formatting flags to mark explicit cross-references.

Explicit cross-references (hyperlinks) are words or phrases on the screen
that are associated with a context. For example, the word "Example" in the
initial help-screen area for any C function is an explicit cross-reference
to the C program example for that function. You must insert formatting flags
in your help text files to mark explicit cross-references.

If the hyperlink consists of a single word, you can use invisible text to
flag it in the source file. The \v formatting flag creates invisible text,
as follows:

Specify the first \v flag immediately following the word you want to use as
the hyperlink. Following the flag, insert the context that the hyperlink
crossreferences. The second \v flag marks the end of the context; that is,
the end of the invisible text. HELPMAKE generates a cross-reference whose
context is the invisible text, and whose hyperlink is the entire word.

If the hyperlink consists of a phrase, rather than a single word, you must
use anchored text to create explicit cross-references. Use the \a and \v
flags to create anchored text as follows:

The \a flag marks an anchor for the cross-reference. The text that follows
the \a flag is the hyperlink. The hyperlink must fit entirely on one line.
The first \v flag marks both the end of the hyperlink and the beginning of
the invisible text that

contains the cross-reference context. The second \v flag marks the end of
the invisible text.

The following example contains three implicit cross-references to the C
routines abs, cabs, and fabs.

The following example shows the encoding for an explicit cross-reference to
an example program and a function template from the help database for the C
run-time library:

Here, the hyperlinks are  Example  and  Template, which reference the
contexts  open.ex  and  open.tm. The example also contains an implicit
cross-reference to the close function.

The following example shows the encoding for an explicit cross-reference to
an entire family of functions:

The cross-reference uses anchored text to associate a phrase, rather than
just a word, with a context. In this example, the hyperlink is the anchored
phrase  is... functions, and it cross-references the context  is_functions.
In addition, the example contains an implicit cross-reference to the atoi
routine.

The code below is an example in QuickHelp format that contains a single
entry:

.context open
.length 13
\bInclude:\p   <fcntl.h>, <io.h>, <sys\\types.h>, <sys\\stat.h>

\bPrototype:\p  int open(char *path, int flag[, int mode]);
flag:  O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY
O_RDWR    O_TEXT    O_TRUNC  O_WRONLY
(can be joined by |)

\bReturns:\p    a handle if successful, or -1 if not.
errno:  EACCES, EEXIST, EMFILE, ENOENT

access, chmod, close, creat, dup, dup2, fopen, sopen, umask

The .length command near the beginning of the example specifies the size of
the initial window for the help text. Here, the initial window displays 13
lines.

The manifest constants (such as O_WRONLY and EEXIST), the C keywords (such
as int and char), and the other functions (such as sopen and access) are
implicit cross-references. The words  Example  and  Template  are explicit
cross-references to the example  open.ex  and to the open template  open.tp,
respectively. Note the use of double backslashes in the include file names.

7.6.2  Minimally Formatted ASCII Format

A minimally formatted ASCII text file comprises a sequence of topics, each
preceded by one or more unique context definitions. Each context definition
must be on a separate line beginning with a help delimiter (>>). Subsequent
lines up to the next context definition constitute the topic text.

Minimally formatted ASCII files cannot contain highlighting.

Minimally formatted ASCII files can be used in two ways. You can compress
the file with HELPMAKE, creating a help database, or an application can
access the uncompressed file directly. Uncompressed files are somewhat
larger and slower to search, however. Minimally formatted ASCII files are of
fixed width, and they cannot contain highlighting (or other nondefault
attributes) or cross-references.

The following example, coded in minimally formatted ASCII, shows the same
text as the QuickHelp example in the previous section. The first line of the
example defines  open  as a context string. The minimally formatted ASCII
help file must begin with the help delimiter (>>), so that HELPMAKE or the
application can verify that the file is indeed an ASCII help file.

>>>>open

Include:    <fcntl.h>, <io.h>, <sys\types.h>, <sys\stat.h>

Prototype:  int open(char *path, int flag[, int mode]);
flag:  O_APPEND  O_BINARY  O_CREAT  O_EXCL  O_RDONLY
O_RDWR    O_TEXT    O_TRUNC  O_WRONLY
(can be joined by |)

Returns:    a handle if successful, or -1 if not.
errno:  EACCES, EEXIST, EMFILE, ENOENT

See also:  access, chmod, close, creat, dup, dup2, fopen, sopen, umask

When displayed, the help information appears exactly as it is typed into the
file. Any formatting codes are treated as ASCII text. Note that you do not
need to escape backslashes in minimally formatted ASCII files.

If you compress minimally formatted ASCII files, they are smaller and faster
to search.

7.6.3  Rich Text Format (RTF)

RTF is a Microsoft word-processing format supported by many other word
processors. It allows documents to be transferred from one application to
another without losing any formatting information. The HELPMAKE utility
recognizes a subset of the full RTF syntax. If your file contains any RTF
code that is not part of the subset, HELPMAKE ignores the code and strips it
out of the file.

Certain word-processing and file-conversion programs generate the RTF code
automatically as output. You need not worry about inserting RTF codes
yourself; you can simply format your help files directly with a
word-processor that generates RTF, using the attributes supported by the
subset. The only items you need to insert are the help delimiter (>>) and
context string that start each entry.

HELPMAKE recognizes the subset of RTF listed below:

RTF Code                          Action
────────────────────────────────────────────────────────────────────────────
\b                                Boldface. The application decides how to
display this; often it is intensified
text.

\fi <nnn>                         Paragraph first-line indent.

\i                                Italic. The application decides how to
display this; often it is reverse video.

\li <nnn>                         Paragraph indent from left margin.

\line                             New line (not new paragraph).

\par                              End of paragraph.

\pard                             Default paragraph formatting.

\plain                            Default attributes. On most screens this
is nonblinking normal intensity.

\tab                              Tab character.

\ul                               Underline. The application decides how
to display this; some adapters that do
not support underlining display it as
blue text.

\v                                Hidden text. Hidden text is used for
cross-reference information and for some
application-specific communications; it
is not displayed.

Using the word-processing program, you can break the topic text into
paragraphs. When HELPMAKE compresses the file, it formats the text to the
width given with the / W option, ignoring the paragraph formats.

As with the other text formats, each entry in the database source consists
of one or more context strings, followed by topic text. An RTF file can
contain QuickHelp dot commands.

The help delimiter (>>) at the beginning of any paragraph denotes the
beginning of a new help entry. The text that follows on the same line is
defined as a context for the topic. If the next paragraph also begins with
the help delimiter, it also defines a context string for the same topic
text. You can define any number of contexts for a block of topic text. The
topic text comprises all subsequent paragraphs up to the next paragraph that
begins with the help delimiter.

The code below is an example of a help database that contains a single entry
using subset RTF text. Note that RTF uses curly braces ({}) for nesting.
Thus, the entire file is enclosed in curly braces, as is each specially
formatted text item.

{\rtf1
\pard >>open\par
{\b Include:}    <fcntl.h>, <io.h>, <sys\\types.h>, <sys\\stat.h>\par
\par
{\b Syntax:}     int open( char * filename, int oflag[, int pmode ]
);\par
oflag:  O_APPEND  O_BINARY  O_CREAT  O_EXCL  O_RDONLY\par
O_RDWR    O_TEXT    O_TRUNC  O_WRONLY\par
(may be joined by |)\par
\par
{\b Returns:}    a handle if successful, or -1 if not.\par
errno:  EACCES, EEXIST, EMFILE, ENOENT\par
\par
{\b See also:}  Examples{\v open.ex}, access, chmod, close, creat,
dup,\par
dup2, fopen, sopen, umask\par
>>open.ex\par
To build this help file, use the following command:\par
\par
HELPMAKE /S1 /E15 /OOPEN.HLP OPEN.RTF\par
\par

< Back >{\v !B}
}

Actual RTF output normally contains additional information that is not
visible to the user; HELPMAKE ignores this extra information.

Chapter 8  Customizing the Microsoft Programmer's WorkBench
────────────────────────────────────────────────────────────────────────────

Designed with flexibility in mind, the Microsoft Programmer's WorkBench
(PWB) provides a highly extensible development platform for the Microsoft C
Professional Development System. Using PWB it is easy to change basic
environment features such as screen colors and key assignments, and you can
add powerful new functions of your own using macros and C-language
extensions.

This chapter explains four methods for customizing the Programmer's
WorkBench: setting switches, assigning keystrokes, writing macros, and
writing C extensions. While it explains customization methods, the chapter
does not document every customizable feature of the Programmer's WorkBench.
Use on-line help as your primary source of information about these and other
PWB features.

This chapter assumes you are familiar with basic PWB operations and
terminology. If you are not, read "Using the Programmer's WorkBench" in
Installing and Using the Microsoft C Professional Development System.

8.1  Setting Switches

The Programmer's WorkBench has a number of "switches," or user-configurable
options, that control features such as screen colors. Each switch has a name
and can be assigned a value.

There are two ways to set PWB switches. The easiest way is by choosing
Editor Settings in the Options menu. You can also edit the TOOLS.INI
initialization file. These methods can also be used for more elaborate
customizations, such as writing macros.

8.1.1  Editing the <assign> Pseudofile

If you choose Editor Settings in the Options menu, PWB changes to the
<assign> pseudofile and displays it in the current window. (A pseudofile is
constructed dynamically by PWB; it exists only in memory.) The <assign>
file lists all the current PWB settings.

To change a switch, edit the line where it appears. For instance, the
vscroll switch controls how many lines PWB scrolls vertically; its default
setting is 1. To change it, move to the corresponding line:

vscroll:1

Change the 1 to 3 and move the cursor to another line. PWB highlights the
line to indicate the change is legal. (If you make an illegal change, PWB
signals an error.) The change takes effect immediately: now PWB scrolls text
three vertical lines at a time.

If you don't explicitly save a change, it disappears at the end of the
current session. You can save a change by saving <assign> as you would any
other file (by pressing ALT+A ALT+A F2). When you exit PWB, you are asked if
you want to save TOOLS.INI, the PWB initialization file, which records
customizations. Answer yes (type  Y)  to save the change.

You can also use this method for more elaborate customizations, such as
writing macros (see Section 8.3, "Writing Macros"). Simply insert a few
blank lines in <assign> and enter the new information in them. Note that PWB
only pays attention to lines you change or add to <assign>. Deleting a line
has no effect.

8.1.2  Editing the TOOLS.INI Initialization File

Another way to customize PWB is by editing TOOLS.INI, the initialization
file used by PWB and other Microsoft language tools. This method is useful
if you customize PWB extensively.

While the <assign> file lists every customizable PWB item, the TOOLS.INI
file contains lines only for items you have customized. Those items not
mentioned in TOOLS.INI are set to a default value.

Dividing TOOLS.INI into Sections

Since several tools can use TOOLS.INI, the file may contain information that
doesn't relate to PWB. If you customize more than one tool, TOOLS.INI is
divided into sections, one for each tool. Each section begins with a tag
consisting of the tool's base name enclosed in square brackets:  [PWB]  for
PWB.EXE, [NMAKE]  for NMAKE.EXE, and so on.

For example, say you set the vscroll switch to 3 and save the change, but
you have not customized PWB in any other way. Your TOOLS.INI file will
contain this section:

[PWB]
vscroll:3

Settings following this tag are put in effect by PWB every time it starts.

You can also create sections of TOOLS.INI that PWB reads only in certain
circumstances. You can create sections for different video adapters,
file-name extensions, and operating system versions.

If you use more than one video display, TOOLS.INI can have a different
section for each display:

■   [PWB-mono]

■   [PWB-cga]

■   [PWB-ega]

■   [PWB-vga]

After each tag, you can set different screen colors, dimensions, and other
display-specific switches.

You can also create a section for files with specific extensions. For
instance, your TOOLS.INI file could contain a section beginning with the tag

[PWB-.C]

for C source files, and

[PWB-.ASM]

for assembly-language (.ASM) source files. Each time you load a file with
the designated extension, PWB reads the appropriate section of TOOLS.INI.
For each file type, you could use a different set of macros and other
customizations.

TOOLS.INI can also contain sections specific to operating system versions.
The following tag introduces a section specific to DOS version 3.20, for
instance:

[PWB-3.20]

You can combine tags as needed. For example, the tag

[PWB-3.20 PWB-10.10R]

applies to DOS version 3.20 and OS/2 version 1.1 real mode.

You can also create a section in TOOLS.INI containing switches for a
userwritten extension. See Section 8.4.3, "Describing Functions and
Switches." On-line help contains additional information about TOOLS.INI
tags.

8.2  Assigning Keystrokes

PWB allows you to assign any editing function to almost any keystroke.
Reassigning keystrokes doesn't change PWB graphic interface, however.

Keystrokes, like switches, are listed in the <assign> pseudofile (choose Key
Assignments in the Options menu) and can be changed there. For example, say
you want to assign the home cursor function to the SHIFT+HOME keystroke. The
default keystroke assignment for home is:

home:ctrl+home

If you change the assignment to

home:shift+home

SHIFT+HOME moves the cursor to the home (upper left) window position.

It is legal to assign more than one keystroke to the same function. For
example, many keystrokes invoke the select function, which selects a text
region. Thus, the previous example adds a new keystroke (SHIFT+HOME) for the
home function; it does not remove the previous assignment (CTRL+HOME).

There are two limitations on keystroke assignments:

■   You can't reassign a keystroke that PWB is using for a menu. For
instance, if ALT+F pulls down the File menu, PWB ignores any attempt
to reassign ALT+F.

■   You can't reassign ALT plus the number keys 1 - 9 (ALT+1, ALT+2, and
so on). These keystrokes are reserved for the file history menu items.

Each keystroke can only invoke one function. If you mistakenly assign a
key-stroke to more than one function, PWB uses the most recent assignment.
For example,

home:ctrl+a
setfile:ctrl+a

assigns the CTRL+A keystroke to two different functions, home and setfile.
The second assignment overrides the first, assigning CTRL+A to setfile.

Occasionally, you may want to "unassign," or disable, a keystroke. This is
done by assigning the unassigned function to the keystroke. For example,

unassigned:ctrl+a

disables CTRL+A. PWB signals an error when you press any unassigned key.

8.3  Writing Macros

The fastest way to create a new editing function for PWB is to write a
macro. The function can be as simple as inserting a long word or phrase, or
it can perform complex tasks by invoking PWB functions and other macros.

8.3.1  Macro Syntax

A macro can contain any combination of PWB functions, literal text, and
macro operators. You can define as many as 1,024 macros at one time.

Literal text is case sensitive.

Literal text is anything inside double quotes. Inside literal text, you can
represent a double quote as  \" and a backslash as  \\. Text is case
sensitive inside quotes and case insensitive outside them.

The following macro comments out a line of C source code:

comment:=begline "/* " endline " */"
comment:alt+c

The first line names the macro and tells what it does. The begline and
endline editor functions move the cursor, while the text inside quotes is
printed at the current cursor position. The second line assigns a keystroke
(ALT+C) to the macro.

A macro definition must fit on one logical line. If necessary, you can use
the backslash ( \ ) to continue the definition on the next line. For
instance, the definition

comment:=begline "/* " endline " */"

could be written as

comment:=begline  \
"/* " endline  \
" */"

Notice the extra space before each backslash. If you want a space between
the end of one line and the beginning of the other, you must precede the
backslash with two spaces.

You can use the arg function to pass arguments to functions. For example,
the following macro passes the argument  15  to the plines function (which
scrolls text down):

movedown:=arg "15" plines

Because arg precedes the literal text, the text doesn't appear on the
screen. Instead, it is passed as an argument to the next function, plines.
The macro scrolls the current text down 15 lines.

Arguments can use regular expression syntax, as well (regular expressions
are documented in on-line help):

endword:=arg arg "( !.!$!\\:!;!\\)!\\(!,)" psearch The arg arg sequence directs the psearch function to treat the text argument as a regular expression search pattern. This search pattern tells PWB to search for the next period, end of line ($), colon, semicolon, close
parenthesis, open parenthesis, or comma.

A macro can invoke other macros:

lcomment:= "/* "
rcomment:= " */"
commentout:=begline lcomment endline rcomment
commentout:alt+z

The  commentout  macro invokes the previously defined macros  lcomment  and
rcomment.

In addition to standard PWB functions, macros can invoke user-defined
(extension) functions. See Section 8.4, "Writing and Building C Extensions."

8.3.2  Macro Responses

Some PWB functions ask you for confirmation. For example, the meta exit
(quit without saving) function normally asks if you really want to exit.
Such questions always take the answer "yes" (y) or "no" (n).

When you invoke such a function in a macro, the function assumes an answer
of yes and does not ask for confirmation. For example, the macro definition

quit:=meta exit
quit:alt+x

invokes meta exit when you press ALT+X. Because the meta exit function is
invoked from a macro, PWB exits without asking for confirmation.

The following operators allow you to restore normal prompting or change the
default responses:

Operator                          Description
────────────────────────────────────────────────────────────────────────────
<                                 Asks for confirmation; if not followed
by another < operator, prompts for all
further questions

<y                                Assumes a response of yes

<n                                Assumes a response of no

A response operator applies to the function immediately preceding it. For
instance, you can add the  operator to the  quit  macro definition to
restore the usual prompt:

quit:=meta exit <
quit:alt+x

Now the macro prompts for a response before it exits.

8.3.3  Macro Arguments

If you enter an argument in PWB and then invoke a macro, the argument is
passed to the first function in the macro that takes an argument:

tripleit:=copy paste paste

The  tripleit  macro invokes the copy and paste editing functions. If you
highlight a text area and then invoke the macro, your highlighted argument
is passed to the copy function, which copies the argument to the clipboard.
The macro then invokes paste twice. The effect is to insert two copies of
the highlighted text.

You cannot pass more than one argument from PWB to a macro.

You cannot pass more than one argument from PWB to a macro, even if the
macro invokes more than one function that can accept an argument. The
argument always goes to the first function in the macro that takes an
argument.

You can also prompt for input inside a macro and pass the input as an
argument using the prompt function as shown below:

newfile:=arg "Next file: " prompt setfile <
newfile:alt+n

The  newfile  macro prompts for a file name and then switches to the
specified file. The sequence  arg "Next file: " passes a text argument to
prompt, which prints the text on the dialog line and waits for input. The
input is passed as a text argument to the setfile function, which switches
to that file. For more information on the prompt function, see on-line help.

8.3.4  Macro Conditionals

Macros can take different actions depending on certain conditions. Such
macros take advantage of the fact that PWB editing functions generally
return values─a TRUE (nonzero) value if successful or FALSE (zero) if
unsuccessful.

Macros can use four conditional operators:

Operator                          Description
────────────────────────────────────────────────────────────────────────────
:>label                           Defines a label that can be targeted by
other operators

=>label                           Jumps to label

+>label                           Jumps to label if the previous function
returns TRUE

->label                           Jumps to label if the previous function
returns FALSE

For example, the  leftmarg  macro moves the cursor to the left margin of the
editing window:

leftmarg:=:>leftmore left +>leftmore

The macro above invokes the left function repeatedly (jumping to the label
leftmore) until it returns FALSE, indicating the cursor has reached the left
margin.

The label must appear immediately after the conditional operator, with no
intervening spaces. A conditional operator without a label exits the macro
immediately if the condition is true. If the condition is false, the macro
continues execution. The following example demonstrates this:

turnon:=insertmode +> insertmode

This macro turns on insert mode regardless of whether insert mode is
currently on or off. If insert mode is off, the first invocation of
insertmode toggles the mode on and returns TRUE, causing the +> operator to
terminate the macro. If insert mode is currently on, the first invocation of
insertmode turns insert mode off and returns FALSE. The macro then invokes
insertmode a second time, turning insert mode back on.

8.3.5  Temporary Macros

Occasionally, you may want to create a macro that lasts only through the
current session. This can be done with the assign function. For example, the
following steps create the  comment  macro described above.

To create the macro:

■   Press ALT+A

■   Type  comment:=begline "/* " endline " */"

■   Press ALT+=

To assign the ALT+C keystroke to the macro:

■   Press ALT+A

■   Type  comment:alt+c

■   Press ALT+=

The macro is available immediately and then disappears at the end of the
current session.

8.3.6  Macro Recordings

Another way to create a macro is by recording your own actions. The entire
sequence of actions is saved and can be replayed later by pressing a key.

You start the recording by invoking the record function. PWB names the
resulting macro recordvalue by default, but you can use other names as well.
To record a macro:

■   Choose Record On from the Edit menu to start the recording.

■   Perform the actions you want to record.

■   Choose Record On again to end the recording.

■   If recordvalue is not already assigned, assign it to a keystroke as
described above.

After you complete these steps, a macro named recordvalue is available
through the keystroke you assigned in the last step above. When you press
this key, PWB replays the actions you recorded.

If you don't do anything more, the recorded macro is temporary─it disappears
when you exit PWB. To save the macro permanently:

■   Open the <record> pseudofile (press ALT+A, type  <record>, press F2).

■   Copy the macro definition in <record>.

■   Paste the definition into the  [PWB]  section of your TOOLS.INI file.

Studying recorded macros can teach you a lot about macros and editor
functions. If you open the <record> pseudofile in a second window before you
record, you can watch PWB write the macro definition function by function.

If you save a recorded macro, you'll want to name it something other than
recordvalue, the default name. To do this, pass the new name as an argument
when you start the recording:

■   Press ALT+A ALT+A.

■   Type the new name.

■   Choose Record On from the Edit menu to start recording.

■   Complete the recording as usual.

You can expand an existing macro using the same process. If you supply the
name of an existing macro, PWB appends the recorded commands to the macro
instead of replacing it.

You can record a series  of actions without  executing them.

You can also make a "silent" recording, which records a series of actions
without executing them. Start the recording with a meta record command
(press F9 SHIFT+CTRL+R). Then complete the recording process as described
above.

8.4  Writing and Building C Extensions

An "extension" is a file containing one or more user-written functions. PWB
loads extensions at run time. Once the extension has been loaded, its
functions can be assigned their own keystrokes, given arguments, and invoked
in macros, exactly like other PWB functions.

User-written functions execute more quickly than macros.

The ability to load and call user-written functions makes PWB highly
extensible. Because they consist of compiled C code, your functions can
perform more complex jobs than macros can, and they execute many times
faster.

An extension contains executable code, but it differs from a normal
executable file in some important ways:

■   It does not contain the usual C start-up code.

■   It contains special data structures that describe its functions to
PWB.

■   Its functions are declared in a form that allows PWB to call them and
pass arguments to them.

■   Its functions can call native PWB functions, and some, but not all, C
library functions.

This section explains how to build, load, and invoke a PWB extension. The
example, CENTER.C, serves as a basis for discussion throughout the rest of
this chapter.

The CENTER.C extension contains one extension function,  CenterLine, which
centers a line or range of lines in the current file.

/* CENTER.C: Sample PWB extension */

#define LINE_LENGTH 80 /* Assumes 80-column screen */

#include <string.h>
/* PWB extension header file */
#include "ext.h"

PWBFUNC CenterLine( unsigned argData,
ARG _far *pArg,
flagType fMeta );

/* Switch Table */
struct swiDesc   swiTable[] =
{
{ NULL, NULL, 0 }
};

/* Command Table */
struct cmdDesc   cmdTable[] =
{
{ "CenterLine", CenterLine, 0, NOARG | LINEARG },
{ NULL, NULL, 0, 0 }
};

/* Initialization Function */
void EXTERNAL WhenLoaded( void )
{
}

/* Extension (user-written) function */
PWBFUNC CenterLine( unsigned argData,
ARG _far *pArg,
flagType fMeta )
{
PFILE pFile;
LINE  yStart, yEnd;
int   len;
char *pBuf, buf[BUFLEN];

/* Get a handle to the current file */
pFile = FileNameToHandle( "", "" );

/* Handle various argument types */
switch( pArg->argType )
{
case NOARG:  /* No argument. Center current line */
yStart = yEnd = pArg->arg.noarg.y;
break;

case LINEARG:  /*  Center range of lines */
yStart = pArg->arg.linearg.yStart;
yEnd = pArg->arg.linearg.yEnd;
break;
}

/* Center current line or range of lines */
for( ; yStart <= yEnd; yStart++ )
{
/* Get a line from the current file */
len = GetLine( yStart, buf, pFile );

if( len > 0 )
{
/* Center the text in this line */
pBuf = buf + strspn( buf, " \t" );
len = strlen( pBuf );
memmove( buf+(LINE_LENGTH-len) / 2, pBuf, len+1 );
memset( buf, ' ', (LINE_LENGTH - len) / 2 );

/* Write modified line back to the current file */
PutLine( yStart, buf, pFile );
}
}
return TRUE;
}

Building and using a PWB extension involves four basic steps:

1.  Compiling

4.  Assigning a keystroke to each function in the extension

You can build extensions for both real mode (DOS) and OS/2 protected mode.

8.4.1  Building Real-Mode Extensions

This section describes how to build extensions for real mode.

Compiling

The source (.C) file for an extension must include EXT.H, the extension
header file. Since an extension is not a stand-alone executable file, it
doesn't have a main function; so its source file is compiled with the /c
(compile, but don't link) option:

CL /c /Gs /ACw CENTER.C

The /Gs option turns off stack checking; the /ACw option selects the
required custom memory model.

PWB extension interface is designed for C programmers. However, you can
write extensions in assembly language or other languages if you simulate the
required C memory model (in which SS is not assumed to equal DS).

The first object file in the link command must be the stub EXTHDR.OBJ:

link exthdr center, center.mxt;

PWB can load a file with any name, but most programmers use the .MXT
extension to distinguish a PWB extension from a normal .EXE file.

Once the extension is built, you can cause PWB to load it by adding a load
command to your TOOLS.INI file:

You don't need to supply a file extension; PWB assumes the correct file
extension. To specify a path, supply the path name preceded by a dollar sign
($): load:$INIT:center

The example tells PWB to search the directories specified in the INIT
environment variable. If listed, the environment variable must be in
uppercase.

TOOLS.INI can contain multiple load commands for different extensions.
However, loading each extension involves a certain amount of memory
overhead, and there is no way to unload an extension from memory. To
conserve memory, place all frequently used functions in a single extension
and load only that extension.

Assigning Keystrokes to Functions

After an extension has been loaded, you must provide some way to invoke its
functions from inside PWB. A keystroke is the most common means, although
extension functions, like native PWB functions, can be invoked in various
ways.

You can assign the ALT+C keystroke to the  CenterLine  function with:

CenterLine:alt+c

Once the  CenterLine  function has been assigned to this keystroke, you can
invoke it by pressing ALT+C.

8.4.2  Building Protected-Mode Extensions

The build process for OS/2 protected mode differs only slightly from the
real-mode build process.

Compiling

The source (.C) file for an extension must include EXT.H, the extension
header file. Since an extension is not a stand-alone executable file, it
doesn't have a main function; so its source file is compiled with the /c
(compile, but don't link) option:

CL /c /Gs /ACw CENTER.C

The /Gs option turns off stack checking; the /ACw option selects the
required custom memory model.

PWB extension interface is designed for C programmers. However, you can
write extensions in assembly language or other languages if you simulate the
required C memory model (in which SS is not assumed to equal DS).

Link with EXTHRDP.OBJ instead of EXTHDR.OBJ. Specify the .PXT extension for
the output file. List the EXT.DEF definitions file:

link exthdrp center, center.pxt,, os2, ext.def

In protected mode, PWB assumes the .PXT file extension. If your extension is
not found, PWB assumes the .DLL file extension.

You cannot create a bound extension.

There is no way to create a bound extension (one that runs in both real and
protected mode). However, you can build separate versions of an extension
and use a single TOOLS.INI load command to load the correct extension in
each mode. PWB loads the real-mode file (.MXT) in real mode and the
protected-mode file (.PXT or .DLL) in protected mode.

Assigning Keystrokes to Functions

After an extension has been loaded, you must provide some way to invoke its
functions from inside PWB. A keystroke is the most common means, although
extension functions, like native PWB functions, can be invoked in various
ways.

You can assign the ALT+C keystroke to the  CenterLine  function with:

CenterLine:alt+c

Once the  CenterLine  function has been assigned to this keystroke, you can
invoke it by pressing ALT+C.

8.4.3  Describing Functions and Switches

To call functions in your extension, PWB must know certain information about
each function, such as the name and address of the function, what types of
arguments it accepts, and what switches (if any) it employs. You provide
this information in a pair of arrays─cmdTable and swiTable─that must be
present in every PWB extension.

The cmdTable Array

Every extension must contain an array of structures named cmdTable. This
array provides the information PWB needs to call the extension's functions.

The cmdTable array is an array of structures of type cmdDesc (which is
declared in EXT.H). Each structure in the array describes one function in
the extension. The array is terminated with a structure whose members are
all null.

For instance, the CENTER.C extension has one function, named  CenterLine, so
its cmdTable array contains two structures (one for  CenterLine  and the
other to terminate the table):

struct cmdDesc cmdTable[] =
{
{ "CenterLine", CenterLine, 0, NOARG | LINEARG },
{ NULL, NULL, 0, 0 }
};

Each cmdDesc structure in cmdTable contains these members:

■   The function's name

■   The function's address

■   Reserved item (must be 0)

■   The argument types the function accepts

The last member in the list is an integer containing bitflags representing
types of arguments that your function accepts. You can combine more than one
bitflag using the OR ( | ) operator.

For instance, the  CenterLine  function can handle an argument of the type
LINEARG, or no arguments (NOARG). So it lists the types:

NOARG | LINEARG

There are many argument types in addition to these. For information about
specific argument types, see the Extensions topic in on-line help.

The swiTable Array

Extension functions, such as native PWB functions, can respond to user-
configurable switches. From the viewpoint of an extension function, a switch
is usually a variable that the user can change at run time. Your function
must be ready to respond to these changes, and PWB must have some way to
convey them. The vehicle for this interchange is an array of structures
named swiTable.

The swiTable array is similar to the cmdTable array described above. It is
an array of structures, terminated by a structure whose members are all
null. Each structure in swiTable describes one switch used by a function in

The CENTER.C extension doesn't take any switches, so its swiTable array only
contains a terminating null structure:

struct swiDesc swiTable[] =
{
{ NULL, NULL, 0 }
};

Each structure in swiTable is of type swiDesc, whose members are

■   A pointer to the switch name

■   A pointer to the switch or a function

■   A flag that indicates the type of the switch

A switch can be one of three types: SWI_BOOLEAN for TRUE/FALSE conditions,
SWI_NUMERIC for numerics, or SWI_SPECIAL for strings.

The second member of swiDesc is a pointer. It points to the switch itself if
the switch is type SWI_BOOLEAN or SWI_NUMERIC, or to a string-handling
function if the switch is type SWI_SPECIAL.

For instance, the following code creates a numeric switch with the default
value 27:

static int n = 27;

struct swiDesc swiTable[] =
{
{ "newswitch", &n, SWI_NUMERIC | RADIX10 },
{ NULL,  NULL, 0 }
};

The first structure in the example above contains the name of the switch
("newswitch"), a pointer to the variable that contains the switch's value
(&n), and the switch's type (SWI_NUMERIC).

In this example, the third structure member contains another constant,
RADIX10. If a switch is type SWI_NUMERIC, you must supply a second constant
to tell PWB whether to interpret user-assigned values as decimal (RADIX10)

If the switch is type SWI_SPECIAL, the second member of swiDesc is a pointer
to an additional string-handling function that you write. This function must
be of type int far _pascal. Each time the text switch changes, PWB calls
your function, passing it the address of the updated string as a char far
pointer. The following code stores the updated string in a buffer named
mystring:

char mystring[BUFLEN];

int far _pascal setstr( char far *ptr )
{
strcpy( mystring, ptr );
}

If desired, you can list switches for extension functions separately from
other switches. Whenever PWB loads an extension, it looks in TOOLS.INI for a
section with this form:

[PWB-ext]

where ext is the base name of the extension. If the extension exists, PWB
recognizes the settings immediately following the tag. For instance, if your
extension SAMPLE.MXT uses a numeric switch named  numbills, you can set
numbills  to the value 66 with:

[PWB-SAMPLE]
numbills:66

8.4.4  Initializing Functions

Every PWB extension must contain a function named WhenLoaded, which PWB
provides a chance to do any initialization that your functions require. (If
your functions don't need any initialization, they can simply return.)

The CENTER.C extension uses WhenLoaded to display a loading message:

void EXTERNAL WhenLoaded( void )
{
}

DoMessage is a PWB function that displays a message on the dialog line.
Section 8.4.7, "Calling PWB Functions," lists PWB functions and explains how
to call them.

8.4.5  Prototyping Functions

To be called by PWB, each extension function must be declared as type
PWBFUNC and accept the parameters argData, pArg, and fMeta. The  CenterLine
function in the section of CENTER.C code below follows  this model:

PWBFUNC CenterLine( unsigned argData,
ARG _far *pArg,
flagType fMeta )

The PWBFUNC type is actually a macro that evaluates to flagType _pascal
_loadds _far. The flagType return type declares that the function returns
either TRUE (nonzero) or FALSE (zero). Your function should return a value
so that it can be used in a macro with conditionals. The modifiers _pascal,
_loadds, and _far specify the calling conventions PWB expects editor
functions to have.

8.4.6  Receiving Parameters

Like native PWB functions, extension functions can receive parameters from
the user. The CENTER.C example allows you to select a range of lines to
center, for example. The selected range is passed as a parameter to the
CenterLine  function.

Extension functions receive parameters in much the same way ordinary C
programs receive command-line parameters. In both cases, the parameters are
passed in a predefined data construct─argc and argv for a normal C program,
and the following parameters for an extension function:

Parameter                         Description
────────────────────────────────────────────────────────────────────────────
argData                           The keystroke used to invoke your
function

pArg                              A pointer to a structure containing
arguments passed to your function

fMeta                             TRUE (nonzero) if meta precedes the
argument, otherwise FALSE (zero)

The first parameter is rarely used. Most extension functions receive all
their parameter data in the second parameter, pArg. This parameter is a
pointer to a structure of type ARG, which contains:

Parameter                         Description
────────────────────────────────────────────────────────────────────────────
argType                           An integer that indicates the argument
type

arg                               A union of structures, one structure for
each
argument type

Typically, your function tests pArg->argType to find out what type of
parameter PWB has passed. Once the type is known, the function responds
accordingly. The following code from CENTER.C handles two argument types:

switch( pArg->argType )
{
case NOARG:  /* No argument. Center current line */
yStart = yEnd = pArg->arg.noarg.y;
break;

case LINEARG:  /*  Center range of lines */
yStart = pArg->arg.linearg.yStart;
yEnd = pArg->arg.linearg.yEnd;
break;
}

PWB rejects invalid arguments.

If your function takes only one argument, it doesn't need to test
pArg->argType at all. PWB knows beforehand what argument types your function
accepts (via cmdDesc) and rejects any invalid arguments.

Once the argument type is known, your function can access the parameters
through pArg->arg, a structure whose members differ for each argument type.
In the NOARG (no arguments) case, it contains x and y values identifying the
cursor position in the current file:

struct noargType
{           /* no argument    */
LINE y;  /* cursor line    */
COL  x;  /* cursor column  */
};

The CENTER.C example uses the y value in this structure (noarg.y, the cursor
line) to center the current line:

case NOARG:  /* No argument. Center current line */
yStart = yEnd = pArg->arg.noarg.y;
break;

Similarly, in the LINEARG case, the pArg->arg structure contains three
values:

struct lineargType
{                /* line argument specified */
int  cArg;    /* count of args pressed   */
LINE yStart;  /* starting line of range  */
LINE yEnd;    /* ending line of range    */
};

The CENTER.C example uses the starting and ending values in this structure
(yStart and yEnd) to center a range of selected lines:

case LINEARG:  /*  Center range of lines */
yStart = pArg->arg.linearg.yStart;
yEnd = pArg->arg.linearg.yEnd;
break;

The method is the same for other argument types. The pArg->arg structures
for all argument types are described in on-line help.

8.4.7  Calling PWB Functions

Many of PWB's internal functions are public. Your extension function can
call them for the same purposes that PWB itself does. This section
demonstrates the most commonly used PWB functions─those that manipulate the
current file.

A list of callable PWB functions appears near the end of this section. For
complete information on specific PWB functions, consult on-line help.

Getting a File Handle

Extension functions can do many different tasks, but they typically
manipulate a file in some way. The extension function in the CENTER.C
example rewrites a line or lines in the current file, for example. The
current file is the one that appears in the editing window. Since it is
already open for editing, you can access the current file without opening
it. Simply assign its file handle to a variable in your function.

PWB file-handling functions use file handles of type PFILE. The CENTER.C
example declares the following handle variable:

PFILE pFile;

The FileNameToHandle function gets a handle to a file that is already open
for editing:

pFile = FileNameToHandle( "", "" );

The function takes two string arguments. If the first string is null, as
here, the FileNameToHandle function returns a handle to the current file.
You can use the AddFile function to get handles to other files (in which
case you may need to use other PWB functions such as FileRead).

Reading a Line From the File

Once your function has a file handle, it can read from the file with the
GetLine function, which reads one line at a time:

len = GetLine( yStart, buf, pFile );

The first argument is a line number, the second a pointer to a buffer, and
the third a file handle. So the above call reads line number  yStart  from
the file whose handle is  pFile  into the buffer  buf. Note that the first
line in a file is line 0, not line 1.

Once you have read a line into a local buffer, you can manipulate it as
desired. CENTER.C uses its buffer  buf  to center the line's text.

Writing a Line to the File

After modifying a line, you can write it back to the file. The PutLine
function writes one line at a time:

PutLine( yStart, buf, pFile );

PutLine takes the same arguments as GetLine─a line number, buffer pointer,
and file handle. In CENTER.C, the above call writes the line from  buf  to
line  yStart  in the file whose handle is  pFile.

Summary of PWB Functions

If you understand how CENTER.C works, you know the basics of using PWB
functions in your own functions. The rest is just a matter of learning the
details of individual functions. Table 8.1 lists the PWB functions, grouping
them by category. For additional information on specific functions, consult
on-line help.

Table 8.1  Callable PWB Functions

╓┌──────────────────┌──────────────────┌─────────────────────────────────────╖
Category           Function           Description
────────────────────────────────────────────────────────────────────────────
Block Operations   CopyBox            Insert rectangular area
Category           Function           Description
────────────────────────────────────────────────────────────────────────────
Block Operations   CopyBox            Insert rectangular area

CopyLine           Insert range of lines

CopyStream         Insert stream of text

DelBox             Delete rectangular area

DelLine            Delete range of lines

DelStream          Delete stream of text

Build              fGetMake           Get extmake setting

fSetMake           Set extmake setting

Color              GetColor           Get color of specified line

PutColor           Set color of specified line
Category           Function           Description
────────────────────────────────────────────────────────────────────────────
PutColor           Set color of specified line

Cursor             GetCursor          Get cursor position

MoveCur            Move cursor

Dialog             DoMessageBox       Create message dialog

PopUpBox           Display text in dialog
window

Display            BadArg             Report that argument was invalid

Display            Update screen

DoMessage          Display message on dialog line

File               AddFile            Open new file and get file handle

Category           Function           Description
────────────────────────────────────────────────────────────────────────────

DelFile            Delete contents of file buffer

fChangeFile        Change current file to named file

FileNameToHandle   Get handle to open file

FileRead           Copy disk file to file
buffer

FileWrite          Copy file buffer to disk file

Table 8.1   (continued)

╓┌────────────────┌─────────────────┌────────────────────────────────────────╖
Category         Function          Description
────────────────────────────────────────────────────────────────────────────
Category         Function          Description
────────────────────────────────────────────────────────────────────────────
pFileToTop        Make specified file the current file

RemoveFile        Remove file from memory

Keyboard         KbHook            Restore keyboard control to PWB

KbUnHook          Remove keyboard control from PWB

ReadChar          Get information on next keystroke

Format           ReadCmd           Get keystroke information in CmdDesc

Line             FileLength        Get length of file

GetLine           Get line from file

PutLine           Write line to file

List             GetListEntry      Get item from list
Category         Function          Description
────────────────────────────────────────────────────────────────────────────
List             GetListEntry      Get item from list

ScanList          Process list

Memory           Falloc            Allocate far memory

Fdalloc           Deallocate far memory

Miscellaneous    fExecute          Execute macro

FindSwitch        Get information about switch

GetEditorObject   Get internal PWB data item

GetString         Get input from dialog line

mgetenv           Get environment string

NameToFunc        Get information about function or macro
Category         Function          Description
────────────────────────────────────────────────────────────────────────────
NameToFunc        Get information about function or macro

NameToKeys        Get key(s) assigned to specified
function

Replace           Replace character

SetEditorObject   Set internal PWB data item

SetKey            Assign function to
keystroke

Search           REsearch          Search for regular
expression

search            Search for string

Virtual Memory   fpbtoVM           Copy data to virtual memory

Category         Function          Description
────────────────────────────────────────────────────────────────────────────

VMalloc           Allocate virtual memory

VMFree            Free virtual memory

VMtofpb           Copy data from virtual memory

Window           CloseWnd          Close window

Resize            Resize window

SplitWnd          Split window

────────────────────────────────────────────────────────────────────────────

8.4.8  Calling C Library Functions

You can write many useful extension functions using only PWB functions
listed in the previous section. It is also possible to call C library
routines, with some limitations. An extension written for OS/2 protected
mode can call any C library routine if it is linked with EXTHDRP.OBJ and the
.DLL C run-time library. The list of usable routines is shorter for
real-mode (DOS) extensions linked with the non-.DLL run-time library.

Before you call a C library routine, ask whether the task can be done with a
PWB function. If the answer is yes, you should always call a PWB function in
preference to the C library routine. This practice ensures compatibility
between your functions and PWB.

The following categories of C library routines are always safe to use in
real mode:

■   Buffer manipulation

■   Character classification and conversion

■   Data conversion

■   String manipulation

This list includes the library routines you are most likely to need in an
extension function. If your extension function calls C library functions,
you must link with the compact-model C library.

The following routines should not be used in real mode:

■   Routines that need C start-up support (most input/output functions)

■   Memory management routines, such as malloc, and routines that call
them

■   Process control routines such as spawn and exec

If you are in doubt about a particular C library routine, you can always use
it and see what happens. If the linker displays the following message,

error L2044: __acrtused : symbol multiply defined, use /NOE

the routine requires C start-up support and should not be used.

Chapter 9  Debugging C Programs with CodeView
────────────────────────────────────────────────────────────────────────────

Even experienced programmers occasionally find bugs in their programs. This
chapter explores techniques that will help you locate these errors quickly,
using the Microsoft CodeView debugger.

This chapter describes:

■   How to display and modify variables and memory

■   How to control the flow of execution while debugging

■   Advanced CodeView debugging techniques

■   How to control CodeView's behavior with command-line switches and the
TOOLS.INI file

CodeView supports the Microsoft mouse (or any fully compatible pointing
device). All operations are described first using the mouse; the keyboard
command follows.

For information about debugging OS/2 programs that use threads or processes,
see Chapter 15, "Creating OS/2 Multithread Applications."

9.1  Understanding CodeView Windows

CodeView divides the screen into logically separate sections called windows,
so that a large amount of information can be displayed in an organized and
easy-to-read fashion. Each window is a discrete section of the display that
operates independently of the other windows.

Each window displays a different type of data.

Each CodeView window has a distinct function. The name of each window
described below appears in the top of the window's frame:

■   The Source window displays the source code. You can open a second
Source window to view an include file, another source file, or the
same source file at a different location.

■   The Command window accepts debugging commands.

■   The Watch window displays the current values of selected variables.

■   The Local window lists the values of all variables local to the
current function or block.

■   The Memory window shows the contents of memory. You can open a second
Memory window to view a different section of memory.

■   The Register window displays the contents of the microprocessor's
registers, as well as the processor flags.

■   The 8087 window displays the registers of the coprocessor or its
software emulator.

CodeView starts running with three windows displayed. The Local window is at
the top, the Source window fills the middle of the screen, and the Command
window is at the bottom.

There are two ways to open windows. You can choose the desired window from
the View menu. (Note that you can open more than one of certain windows,
such as Source or Memory.) In addition, some operations (such as selecting a
Watch variable) open the appropriate window automatically, if it is not

All displays are  updated automatically.

CodeView continually and automatically updates the contents of all windows.
However, if you want to interact with a particular window (for instance, to
enter a command, set a breakpoint, or modify a variable), you must select
that window as the focus of user interaction.

The selected window is called the "current" window. The current window is
marked in three ways:

■   The window's name is highlighted in white.

■   The text cursor appears in the window.

■   The vertical and horizontal scroll bars are moved into the window.

To select a new current window, click left in the window (position the mouse
cursor in the window and press the left mouse button) that you want to be
current. You can also press F6 or SHIFT+F6 to move the focus from one window
to the next.

Windows often contain more information than can be displayed in the area
allotted to the window. There are two ways to view these additional
contents. You can drag on the window's horizontal or vertical scroll bars.
(Position the mouse pointer on the bar and, while holding down the left
mouse button, drag the mouse in the appropriate direction.) You can also use
the direction keys (LEFT, RIGHT, UP, DOWN) to move the text cursor.

Typing commands into the Source window causes CodeView to temporarily shift
its focus to the Command window. Whatever you type is appended to the last
line in the Command window. If the Command window is closed, CodeView beeps
in response to your entry and ignores the input.

Although you cannot change the relative positions of the windows, you can
change their size or remove them. The Maximize, Size, and Close commands
from the View menu perform these functions, or you can press CTRL+F10,
CTRL+F8, and CTRL+F4, respectively. Window manipulations are especially easy
with a mouse:

■   To maximize a window (enlarge it so it fills the screen), click left
on the up arrow at the right end of the window's top border. To
restore the window to its previous size and position, click left on
the double arrow at the right end of the top border.

■   To change the size of a window, position the mouse pointer anywhere
along the white line at the top of the window. Press and hold down the
left mouse button. When two double arrows appear on the line, you can
drag the mouse to enlarge or reduce the window. The same action on a
vertical border widens or narrows the window.

■   To close a window, click left on the dot at the left end of the top
border. You can also close any window in the View menu whose name has
a dot next to it by selecting that window from the menu or by pressing
that window's acclerator key. The adjacent windows automatically
expand to recover the empty space.

CodeView stores session information in a file called CURRENT.STS, which is
created in the directory pointed to by the INIT environment variable. The
session information includes such items as the name of the program being
debugged, which CodeView windows were open, and the breakpoint locations.
This information becomes the default status the next time you run CodeView.

9.2  Overview of Debugging Techniques

There is no single best approach to debugging for all programs or users.
CodeView offers a variety of debugging tools that let you pick a method
appropriate to the program or your work habits. The following section  may
help you decide how to approach a particular program.

Broadly speaking, two things can go wrong in a program:

■   The program doesn't manipulate the data the way you expected it to.

■   The flow of execution is incorrect.

These problems occasionally overlap. Incorrect execution can corrupt the
data, and bad data can cause execution to take an unexpected turn. Because
CodeView allows you to trace program execution and display whatever
combination of variables you want simultaneously, you don't have to know
ahead of time whether the problem is bad data manipulation, a bad execution
path, or some combination of these.

CodeView has features that deal specifically with the problems of bad data
and incorrect execution:

■   You can view and modify any program variable, any section of memory,
or any processor register.

■   You can monitor the path of execution and precisely control where
execution pauses.

The following sections explain how to view and modify data and describe how
execution is controlled.

9.3  Viewing and Modifying Program Data

The CodeView debugger offers a variety of ways to display program variables,
processor registers, and memory. You can also modify the values of all these
items as the program executes. This section shows how to display and modify
variables, registers, and memory.

9.3.1  Displaying Variables in the Watch Window

To add a variable to the Watch window, position the cursor on the name of
the variable using either the mouse or the direction keys (LEFT, RIGHT, UP,
DOWN). Then select the Add Watch command from the Watch menu, or press
CTRL+W.

A dialog box appears with the selected variable's name displayed in the
Expression field. If you don't want to watch the variable shown, type in the
name of the variable you want to watch. Pressing ENTER or clicking left on
the OK button adds this variable to the Watch window.

The Watch window appears at the top of the screen. Adding a Watch variable
automatically opens the Watch window if the window doesn't already exist.

A newly added variable may be followed by the message:

<Watch Expression Not in Context>

This message appears when program execution has not yet reached the block
where the variable is defined. (A block is a section of code enclosed in
curly braces.) Global variables (those declared outside C functions) never
cause CodeView to display this message; they can be watched from anywhere in
the program.

To remove a variable from the Watch window, use the Delete Watch command
from the Watch menu, and select the variable to be removed using the list in
the dialog box. You can also position the cursor on any line in the Watch
window and press CTRL+Y to delete the line.

There is no limit to how many variables you can watch.

You can place as many variables as you like in the Watch window; the
quantity is limited only by available memory. You can scroll through the
Watch window to position it at those variables you want to view. CodeView
automatically updates all watched variables as the program runs, including
those not currently visible.

Loops (do, for, or while) cause problems when they don't terminate
correctly. Displaying loop variables in the Watch window is an easy way to
determine whether a loop variable achieves its proper value.

9.3.2  Displaying Expressions in the Watch Window

You may have noticed that the Add Watch dialog box prompts for an
expression, not simply a variable name. As this suggests, you can enter an
expression (that is, any valid combination of variables, constants, and
operators) for CodeView to evaluate and display.

Expressions can use the  syntax of other languages.

You are not limited to evaluating C expressions. The Language command of the
Options menu offers a choice of BASIC or FORTRAN expression evaluation, if
one of these languages better suits your needs. The ability to select the
language evaluator is especially useful when debugging mixed-language
programs. Remember that C-specific features, such as type casting or pointer
conversions, are not available in other languages.

You can display more information with expressions than with individual
variables.

By reducing several variables to a single, easily read value, an expression
can be easier to interpret than the components that make it up. Imagine a
for loop with two variables whose ratio is supposed to remain constant. You
suspect that one of these variables (you aren't sure which) sometimes takes
the wrong value. With (var1 / var2)  displayed as an expression in the Watch
window, you can easily see when this single value changes; you don't have to
mentally divide two numbers.

You can also display Boolean expressions. For example, if a variable is
never supposed to be larger than 100 or less than 25,  (var < 25 || var >
100)  evaluates to 1 (true) when  var  goes out-of-bounds.

9.3.3  Displaying Arrays and Structures

Most program variables are scalar quantities─a single character or a single
integer or floating-point value. These appear in the Watch window with the
variable name to the left, followed by an equal sign (=) and the current
value.

You can view arrays and structures in expanded form.

Arrays and structures contain multiple values, arranged in one or more
layers. They are often referred to as "aggregate" data items. CodeView lets
you control how much of these variables is shown; that is, whether all,
part, or none of their internal structure is displayed.

An array initially appears in the Watch window in this form:

+wordholder[]  = [...]

The brackets indicate that this variable contains more than one element. The
plus sign (+) indicates that the variable has not yet been expanded to
display its components.

To expand the array, double-click anywhere on the line. You can also
position the cursor on the line and press ENTER. For example, if  wordholder
is a six-character array containing the word "Basic," the Watch window
display changes to the following :

-wordholder[]
[0]  =  66 'B'
[1]  =  97 'a'
[2]  =  115 's'
[3]  =  105 'i'
[4]  =  99 'c'
[5]  =  0 ''

Note that both the individual character values and their ASCII decimal
equivalents are listed. The minus sign (-) indicates no further expansion is
possible. To contract the array, double-click on its line (or position the
cursor on the line and press ENTER) again.

If it is inconvenient to view a character array in this form, cast the
variable's name to a character pointer by placing  (char *)  in front of the
name. The character array is then displayed as a string delimited by
apostrophes.

You can display arrays with more than one dimension. Imagine a 5 x 5 integer
array named  matrix, whose diagonal elements are the numbers 1 through 5 and
whose other elements are zero. Unexpanded, the array is displayed like this:

+matrix[]  = [...]

Double-clicking on  matrix  (or pressing ENTER) changes the display:

-matrix[]
+[0][]  =  [...]
+[1][]  =  [...]
+[2][]  =  [...]
+[3][]  =  [...]
+[4][]  =  [...]

The actual values of the elements are not shown yet. You have to descend one
more level to see them. To view the elements of the third row of the array,
position the cursor anywhere on the fourth line and press ENTER:

-matrix[]
+[0][]  =  [...]
+[1][]  =  [...]
-[2][]
[0]  = 0
[1]  = 0
[2]  = 3
[3]  = 0
[4]  = 0
+[3][]  =  [...]
+[4][]  =  [...]

Expanding the fifth row of the array produces this display:

-matrix[]
+[0][]  =  [...]
+[1][]  =  [...]
-[2][]
[0]  = 0
[1]  = 0
[2]  = 3
[3]  = 0
[4]  = 0
+[3][]  =  [...]
-[4][]
[0]  = 0
[1]  = 0
[2]  = 0
[3]  = 0
[4]  = 5

You can view individual elements instead of  the entire array.

Any element of an array (or structure) can be independently expanded or
contracted. If you only want to view one or two elements of a large array,
specify the particular array or structure elements in the Expression field
of the Add Watch dialog box; you need not display every element of the
variable.

You can dereference pointers.

You can dereference a pointer in the same way as you expand an array or
structure. The pointer address is displayed, followed by all the elements of
the variable to which the pointer currently refers. Multiple levels of
indirection (that is, pointers referencing other pointers) can be displayed
simultaneously.

9.3.4  Displaying Array Elements Dynamically

You do not have to display every element of an array. If specific subscripts
are given, the corresponding element is displayed.

You can also specify a dynamic array element, which changes as some other
variable changes. For example, suppose that the loop variable  p  is a
subscript for the array variable  catalogprice. The Watch window expression
catalogprice[p]  displays only the array element currently specified by p,
not the entire array.

You can mix constant and variable subscripts. For example, the expression
bigarray[3][i]  displays only the element in the third row of the array to
which the index variable  i  points.

9.3.5  Using Quick Watch

Selecting the Quick Watch command from the Watch menu (or pressing SHIFT+F9)
displays the Quick Watch dialog box. If the text cursor is in the Source,
Local, or Watch window, the variable at the current cursor position appears
in the dialog box. If this is not the item you wish to display, type in the
desired expression or variable, then press ENTER. The selected item is
displayed immediately.

The Quick Watch display automatically expands arrays and structures to their
first level. For example, an array with three dimensions is expanded to the
first dimension. You can expand or contract an element just as you would in
the Watch window: position the cursor on the appropriate line and press
ENTER. If the array needs more lines than the Quick Watch window can
display, drag the mouse along the scroll bar, or press DOWN or PGDN to view
the rest of the array.

You can add Quick Watch variables to the Watch window.

If you decide to add a Quick Watch item to the Watch window, select the Add
Watch button. Arrays and structures appear in the Watch window expanded as
they were displayed in the Quick Watch box.

Quick Watch is a convenient way to take a quick look at a variable or
expression. Since only one Quick Watch variable can be viewed at a time, you
would not use Quick Watch for most of the variables you want to view.

9.3.6  Displaying Memory

Selecting the Memory command from the View menu opens a Memory window. Up to
two Memory windows can be open at one time.

By default, memory is displayed as hexadecimal byte values, with 16 bytes
per line. At the end of each line is a second display of the same memory in
ASCII form. Values that correspond to printable ASCII characters (decimal 32
through 127) are displayed in that form. Values outside this range are shown
as periods.

You can display memory  values in any form.

Byte values are not always the most convenient way to view memory. If the
area of memory you're examining contains character strings or floating-point
values, you might prefer to view them in a directly readable form. The
Memory Window command of the Options menu displays a dialog box with a
variety of display options:

■   ASCII characters

■   Byte, word, or double-word binary values

■   Signed or unsigned integer decimal values

■   Short (32 bit), long (64 bit), or ten-byte (80 bit) floating-point
values

You can also directly cycle through these display formats by pressing F3.

If a section of memory cannot be displayed as a valid floating-point number,
the number shown includes the characters  NAN  (not a number).

Displaying Variables with a Live Expression

Section 9.3.4, "Displaying Array Elements Dynamically," explains how to
display a specific array element by adding the appropriate expression to the
Watch window. It is also possible to watch a particular memory area that
your program uses to store data in the Memory window. This CodeView display
feature is called a "live expression."

"Live" means that the area of memory displayed changes to reflect the value
of a pointer or subscript. For example, if  buffer  is an array and  pbuf
is a pointer to that array, then  *pbuf  points to the array element
currently referenced. A live expression displays the section of memory
beginning with this element. If your program changes the value of  pbuf,
CodeView dynamically adjusts the Memory window display.

Live expressions are displayed in a Memory window, not in the Watch window.
To create a live expression, select the Memory Window command of the Options
menu, then select the Live Expression check box. Enter the name of the
element you want to view. For example, if  strgptr  is a pointer to an array
of characters, and you want to see what it currently points at, enter
*strgptr. Then select the OK button or press ENTER to view that memory area.

A new Memory window opens. The first memory location in the window is the
first memory location of the live expression. The section of memory
displayed changes to the section the pointer currently references.

You can use the Memory Window command of the Options menu to display the
value of the live expression in a directly readable form. This is especially
convenient when the live expression represents strings or floating-point
values, which are difficult to interpret in hexadecimal form.

It is usually more convenient to view an item in the Watch window than as a
live expression. However, some items are more easily viewed as live
expressions. For example, you can examine what is currently on top of the
stack. Enter SS:SP as the live expression.

9.3.7  Displaying the Processor Registers

Selecting the Register command from the View menu (or pressing F2) opens a
window on the right side of the screen. The current values of the
microprocessor's registers appear in this window.

At the bottom of the window is a group of mnemonics representing the
processor flags. When you first open the Register window, all values are
shown in normal-intensity video. Any subsequent changes are marked in
high-intensity video. For example, suppose the overflow flag is not set when
the Register window is first opened. The corresponding mnemonic is NV and it
appears in light gray. If the overflow flag is subsequently set, the
mnemonic changes to OV and appears in bright white.

Selecting the 386 Instructions command from the Options menu displays the
registers as 32-bit values, but only if your computer uses an 80386
processor, and only when running the real-mode version of CodeView.
Selecting this command a second time toggles back to a 16-bit display.

You can also display the registers of an 8087/287/387 coprocessor in a
separate window by selecting the 8087 command from the View menu. If your
program uses the coprocessor emulator, the emulated registers are displayed

9.3.8  Modifying the Values of Variables, Registers, and Memory

You can easily change the values of variables, memory locations, or
registers displayed in the Watch, Local, Memory, Register, or 8087 windows.
Simply position the cursor at the value you want to change and edit it to
the appropriate value. If you change your mind, press ALT+BKSP to undo the
last change you made.

The starting address of each line of memory displayed is shown at the left
of the Memory window, in CS:IP form. Altering the address automatically
shifts the display to the corresponding section of memory. If that section
is not used by your program, memory locations are displayed as double
question marks (??).

Byte display form is different from other forms.

When you select Byte display from the Memory Window Options dialog box,
CodeView presents both a hexadecimal and an ASCII representation of the data
in memory. (Byte display is the default.) You can change data in memory
either by entering new hex values over the hexadecimal representation of
your data or by entering character values over the character representation.

To toggle a processor flag, click left on its mnemonic. You can also
position the cursor on a mnemonic, then press any key (except TAB or SPACE).
Repeat to restore the flag to its previous setting.

Be cautious when modifying memory or a register.

The effect of changing a register, flag, or memory location may vary from no
effect at all, to crashing the operating system. You should be cautious when
altering "machine-level" values; most of the items you would want to change
can be altered from the Watch window.

One instance where direct manipulation of register values can be valuable is
when you are debugging in-line assembly code. You can change register values
to test assumptions before making changes in your source code and
recompiling.

9.4  Controlling Execution

There are two forms of program execution under CodeView:

■   Continuous; the program executes until either a previously specified
"breakpoint" has been reached or the program terminates normally.

■   Single-step; the program pauses after each line of code has been
executed.

Sections 9.4.1 and 9.4.2 explain how each form of execution works and the
most effective way to use each.

9.4.1  Continuous Execution

Continuous execution lets you quickly execute the bug-free sections of code,
which would otherwise take a long time to execute a single step at a time.

The simplest form of continuous execution is to click right (position the
mouse pointer and press the right mouse button) anywhere on the line of code
you want to debug or examine in more detail. The program executes at full
speed up to the beginning of this line, then pauses. You can do the same
thing by positioning the text cursor on this line, then pressing F7.

You can also pause execution at a specific line of code with a "breakpoint."
There are several types of breakpoints. Breakpoints are explained in the
following section.

Selecting Breakpoint Lines

Breakpoints can be  tied to lines of code.

You can skip over the parts of the program that you don't want to examine by
specifying one or more lines as "breakpoints." The program executes at full
speed up to the first breakpoint, then pauses. Pressing F5 continues program
execution up to the next breakpoint, and so on. (You can halt execution at
any time by pressing CTRL+BREAK or ALT+SYSRQ.)

There is no limit to the  number of breakpoints.

You can set as many breakpoints as you like (limited only by available
memory). There are several ways to set breakpoints:

■   Double-click anywhere on the desired breakpoint line. The selected
line is highlighted to show that it is a breakpoint. To remove the
breakpoint, double-click on the line a second time.

■   Position the cursor anywhere on the line at which you want execution
to pause. Press F9 to select the line as a breakpoint. (CodeView
highlights lines that have been selected as breakpoints.) Press F9 a
second time to remove the breakpoint.

■   Display the Set Breakpoint dialog box by selecting Set Breakpoint from
the Watch menu. Choose one of the breakpoint options that permits a
line ("location") to be specified. The line on which the text cursor
currently rests is the default breakpoint line in the Location field.
If this line is not the desired breakpoint, enter the line number
desired. (The line number must begin with a period.) Use F9 or the
Edit Breakpoints screen of the Watch menu to remove the breakpoint.

Not every line can  be a breakpoint.

A breakpoint line must be a program line that represents executable code.
You cannot select a blank line, a comment line, or a declaration line (such
as a variable declaration or a preprocessor statement) as a breakpoint.

A breakpoint can also be set at a function or an explicit address. To set a
breakpoint at a function, simply enter its name in the Set Breakpoint dialog
box. To set a breakpoint at an address, enter the address in CS:IP form.

────────────────────────────────────────────────────────────────────────────
NOTE

By default, Microsoft compilers optimize your code. In the process of
optimization, some lines of code may be repositioned or reorganized for more
efficient execution. These changes can prevent CodeView from recognizing the
corresponding lines of source code as breakpoints. Therefore, it is a good
idea to disable optimization during development (use the /Od switch). You
can restore optimization once debugging is completed.
────────────────────────────────────────────────────────────────────────────

Once execution has paused, you can continue execution by pressing F5 or
clicking left on the <F5> button in the display.

Setting Breakpoint Values

Breakpoints can be  tied to variables.

Breakpoints are not limited to specific lines of code. CodeView can also
break execution when a variable reaches a particular value, or just changes
value. You can also combine these value breakpoints with line breakpoints,
so that execution stops at a specific line only if a variable has
simultaneously reached a particular value, or changed value. You must use
the check boxes in the Set Breakpoint dialog box to select these other types
of breakpoints.

To pause execution when an expression reaches a particular value, enter that
expression in the Expression field of the Set Breakpoint dialog box. For
example, assume you have declared a tree structure as follows:

struct Tagtree
{
char * s;                /* Pointer to a string */
struct TAGtree * left;   /* Pointer to left branch */
struct TAGtree * right;  /* Pointer to right branch */
};

struct TAGtree t;

You can then pause execution when your tree traversal reaches a terminal
node by entering the expression  (t.left == NULL) || (t.right == NULL).

To pause execution when a variable changes value, you need to enter only the
name of the variable in the Expression field. For large variables (such as
arrays or character strings), you can specify the number of bytes you want
checked (up to 32K) in the Length field.

────────────────────────────────────────────────────────────────────────────
NOTE

When a breakpoint is tied to a variable, CodeView must check the variable's
value after each machine instruction is executed. This slows execution
greatly. For maximum speed when debugging, either tie conditional
breakpoints to specific lines, or set conditional breakpoints only after you
have reached the section of code that needs to be debugged.
────────────────────────────────────────────────────────────────────────────

Using Breakpoints

Here are several examples that show how breakpoints can help you find the
cause of a problem.

One of the most common bugs is a for loop that executes too many or too few
times. If you set a breakpoint that encloses the loop statements, the
program pauses after each iteration. With the loop variable or critical
program variables in the Watch or Local windows, it should be easy to see
what the loop is doing wrong.

You can specify how  many times a breakpoint  line is executed.

You do not have to pause at a breakpoint the first time execution reaches
it. CodeView lets you specify the number of times you want to ignore the
breakpoint condition before pausing. Enter the decimal number in the Pass
Count field of the Set Breakpoint dialog box of the Watch menu.

For example, suppose your program repeatedly calls a function to create a
binary tree. You suspect that something goes wrong with the process about
halfway through. You could mark the line that calls the function as the
breakpoint, then specify how many times this line is to execute before
execution pauses. Running the program creates a representative (but
unfinished) tree structure that can be examined from the Watch window. You
can then continue your analysis using single-stepping.

Another programming error is erroneously assigning a value to a variable.
Enter the variable in the Expression field of the Set Breakpoint dialog box.
Execution breaks whenever this variable changes value.

You can assign new  values to variables while  execution is paused.

Breakpoints are a convenient way to pause the program so you can assign new
values to variables. For example, if a limit value is set by a variable, you
can change the value to see whether program execution is affected.
Similarly, you can pass a variety of values to a switch statement to see if
they are correctly processed.

This ability to alter variables is an especially convenient way to test new
functions without having to write a stand-alone test program.

9.4.2  Single-Stepping

In single-stepping, CodeView pauses after each line of code is executed. (If
a line contains more than one executable statement, CodeView executes all
the statements on the line before pausing.) The next line to be executed is
highlighted in reverse video.

There are two ways  to single-step.

You can single-step through a program with the Step and Trace functions.
Step (executed by pressing F10) steps over function calls. All the code in
the function is executed but, to you, the function appears to execute as a
single step. Trace (executed by pressing F8) traces through every step of
all functions for which CodeView has symbolic information. Each line of the
function is executed as a separate step. (CodeView has no symbolic
information about run-time functions; therefore, they are executed as a
single step.)

You can alternate between Trace and Step as you like. The method you use
depends only on whether you want to see what happens within a particular
function.

You can Trace through the program continuously (without having to press F8),
using the Animate command of the Run menu. The speed of execution is
controlled by the Trace Speed command from the Options menu. You can halt
animated execution at any time by pressing any key.

9.5  Replaying a Debug Session

CodeView can automatically create a "tape" (a disk file) with all the
debugging instructions and input data you entered when testing a program.
The tape is then "replayed" to repeat the debugging process. This dynamic
replay feature is unique to the CodeView debugger and is activated by
selecting the History On command from the Run menu. Selecting History On a
second time terminates  recording.

You can use the recording as a bookmark. You can quit after a long debugging
session, then pick up the session later in the same place.

Dynamic replay makes it  easy to correct a mistake.

The principal use of dynamic replay is to allow you to back up when you make
an error or overshoot the section of code with the bug. This feature is
important because not all bugs are located when executing the program in a
linear fashion.

For example, you may have to manually execute a function many times before
its bug appears. If you then enter a command that alters the machine's or
program's status and thereby lose the information you need to find the cause
of the bug, you would have had to restart the program and manually repeat
every debugging step to return to that point. Even worse, if you don't
remember the exact sequence of events that exposed the bug, it could take
hours to find your way back.

Dynamic replay eliminates this problem. Selecting the Undo command from the
Run menu automatically restarts the program and rapidly executes every debug
command up to (but not including) the last one you entered. You can repeat
this process as many times as you like until you return to the desired point
in execution.

To add additional steps to an existing tape, select History On, then select
Replay. When replay has completed, perform whatever new debugging steps you
want, then select History On a second time to terminate recording. The new
tape contains both the original and the added commands.

────────────────────────────────────────────────────────────────────────────
NOTE

CodeView records only those mouse commands that apply to CodeView. Mouse
commands recognized by the application being debugged are not recorded.
────────────────────────────────────────────────────────────────────────────

Replay Limitations under OS/2

There are some limitations to dynamic replay when debugging under OS/2:

■   The program must not respond to asynchronous events.

■   Breakpoints must be specified at specific source lines or for specific
symbols (rather than by absolute addresses), or replay may fail.

■   Single-thread programs behave normally during replay. However, one of
the threads in a multithread program may cause an asynchronous event,
violating the first restriction. Multithread programs are, therefore,
more likely to fail during replay.

■   Multiprocess replay will fail. Each new process invokes a new CodeView
session. The existence of multiple sessions makes it impractical to
record the sequence of events if you execute commands in a session
other than the original.

■   Replay under Presentation Manager is not currently supported because
it violates the first restriction.

9.6  Advanced CodeView Techniques

Once you are comfortable displaying and changing variables, stepping through
the program, and using dynamic replay, you might want to experiment with the
advanced techniques explained below.

Setting Command-Line Arguments

If your program retrieves command-line arguments, you can specify them with
the Set Runtime Arguments command from the Run menu. Enter the arguments in
the Command Line field before you begin execution. (Arguments entered after
execution begins cause an automatic restart.)

Multiple Source Windows

You can open two Source windows at the same time. The windows can display
two different sections of the same program, or one can show the high-level
listing and the other the assembly-language listing. In the latter case, the
contents of the windows track, with the next assembly-language instruction
to be executed matching the next line of source code.

You can move freely between these windows, executing a single line of source
code or a single assembly instruction at a time. The assembly-language
window must be opened in CS:IP mode.

Calling Functions

Any C function in your program (whether user-written or from the library)
can be called from the Command window or the Watch window, using the
following format:

?funcname (varlist)

The function is evaluated and the returned value is displayed in the Command
window.

The function does not have to be called by your program to be available for
evaluation. For example, all the .OBJ code specified in the linker input
response file is linked. The functions in this code can then be evaluated
from the Command window.

This feature allows you to run functions from within CodeView that you would
not normally include in the final version of your program. For example, you
could include the OS/2 API functions that control semaphores, then execute
them from the Command window to manipulate the run-time environment at any
point in the debugging process.

Checking for Undefined Pointers

Until a pointer has been explicitly assigned a value, its value is
undefined. That is, its value may be completely random, or it may be some
consistent value that does not point to a useful data address (such as -1).

Accessing data through an uninitialized pointer will cause unpredictable
program behavior and, under OS/2, will usually result in a protection
violation. Because many C programs use pointers heavily, tracking down
exactly which pointer variable was left uninitialized is tedious.

CodeView can help locate the problem quickly. If you use an uninitialized
pointer (or "null pointer" under OS/2) the operating system will generate a
protection violation. By examining the Calls menu, you can determine the
last line of your code that was executed before the protection violation
occurred.

Under DOS, you can take advantage of the fact that global or static
variables are initialized to 0 to track down uninitialized pointers. Set a
conditional breakpoint that stops when location 0 changes, then start
execution. Execution will pause when your program makes an assignment to
that location.

────────────────────────────────────────────────────────────────────────────
NOTE

For near pointers, location 0 is DS:0000; for far pointers, location 0 is
0000:0000.
────────────────────────────────────────────────────────────────────────────

Using Breakpoints Efficiently

Breakpoints slow execution when debugging. You can increase CodeView's speed
by using the /R command-line switch if you have an 80386-based computer.
This switch enables the 386's four debug registers, which support breakpoint
checking in hardware rather than in software.

Printing Selected Items

You can print all or part of the contents of any window with the Print
command from the File menu. The check box lets you print the complete
contents of the window, only the material that is currently viewable in the
window, or selected text from the window. Text is selected by dragging the
mouse across it, or by holding down the SHIFT key and pressing the direction
keys (LEFT, RIGHT, UP, DOWN).

By default, print output is to the file CODEVIEW.LST in the current
directory. You can choose whether the new material will be appended to an
existing file or overwrite it, using the Append/Overwrite check box. If you
would like print output to go to a different file, type its name in the To
File Name field. If you want the output to go to a printer, enter the
appropriate device name, such as LPT1 or COM2.

Handling Register Variables

A register variable is stored in one of the microprocessor's registers,
rather than in RAM. This speeds access to the variable.

There are two ways for a conventional variable to become a register
variable. One way is declaring the variable as a register variable; if a
register is free, the compiler will store the variable there. The other way
occurs during optimization, when the compiler stores an often-used variable
(such as a loop variable) in a register to speed up execution.

Register variables can cause problems during debugging. As with local
variables, they are only visible within the function where they are defined.
In addition, a register variable may not always be displayed with its
current value.

In general, it is a good idea to turn off all optimization and to avoid
declaring register variables until the program has been fully debugged. Any
side effects produced by optimization or register variables can then be
easily isolated.

Redirecting CodeView Input and Output

The Command window accepts DOS-like commands that redirect input and output.
These commands can also be included on the command line that invokes
CodeView. Whatever follows the /C option in the command line is treated as
CodeView commands that are immediately executed at start-up.

CV/c "infile; t >outfile" myprog

Input is redirected to  infile, which can contain start-up commands for
CodeView. When CodeView exhausts all commands in the input file, focus
automatically shifts to the command window. Output is sent to  outfile  and
echoed to the Command window. The  t  must precede the  >  command for
output to be sent to the Command window.

Redirection is a useful way to automate CodeView start-up. It also lets you
keep a viewable record of command-line input and output, a feature not
available with dynamic replay. (No record is kept of mouse operations.) Some
applications (particularly interactive ones) may need modification to allow
for redirection of input to the application itself.

Using CodeView with Additional Memory

If your computer uses expanded or extended memory, you can increase
CodeView's functionality by selecting the /X or /E option. CodeView moves as
much as it can of itself, the debugging table, and the program to higher
memory (above the first megabyte).

The /X option uses extended memory and gives the greatest speed increase.
This option requires the HIMEM.SYS driver, which is included on your
distribution disks. Add  DEVICE = HIMEM.SYS  to your CONFIG.SYS file to load
HIMEM.SYS at boot time.

The /E option uses expanded memory. The speed increase is not as great as
that supplied by the /X option. The expanded memory manager (EMM) must be
LIM 4.0, and no single module's debug information can exceed 48K. If the
symbol table exceeds this limit, try reducing file-name information by not
specifying paths at compile time and using /Zi only with those sections of
the program that need debugging (use /Zd otherwise).

If you do not specify either /X or /E (or the /D disk-overlay option),
CodeView automatically searches for the HIMEM.SYS driver and extended memory
so it can implement the /X option. If it fails, CodeView searches for
expanded memory to implement the /E option. If that search fails, CodeView
uses a default disk overlay of 64K. (See the description of the /D option
below.)

9.7  Controlling CodeView with Command-Line Options

The following options can be added to the command line that invokes
CodeView:

Option                            Effect
────────────────────────────────────────────────────────────────────────────
/2                                Two-monitor debugging. The display
adapters must be configured for
different addresses. One display shows
the output of the application; the other
shows CodeView.

/25                               Display in 25-line mode.

/43                               Display in 43-line mode (EGA or VGA
only).

/50                               Display in 50-line mode (VGA only).

/B                                Display in black and white. This assures
that the display is readable when a
color display is not used.

/Ccommands                        All items following this switch are
treated as CodeView commands to be
executed immediately on start-up.
Commands must be separated with a
semicolon (;).

/D«ddd»                           Use disk overlays, where ddd is the
decimal size of the overlay buffer, in
kilobytes. The acceptable range is 16K
to 128K. The default size is 64K. DOS
only.

/E                                Use expanded memory for symbolic
information. DOS only.

/F                                Flip screen video pages. When your
application does not use graphics, eight
video screen pages are available.
Switching from CodeView to the output
screen is accomplished more quickly than
swapping (/S) by directly selecting the
appropriate video page. Cannot be used
with /S. DOS only.

/Inumber                          Turns nonmaskable interrupts and
8259-interrupt trapping on (/I1) or off
(/I2).

/K                                Disables installation of keyboard
monitors for the program being debugged.

/Ldlls                            Load DLLs specified. DLLs must be
separated by a semicolon (;). OS/2 only.

/M                                Disable the mouse.

/Nnumber                          /N0 tells CodeView to trap; /N1 tells it
not to.

/O                                Debug child processes ("offspring").
OS/2 only.

/R                                Use 386 hardware debug registers. DOS
only.

/S                                Swap screen in buffers. When your
program uses graphics, all eight screen
buffers must be used. Switching from
CodeView to the output screen is
accomplished by saving the previous
screen in a buffer. Cannot be used with
/F. DOS only.

/X                                Use extended memory for symbolic
information. DOS only.

9.8  Customizing CodeView with the TOOLS.INI FILE

The TOOLS.INI file customizes the behavior and user interface of several
Microsoft products. The TOOLS.INI file is a plain ASCII text file. You
should place it in a directory pointed to the INIT environment variable. (If
you do not use the INIT environment variable, CodeView looks for TOOLS.INI
only in its source directory.)

The CodeView section of TOOLS.INI is preceded by the following line:

[cv]

If you are running the protected-mode version of CodeView, use  [cvp]
instead. If you run both versions, include both:  [cv cvp].

Most of the TOOLS.INI customizations control screen colors, but you can also
specify such things as start-up commands or the name of the file that
receives CodeView output. On-line help contains full information about all
TOOLS.INI switches for CodeView.

PART III  Special Environments
────────────────────────────────────────────────────────────────────────────

The Microsoft C Professional Development System provides a platform from
which you can build graphics applications and interface with programs
written in other languages.

Chapter 10 discusses using the real-world graphics functions to set video
modes, draw basic shapes, and use graphic fonts. Chapter 11 describes
"presentation graphics," sophisticated charts and graphics that show data
relationships. Chapter 12 explains how to write C programs so that they
interface with assembly language routines or routines written in other
languages. Chapter 13 describes portability of Microsoft C to other
environments.

Chapter 10  Communicating with Graphics
────────────────────────────────────────────────────────────────────────────

A map, a chart, an illustration, a graph, or some other visual aid often can
communicate more information more quickly and more vividly than would
several screens of text.

The extensive Microsoft C graphics library allows you to communicate your
ideas graphically. The functions range from the simple to the complex; from
functions that turn on a pixel to functions that draw graphs and charts
complete with labels and legends.

This chapter describes low-level graphics functions that draw basic shapes
such as lines, circles, and rectangles. It introduces video modes, color
palettes, coordinate systems, and synopses of the graphics and font
functions. For complete function prototypes and example programs, use
on-line help.

────────────────────────────────────────────────────────────────────────────
NOTE

The ANSI C standard does not define any standard graphics functions. The
functions described in this section are unique to Microsoft C and are not
portable to other implementations of C.
────────────────────────────────────────────────────────────────────────────

10.1  Video Modes

Graphics adapters are boards or cards inside the computer that are
responsible for displaying text and graphics on the screen. Commonly used

■   CGA (Color Graphics Adapter)

■   EGA (Enhanced Graphics Adapter)

■   HGC (Hercules Graphics Card)

■   MCGA (Multicolor Graphics Array)

■   MDPA (Monochrome Display Printer Adapter)

■   VGA (Video Graphics Array)

In addition, there are Olivetti versions of the CGA, EGA, and VGA (called
OCGA, OEGA, and OVGA in this chapter).

The video modes available at run time depend on your graphics adapter and
monitor.

Adapters can enter one or more "video modes." The video mode controls the
resolution and number of colors on the video display. Microsoft C supports
17 video modes, which fall into two broad categories:

■   "Text modes," where characters are displayed

■   "Graphics modes," where individual pixels can be turned on and off

The graphics adapter and the type of monitor in use determine which of the
17 video modes are available at run time. See Section 10.1.2 for a list of
video modes.

10.1.1  Sample Low-Level Graphics Program

The program ERESBOX.C below shows, in a few lines, the steps you follow to
enter and exit a graphics mode. It sets the video mode _ERESCOLOR, draws a
box, waits for a keypress, and returns to default mode, which is the video
mode in effect when the program began running.

/* ERESBOX.C -- Enters _ERESCOLOR mode and draws a box */

#include <graph.h> /* graphics functions */
#include <stdio.h> /* puts */
#include <conio.h> /* getch */

main()
{
if( _setvideomode( _ERESCOLOR ) ) /* EGA 640x350 mode */
{
_rectangle( _GBORDER, 10, 10, 110, 110 ); /* draw */
getch(); /* wait for a keypress */
_setvideomode( _DEFAULTMODE ); /* return to default */
} else puts( "Can't enter _ERESCOLOR graphics mode." )
}

The program above illustrates the steps you follow to display graphics:

■   Include the header file GRAPH.H. It contains function prototypes,
macros, useful structures, and symbolic constants such as _ERESCOLOR,
_GBORDER, and _DEFAULTMODE.

#include <graph.h>

■   Call the _setvideomode function, which sets the desired video mode.
The function returns 0 if the hardware does not support the requested
mode. (See Section 10.1.2, "Setting a Video Mode.")

if( _setvideomode( _ERESCOLOR ) )

■   Draw the graphics on the screen. The example program calls the
_rectangle function. (See Section 10.4.3, "Drawing Points, Lines, and
Shapes.")

_rectangle( _GBORDER, 10, 10, 110, 110 )

■   Exit the graphics mode and return to whatever video mode was in effect
before the program began. Call _setvideomode, passing the constant
_DEFAULTMODE. In some cases, you might want to skip this step, exiting
the program with the graphics screen still in place.

_setvideomode( _DEFAULTMODE );

In addition, you must link with the GRAPHICS.LIB library, which contains the
function code. If you use window-coordinate functions (which require
floating-point calculations) and if you have not created a standard combined
library containing a floating-point component, you must explicitly link with
a floating-point math library.

10.1.2  Setting a Video Mode

The _setvideomode function turns on one of the 17 available video modes.
Pass it a single integer that tells it which mode to display. The constants
in Table 10.1 are defined in the GRAPH.H file. The dimensions are listed in
pixels for video graphics mode and in columns for video text mode.

Table 10.1  Constants that Represent Video Modes

╓┌────────────────┌───────────────────────────────────────┌──────────────────╖
Constant (Name)  Description                             Mode/Hardware
────────────────────────────────────────────────────────────────────────────
_DEFAULTMODE     Restores the original mode              All/All

_ERESCOLOR       640  x  350, 4 or 16 color              Graphics/EGA

_ERESNOCOLOR     640  x  350, BW                         Graphics/EGA

_HRES16COLOR     640  x  200, 16 color                   Graphics/EGA

Constant (Name)  Description                             Mode/Hardware
────────────────────────────────────────────────────────────────────────────

_HERCMONO*       720  x 348, BW                          Graphics/HGC

_HRESBW          640  x  200, BW                         Graphics/CGA

_MAXCOLORMODE    Graphics mode with the most colors      Graphics/All┼

_MAXRESMODE      Graphics mode with the highest          Graphics/All┼
resolution

_MRES4COLOR      320  x  200, 4 color                    Graphics/All

_MRES16COLOR     320  x  200, 16 color                   Graphics/EGA

_MRES256COLOR    320  x  200, 256 color                  Graphics/VGA

_MRESNOCOLOR     320  x  200, 4 gray                     Graphics/CGA

_ORESCOLOR       640  x  400, 1 of 16 colors             Graphics/Olivetti
Constant (Name)  Description                             Mode/Hardware
────────────────────────────────────────────────────────────────────────────
_ORESCOLOR       640  x  400, 1 of 16 colors             Graphics/Olivetti

_TEXTBW40        40 column text, 16 gray                 Text/CGA

_TEXTBW80        80 column text, 16 gray                 Text/CGA

_TEXTC40         40 column text, 16/8 color              Text/CGA

_TEXTC80         80 column text, 16/8 color              Text/CGA

_TEXTMONO        80 column text, BW                      Text/MDPA

_VRES2COLOR      640  x  480, BW                         Graphics/VGA

_VRES16COLOR     640  x  480, 16 color                   Graphics/VGA

────────────────────────────────────────────────────────────────────────────

* Before attempting to enter _HERCMONO mode, you must install the
terminate-and-stay-resident program MSHERC.COM, which comes in the Microsoft
C package. If you have both a Hercules adapter and an additional graphics
adapter in the same computer, use the /H option  to put the Hercules into
HALF mode to avoid unpredictable and undesirable results.
┼ _MAXRESMODE and _MAXCOLORMODE support all adapters except the MDPA. See
Section

If the hardware does not support the selected mode, _setvideomode returns 0.

Some graphics adapters are able to enter additional video modes:

■   EGA adapters can display all CGA modes.

■   HGC adapters can enter _TEXTMONO mode.

■   MCGA adapters can display all CGA modes, plus _VRES2COLOR and
_MRES256COLOR.

■   VGA adapters can display all EGA and CGA modes.

10.1.3  Reading the videoconfig Structure

At any time, you can inquire about the current video configuration by
passing the _getvideoconfig function a structure of type videoconfig. The
structure contains 11 members, all of which are short integers. They are
listed in Table 10.2.

Table 10.2  Members of a videoconfig Structure

╓┌──────────────────────┌────────────────────────────────────────────────────╖
Member                 Description
────────────────────────────────────────────────────────────────────────────
bitsperpixel           Number of bits per pixel
memory                 Adapter video memory in kilobytes
Member                 Description
────────────────────────────────────────────────────────────────────────────
memory                 Adapter video memory in kilobytes
mode*                  Current video mode
monitor*               Active display monitor
numcolors              Number of color indexes
numtextcols            Number of text columns available
numtextrows            Number of text rows available
numvideopages          Number of video pages available
numxpixels             Number of pixels on the x axis
numypixels             Number of pixels on the y axis
────────────────────────────────────────────────────────────────────────────

* Possible values for the mode, adapter, and monitor items are listed in the
GRAPH.H file.

The _getvideoconfig function initializes these values. Most of the values
are self-explanatory. For example, if numxpixels holds 640, the current
video mode contains 640 horizontal pixels, numbered 0 - 639.

The READVC.C example program below illustrates how to initialize and examine
a videoconfig structure:

/* READVC.C -- Reads the videoconfig structure */

#include <graph.h>
#include <stdio.h>

main()
{
struct videoconfig vc;

_getvideoconfig( &vc );
printf( "Text Rows = %i.\n", vc.numtextrows );
}

First, the program declares a structure  vc  of type videoconfig. Next, it
calls _getvideoconfig to initialize the structure. Finally, it prints a
member of the structure.

10.1.4  Maximizing Resolution or Color

Two symbolic constants are new to Microsoft C 6.0: _MAXRESMODE and
_MAXCOLORMODE. The first selects the highest possible resolution for the
graphics adapter and monitor currently in use. The second selects the
graphics mode with the greatest number of colors. The constants work with
all graphics adapters except the MDPA. (See Table 10.3.)

Table 10.3  Constants for Maximum Resolution and Color

╓┌────────────────┌──────────────┌───────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
CGA              _HRESBW        _MRES4COLOR
EGA color        _HRES16COLOR   _HRES16COLOR
EGA ecd 64K      _ERESCOLOR     _HRES16COLOR
EGA ecd 256K     _ERESCOLOR     _ERESCOLOR
EGA mono         _ERESNOCOLOR   _ERESNOCOLOR
HGC              _HERCMONO      _HERCMONO
MCGA             _VRES2COLOR    _MRES256COLOR
────────────────────────────────────────────────────────────────────────────
MCGA             _VRES2COLOR    _MRES256COLOR
MDPA             Fails          Fails
OCGA             _ORESCOLOR     _MRES4COLOR
OEGA color       _ORESCOLOR     _ERESCOLOR
VGA/OVGA         _VRES16COLOR   _MRES256COLOR
────────────────────────────────────────────────────────────────────────────

10.1.5  Selecting Your Own Video Modes

A program that will run only on a single machine with a known graphics
adapter can enter the appropriate video mode immediately. However, if you
attempt to run the program on another machine with a different adapter, it
may not run correctly, if at all.

If your program might run on a variety of computers and you prefer to select
your own video modes, initialize a videoconfig structure by calling the
_getvideoconfig function. Then check the adapter member and use a switch
statement to enter the selected video mode.

For example, suppose you know that a program will run on monochrome systems
equipped with either an EGA adapter or a Hercules adapter. To enter the
appropriate mode, use code such as this:

struct videoconfig vc;

_getvideoconfig( &vc );

{
case _EGA:
_setvideomode( _ERESNOCOLOR );
break;
case _HGC:
_setvideomode( _HERCMONO );
break;
}

10.2  Mixing Colors and Changing Palettes

Depending on the graphics card installed and the video mode in effect, you
can display 2, 4, 8, 16, or 256 colors on the screen at the same time. You
specify a color by selecting a color index (sometimes called a "pixel value"
or "color attribute"). The color indexes are numbered from 0 to n-1, where n
is the number of colors in the palette.

CGA adapters offer four different palettes containing predefined fixed color
sets.

All video modes that support color offer a color palette.

EGA, MCGA, and VGA adapters have palettes that can be redefined to suit your
needs. You can change the visible color associated with any color index by
remapping to a color index a color value that describes the true color (the
amount of red, green, and blue) you want to display.

Olivetti adapters (OCGA, OEGA, and OVGA) support the standard CGA, EGA, and
VGA modes (and palettes), plus an additional Olivetti mode described in
Section 10.2.2, "Olivetti Palettes."

────────────────────────────────────────────────────────────────────────────
NOTE

The distinction between a color index and a color value is important. A
color index is always a short integer. A color value is always a long
integer. The only exception to this rule involves _setbkcolor, which uses a
color index cast to a long integer in CGA and text modes.
────────────────────────────────────────────────────────────────────────────

10.2.1  CGA Palettes

The CGA (Color Graphics Adapter) supports two color video modes: _MRES4COLOR
and _MRESNOCOLOR, which display four colors selected from one of several
predefined palettes of colors. They display these foreground colors against
a background color that can be any one of the 16 available colors. With the
CGA hardware, the palette of foreground colors is predefined and cannot be
changed. Each palette number is an integer. (See Table 10.4.)

Table 10.4  CGA Palettes in _MRES4COLOR Mode

╓┌───────────────┌─────────────┌───────────────┌─────────────────────────────╖
Color Index
Palette         1             2               3
Number
────────────────────────────────────────────────────────────────────────────
0               Green         Red             Brown

1               Cyan          Magenta         Light Gray

2               Light Green   Light Red       Yellow

3               Light Cyan    Light Magenta   White

────────────────────────────────────────────────────────────────────────────

_MRESNOCOLOR produces palettes with shades of gray on monochrome monitors.

The _MRESNOCOLOR video mode produces palettes containing various shades of
gray on monochrome monitors. However, the _MRESNOCOLOR mode displays colors
when used with a color display. Only two palettes are available in this
mode. Table 10.5 shows the colors available in the two palettes.

Table 10.5  CGA Palettes in _MRESNOCOLOR Mode

╓┌───────────────┌────────────┌─────────────┌────────────────────────────────╖
Color Index
Palette         1            2             3
Number
────────────────────────────────────────────────────────────────────────────
0               Blue         Red           Light Gray

1               Light Blue   Light Red     White

────────────────────────────────────────────────────────────────────────────

You can use the _selectpalette function only in the _MRES4COLOR,
_MRESNOCOLOR, and _ORESCOLOR graphics modes. To change palettes in other
video modes, use the _remappalette or _remapallpalette functions.

10.2.2  Olivetti(R) Palettes

Olivetti graphics adapters are found in most Olivetti computers (including
the M24, M28, M240, M280, and M380) and in the AT&T 6300 series computers.
These adapters function the same as their non-Olivetti equivalents; that is,
the OCGA, OEGA, and OVGA adapters support CGA, EGA, and VGA modes,
respectively. In addition, Olivetti adapters can enter the high resolution
_ORESCOLOR mode.

In _ORESCOLOR mode, you can choose one of 16 foreground colors by passing a
value in the range 0 -15 to the _selectpalette function. The background
color is always black.

10.2.3  VGA Palettes

Depending on the video mode currently in effect, a VGA (Video Graphics
Array) screen has 2, 16, or 256 color indexes chosen from a pool of 262,144
(256K) color values.

To name a color value, specify a level of intensity ranging from 0 - 63 for
each of the red, green, and blue components. The long integer that defines a
color value contains four bytes (32 bits):

(This figure may be found in the printed book.)

The most-significant byte should contain zeros. The two high bits in the
remaining three bytes should also be zero (these bits are ignored).

To mix a light red (pink), turn red all the way up, and mix in some green
and blue:

(This figure may be found in the printed book.)

The number 0x0020203FL represents this value in hexadecimal notation. You
can also use the following macro:

#define RGB ( r, g, b ) (0x3F3F3FL & ((long)(b) << 16 | (g) << 8 | (r)))

To create pure yellow (100% red plus 100% green) and assign it to a variable
yel, use this line:

yel = RGB( 63, 63, 0 );

For white, turn all the colors on:  RGB( 63, 63, 63). For black, set all
colors to 0:  RGB( 0, 0, 0 ).

Once you have the color value,

■   Call _remappalette, passing a color index and a color value.

■   Call _setcolor to make that color index the current color.

■   Draw something.

The program YELLOW.C below shows how to remap a color. It draws a rectangle
in color index 3 and then changes index 3 to the color value 0x00003F3FL
(yellow).

/* YELLOW.C -- Draws a yellow box on the screen */
/* Requires VGA or EGA */

#include <graph.h> /* graphics functions */
#include <conio.h> /* getch */

main()
{
short int index3 = 3;
long int yellow = 0x00003F3FL;
long int old3;

if( _setvideomode( _HRES16COLOR ) )
{
/* set current color to index 3*/
_setcolor( index3 );
/* draw a rectangle in that color */
_rectangle( _GBORDER, 10, 10, 110, 110 );
/* wait for a keypress */
getch();
/* change index 3 to yellow */
old3 = _remappalette( index3, yellow );
/* wait for a keypress */
getch();
/* restore the old color */
_remappalette( index3, old3 );
getch();
/* back to default mode */
_setvideomode( _DEFAULTMODE );
} else _outtext( "This program requires EGA or VGA." );
}

10.2.4  MCGA Palettes

In terms of color mixing, the MCGA (Multicolor Graphics Array) adapter is
the same as the VGA. It can display any of 256K colors. It cannot enter all
of the VGA video modes, however. It is limited to CGA modes and _VRES2COLOR
and _MRES256COLOR.

10.2.5  EGA Palettes

Mixing colors in EGA (Enhanced Graphics Adapter) is similar to the VGA
mixing described in Section 10.2.3, but there are fewer levels of intensity
for the red, green, and blue (RGB) components. In the modes that offer 64
colors, the RGB values include two bits and can range in value from 0 - 3.
The long integer that defines a color value looks like this:

(This figure may be found in the printed book.)

The bits marked  0  should be zeros; the bits marked  ?  are ignored. EGA
color values are defined this way to maintain compatibility with VGA color
values.

To form a pure red color value, use the constant 0x00000030L. For cyan (blue
plus green), use 0x00303000L. The RGB macro defined above for VGA color
mixing can be used as is, or you can modify it for EGA monitors:

#define EGARGB( r, g, b ) (0x303030L & ((long)(b) << 20 | (g) << 12 | (r
<< 4)))

In this macro, you would pass values in the range 0 -3 instead of 0 - 63.

For an example program that remaps a color index to a color value, see
YELLOW.C in Section 10.2.3, "VGA Palettes."

10.2.6  Symbolic Constants

The GRAPH.H file defines the following constants, which can be used as

╓┌─────────────┌──────────────┌──────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_BLACK        _GREEN         _LIGHTYELLOW
_BLUE         _LIGHTBLUE     _MAGENTA
_BRIGHTWHITE  _LIGHTCYAN     _RED
_BROWN        _LIGHTGREEN    _WHITE
_CYAN         _LIGHTMAGENTA
_GRAY         _LIGHTRED

For example, to change color index 1 to red, use the line

_remappalette( 1, _RED );

which causes any object currently drawn with color index 1 to change to red.
The default color value associated with index 1 is blue.

10.3  Specifying Points within Coordinate Systems

A coordinate system describes points on the screen in terms of their
horizontal (x) and vertical (y) positions. You specify a certain location by
providing two values that map to a unique position.

Graphics functions usually use viewport and window coordinates.

Coordinates on the physical screen never change. Only five functions, listed
in Section 10.3.1, use physical coordinates. All other graphics functions
use one of these two coordinate systems:

■   Viewport coordinates (short integers)

■   Window coordinates (double-precision floating-point numbers)

Viewports and windows can occupy all of the physical screen or just part of
it. The three coordinate systems and conventions for naming points and
regions of the screen are described below.

10.3.1  Physical Coordinates

Within the physical screen, the upper left corner is called the "origin."
The x and y coordinates for the origin are always (0, 0). The x axis extends
in the positive direction left to right, while the y axis extends in the
positive direction top to bottom.

For example, the video mode _VRES16COLOR has a resolution of 640 x 480,
which means the x axis contains the values 0 - 639 (left to right), and the
y axis contains 0 - 479 (top to bottom). (See Figure 10.1.)

(This figure may be found in the printed book.)

Only five functions use physical coordinates: _setcliprgn, _setvieworg,
_setviewport, _getviewcoord, and _getphyscoord.

The _setcliprgn function establishes a "clipping region." Attempts to draw
inside the region succeed, while attempts to draw outside the region are
clipped (ignored). When you first enter a graphics mode, the clipping region
defaults to the entire screen.

The _setvieworg function changes the current location of the origin. When a
program first enters a graphics mode, the physical origin and the viewport
origin are in the upper left corner. The following code moves the viewport
origin to the physical screen location (50, 100):

_setvieworg( 50, 100 );

The effect on the screen is illustrated in Figure 10.2. Note that the number
of pixels remains constant, but the range of legal x values changes from a
range of 0 to 639 (physical screen) to -50 to 589. The legal y values change
as well.

(This figure may be found in the printed book.)

All graphics functions are affected by the new origin, including _arc,
_ellipse, _lineto, _moveto, _outgtext,  _pie, and _rectangle.

The third function that uses physical coordinates is _setviewport, described
below, which establishes the boundaries of the current viewport.

10.3.2  Viewport Coordinates

The default viewport coordinate system is identical to the physical screen
coordinate system. The _setviewport function creates a new viewport within
the boundaries of the physical screen. A standard viewport has two
distinguishing features:

■   The origin of a viewport initially lies in the upper left corner of
the viewport, not the upper left corner of the physical screen.

■   The clipping region matches the outer boundaries of the viewport.

Graphics output functions require viewport or window coordinate values.

In other words, the _setviewport function does the same thing as would two
separate calls to _setvieworg and _setcliprgn. All graphics output functions
require values that are either viewport coordinates or window coordinates.

For example,

_setviewport( 50, 50, 200, 100 );

creates the viewport illustrated in Figure 10.3. The values passed to the
_setviewport function are physical screen locations of opposite corners.
After the viewport is created, the viewport origin lies in the upper left
corner.

(This figure may be found in the printed book.)

10.3.3  Window Coordinates

The _setwindow function allows you to use floating-point coordinates instead
of integers. More importantly, it scales the screen coordinates to almost
any size within the current viewport. Window functions take double-precision
arguments and have names that end with the suffixes _w or _wxy. The function
_lineto_w is the window-coordinate equivalent of the viewport function
_lineto.

To create a window for charting 12 months of average temperatures ranging
from - 40 to 100, use this line:

_setwindow( TRUE, 1.0, -40.0, 12.0, 100.0 );

The first argument is the invert flag, which puts the lowest y value at the
bottom of the screen instead of the top. The minimum and maximum coordinates
follow. The new organization of the screen is shown in Figure 10.4.

(This figure may be found in the printed book.)

If you plot a point with _setpixel_w or draw a line with _lineto_w, the
values are automatically scaled to the established window.

Window-coordinate graphics provide a lot of flexibility. You can fit an axis
into a small range (such as 151.25 to 151.45) or into a large range (-50,000
to 80,000), depending on the type of data to be graphed. In addition, by
changing the window coordinates and redrawing a figure, you can create the
effects of zooming in or panning across a figure.

10.3.4  Screen Locations

A coordinate system needs two values (a horizontal and a vertical position)
to describe the location of a point on the screen. There are times, however,
when it is more convenient to use one variable instead of two.

Some graphics functions require you to pass the location of a point on the
screen. Others return a value that represents a location. The GRAPH.H file
defines two structures that allow you to refer to a point with a single
variable.

■   An xycoord structure contains two short integers called xcoord and
ycoord for use in viewport graphics.

■   A _wxycoord structure contains two doubles called wx and wy for use in
window-coordinate graphics.

For example, you pass four doubles to the _rectangle_w function: an x and y
position for the upper left corner of the window and an x and y position for
the lower right corner. The _rectangle_wxy function takes two _wxycoord
structures.

10.3.5  Bounding Rectangles

Certain figures such as arcs and ellipses are centered within a "bounding
rectangle," specified by two points that define the opposite corners of the
rectangle. The center of the rectangle becomes the center of the figure, and
the rectangle's borders determine the size of the figure. Figure 10.5 shows
start and end vectors and a bounding rectangle in which a pie shape has been
drawn with the _pie function. The first two sets of coordinates are x1, y1,
x2, and y2. They define the boundaries of the rectangle. The pie shape needs
two other points, x3, y3, x4, and y4, which indicate the starting and ending
lines.

(This figure may be found in the printed book.)

10.3.6  The Pixel Cursor

A "pixel cursor" is a location on the screen. The _moveto function positions
this cursor at a given spot. Nothing visible appears. If you call _lineto, a
line is drawn from the current pixel cursor to another point. The _lineto
function also changes the location of the pixel cursor. When you call
_outgtext to display fonted text, the characters are drawn at the current
pixel cursor location.

To draw a series of connected lines, call _lineto several times.

The _getcurrentposition function returns the cursor location in an xycoord
structure.

10.4  Graphics Functions

This section lists the functions that work in one or more bit-mapped
graphics modes. Most of these functions are present in several forms. The
function names that end with _w use double values as arguments and the
window coordinate system. Functions that end with _wxy use the window
coordinate system and a _wxycoord structure to define the coordinates.
Functions with no suffix use the viewport coordinate system.

10.4.1  Controlling Video Modes

The functions described below affect the current video mode, coordinate
systems, clipping regions, viewports, and windows. For more information, use
on-line help.

_clearscreen - Erases the text or graphics screen and fills it with the
current background color (note that setting the video mode automatically
clears the screen). Pass one of the constants _GCLEARSCREEN, _GVIEWPORT, or
_GWINDOW. No return value.

_getphyscoord - Converts viewport coordinates to physical coordinates. Pass
an x and y coordinate from the viewport. The function returns an xycoord
structure, which includes an x and a y position from the physical screen.

_getvideoconfig - Obtains the status of the current graphics environment.
Pass it the address of a structure of type _videoconfig. See Section 10.1.3.
"Reading the videoconfig Structure."

_getviewcoord - Converts physical coordinates to viewport coordinates. Pass
two integers: an x and y coordinate. The function returns an xycoord
structure containing the equivalent position within the viewport.

_getviewcoord_w - Converts window coordinates to viewport coordinates. Pass
two doubles that name points within the window. Returns the equivalent
viewport coordinates as an xycoord structure.

_getviewcoord_wxy - Converts window coordinates to viewport coordinates in
an xycoord structure. Pass a _wxycoord structure.

_getwindowcoord - Converts viewport coordinates to window coordinates. Pass
two integers representing viewport coordinates. Returns a _wxycoord
structure.

_setcliprgn - Limits graphic output to part of the screen, called the
"clipping  region." Pass four values: the x and y coordinate of the upper
left corner (on the physical screen) and the coordinates of the lower right
corner. The default clipping region is the entire screen. See Section
10.3.1, "Physical Coordinates."

_setvideomode - Selects an operating mode for the display screen. Pass a
constant, such as _HRES16COLOR. Returns 0 if the video mode selected is not
supported by the hardware. See Section 10.1.2, "Setting a Video Mode."

_setvideomoderows - Sets the video mode and the number of rows for text
operations. Pass two values: a video mode and the desired number of text
rows (25, 30, 43, 50, or 60). Pass the symbolic constant _MAXTEXTROWS to get
the largest available number of rows. Returns the number of rows or 0 if
unsuccessful.

_setvieworg - Repositions the viewport origin. Pass an x and y position: the
physical screen location that will become the new origin. Returns the
previous origin in an xycoord structure.

_setviewport - Creates a viewport, including a clipping region and a new
origin in the upper left corner of the viewport. Subsequent calls to
graphics routines will be limited to the viewport area. Pass four short
integers that indicate the physical screen locations of the x and y
coordinates in the upper left and  lower right corners of the viewport. No
return value.

_setwindow - Defines a window coordinate system. Pass five values: a short
invert flag (TRUE or FALSE) and four doubles that represent the extreme
values in the upper left and lower right portions of the current viewport.
See Section 10.3.3, "Window Coordinates."

10.4.2  Changing Colors

The functions below control colors and color palettes. For an introduction
to this topic, see Section 10.2, "Mixing Colors and Changing Palettes." For
function prototypes and more information, consult on-line help.

_getbkcolor - Reports the current background color as a long integer. In
EGA, MCGA, and VGA video modes, this is a color value. In CGA and text
modes, it is a color index.

_getcolor - Returns the current color index.

_remapallpalette - Assigns new color values to all color indexes. Pass a
pointer to an array of color values. Returns 0 if unsuccessful.

_remappalette - Assigns a color value to a specific color index. Pass a
short color index and a long color value (which specifies the amount of red,
green, and blue). Returns the previous color value for that index or -1 if
unsuccessful. See Section 10.2.1, "CGA Palettes."

_selectpalette - Selects a predefined palette. This function applies only to
the CGA video modes _MRES4COLOR and _MRESNOCOLOR and the Olivetti graphics
mode _ORESCOLOR. To change palettes in other color video modes, use
_remappalette instead. Pass a short integer in the range 0 - 4 for CGA, or 0
-15 for Olivetti mode. Returns the value of the previous palette.

_setbkcolor - Sets the current background color. Always pass a long integer.
In EGA, MCGA, and VGA modes, this value is a color value. In CGA and text
modes, this is a color index cast to a long integer. Returns the old
background color or -1 if unsuccessful.

_setcolor - Sets the color index to be used for graphic output. It affects
later calls to functions such as _arc, _ellipse, _floodfill, _lineto,
_outgtext, _outtext, _pie, _rectangle, and _setpixel. Returns the previous
color or -1 if unsuccessful.

10.4.3  Drawing Points, Lines, and Shapes

The functions described below draw points, lines, and shapes. For a
definition of bounding rectangle and pixel cursor, see Sections 10.3.5 and
10.3.6.

_arc - Draws an elliptical arc. Pass eight short integers: four pairs of x
and y coordinates. The first two pairs are the corners of the bounding
rectangle. The third and fourth are the starting and ending points of the
arc. Returns 0 if unsuccessful.

_arc_wxy - Draws an arc within the window. Pass four wxycoord structures.
The first two are the corners of the bounding rectangle. The third and
fourth are the starting and ending points of the arc. Returns 0 if
unsuccessful.

_ellipse - Draws an ellipse or a circle. Pass a short fill flag ( _GBORDER
or _GFILLINTERIOR) and four short integers representing the corners of the
bounding rectangle. Returns 0 if unsuccessful.

_ellipse_w - Draws an ellipse or a circle within a window. Pass a short fill
flag ( _GBORDER or _GFILLINTERIOR) and four doubles representing the corners
of the bounding rectangle. Returns 0 if unsuccessful.

_ellipse_wxy - Draws an ellipse or a circle. Pass a short fill flag (
_GBORDER or  _GFILLINTERIOR) and two _wxycoord structures representing the
two corners of the bounding rectangle. Returns 0 if unsuccessful.

_getcurrentposition - Returns the current pixel cursor position in viewport
coordinates as an xycoord structure. The current position can be changed by
_arc, _lineto, and _moveto. The default position is the center of the
viewport.

_getcurrentposition_w - Returns the current position of the pixel cursor as
a _wxycoord structure containing the x and y coordinates. Pass nothing.

_getpixel - Returns a pixel's color index. Pass a short x and y coordinate
(in viewport coordinates). If the point is outside the clipping region, the
function returns -1.

_getpixel_w - Returns a pixel's color index. Pass two doubles: an x and y
coordinate.

_lineto - Draws a line from the current pixel cursor position to a specified
point. Pass a short x and a short y position. Returns 0 if unsuccessful.

_lineto_w - Draws a line from the current pixel position to a specified
window coordinate point. Pass a double x and y position. Returns 0 if
unsuccessful.

_moveto - Moves the pixel cursor to a specified point (with no graphic
output). Pass an x and y position. Returns the coordinates of the previous
position in an xycoord structure.

_moveto_w - Moves the pixel cursor to a specified point in a window. Pass
two doubles: an x and a y coordinate. Returns the previous position as a
_wxycoord structure.

_ pie - Draws a figure shaped like a pie slice. Pass a short fill flag and
eight short integers. The first four describe the bounding rectangle. The
final four represent the starting vector and ending vector. Returns 0 if
unsuccessful.

_ pie_wxy - Draws a pie-slice figure within a window. Pass a short fill flag
and four _wxycoord structures. The first two describe the bounding
rectangle. The second two represent the starting vector and ending vector.
Returns 0 if unsuccessful.

_rectangle - Draws a rectangle in the current line style. Pass a short fill
flag ( _GFILLINTERIOR or _GBORDER) and four short integers: the x and y
coordinates of opposite corners. Returns 0 if unsuccessful.

_rectangle_w - Draws a rectangle in the current line style. Pass a short
fill flag ( _GFILLINTERIOR or _GBORDER) and four doubles: the x and y window
coordinates of opposite corners. Returns 0 if unsuccessful.

_rectangle_wxy - Draws a rectangle in the current line style. Pass a short
fill flag ( _GFILLINTERIOR or _GBORDER) and two _wxycoord structures
describing the x and y coordinates of opposite corners. Returns 0 if
unsuccessful.

_setpixel - Sets a pixel to the current color (which is selected by
_setcolor). Pass it integer x and y coordinates. Returns the previous value
of the pixel or -1 if unsuccessful.

_setpixel_w - Sets a pixel to the current color (which is selected by
_setcolor). Pass it double x and y coordinates describing a position within
the window. Returns the previous value of the pixel or -1 if unsuccessful.

10.4.4  Defining Patterns

The following functions control the style in which straight lines are drawn
and the fill pattern used for solid shapes. For more information, use
on-line help.

_floodfill - Fills a bounded shape with the fill pattern set by _setfillmask
in the current color established by _setcolor. Pass an x and y coordinate
and a boundary color (the color index that marks the edge of the shape to be
filled). Returns 0 if unsuccessful.

_floodfill_w - Fills a bounded shape with the fill pattern set by
_setfillmask. Pass doubles that describe an x and y position within the
window and a boundary color (the color index that marks the edge of the
shape to be filled). Returns 0 if unsuccessful.

_getfillmask - Returns the address of the current fill mask, an
eight-character array, or 0 if the fill mask is not currently defined.

_getlinestyle - Returns the line style, a short integer whose bits
correspond to the screen pixels turned on or off within a line.

_setfillmask - Sets the current fill mask used by _floodfill and functions
that draw solid shapes (_ellipse, _pie, and _rectangle). Pass the address of
an array of eight unsigned characters, where each bit represents a pixel.
The pixels are drawn in the current color. No return value.

_setlinestyle - Sets the current style, which is used to draw the straight
lines within _lineto, _rectangle, and _pie. Pass an unsigned short integer
within which the bits correspond to the pixels on screen. For example,
0xFFFF represents a solid line, 0xAAAA is a dotted line, and 0xF0F0 is
dashed.

10.4.5  Manipulating Images

The functions described below can be used to create animated graphics. The
_getimage and _putimage functions act like a rubber stamp; after capturing a
shape, you can make copies anywhere on the screen.

_getimage - Stores a screen image in memory. Pass four integers (the
coordinates of the bounding rectangle) and a pointer to a storage buffer.
Call _imagesize to find out how much memory is required. No return value.

_getimage_w - Stores a screen image in memory. Pass four doubles (the
coordinates of the bounding rectangle) and a pointer to a storage buffer.
Call _imagesize_w to find out how much memory is required. No return value.

_getimage_wxy - Same as _getimage_w, but you pass two _wxycoord structures
and a pointer to memory.

_imagesize - Returns a long integer representing the size of an image in
bytes. Call this function in preparation for a call to _getimage. Pass four
integers: the x and y coordinates of opposite corners of the portion of the
screen to be saved.

_imagesize_w - Returns the size of an image in bytes in preparation for a
call to _getimage_w and _putimage_w. Pass four doubles: the x and y window
coordinates of opposite corners of the portion of the screen to be saved.

_imagesize_wxy - Same as _imagesize_w, but you pass two _wxycoord
structures.

_putimage - Retrieves an image from memory and displays it on the active
screen page. The image should previously have been saved to memory with
_getimage. Pass two short integers (coordinates where the image is to be
placed), a pointer to the image, and a short integer indicating what kind of
action to take: _GAND, _GOR, _GPRESET, _GPSET, or _GXOR. No return value.

_putimage_w - Displays an image from memory within a window. The image
should previously have been saved to memory with _getimage_w. Pass two
doubles (coordinates where the image is to be placed), a pointer to the
image, and a short integer indicating what kind of action to take: _GAND,
_GOR, _GPRESET, _GPSET, or _GXOR. No return value.

10.5  Using Graphic Fonts

A "font" is a collection of stylized text characters. Each font consists of
a typeface with several type sizes.

A "typeface" is the name of the displayed text─Courier, for example, or
Roman. The list on the next page shows six of the typefaces available with
the Microsoft C font library.

"Type size" measures the screen area occupied by individual characters in
units of screen pixels. For example, "Courier 12 x 9" denotes text of
Courier typeface, with each character occupying a screen area of 12 vertical
pixels by 9 horizontal pixels.

A font's spacing can be fixed or proportional. "Fixed" means that all
characters have the same width in pixels. "Proportional" means the width
varies. An i, for example, is thinner than an M.

The Microsoft C font functions use two methods to create fonts. The first
technique generates Courier, Helv, and Tms Rmn fonts through a "bit-mapping"
(or "raster-mapping") technique. Bit-mapping defines character images with
binary data. Each bit in the map corresponds to a screen pixel. If a bit is
1, its associated pixel is set to the current screen color.

The second method creates the remaining three type styles─Modern, Script,
and Roman─as "vector-mapped" fonts. Vector-mapping represents each character
in terms of lines and arcs.

Each method has advantages and disadvantages. Bit-mapped characters are more
completely formed since the pixel mapping is predetermined. However, they
cannot be scaled. Vector-mapped text can be scaled to any size, but the
characters tend to lack the solid appearance of the bit-mapped characters.

The following list shows six sample typefaces:

(This figure may be found in the printed book.)

Table 10.6 lists available sizes for each font. Note that the bit-mapped
fonts come in preset sizes as measured in pixels. The vector-mapped fonts
can be scaled to any size.

Table 10.6  Typefaces and Type Sizes in the C Library

╓┌─────────┌────────┌─────────────────┌──────────────────────────────────────╖
Typeface  Mapping  Size (in pixels)  Spacing
────────────────────────────────────────────────────────────────────────────
Courier   Bit      10 x 8, 12 x 9,   Fixed
15 x 12

Helv      Bit      10 x 5, 12 x 7,   Proportional
15 x 8, 18 x 9,
22 x 12, 28 x 16

Tms Rmn   Bit      10 x 5, 12 x 6,   Proportional
Typeface  Mapping  Size (in pixels)  Spacing
────────────────────────────────────────────────────────────────────────────
Tms Rmn   Bit      10 x 5, 12 x 6,   Proportional
15 x 8, 16 x 9,
20 x 12, 26 x 16

Modern    Vector   Scaled            Proportional

Script    Vector   Scaled            Proportional

Roman     Vector   Scaled            Proportional

────────────────────────────────────────────────────────────────────────────

10.5.1  Using the C Font Library

Data for both bit-mapped and vector-mapped fonts reside in .FON files. For
example, the files MODERN.FON, ROMAN.FON, and SCRIPT.FON hold data for the
three vector-mapped fonts.

You can use Microsoft Windows .FON files.

The Microsoft C .FON files are identical to the .FON files used in the
Microsoft Windows operating environment. If you have access to Windows, you
can use any of its .FON files with Microsoft C font functions. In addition,
several vendors offer software that creates or modifies .FON files, allowing
you to design your own fonts.

Your programs should follow these three steps to display fonted text:

1.  Register the fonts.

2.  Set the current font from the register.

3.  Display text using the current font.

The following sections describe each of the three steps in detail. An
example program in Section 10.5.5 demonstrates these steps.

10.5.2  Registering the Fonts

The fonts must first be organized into a list in memory, a process called
"registering." Register fonts by calling the function _registerfonts. This
function reads header information from specified .FON files, building a list
of file information but not reading any mapping data from the files.

The GRAPH.H file prototypes the _registerfonts function as

short far _registerfonts( unsigned char far * );

The argument points to a string containing a file name. The file name is the
name of the .FON file for the desired font. The file name can include wild
cards, allowing you to register several fonts with one call to
_registerfonts.

If it successfully reads one or more .FON files, _registerfonts returns the
number of fonts. If the function fails, it returns a negative error code.

10.5.3  Setting the Current Font

Call the function _setfont to select a current font. This function checks to
see if the requested font is registered, then reads the mapping data from
the appropriate .FON file. A font must be registered and marked current
before your program can display text in that font.

The GRAPH.H file prototypes the_setfonts function as

short far _setfont( unsigned char far * );

The function's argument is a pointer to a character string. The string
consists of letter codes that describe the desired font, as outlined here:

Option Code                       Meaning
────────────────────────────────────────────────────────────────────────────
b                                 The best fit from the registered fonts.
This option instructs _setfont to accept
the closest-fitting font if a font of
the specified size is not registered.

If at least one font is registered, the
b option always sets a current font. If
you do not specify the b option and an
exact matching font is not registered,
the
_setfont function will fail. In this
case, any existing current font remains
current. Refer to on-line help for a
description of error codes returned by
_setfont.

The _setfont function uses four criteria
for selecting the best fit. In
descending order of precedence, the four
criteria are pixel height, typeface,
pixel width, and spacing (fixed or
proportional). If you request a
vector-mapped font, _setfont sizes the
font to correspond with the specified
pixel height and width. If you request a
raster-mapped (bit-mapped) font,
_setfont chooses the closest available
size. If the requested type size for a
raster-mapped font fits exactly between
two registered fonts, the smaller size
takes precedence.

f                                 Fixed-spaced font.

hy                                Character height, where y is the height
in pixels.

nx                                Font number x, where x is less than or
equal to the value returned by
_registerfonts. For example, the option
n3 makes the third registered font
current, if three or more fonts are
registered.

p                                 Proportional-spaced font.

r                                 Raster-mapped (bit-mapped) font.

tfontname'                       Typeface of the font in single quotes.
The fontname string is one of the
following:

courier  modern  helv  script  tms rmn  roman

Note the space in tms rmn. Additional
font files use other names for fontname.
Refer to the vendor's documentation for
these names.

v                                 Vector-mapped font.

wx                                Character width, where x is the width in
pixels.

Option codes are not case sensitive and can be listed in any order. You can
separate codes with spaces or any other character that is not a valid option
code. The _setfont function ignores all invalid codes.

The _setfont function updates a data area with parameters of the current
font. The data area is in the form of a structure, defined in GRAPH.H as
follows:

struct _fontinfo
{
int     type;          /* set = vector,clear = bit map */
int     ascent;        /* pix dist from top to base */
int     pixwidth;      /* character width in pixels */
int     pixheight;     /* character height in pixels */
int     avgwidth;      /* average character width */
char    filename[81];  /* file name including path */
char    faceName[32];  /* font name */
};

If you want to retrieve the parameters of the current font, call the
function _getfontinfo.

10.5.4  Displaying Text

The last step, displaying text, consists of two parts. First you must select
a screen position for the text with the graphics function _moveto. Then
display fonted text at that position with the function _outgtext. The
_moveto function takes pixel coordinates as arguments. The coordinates
locate the top left of the first character in the text string.

10.5.5  A Sample Program

The program SAMPLER.C displays sample text in all the available fonts, then
exits when a key is pressed. Make sure the .FON files are in the current
directory before running the program.

/* SAMPLER.C: Displays sample text in various fonts. */

#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include <graph.h>
#include <string.h>
#define NFONTS 6

main()

{
static unsigned char *text[2*NFONTS] =
{
"COURIER",        "courier",
"HELV",           "helv",
"TMS RMN",        "tms rmn",
"MODERN",         "modern",
"SCRIPT",         "script",
"ROMAN",          "roman"
};
static unsigned char *face[NFONTS] =
{
"t'courier'",
"t'helv'",
"t'tms rmn'",
"t'modern'",
"t'script'",
"t'roman'"
};

static unsigned char list[20];
struct videoconfig vc;
int mode = _VRES16COLOR;
register i;

/*   Read header info from all .FON files in
*   current directory
*/

if( _registerfonts( "*.FON" ) < 0 )
{
_outtext( "Error: can't register fonts" );
exit( 0 );
}

/*   Set highest available video mode */

if( _setvideomode( _MAXRESMODE ) == 0 )
exit ( 0 );

/*   Copy video configuration into structure vc */

_getvideoconfig( &vc );

/*   Display six lines of sample text */

for( i = 0; i < NFONTS; i++ )
{
strcpy( list, face[i] );
strcat( list, "h30w24b" );

if( _setfont( list ) >= 0 )
{
_setcolor( i + 1 );
_moveto( 0, (i * vc.numypixels) / NFONTS );
_outgtext( text[i * 2] );
_moveto( vc.numxpixels / 2,
(i * vc.numypixels) / NFONTS );
_outgtext( text[(i * 2) + 1] );
}
else
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error: can't set font" );
exit( 0 );
}
}

getch();
_setvideomode( _DEFAULTMODE );

/* Return memory when finished with fonts */

_unregisterfonts();
exit( 0 );
}

10.5.6  Using Fonts Effectively

Displaying fonts is simply another form of graphics; using fonts effectively
requires little programming effort. Still, there are a few things to watch:

■   Remember that the video mode should be set only once. If you generate
an image with presentation graphics and want to add text to it, do not
reset the video mode prior to calling the font routines. Doing so will
blank the screen, destroying the original image.

■   The _setfont function reads specified .FON files to obtain mapping
data for the current font. Each call to _setfont causes a disk access
and overwrites the old font data in memory. If you want to show text
of different styles on the same screen, display all text of one font
before moving on to the others. Minimizing the number of calls to
_setfont saves time spent in disk I/O and memory reloads.

■   When your program finishes using the fonts library, you may want to
free the memory occupied by the register list by calling
_unregisterfonts. This function frees the memory allocated by
_registerfonts. The register information for each type size of each
font takes up approximately 140 bytes of memory.

■   Aesthetic suggestions for the printed page also apply to screen text.
Typefaces are more effective when they do not compete with each other
for attention. Restricting the number of styles per screen to one or
two generally results in a more pleasing, less cluttered image.

Chapter 11  Creating Charts and Graphs
────────────────────────────────────────────────────────────────────────────

The low-level graphics functions described in Chapter 10, "Communicating
with Graphics," draw points, lines, and shapes. Although it is possible to
use them to generate charts and graphs, an additional set of high-level
graphics functions is better suited to this task.

"Presentation graphics" is a set of high-level functions that displays
presentation-quality graphics. These functions transform numeric data into
pie charts, bar and column charts, line graphs, and scatter diagrams.

This chapter describes how to use presentation graphics.

11.1  Overview of Presentation Graphics

The presentation graphics library PGCHART.LIB contains 22 functions. They
are listed in Table 11.1 for convenient reference.

Table 11.1  Presentation Graphics Function

╓┌───────────────────┌─────────────────────┌─────────────────────────────────╖
Primary Functions   Secondary Functions
────────────────────────────────────────────────────────────────────────────
_pg_chart           _pg_analyzechart      _pg_hlabelchart
Primary Functions   Secondary Functions
────────────────────────────────────────────────────────────────────────────
_pg_chart           _pg_analyzechart      _pg_hlabelchart
_pg_chartms         _pg_analyzechartms    _pg_resetpalette
_pg_chartpie        _pg_analyzepie        _pg_resetstyleset
_pg_chartscatter    _pg_analyzescatter    _pg_setchardef
_pg_chartscatterms  _pg_analyzescatterms  _pg_setpalette
_pg_defaultchart    _pg_getchardef        _pg_setstyleset
_pg_initchart       _pg_getpalette        _pg_vlabelchart
_pg_getstyleset
────────────────────────────────────────────────────────────────────────────

The seven primary functions initialize variables and display selected chart
types.

In most cases, you will be using only seven "primary functions." These
functions initialize variables and display selected chart types. The 15
"secondary functions" of presentation graphics do not directly display
charts. Most of them retrieve or set data in the presentation graphics chart
environment.

Among the secondary functions are the "analysis functions," identified by
the prefix _pg_analyze. These five functions calculate default values that
pertain to a given chart type and data set. Calling an analysis function has
the same effect as calling a corresponding primary function, except that the
chart is not displayed. This allows you to pass on to the library the burden
of calculating values. You can then make modifications to the resulting
values and call a primary routine to display the chart.

Use the _pg_hlabelchart and _pg_vlabelchart functions to display text that
is not part of a title or axis label on your chart. These functions enable
you to attach notes or other messages to your chart.

11.2  Parts of a Graph

This section describes the terms used to refer to the different kinds of
information that can be plotted. The various types of charts and graphs are
also defined.

Data Series

Data that are related by a common idea or purpose constitute a "series." For
example, the prices of a futures commodity over the course of a year form a
single series of data. The volume forms a second data series.

When you include several series in one chart, characteristics such as color
and pattern can help distinguish one from another. You can more readily
differentiate series on a color monitor than you can on a monochrome
monitor. The number of series that can appear on the same chart depends on
the chart type and the number of available colors.

Categories

"Categories" are nonnumeric data. A set of categories forms a frame of
reference for the comparison of numeric data. For example, the months of the
year are categories against which numeric data such as inches of rainfall
can be plotted.

Regional sales provide another example. A chart can compare a company's
sales in different parts of the country. Each region forms a category.

Values

"Values" are numeric data. Sales, stock prices, air temperatures, and
populations are all series of values that can be plotted against categories
or against other values.

Presentation graphics allows you to overlay different series of value data
on a single graph. For example, average monthly temperatures or monthly
sales of heating oil during different years─or a combination of temperatures
and sales─can be plotted together on the same graph.

Pie Charts

"Pie charts" are used to represent data by showing the relationship of each
part to the whole. A good example is a company's annual budget. A pie chart
allows you to view each area of revenue or spending by its relative size
within the context of the entire company budget.

Presentation graphics can display either a standard or an "exploded" pie
chart. The exploded view shows the pie with one or more pieces separated for
emphasis. You can label each slice of a pie chart with a percentage figure
if you wish.

Bar and Column Charts

As the name implies, a "bar chart" shows data as horizontal bars. Bar charts
show comparisons among items rather than absolute value.

"Column charts" are vertical bar charts. Column charts are frequently used
to show variations over a period of time, since they suggest time flow
better than a bar chart.

Line Graphs

"Line graphs" illustrate trends or changes in data. They show how a series
of values varies against a particular category─for example, average
temperatures throughout one year.

Traditionally, line graphs show a collection of data points connected by
lines. Presentation graphics can also plot points that are not connected by
lines.

Scatter Diagrams

A "scatter diagram" is the only type of graph available in presentation
graphics that directly compares values with values. A scatter diagram simply
plots points.

Scatter diagrams illustrate the relationship between numeric values in
different groups of data. They graphically show trends and correlations not
easily detected from rows and columns of raw numbers.

Scatter diagrams are most useful with large amounts of data. Consider, for
example, the relationship between personal income and family size. If you
poll one thousand wage earners for their income and family size, you have a
scatter diagram with one thousand points. If you combine your results so
that you are left with one average income for each family size, you have a
line graph.

Axes

All presentation graphics charts except pie charts are displayed with two
perpendicular reference axes. The vertical, or y, axis runs from top to
bottom of the chart and is placed against the left side of the screen. The
horizontal, or x, axis runs from left to right across the bottom of the
screen.

The chart type determines the axis used for category data and the axis for
value data.

The x axis is the category axis for column and line charts and the value
axis for bar charts. The y axis is the value axis for column and line charts
and the category axis for bar charts.

Chart Windows

The "chart window" defines that part of the screen on which the chart is
drawn. By default, the window fills the entire screen, but presentation
graphics allows you to resize the window for smaller graphs. By redefining
the chart window to different screen locations, you can view separate graphs
together on the same screen.

Data Windows

While the chart window defines the entire graph including axes and labels,
the "data window" defines only the actual plotting area. This is the portion
of the graph to the right of the y axis and above the x axis. You cannot
specify or adjust the size of the data window. Presentation graphics
automatically determines its size based on the dimensions of the chart
window.

Chart Styles

Each of the five types of presentation graphics charts can appear in two
different "chart styles," as described in Table 11.2.

Table 11.2  Presentation Graphics Chart Styles

╓┌───────────┌───────────────────┌───────────────────────────────────────────╖
Chart Type  Chart Style #1      Chart Style #2
────────────────────────────────────────────────────────────────────────────
Pie         With percentages    Without percentages
Bar         Side-by-side        Stacked
Column      Side-by-side        Stacked
Line        Points with lines   Points only
Scatter     Points with lines   Points only
────────────────────────────────────────────────────────────────────────────

Bar and column charts have only one style when displaying a single series of
data. The styles "side-by-side" and "stacked" are applicable when more than
one series appears on the same chart. The first style arranges the bars or
columns for the different series side by side, showing relative heights or
lengths. The stacked style, illustrated for a column chart in Figure 11.3,
emphasizes relative sizes between bars or columns.

Legends

Legends help identify  individual data series.

When displaying more than one data series on a chart, presentation graphics
uses different colors, line styles, or patterns to differentiate them.
Presentation graphics also can display a "legend" that labels the different
series of a chart. For a pie chart, the legend labels individual slices of
the pie.

A sample of the color and pattern used to graph the series appears next to
the series label. This identifies the set of data to which the labels
belong.

You may change the font displayed by calling the _registerfonts and _setfont
functions (see Section 10.5 for more information about using fonts). If you
don't select a font, presentation graphics defaults to an internal font.

11.3  Writing a Presentation Graphics Program

To write a C program that uses presentation graphics, follow these steps:

1.  Include the required header files, GRAPH.H and PGCHART.H, as well as
any other header files your program may need.

2.  Set the video mode to a graphics mode. See Chapter 10, "Communicating
with Graphics," for a description of video modes.

3.  Initialize the presentation graphics chart environment. Presentation
graphics places charting parameters in data structures. The amount of
initialization that must be done by your program depends on how
extensively it relies on the defaults.

4.  Assemble the plot data. Data can be collected in a variety of ways: by
calculating it elsewhere in the program, reading it from files, or
entering it from the keyboard. All plot data must be assembled in
arrays because the presentation graphics functions locate them through
pointers.

5.  Call presentation graphics functions to display the chart. Pause while
the chart is on the screen.

6.  Reset the video mode. When your program detects the signal to
continue, it should reset the video to its original (default) mode.

After compiling the program, link it to the library modules PGCHART.LIB and
GRAPHICS.LIB.

The sample programs in Sections 11.3.1-11.3.3 use 5 of the 22 presentation
graphics functions: _pg_initchart, _pg_defaultchart, _pg_chartpie,
_pg_chart, and _pg_chartscatter. Each program is commented so that you can
recognize the steps given in this section.

11.3.1  Pie Chart

The following program uses presentation graphics to display a pie chart for
monthly sales of orange juice over a year. The chart, which is shown in
Figure 11.1, remains on the screen until a key is pressed.

/* PIE.C:  Create sample pie chart.  */

#include <conio.h>
#include <string.h>
#include <graph.h>
#include <pgchart.h>

#define MONTHS 12

typedef enum {FALSE, TRUE} boolean;

float far value[MONTHS] =
{
33.0, 27.0, 42.0, 64.0,106.0,157.0,
182.0,217.0,128.0, 62.0, 43.0, 36.0
};
char far *category[MONTHS] =
{
"Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jly", "Aug",
"Sep", "Oct", "Nov", "Dec"
};
short far explode[MONTHS] = {0};

main()
{
chartenv env;
int mode = _VRES16COLOR;

/* Set highest video mode available */

if( _setvideomode( _MAXRESMODE ) == 0 )
exit( 0 );

/* Initialize chart library and a default pie chart */

_pg_initchart();
_pg_defaultchart( &env, _PG_PIECHART, _PG_PERCENT );

/* Add titles and some chart options */

strcpy( env.maintitle.title, "Good Neighbor Grocery" );
env.maintitle.titlecolor = 6;
env.maintitle.justify = _PG_RIGHT;
strcpy( env.subtitle.title, "Orange Juice Sales" );
env.subtitle.titlecolor = 6;
env.subtitle.justify = _PG_RIGHT;
env.chartwindow.border = FALSE;

/* Parameters for call to _pg_chartpie are:
*
*    env        - Environment variable
*    category   - Category labels
*    value      - Data to chart
*    explode    - Separated pieces
*    MONTHS     - Number of data values
*/
if( _pg_chartpie( &env, category, value,
explode, MONTHS ) )
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error:  can't draw chart" );
}
else
{
getch();
_setvideomode( _DEFAULTMODE );
}
return( 0 );
}

(This figure may be found in the printed book.)

11.3.2  Bar, Column, and Line Charts

The code for the PIE.C program needs only minor alterations to produce bar,
column, and line charts for the same data:

■   Replace the call to _pg_chartpie with _pg_chart. This function
produces bar, column, and line charts depending on the value of the
second argument for _pg_defaultchart.

■   Give new arguments to _pg_defaultchart that specify chart type and
style.

■   Assign titles for the x axis and y axis in the structure  env.

■   Remove references to array  explode, which is applicable only to pie
charts.

The following example produces a bar chart for the store owner's data. The
result is shown in Figure 11.2.

/* BAR.C:  Create sample bar chart. */

#include <conio.h>
#include <string.h>
#include <graph.h>
#include <pgchart.h>

#define MONTHS 12

typedef enum {FALSE, TRUE} boolean;

float far value[MONTHS] =
{
33.0, 27.0, 42.0, 64.0,106.0,157.0,
182.0,217.0,128.0, 62.0, 43.0, 36.0
};
char far *category[MONTHS] =
{
"Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jly", "Aug",
"Sep", "Oct", "Nov", "Dec"
};

main()
{
chartenv env;
int mode = _VRES16COLOR;

/* Set highest video mode available */

if( _setvideomode( _MAXRESMODE ) == 0 )
exit( 0 );

/* Initialize chart library and a default bar chart */
_pg_initchart();
_pg_defaultchart( &env, _PG_BARCHART, _PG_PLAINBARS );

/* Add titles and some chart options */

strcpy( env.maintitle.title, "Good Neighbor Grocery" );
env.maintitle.titlecolor = 6;
env.maintitle.justify = _PG_RIGHT;
strcpy( env.subtitle.title, "Orange Juice Sales" );
env.subtitle.titlecolor = 6;
env.subtitle.justify = _PG_RIGHT;
strcpy( env.yaxis.axistitle.title, "Months" );
strcpy( env.xaxis.axistitle.title, "Quantity (cases)" );
env.chartwindow.border = FALSE;

/* Parameters for call to _pg_chart are:
*    env        - Environment variable
*    category   - Category labels
*    value      - Data to chart
*    MONTHS     - Number of data values
*/
if( _pg_chart( &env, category, value, MONTHS ) )
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error:  can't draw chart" );
}
else
{
getch();
_setvideomode( _DEFAULTMODE );
}
return( 0 );
}

(This figure may be found in the printed book.)

The grocer's bar chart becomes a column chart in two easy steps. Simply
specify the new chart type when calling _pg_defaultchart and change the axis
titles. To produce a column chart for the grocer's data, replace the call to
_pg_defaultchart with

_pg_defaultchart( &env, _PG_COLUMNCHART, _PG_PLAINBARS );

Replace the last two calls to strcpy with

strcpy( env.xaxis.axistitle.title, "Months" );
strcpy( env.yaxis.axistitle.title, "Quantity (cases)" );

Note that now the x axis is labeled "Months" and the y axis is labeled
"Quantity (cases)." Figure 11.3 shows the resulting column chart.

(This figure may be found in the printed book.)

Creating an equivalent line chart requires only one change. Use the same
code as for the column chart and replace the call to _pg_defaultchart with

_pg_defaultchart( &env, _PG_LINECHART, _PG_POINTANDLINE );

Figure 11.4 shows the line chart for the grocer's data.

(Please refer to the printed book.)

(This figure may be found in the printed book.)

11.3.3  Scatter Diagram

The program SCATTER.C displays a scatter diagram that illustrates the
relationship between the sales of orange juice and hot chocolate throughout
a 12-month period. Figure 11.5 shows the results of SCATTER.C. Notice that
the scatter points form a slightly curved line, indicating that a
correlation exists between the sales of the two products. The demand for
orange juice is roughly inverse to the demand for hot chocolate.

/* SCATTER.C:  Create sample scatter diagram. */

#include <conio.h>
#include <string.h>
#include <graph.h>
#include <pgchart.h>

#define MONTHS 12

typedef enum {FALSE, TRUE} boolean;

/* Orange juice sales */

float far xvalue[MONTHS] =
{
33.0, 27.0, 42.0, 64.0,106.0,157.0,
182.0,217.0,128.0, 62.0, 43.0, 36.0
};

/* Hot chocolate sales */

float far yvalue[MONTHS] =
{
37.0, 37.0, 30.0, 19.0, 10.0,  5.0,
2.0,  1.0,  7.0, 15.0, 28.0, 39.0
};

main()
{
chartenv env;
int mode = _VRES16COLOR;

/* Set highest video mode available */

if( _setvideomode( _MAXRESMODE ) == 0 )
exit( 0 );
/* Initialize chart library and default
* scatter diagram
*/
_pg_initchart();
_pg_defaultchart( &env, _PG_SCATTERCHART,
_PG_POINTONLY );

/* Add titles and some chart options */

strcpy( env.maintitle.title, "Good Neighbor Grocery" );
env.maintitle.titlecolor = 6;
env.maintitle.justify = _PG_RIGHT;
strcpy( env.subtitle.title,
"Orange Juice vs Hot Chocolate" );
env.subtitle.titlecolor = 6;
env.subtitle.justify = _PG_RIGHT;
env.yaxis.grid = TRUE;
strcpy( env.xaxis.axistitle.title,
"Orange Juice Sales" );
strcpy( env.yaxis.axistitle.title,
"Hot Chocolate Sales" );
env.chartwindow.border = FALSE;

/* Parameters for call to _pg_chartscatter are:
*    env        - Environment variable
*    xvalue     - X-axis data
*    yvalue     - Y-axis data
*    MONTHS     - Number of data values
*/
if( _pg_chartscatter( &env, xvalue,
yvalue, MONTHS ) )
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error:  can't draw chart" );
}
else
{
getch();
_setvideomode( _DEFAULTMODE );
}
return( 0 );
}

(This figure may be found in the printed book.)

11.4  Manipulating Colors and Patterns

Presentation graphics displays each data series in a way that makes it
discernible from other series. It does this by defining a separate "palette"
for every data series in a chart. Palettes consist of entries that determine
color, line style, fill pattern, and point character used to graph the
series.

Presentation graphics maintains its palettes as an array of structures. The
header file PGCHART.H defines the palette structures as shown below:

/* Typedef for pattern bitmap */
typedef unsigned char fillmap[8];

/* Typedef for palette entry definition */
typedef struct
{
unsigned short color;
unsigned short style;
fillmap        fill;
char           plotchar;
} paletteentry;

/* Typedef for palette definition */
typedef paletteentry palettetype[_PG_PALETTELEN];

Do not confuse the presentation graphics palettes with the adapter display
palettes, which are register values kept by the video controller. The
function _selectpalette described in Chapter 10, "Communicating with
Graphics," sets the display palette. It does not define the data series
palettes used by presentation graphics.

11.4.1  Color Pool

The color pool determines the  colors of graphic elements (axes, labels,
legends, titles).

Presentation graphics organizes all chart colors into a "color pool." The
color pool holds the color index values valid for the current graphics mode.
(Refer to Chapter 10, "Communicating with Graphics," for more information
about the color index.) Palette structures contain color codes that refer to
the color pool. A palette's color index determines the colors used to graph
the data series associated with the palette. The colors of labels, titles,
legends, and axes are determined by the contents of the color pool.

The first element of the color pool is always 0, which is the color index
for the screen background color. The second element is always the highest
color index available for the graphics mode. The remaining elements repeat
the sequences of available pixel values, beginning with 1.

As shown in the example in Section 11.4, the first member of a palette data
structure is

unsigned short color;

This member defines the color index for the data series associated with the
palette.

An example should make this clearer. A graphics mode of _MRES4COLOR (320 by
200 pixels) provides four colors for display. Color index values from 0 to 3
determine the possible colors─say, black, green, red, and brown,
respectively. The first eight elements of this color pool are shown below.

╓┌─────────────────┌────────────┌────────────────────────────────────────────╖
Color Pool Index  Color Index  Color
────────────────────────────────────────────────────────────────────────────
0                 0            Black
1                 3            Brown
Color Pool Index  Color Index  Color
────────────────────────────────────────────────────────────────────────────
1                 3            Brown
2                 1            Green
3                 2            Red
4                 3            Brown
5                 1            Green
6                 2            Red
7                 3            Brown
────────────────────────────────────────────────────────────────────────────

Notice that the sequence of available foreground colors repeats from the
third element. The first data series in this case would be plotted in brown,
the second series in green, the third series in red, the fourth series again
in brown, and so forth.

Video adapters such as the EGA or the Hercules(R) InColor(tm) Card allow 16
on-screen colors. This allows presentation graphics to graph more series
without duplicating colors.

11.4.2  Style Pool

Presentation graphics matches the color pool with a collection of different
line styles called the "style pool." Entries in the style pool define the
appearance of lines such as axes and grids. Lines can be solid, dotted,
dashed, or some combination of styles.

The second member of a palette structure defines a style code as

unsigned short style;

Each palette contains a style code that refers to an entry in the style pool
in the same way that it contains a color code that refers to an entry in the
color pool. The style code value in a palette is applicable only to line
graphs and lined scatter diagrams. The style code determines the appearance
of the lines drawn between points.

Use the different line styles in the style pool to differentiate series.

The palette's style code adds further variety to the lines of a multiseries
graph. It is most useful when the number of lines in a chart exceeds the
number of available colors. For example, a graph of nine different data
series must repeat colors if only three foreground colors are available for
the display. However, the style code for each color repetition will be
different, ensuring that none of the lines looks the same.

11.4.3  Pattern Pool

Presentation graphics also maintains a pool of "fill patterns" that
determine the fill design for column, bar, and pie charts. The third member
of the palette structure holds the fill pattern. The pattern member is an
array:

fillmap fill;

where  fillmap  is type-defined as

typedef unsigned char fillmap[8];

Each fill pattern array holds an 8-by-8 bit map that defines the fill
pattern for the data series associated with the palette. Table 11.3 shows
how a fill pattern of diagonal stripes is created with the  fill  pattern
array.

The bit map in Table 11.3 corresponds to screen pixels. Each of the eight
layers of the map is a binary number, where a solid circle signifies 1 and
an open circle signifies 0. Thus the first layer of the map─that is, the
first byte─represents the binary number 10011001, which is the decimal
number 153.

Table   11.3 Fill Patterns

╓┌───────────────────────────────────┌───────────────────────────────────────╖
Bit Map                             Value in Fill
────────────────────────────────────────────────────────────────────────────
☼  ∙  ∙  ☼  ☼  ∙  ∙  ☼              fill[0] = 153
☼  ☼  ∙  ∙  ☼  ☼  ∙  ∙              fill[1] = 204
∙  ☼  ☼  ∙  ∙  ☼  ☼  ∙              fill[2] = 102
∙  ∙  ☼  ☼  ∙  ∙  ☼  ☼              fill[3] =   51
☼  ∙  ∙  ☼  ☼  ∙  ∙  ☼              fill[4] = 153
☼  ☼  ∙  ∙  ☼  ☼  ∙  ∙              fill[5] = 204
Bit Map                             Value in Fill
────────────────────────────────────────────────────────────────────────────
☼  ☼  ∙  ∙  ☼  ☼  ∙  ∙              fill[5] = 204
∙  ☼  ☼  ∙  ∙  ☼  ☼  ∙              fill[6] = 102
∙  ∙  ☼  ☼  ∙  ∙  ☼  ☼              fill[7] =   51
────────────────────────────────────────────────────────────────────────────

For example, if you want to create the pattern in Table 11.3 for your
chart's first data series, you must reset the  fill array for the first
palette structure. You can do this in five steps:

1.  Declare a structure of type palettetype to hold the palette
parameters.

2.  Call _pg_initchart to initialize the palettes with default values.

3.  Call the presentation graphics function _pg_getpalette to retrieve a
copy of the current palette data.

4.  Assign the values given in Table 11.3 to the array  fill  for the
first palette.

5.  Call the presentation graphics function _pg_setpalette to load the
modified palette values.

The following lines of code demonstrate these five steps:

/* Declare a structure array for palette data. */

palettetype palette_struct;
.
.
.
/* Initialize chart library */

_pg_initchart();
.
.
.
/* Copy current palette data into palette_struct */

_pg_getpalette( palette_struct );

/* Reinitialize fill pattern for first palette using
values in Table .3 */

palette_struct[1].fill[0] = 153;
palette_struct[1].fill[1] = 204;
palette_struct[1].fill[2] = 102;
palette_struct[1].fill[3] =  51;
palette_struct[1].fill[4] = 153;
palette_struct[1].fill[5] = 204;
palette_struct[1].fill[6] = 102;
palette_struct[1].fill[7] =  51;

/* Load new palette data */

_pg_setpalette( palette_struct );

Now when you display your bar or column chart, the first series appears
filled with the striped pattern shown in Table 11.3.

Palette structures are used differently with pie charts. Instead of
clarifying multiple series, fill patterns, line styles, and colors, palette
structures are used to distinguish individual slices in a pie chart.
Palettes are recycled if the number of slices exceeds _PG_PALETTELEN. Thus,
the first palette dictates not only the appearance of the first slice, but
of slice number _PG_PALETTELEN as well. The second palette determines the
appearance of both the second slice and of slice number _PG_PALETTELEN + 1,
and so forth.

11.4.4  Character Pool

The last member of a palette structure is an index number in a pool of ASCII
characters:

char plotchar;

The member plotchar represents plot points on line graphs and scatter
diagrams. Each palette uses a different character to distinguish plot points
between data series.

11.5  Customizing the Chart Environment

The presentation graphics functions are designed to be flexible. You can use
the system of default values to produce professional-looking charts with a
minimum of programming effort. Or you can fine-tune the appearance of your
charts by overriding default values and initializing variables explicitly in

The header file PGCHART.H defines a structure type chartenv, which organizes
the chart environment variables. The chart environment describes everything
about a chart except the plots themselves. It is the blank page, in other
words, ready for plotting data. The environment determines the appearance of
text, axes, grid lines, and legends.

Colors and line styles in the chart environment are taken from palettes. In
this way, the appearance of titles and axis lines matches the colors and
line styles of plotted data series.

You can reset any variable in the environment.

Calling the _pg_defaultchart function fills the chart environment with
default values. Presentation graphics allows you to reset any variable in
the environment before displaying a chart. Except for adjusting the palette
values, all initialization of data is done through a chartenv type
structure.

The sample chart programs provided in Section 11.3, "Writing a Presentation
Graphics Program," illustrate how to adjust variables in the chart
environment. These programs create a structure  env  of type chartenv. The
structure  env  contains the chart environment variables, initialized by the
call to the _pg_defaultchart function. Environment variables such as the
chart title are then given specific values, as in

strcpy( env.maintitle.title, "Good Neighbor Grocery" );

Environment variables that determine colors and line styles deserve special
mention. The chart environment holds several such variables, which can be
recognized by their names. For example, the variable titlecolor specifies
the color of title text. Similarly, the variable gridstyle specifies the
line style used to draw the chart grid.

These variables are index numbers, but do not refer directly to the color
pool or line pool. They correspond instead to palette numbers. If you set
titlecolor to 2, presentation graphics uses the color code in the second
palette to determine the title's color. Thus, the title in this case would
be the same color as the chart's second data series. If you change the color
code in the palette, you'll also change the title's color.

A structure of type chartenv consists of four types of secondary structures.
The file PGCHART.H type-defines these secondary structures: titletype,
axistype, windowtype, and legendtype.

The remainder of this section describes the chart environment of
presentation graphics. It first examines structures of the four secondary
structures that make up the chart environment structure. The section
concludes with a description of the chartenv structure type. Each section
begins with a brief explanation of the structure's purpose, followed by a
listing of the structure type definition as it appears in the PGCHART.H
file. All symbolic constants are defined in the file PGCHART.H.

11.5.1  titletype Structures

Structures of type titletype determine text, color, and placement of titles
appearing in the graph. The PGCHART.H file defines the structure type as

typedef struct
{
char     title[_PG_TITLELEN];  /* Title text */
short    titlecolor;           /* Palette color
for title text */
short    justify;              /* _PG_LEFT, _PG_CENTER,
_PG_RIGHT */
} titletype;

The following list describes titletype members:

Member Variable                   Description
────────────────────────────────────────────────────────────────────────────
justify                           An integer specifying how the title is
justified within the chart window. The
symbolic constants defined
in PGCHART.H for this variable are
_PG_LEFT,
_PG_CENTER, and _PG_RIGHT.

titlecolor                        An integer between 1 and _PG_PALETTELEN
that specifies a title's color. The
default value for
titlecolor is 1.

title[_PG_TITLELEN]               A character array containing title text.
For example, if  env  is a structure of
type chartenv, then  env.maintitle.title
holds the character string used for the
main title of the chart. Similarly,
env.xaxis.axistitle.title  contains the
x axis title. The number of characters
in a title must be one less than
_PG_TITLELEN to allow room for a null
terminator.

11.5.2  axistype Structures

Structures of type axistype contain variables for the axes such as color,
scale, grid style, and tick marks. The PGCHART.H file defines the structure
type as the following:

typedef struct
{
short       grid;          /* TRUE=grid lines drawn;
FALSE=no lines */
short       gridstyle;     /* Style bytes for grid */
titletype   axistitle;     /* Title definition
for axis */
short       axiscolor;     /* Color for axis */
short       labeled;       /* TRUE=ticks marks and titles
drawn */
short       rangetype;     /* _PG_LINEARAXIS,
_PG_LOGAXIS */
float       logbase;       /* Base used if log axis */
short       autoscale;     /* TRUE=next 7 values
calculated by system */
float       scalemin;      /* Minimum value of scale */
float       scalemax;      /* Maximum value of scale */
float       scalefactor;   /* Scale factor for data on
this axis */
titletype   scaletitle;    /* Title definition for
scaling factor */
float       ticinterval;   /* Distance between tick marks
(world coord.) */
short       ticformat;     /* _PG_EXPFORMAT or
_PG_DECFORMAT */
short       ticdecimals;   /* Number of decimals for tick
labels (max=9) */
} axistype;

The following list describes axistype member variables:

Member Variable                   Description
────────────────────────────────────────────────────────────────────────────
autoscale                         A Boolean variable. If autoscale is set
to TRUE,
presentation graphics automatically
determines
values for scalefactor, scalemax,
scalemin,
scaletitle, ticdecimals, ticformat, and
ticinterval
(see below). If autoscale equals FALSE,
these seven variables must be specified

axiscolor                         An integer between 1 and _PG_PALETTELEN
that specifies the color used for the
axis and parallel grid lines. (See
description for gridstyle below.) Note
that this member does not determine the
color of
the axis title. That selection is made
through the
axistitle structure.

axistitle                         A titletype structure that defines the
title of the associated axis. The title
of the y axis displays vertically to the
left of the y axis, and the title of the
x axis displays horizontally below the x
axis.

grid                              A Boolean true/false value that
determines whether grid lines are drawn
for the associated axis. Grid lines span
the data window perpendicular to the
axis.

gridstyle                         An integer between 1 and _PG_PALETTELEN
that specifies the grid's line style.
Lines can be solid, dashed, dotted, or
some combination. The default value for
gridstyle is 1.

Note that the color of the parallel axis
determines the color of the grid lines.
Thus, the x axis grid is the same color
as the y axis, and the y axis grid is
the same color as the x axis.

labeled                           A Boolean value that determines whether
tick marks and labels are drawn on the
axis. Axis labels should not be confused
with axis titles. Axis labels are
numbers or descriptions such as "23.2"
or "January" attached to each tick mark.

logbase                           If rangetype is logarithmic, the logbase
variable determines the log base used to
scale the axis. The default value is 10.

rangetype                         An integer that determines whether the
scale of the axis is linear or
logarithmic. The variable rangetype
applies only to value data.

Specify a linear scale with
_PG_LINEARAXIS. A linear scale is best
when the difference between axis minimum
and maximum is relatively small. For
example, a linear axis range 0 - 10
results in 10 tick marks evenly spaced
along the axis.

Use _PG_LOGAXIS to specify a logarithmic
rangetype. Logarithmic scales are useful
when
the range is very large or when the data
varies exponentially. Line graphs of
exponentially varying data can be made
straight with a logarithmic
rangetype.

scalefactor                       All numeric data are scaled by dividing
each
value by scalefactor. For relatively
small values,
scalefactor should be 1, which is the
default. But data with large values
should be scaled by an appropriate
factor. For example, data in the range
2 million - 20 million should be plotted
with
scalemin set to 2, scalemax set to 20,
and
scalefactor set to 1 million.

If autoscale is set to TRUE,
presentation graphics automatically
determines a suitable value for
scalefactor based on the range of data
to be plotted. Presentation graphics
selects only values that are a factor of
1 thousand─that is, values such as 1
thousand, 1 million, or 1 billion. It
then labels the
scaletitle appropriately (see below). If
you desire some other value for scaling,
you must set autoscale to FALSE and set
scalefactor to the desired scaling value.

scalemax                          Highest value represented by the axis.

scalemin                          Lowest value represented by the axis.

scaletitle                        A titletype structure defining a string
of text that
describes the value of scalefactor. If
autoscale is TRUE, presentation graphics
automatically writes a scale description
to scaletitle. If autoscale equals FALSE
and scalefactor is 1, scaletitle.title
should be blank. Otherwise your program
should copy an appropriate scale
description to scaletitle.title, such as
"( x 1000)," "(in millions of units),"
or "times 10 thousand dollars."

For the y axis, the scaletitle text
displays vertically between the axis
title and the y axis. For the x axis,
the scale title appears below the x axis
title.

ticdecimals                       Number of digits to display after the
decimal point in tick labels. Maximum
value is 9. (This variable applies only
to axes with value data and is ignored
for the category axis.)

ticformat                         An integer that determines format of the
labels
assigned to each tick mark. Set
ticformat  to
_PG_EXPFORMAT for exponential format or
to _PG_DECFORMAT for decimal. The
default is _PG_DECFORMAT. (This variable
applies only to axes with value data and
is ignored for the category axis.)

ticinterval                       Sets interval between tick marks on the
axis. The tick interval is measured in
the same units as the numeric data
associated with the axis. For example,
if 2 sequential tick marks correspond to
the values 20 and 25, the tick interval
between them is 5. (This variable
applies only to axes with value data and
is ignored for the category axis.)

11.5.3  windowtype Structures

Structures of type windowtype contain sizes, locations, and color codes for
the three windows produced by presentation graphics: the chart window, the
data window, and the legend. Windows are located on the screen relative to
the screen's logical origin. By changing the logical origin, you can display
charts that are partly or completely off the screen.

The PGCHART.H file defines windowtype as the following:

typedef struct
{
short  x1;            /* Left edge of window in
pixels */
short  y1;            /* Top edge of window in
pixels */
short  x2;            /* Right edge of window in
pixels */
short  y2;            /* Bottom edge of window in
pixels */
short  border;        /* TRUE for border, FALSE
otherwise */
short  background;    /* Internal palette color for
window background */
short  borderstyle;   /* Style bytes for window
border */
short  bordercolor;   /* Internal palette color for
window border */
} windowtype;

The following list describes windowtype member variables:

Member Variable                   Description
────────────────────────────────────────────────────────────────────────────
background                        An integer between 1 and _PG_PALETTELEN
that specifies the window's background
color. The default value for background
is 1.

border                            A Boolean variable that determines
whether a border frame is drawn around a
window.

bordercolor                       An integer between 1 and _PG_PALETTELEN
that specifies the color of the window's
border frame. The default value is 1.

borderstyle                       An integer between 1 and _PG_PALETTELEN
that specifies the line style of the
window's border frame. The default value
is 1.

x1, y1, x2, y2                    Window coordinates in pixels. The
ordered pair
(x1, y1) specifies the coordinate of the
upper left corner of the window. The
ordered pair ( x2, y2 ) specifies the
coordinate of the lower right corner.

The reference point for the coordinates
depends on the type of window. The chart
window is located relative to the
logical origin, usually the upper left
corner of the screen. The data and
legend windows are located relative to
the upper left corner of the chart
window. This allows you to change the
position of the chart window without
having to redefine coordinates for the
other two windows.

11.5.4  legendtype Structures

Structures of type legendtype contain size, location, and colors of the
chart legend. The PGCHART.H file defines the structure type as the
following:

typedef struct
{
short      legend;        /* TRUE=draw legend;
FALSE=no legend */
short      place;         /* _PG_RIGHT, _PG_BOTTOM,
_PG_OVERLAY */
short      textcolor;     /* Palette color for text*/
short      autosize;      /* TRUE=system calculates
legend size */
windowtype legendwindow;  /* Window definition for
legend */
} legendtype;

The following list describes legendtype member variables:

Member Variable                   Description
────────────────────────────────────────────────────────────────────────────
autosize                          A Boolean true/false variable that
determines whether presentation graphics
is to automatically
calculate the size of the legend. If
autosize equals FALSE, the legend window
must be specified in the legendwindow
structure (see below).

legend                            A Boolean true/false variable that
determines whether a legend is to appear
on the chart. The legend variable is
ignored by functions that graph
single-series charts.

legendwindow                      A windowtype structure that defines
coordinates, background color, and
border frame for the legend. Coordinates
given in legendwindow are ignored if
autosize is set to TRUE.

place                             An integer that specifies the location
of the legend relative to the data
window. Setting place equal
to the constant _PG_RIGHT positions the
legend
to the right of the data window. Setting
place to
_PG_BOTTOM positions the legend below
the data window. Setting place to
_PG_OVERLAY positions the legend within
the data window.

These settings influence the size of the
data window. If place equals _PG_RIGHT
or _PG_BOTTOM, presentation graphics
automatically sizes the data window to
accommodate the legend. If place equals
_PG_OVERLAY, the data window is sized
without regard to the legend.

textcolor                         An integer between 1 and _PG_PALETTELEN
that specifies the color of text within
the legend window.

11.5.5  chartenv Structures

A structure of type chartenv defines the chart environment. The following
listing shows that a chartenv type structure consists almost entirely of
structures of the four types described above.

The PGCHART.H file defines the chartenv structure type as the following:

typedef struct
{
short       charttype;     /* Chart type */
short       chartstyle;    /* Chart style */
windowtype  chartwindow;   /* Window definition for
overall chart */
windowtype  datawindow;    /* Window definition for data
part of chart */
titletype   maintitle;     /* Main chart title */
titletype   subtitle;      /* Chart subtitle */
axistype    xaxis;         /* Definition for x axis */
axistype    yaxis;         /* Definition for y axis */
legendtype  legend;        /* Definition for legend */
} chartenv;

Initialize the chart environment with the _pg_defaultchart function.

The data in a chartenv type structure is initialized by calling the function
_pg_defaultchart. If your program does not call _pg_defaultchart, it must
explicitly define every variable in the chart environment─a tedious
procedure. The recommended method for adjusting the appearance of your chart
is to initialize variables for the proper chart type by calling the
_pg_defaultchart function, and then to reassign selected environment
variables such as titles.

The following list describes chartenv member variables:

Member Variable                   Description
────────────────────────────────────────────────────────────────────────────
chartstyle                        An integer that determines the style of
the chart
(see Table 11.2). Legal values for
chartstyle are _PG_PERCENT and
_PG_NOPERCENT for pie charts;
_PG_PLAINBARS and _PG_STACKEDBARS for
bar and column charts; and _PG_POINTONLY
and  _PG_POINTANDLINE for line graphs
and scatter diagrams. This variable
corresponds to the third argument for
the _pg_defaultchart function.

charttype                         An integer that determines the type of
chart displayed. The value of charttype
is _PG_BARCHART, _PG_COLUMNCHART,
_PG_LINECHART, _PG_SCATTERCHART, or
_PG_PIECHART. This variable corresponds
to the second argument for the
_pg_defaultchart function.

chartwindow                       A windowtype structure that defines the
appearance of the chart window.

datawindow                        A windowtype structure that defines the
appearance of the data window.

legend                            A legendtype structure that defines the
appearance of the legend window.

maintitle                         A titletype structure that defines the
appearance of the main title of the
chart.

subtitle                          A titletype structure that defines the
appearance of the chart's subtitle.

xaxis                             An axistype structure that defines the
appearance of the x axis. (This variable
is not applicable for pie charts.)

yaxis                             An axistype structure that defines the
appearance of the y axis. (This variable
is not applicable for pie charts.)

Chapter 12  Programming with Mixed Languages
────────────────────────────────────────────────────────────────────────────

There are times when your Microsoft C programs need to call programs written
in other languages or when programs written in other languages need to call
your C functions. This is called mixed-language programming. For example,
when a particular subprogram is available commercially in a language other
than C or when algorithms are described more naturally in a different
language, you need to use more than one language.

This chapter describes the elements of mixed-language programming─how to
make calls from programs written in one language to routines written in
another.

12.1  Making Mixed-Language Calls

Mixed-language programming always involves a call to a function, procedure,
or subroutine. For example, a BASIC main module may need to execute a
specific task that you would like to program separately. Instead of calling
a BASIC subprogram, however, you decide to call a C function.

Mixed-language calls involve calling functions in separate modules. Instead
of compiling all of your source modules with the same compiler, you use
different compilers. In the instance mentioned above, you would compile the
mainmodule source file with the BASIC compiler, another source file (written
in C) with the C compiler, and then link the two object files.

Figure 12.1 illustrates how the syntax of a mixed-language call works, using
the instance mentioned above.

(This figure may be found in the printed book.)

In Figure 12.1, the BASIC call to C is  CALL Prn, similar to a call to a
BASIC subprogram. There are two differences between this mixed-language call
and a call between two BASIC modules:

1.  The subprogram  Prn  is implemented in C, using standard C syntax.

2.  The implementation of the call in BASIC is affected by the DECLARE
statement, which uses the CDECL keyword to create compatibility with
C. The DECLARE statement (which is described in detail in the
Microsoft BASIC Language Reference and the Microsoft BASIC
Programmer's Guide) is an example of a mixed-language "interface"
statement. These interface statements override default naming and
calling conventions. Each language provides its own form of interface.

You can make mixed-language calls to routines regardless of whether they
have return values. (In this chapter, "routine" refers to any function,
procedure, or subroutine that can be called from another module.)

Table 12.1 shows the correspondence between calls to routines in different
languages.

Table 12.1  Language Equivalents for Routine Calls

╓┌───────────────────┌────────────────────┌──────────────────────────────────╖
Language            Return Value         No Return Value
────────────────────────────────────────────────────────────────────────────
Assembly Language   Procedure            Procedure
BASIC               FUNCTION procedure   Subprogram
C                   function             (void) function
FORTRAN             FUNCTION             SUBROUTINE
Pascal              Function             Procedure
────────────────────────────────────────────────────────────────────────────

For example, a C module can make a subprogram call to a FORTRAN subroutine.
You can prototype a FORTRAN subroutine as a function with a  void type.

────────────────────────────────────────────────────────────────────────────
NOTE

BASIC DEF FN functions and GOSUB subroutines cannot be called from another
language.
────────────────────────────────────────────────────────────────────────────

12.2  Language Convention Requirements

To mix languages, the calling program must observe the same conventions as
the called program. The conventions described in this section govern the
following:

■   How compilers treat identifiers, including function and variable names
(naming convention)

■   How the subprogram call is implemented (calling convention)

■   How parameters are passed (parameter-passing convention)

12.2.1  Naming Convention Requirement

Both the calling program and the called subprogram must agree on the names
of identifiers. Identifiers can refer to subprograms (functions, procedures,
and subroutines) or to variables that have a public or global scope. Each
language alters the names of identifiers.

The term "naming convention" refers to the way a compiler alters the name of
the routine before placing it in an object file. Languages may alter the
identifier names differently. You can choose between several naming
conventions to ensure that the names in the calling program agree with those
in the called program. If the names of called routines are stored
differently in each object file, the linker will not be able to find a
match. It will instead report unresolved external references.

Microsoft compilers place machine code into object files; they also place
the names of all publicly accessed routines and variables in object files.
The linker can then compare the name of a routine called in one module with
the name of a routine defined in another module, and recognize a match.
Names are stored in the ASCII (American Standard Code for Information
Interchange) character set.

Some languages translate names to uppercase.

BASIC, FORTRAN, and Pascal use similar naming conventions. They translate
each letter to uppercase. BASIC type declaration characters (%, &, !, #, $) are dropped. Each language recognizes a different number of characters. FORTRAN recognizes the first 31 characters of any name (unless identifier names are truncated), Pascal the first 8, and BASIC the first 40. If a name is longer than the language will recognize, additional characters are simply not placed in the object file. ──────────────────────────────────────────────────────────────────────────── NOTE Versions of Microsoft FORTRAN previous to version 5.0 truncated identifiers to six characters. As of version 5.0, FORTRAN retains up to 31 characters of significance unless you use the /4Yt option. ──────────────────────────────────────────────────────────────────────────── C is a case-sensitive language. The C compiler does not translate any letters to uppercase. It inserts a leading underscore ( _ ) in front of the name of each routine. C recognizes the first 31 characters of a name. Differences in naming conventions are dealt with automatically by mixedlanguage keywords, as long as you follow two rules: 1. If you use any FORTRAN routines that were compiled with the /4Yt command-line option or with the$TRUNCATE metacommand enabled, make
all names 6 characters or less. Make all names 6 characters or less
when using FORTRAN routines compiled with versions of the FORTRAN
compiler prior to 5.0.

2.  Do not use the /NOIGNORECASE linker option (which causes the linker to
treat identifiers in a case-sensitive manner). With C modules, this
means that you must be careful not to rely upon differences between
uppercase and lowercase letters when programming.

CL automatically uses the /NOIGNORECASE option when linking. To solve
the problems created by this behavior, either link separately with the
LINK utility, or use all lowercase letters in your C function names
and public variables (global variables that are not declared as
static).

────────────────────────────────────────────────────────────────────────────
NOTE

If you use the command-line option /Gc (generate Pascal-style function
calls) when you compile, or if you declare a function or variable with the
_pascal keyword, the compiler will translate your identifiers to uppercase.
────────────────────────────────────────────────────────────────────────────

Figure 12.2 illustrates a complete mixed-language development example,
showing how naming conventions enter into the process.

(This figure may be found in the printed book.)

In Figure 12.2, note that the BASIC compiler inserts a leading underscore in
front of  Prn  as it places the name into the object file, because the CDECL
keyword directs the BASIC compiler to use the C naming convention. BASIC
will also convert all letters to lowercase when this keyword is used.
(Converting letters to lowercase is not part of the C naming convention;
however, it is consistent with the programming style of many C programs.)

12.2.2  Calling Convention Requirement

The term "calling convention" refers to the way a language implements a
call. The choice of calling convention affects the machine instructions that
a compiler generates to execute (and return from) a function, procedure, or
subroutine call.

It is crucial that the two routines concerned (the routine issuing a call
and the routine being called) use the same protocol. Otherwise, the
processor may receive inconsistent instructions, causing the program to
behave incorrectly.

The use of a calling convention affects programming in three ways:

1.  The calling routine uses a calling convention to determine the order
in which to pass arguments (parameters) to another routine. This
convention can be specified in a mixed-language interface statement or
declaration.

2.  The called routine uses a calling convention to determine the order in
which to receive the parameters passed to it. In most languages, this
convention can be specified in the routine's heading. BASIC, however,
always uses its own convention to receive parameters.

3.  Both the calling routine and the called routine must agree on which of
them is responsible for adjusting the stack after all parameters are
removed.

In other words, each call to a routine uses a certain calling convention;
each routine heading specifies or assumes some calling convention. The two
conventions must be compatible. With all languages except BASIC, it is
possible to change the calling convention at the point of the call or at the
declaration of the called routine. Usually, however, it is easier to adopt
the convention of the called routine. For example, a C function would use
its own convention to call another C function, and would use the Pascal
convention to call Pascal.

BASIC, FORTRAN, and Pascal use the same standard calling convention. C uses
a different convention.

Effects of Calling Conventions

Calling conventions dictate three things:

1.  The way parameters are communicated from one routine to another (in
Microsoft mixed-language programming, parameters or pointers to the
parameters are passed on the stack)

2.  The order in which parameters are passed from one routine to another

3.  The part of the program responsible for adjusting the stack

Some languages pass parameters in a different order than C.

The BASIC, FORTRAN and Pascal calling conventions push parameters onto the
stack in the order in which they appear in the source code. For example, the
BASIC statement

CALL Calc( A, B )

pushes argument  A  onto the stack before it pushes  B. These conventions
also specify that the stack is adjusted by the called routine just before
returning control to the caller.

The C calling convention pushes parameters onto the stack in the reverse
order from their appearance in the source code. For example, the C function
call

calc( a, b );

pushes  b  onto the stack before it pushes  a. In contrast with the other
high-level languages, the C calling convention specifies that a calling
routine always adjusts the stack immediately after the called routine
returns control.

The BASIC, FORTRAN, and Pascal conventions produce slightly less object
code. However, the C convention makes calling with a variable number of
parameters possible. (Because the first parameter is always the last one
pushed, it is always on the top of the stack; therefore it has the same
address relative to the frame pointer, regardless of how many parameters
were actually passed.)

────────────────────────────────────────────────────────────────────────────
NOTE

The _fastcall keyword, which specifies that parameters are to be passed in
registers, is incompatible with programs written in other languages. Avoid
using _fastcall or the /Gr command-line option for C functions that you
intend to make public to BASIC, FORTRAN, or Pascal programs.
────────────────────────────────────────────────────────────────────────────

12.2.3  Parameter-Passing Requirement

Your programs must agree on the calling convention and the naming
convention; they must also agree on the order in which they pass parameters.
It is important that your routines send parameters in the same way to ensure
proper data transmission and correct program results.

Microsoft compilers support three methods for passing a parameter:

Method                            Description
────────────────────────────────────────────────────────────────────────────
Near reference                    Passes a variable's near (offset)
address. This address is expressed as an
offset from the default data segment.

This method gives the called routine
Any change the routine makes to the
parameter changes the variable in the
calling routine.

Far reference                     Passes a variable's far (segmented)

This method is similar to passing by
near reference, except that a longer
address is passed. This method is slower
than passing by near reference, but is
necessary when you pass data that is
outside the default data segment. (This
is an issue in BASIC or Pascal only if
you have specifically requested far
memory.)

Value                             Passes only the variable's value, not

With this method, the called routine
knows the value of the parameter but has
Changes to a value passed by a parameter
have no affect on the value of the
parameter in the calling routine.

These different parameter-passing methods mean that you must consider the
following when programming with mixed languages:

■   You need to make sure that the called routine and the calling routine
use the same method for passing each parameter (argument). In most
cases, you will need to check the parameter-passing defaults used by
each language and possibly make adjustments. Each language has
keywords or language features that allow you to change
parameter-passing methods.

■   You may want to choose a specific parameter-passing method rather than
using the defaults of any language.

Table 12.2 summarizes the parameter-passing defaults for each language.

Table 12.2  Parameter-Passing Defaults

╓┌─────────┌─────────────────────┌─────────────────────┌─────────────────────╖
Language  Near Reference        Far Reference         By Value
Language  Near Reference        Far Reference         By Value
────────────────────────────────────────────────────────────────────────────
BASIC     All                   ---                   ---

C         Near arrays           Far arrays            All data except
arrays

FORTRAN   All (medium model)    All (large model)     With attributes(1)

Pascal    VAR, CONST            VARS, CONSTS          Other parameters

────────────────────────────────────────────────────────────────────────────

(1)  When a PASCAL or C attribute is applied to a FORTRAN routine, passing
by value becomes the  default.

12.3  Compiling and Linking

After you have written your source files and decided on a naming convention,
a calling convention, and a parameter-passing convention, you are ready to
compile and link individual modules.

12.3.1  Compiling with Correct Memory Models

With BASIC, FORTRAN, and Pascal, no special options are required to compile
source files that are part of a mixed-language program.

With C, not all memory models are compatible with other languages.

BASIC, FORTRAN, and Pascal use only far (segmented) code addresses.
Therefore, you must use one of two techniques with C programs that call one
of these languages: compile C modules in medium, large, or huge model (using
the /AX command-line options), because these models also use far code
addresses; or apply the _far keyword to the definitions of C functions you
make public. If you use the /AX command-line option to specify medium,
large, or huge model, all your function calls become far by default. This
means you don't have to declare your functions explicitly with the _far
keyword.

Choice of memory model affects the default data pointer size in C and
FORTRAN, although this default can be overridden with the _near and _far
keywords. With C and FORTRAN, choice of memory model also affects whether
data objects are located in the default data segment; if a data object is
not located in the default data segment, it cannot be passed by near
reference.

Chapter 2, "Managing Memory."

12.3.2  Linking with Language Libraries

In most cases, you can easily link modules compiled with different
languages. Do any of the following to ensure that all required libraries
link in the correct order:

■   Put all language libraries in the same directory as the source files.

■   List directories containing all needed libraries in the LIB
environment variable.

■   Let the linker prompt you for libraries.

In each of the cases above, the linker finds libraries in the order that it
requires them. If you enter the library names on the command line, make sure
you enter them in an order that allows the linker to resolve your program's
external references. Here are some points to observe when specifying
libraries on the command line:

■   If you are using FORTRAN to write one of your modules, you need to
link with the /NOD (no default libraries) option and explicitly
specify all the libraries you need on the link command line. You can
also specify these libraries with an automatic-response file (or batch
file), but you cannot use a default-library search.

■   If your program uses both FORTRAN and C, specify the library for the
most recent of the two language products first. In addition, make sure
that you choose a C-compatible library when you install FORTRAN.

■   If you are listing BASIC libraries on the LINK command line, specify
those libraries first.

The following example shows how to link two modules,  mod1  and  mod2, with
a user library, GRAFX, the C run-time library, LLIBCE, and the FORTRAN
run-time library, LLIBFORE:

LINK /NOD mod1 mod2,,,GRAFX+LLIBCE+LLIBFORE

12.4  C Calls to High-Level Languages

Just as you can call Microsoft C routines from other Microsoft languages,
you can call routines written in Microsoft FORTRAN and Pascal from C. With
FORTRAN, Pascal, and C, freestanding routines can be written with no
restriction. When calling BASIC routines, however, you must write the main
program in BASIC; any subprograms are free to call one another, whether they
are written in C or BASIC.

For information about how to pass particular kinds of data, see Section
12.9, "Handling Data in Mixed-Language Programming."

Executing a Mixed-Language Call

The C interface to other languages uses standard C prototypes, with the
_fortran or _pascal keyword. Using either of these keywords causes the
routine to be called with the FORTRAN/Pascal naming and calling convention.
(The FORTRAN/Pascal convention also works for BASIC.) Here are the
recommended steps for executing a mixed-language call from C:

1.  Write a prototype for each mixed-language routine called. The
prototype should declare the routine extern for the purpose of program
documentation.

Instead of using the _fortran or _pascal keyword, you can simply
compile with the Pascal calling convention option (/Gc). The /Gc
option causes all functions in the module to use the FORTRAN/Pascal
naming and calling conventions, except where you apply the _cdecl
keyword.

2.  Pass the values of variables or pointers to variables. You can obtain
a pointer to a variable with the address-of (&) operator.

In C, array names are always passed as pointers to the first element
of the array; they are always passed by reference.

The prototype you declare for your function ensures that you are
passing the correct length address (that is, near or far).

3.  Issue a function call in your program as though you were calling a C
function.

4.  Always compile the C module in either medium, large, or huge model, or
use the _far keyword in your function prototype. This ensures that a
far (intersegment) call is made to the routine.

Using the _fortran or _pascal Keyword

There are two rules of syntax that apply when you use the _fortran or
_pascal keyword:

1.  The _fortran and _pascal keywords modify only the item immediately to
their right.

2.  The _near and _far keywords can be used with the _fortran and _pascal
keywords in prototypes. The sequences _fortran _far and _far _fortran
are equivalent.

The keywords _pascal and _fortran have the same effect on the program; using
one or the other makes no difference except for internal program
documentation. Use _fortran to declare a FORTRAN routine, _pascal to declare
a Pascal rou-tine, and either keyword to declare a BASIC routine.

The following examples demonstrate the syntax rules presented above.

The example below declares func to be a BASIC, Pascal, or FORTRAN function
taking two short parameters and returning a short value.

short _pascal func( short sarg1, short sarg2 );

The example below declares func to be pointer to a BASIC, Pascal, or FORTRAN
routine that takes a long parameter and returns no value. The keyword void
is appropriate when the called routine is a BASIC subprogram, Pascal
procedure, or FORTRAN subroutine, since it indicates that the function
returns no value.

void ( _fortran * func )( long larg );

The example below declares func to be a _near BASIC, Pascal, or FORTRAN
routine. The routine receives a double parameter by reference (because it
expects a pointer to a double) and returns a short value.

short _near _pascal func( _near double * darg );

The example below is equivalent to the preceding example ( _pascal _near is
equivalent to _near _pascal).

short _pascal _near func( _near double * darg );

You can make C adopt the conventions of other languages.

When you call a BASIC subprogram, you must use the FORTRAN/Pascal
conventions to make the call. When you call FORTRAN or Pascal, however, you
have a choice. You can make C adopt the conventions described in the
previous section, or you can make the FORTRAN or Pascal routine adopt the C
conventions.

To make a FORTRAN or Pascal routine adopt the C conventions, put the C
attribute in the heading of the routine's definition. The following example
shows the syntax for the C attribute in a FORTRAN subroutine-definition

SUBROUTINE FFROMC [C] (N)
INTEGER*2 N

The following example shows the syntax for the C attribute in a Pascal

PROCEDURE Pfromc( n : INTEGER ) [C];

To make a C function adopt the FORTRAN/Pascal conventions, declare the
function as _fortran or _pascal. For example,

void _pascal CfromP( int n );

12.5  C Calls to BASIC

No BASIC routine can be executed unless the main program is in BASIC,
because a BASIC routine requires the environment to be initialized in a way
that is unique to BASIC. No other language will perform this special
initialization.

However, your program can start up in BASIC, call a C function that does
most of the work of the program, and then call BASIC subprograms and
function procedures as needed. Figure 12.3 illustrates how to do this.

(This figure may be found in the printed book.)

Follow these rules when you call BASIC from C:

1.  Start up in a BASIC main module. You will need to use the DECLARE
statement to provide an interface to the C module.

2.  In the C module, write a prototype for the BASIC routine and include
type information for parameters. Use either the _fortran or _pascal
keyword to modify the routine itself.

3.  Make sure that all data are passed as near pointers. BASIC can pass
data in a variety of ways but is unable to receive data in any form
other than near reference. With near pointers, the program assumes
that the data are in the default data segment. If you want to pass
data that are not in the default data segment, copy the data to a
variable in the default data segment.

4.  Compile the C module in medium or large model to ensure far
(intersegment) calls.

The example below demonstrates a BASIC program that calls a C function. The
C function then calls a BASIC function that returns twice the number passed
to it and a BASIC subprogram that prints two numbers.

' BASIC source
'
' The main program is in BASIC because of BASIC's start-up
' requirements.  The BASIC main program calls the C function
' Cprog.
'
' Cprog calls the BASIC subroutine Dbl.
'
DEFINT A-Z
DECLARE SUB Cprog CDECL()
CALL Cprog
END
'
FUNCTION Dbl(N) STATIC
Dbl = N*2
END FUNCTION
'
SUB Printnum(A,B) STATIC
PRINT "The first number is ";A
PRINT "The second number is ";B
END SUB

/* C source; compile in medium or large model */

int _fortran dbl( int _near * N );
void _fortran printnum( int _near * A, int _near * B );

void cprog()
{
int a = 5;
int b = 6;

printf( "%d times 2 is %d\n", a, dbl( &a ) );
printnum( &a, &b );
}

In the previous example, note that the addresses of  a  and  b  are passed,
since BASIC expects to receive addresses for parameters. This is important
because C passes parameters by value unless you use the address-of (&)
operator to obtain the address, or are passing an array. Also note that the
function prototype for  printnum  declares the parameters as near pointers.
The prototype causes the

variables to be passed by near reference. If  a  or  b  is declared as _far,
the C compiler issues a warning that you are converting a far pointer to a
near pointer and that a segment was lost in the conversion.

Calling and naming conventions are resolved by the CDECL keyword in the
BASIC declaration of Cprog, and by the _fortran keyword in the C declaration
of  dbl  and  printnum.

BASIC can invoke one of your functions as part of the termination
procedure.

Versions of QuickBASIC later than 4.0 provide a "user entry point,"
B_OnExit, which can be called directly from C. The B_OnExit function enables
you to make sure you have performed an orderly termination. The following
code shows how to use B_OnExit.

#include <malloc.h>    /* For declaration of _fmalloc */
#include <stdlib.h>    /* For declaration of onexit_t */

/* The prototype for B_OnExit declares it as a function
* returning type onexit_t that takes one parameter. The
* parameter is a far pointer to a function that returns
* no value.
*/
extern onexit_t _pascal _far B_OnExit( onexit_t );
void TermProc( void );

int * p_IntArray;

void InitProc( void )
{
/* Allocate far space for 20-integer array */

p_IntArray = (int *)_fmalloc( 20 * sizeof( int ) );

/* Log termination routine (TermProc) with BASIC. */

B_OnExit( TermProc );
}

void TermProc( void )
{
free( p_IntArray );    /* Release far space allocated */
}                          /* previously by InitProc.     */

12.6  C Calls to FORTRAN

This section shows two examples of C-FORTRAN programs. There are two types
of subprogram calls to FORTRAN routines: calls to subroutines and calls to
functions. Functions return a value, while subroutines do not. The examples
in the next sections illustrate how to handle the difference between
function and subroutine calls.

12.6.1  Calling a FORTRAN Subroutine from C

The example below demonstrates a C main module calling a FORTRAN subroutine,
MAXPARAM. This subroutine adjusts the lower of two arguments to be equal to
the higher argument.

/* C source file - calls FORTRAN subroutine
* Compile in medium or large model
*/

extern void _fortran maxparam( int _near * I, int _near * J );

/* Declare as void, because there is no return value.
* FORTRAN keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Two integer parameters, passed by near reference.
*/

main()
{
int a = 5;
int b = 7;

printf( "a = %d, b = %d", a, b );
maxparam( &a, &b );
printf( "a = %d, b = %d", a, b );
}

C   FORTRAN source file, subroutine MAXPARAM
C
$NOTRUNCATE SUBROUTINE MAXPARAM (I, J) INTEGER*2 I [NEAR] INTEGER*2 J [NEAR] C C I and J received by near reference, C because of NEAR attribute C IF (I .GT. J) THEN J = I ELSE I = J ENDIF END In the previous example, the C program adopts the naming convention and call-ing convention of the FORTRAN subroutine. The two programs must agree on whether parameters are to be passed by reference or by value. The following keywords affect how the two programs interface: ■ The _fortran keyword directs C to call maxparam with the FORTRAN/ Pascal naming convention (as MAXPARAM); _fortran also directs C to call maxparam with the FORTRAN/Pascal calling convention. ■ Since the FORTRAN subroutine MAXPARAM may alter the value of either parameter, both parameters must be passed by reference. In this case, near reference was chosen; this method is specified in C by the use of near pointers, and in FORTRAN by applying the NEAR keyword to the parameter declarations. Far reference could have been specified by using far pointers in C. In that case, you would not declare the FORTRAN subroutine MAXPARAM with the NEAR keyword. If you compile the FORTRAN program in medium model, declare MAXPARAM using the FAR keyword. 12.6.2 Calling a FORTRAN Function from C The example below demonstrates a C main module calling the FORTRAN function fact. This function returns the factorial of an integer value. /* C source file - calls FORTRAN function. * Compile in medium or large model. */ int _fortran fact( int N ); /* FORTRAN keyword causes C to use FORTRAN/Pascal * calling and naming conventions. * Integer parameter passed by value. */ main() { int x = 3; int y = 4; printf( "The factorial of x is %4d", fact( x ) ); printf( "The factorial of y is %4d", fact( y ) ); printf( "The factorial of x+y is %4d", fact( x + y ) ); } C FORTRAN source file - factorial function C$NOTRUNCATE
INTEGER*2 FUNCTION FACT (N)
INTEGER*2 N [VALUE]
C
C   N is received by value, because of VALUE attribute
C
INTEGER*2 I
FACT = 1
DO 100 I = 1, N
FACT = FACT * I
100    CONTINUE
RETURN
END

In the example above, the C program adopts the naming convention and calling
convention of the FORTRAN subroutine. Both programs must agree on whether
parameters are passed by reference or by value. Note that the C program
passes the parameters by value rather than by reference. Passing parameters
by value is the default for C. To accept parameters passed by value, the
keyword VALUE is used in the declaration of  N  in the FORTRAN function. The
_fortran keyword directs C to call  fact  with the FORTRAN/Pascal naming
convention (as  FACT); _fortran also directs C to call  fact  with the
FORTRAN/Pascal calling convention.

When passing a parameter that should not be changed, pass the parameter by
value. Passing by value is the default method in C and is specified in
FORTRAN by applying the VALUE attribute to the parameter declaration.

12.7  C Calls to Pascal

This section shows two examples of C-Pascal programs. There are two types of
subprogram calls to Pascal routines: calls to procedures and calls to
functions. Functions return a value, while procedures do not. The examples
in the next sections illustrate how to handle the difference between
function and procedure calls.

12.7.1  Calling a Pascal Procedure from C

The following example demonstrates a C main module calling a Pascal
procedure,  maxparam. This procedure adjusts the lower of two arguments to
be equal to the higher argument.

/* C source file - calls Pascal procedure.
* Compile in medium or large model.
*/

void _pascal maxparam( int _near * a, int _near * b );

/* Declare as void, because there is no return value.
* The _pascal keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Two integer params, passed by near reference.
*/

main()
{
int a = 5;
int b = 7;

printf( "a = %d, b = %d", a, b );
maxparam( &a, &b );
printf( "a = %d, b = %d", a, b );
}

{ Pascal source code - Maxparam procedure. }

MODULE Psub;
PROCEDURE Maxparam( VAR a:INTEGER; VAR b:INTEGER );

{ Two integer parameters are received by near reference. }
{ Near reference is specified with the VAR keyword. }

BEGIN
if a > b THEN
b := a
ELSE
a := b
END;
END.

In the example above, the C program adopts the Pascal naming convention and
calling convention. Both programs must agree on whether parameters are
passed by reference or by value; the following keywords affect the
conventions:

■   The _pascal keyword directs C to call  Maxparam  with the FORTRAN/
Pascal naming convention (as  MAXPARAM); _pascal also directs C to
call Maxparam  with the FORTRAN/Pascal calling convention.

■   Since the procedure  Maxparam  can alter the value of either
parameter, both parameters must be passed by reference. In this case,
near reference is used; this method is specified in C by the use of
near pointers, and in Pascal with the VAR keyword.

Far reference could have been specified by using far pointers in C. To
specify far reference in Pascal, use the VARS keyword instead of VAR.

12.7.2  Calling a Pascal Function from C

The example below demonstrates a C main module calling Pascal function
fact. This function returns the factorial of an integer value.

/* C source file - calls Pascal function.
* Compile in medium or large model.
*/

int _pascal fact(int n);

/* PASCAL keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Integer parameter passed by value.
*/

main()
{
int x = 3;
int y = 4;

printf( "The factorial of x   is %4d", fact( x ) );
printf( "The factorial of y   is %4d", fact( y ) );
printf( "The factorial of x+y is %4d", fact( x + y ) );
}

{ Pascal source code - factorial function. }

MODULE Pfun;
FUNCTION Fact (n : INTEGER) : INTEGER;

{Integer parameters received by value, the Pascal default. }

BEGIN
Fact := 1;
WHILE n > 0 DO
BEGIN
Fact := Fact * n;
n := n - 1;          {Parameter n modified.}
END;
END;
END.

In the example above, the C program adopts the Pascal naming convention and
calling convention. Both programs must agree on whether parameters are
passed by reference or by value. The _pascal keyword directs C to call  fact
with the FORTRAN/Pascal naming convention (as  FACT);  _pascal also directs
C to call fact  with the FORTRAN/Pascal calling convention.

The Pascal function  fact  should receive a parameter by value. Otherwise,
the Pascal function will corrupt the parameter's value in the calling
module. Passing by value is the default method for both C and Pascal.

12.8  C Calls to Assembly Language

In Microsoft C, Version 6.0, you can write assembly-language programs either
by using the in-line assembler or by creating a stand-alone module using the
Microsoft Macro Assembler (MASM). If you use the in-line assembler, you do
not need to take any special precautions other than those outlined in
Chapter 3, "Using the In-Line Assembler." This section explains the
techniques for interfacing your assembly-language routines with your C
program.

When deciding whether to use the in-line assembler or MASM, there are
several considerations. Here is a list of advantages MASM provides over the
in-line assembler:

■   MASM supports declaration of data in MASM format; in-line assembly
does not.

■   MASM has a more powerful macro capability than in-line assembly.

■   Modules written for MASM can be interfaced more easily with modules
written in more than one Microsoft high-level language.

■   MASM assembles large assembly-language programs more quickly than the
in-line assembler.

■   MASM supports assembly-language code written prior to the existence of
the in-line assembler.

■   MASM error messages and warnings are more complete than those of the
in-line assembler.

The in-line assembler is far more efficient for some assembly-language
programming tasks. Here are some of the benefits of the in-line assembler:

■   You can do spot optimizations by including short sections of
assemblylanguage code in your C programs with the in-line assembler.

■   Code written in in-line assembler does not necessarily incur the
overhead of a function call; code assembled using MASM always does.

■   You can include in-line assembly code in your C source files; code
written for MASM must be in a separate file.

12.8.1  Writing the Assembly-Language Procedure

You must write your assembly-language procedure so that it uses the same
call-ing conventions and naming conventions as your C program. If you follow
these conventions, you will be able to write recursive procedures
(procedures that call themselves), and you will be able to use the CodeView
debugger to locate errors in the code.

────────────────────────────────────────────────────────────────────────────
NOTE

This section discusses only the simplified segment directives provided with
the Microsoft Macro Assembler, version 5.0. If you are using a version prior
to 5.0, you have to specify complete SEGMENT directives.
────────────────────────────────────────────────────────────────────────────

The standard assembly-language interface method consists of these steps:

1.  Setting up the procedure

2.  Entering the procedure

3.  Allocating local data (optional)

4.  Preserving register values

5.  Accessing parameters

6.  Returning a value (optional)

7.  Exiting the procedure

The next sections describe each of these steps in detail.

12.8.2  Setting Up the Procedure

The linker cannot combine the assembly-language procedure with the C program
unless you define compatible segments and declare the procedure properly.
Perform the following steps to set up the procedure:

1.  Use the .MODEL directive at the beginning of the source file; this
directive automatically causes the appropriate kind of returns to be
generated (NEAR for tiny, small or compact models, FAR for medium,
large, or huge models).

If you are using a version of MASM prior to 5.0, declare the procedure
NEAR for small or compact model, FAR for medium, large, or huge
models.

2.  Use the simplified segment directives .CODE and .DATA to declare the
code and data segments.

If you are using a version of MASM prior to 5.0, declare the segments
using the SEGMENT, GROUP, and ASSUME directives. These directives are
described in the Microsoft Macro Assembler Reference .

3.  Use the PUBLIC directive to declare the procedure label public. This
declaration makes the procedure visible to other modules. Also declare
any data you want to make public as PUBLIC.

4.  Use the EXTRN directive to declare any global data or procedures
accessed by the routine as external. The safest way to use EXTRN is to
place the directive outside any segment definition; however, place
near data inside the data segment.

5.  Observe the C naming convention; precede all procedure names and
global data names with an underscore.

12.8.3  Entering the Procedure

When you enter the procedure, in most cases you will want to set up a "stack
frame." This allows you to access parameters passed on the stack and to
allocate local data on the stack. You do not need to set up the stack frame
if your procedure accepts no arguments and does not use the stack.

To set up the stack frame, issue the instructions:

push    bp
mov     bp,sp

This sequence establishes BP as the frame pointer. You cannot use SP for
this purpose because it is not an index or base register. Also, the value of
SP may change as more data are pushed onto the stack. However, the value of
the base register BP remains constant for the life of the procedure unless
your program changes it, so each parameter can be addressed as an offset
from BP.

The instruction sequence above preserves the value of BP, since it will be
needed in the calling procedure as soon as your assembly-language procedure
returns. It then transfers the value in SP to BP to establish a stack frame
on entry to the procedure.

12.8.4  Allocating Local Data

Your assembly-language procedure can use the same technique for allocating
temporary storage for local data that is used by high-level languages. To
set up local data space, decrease the contents of SP just after setting up
the stack frame. (To ensure correct execution, always increase or decrease
SP by an even number.) Decreasing SP reserves space on the stack for local
data. You must restore the space at the end of the procedure as follows:

push    bp
mov     bp,sp
sub     sp,space

In the example above,  space  is the total size in bytes of the local data
you want to allocate. Local variables are then accessed as fixed negative
displacements from BP.

In the following example, the entry sequence establishes a stack frame and
allocates temporary local storage for two words (4 bytes) of data. Later in
the example, the program accesses the local storage, initializing both to 0.

push    bp          ; Save old stack frame.
mov     bp,sp       ; Set up new stack frame.
sub     sp,4        ; Allocate 4 bytes of local storage.
.
.
.
mov     WORD PTR [bp-2],0
mov     WORD PTR [bp-4],0

Note that local variables are also called dynamic, stack, or automatic
variables.

12.8.5  Preserving Register Values

A procedure called from C should preserve the values of SI, DI, SS, and DS
(in addition to BP, which is already saved). You should push any register
value that your procedure modifies onto the stack after setting up the stack
frame and allocating local storage, but prior to entering the main body of
the procedure. Registers that your procedure does not alter need not be
preserved.

────────────────────────────────────────────────────────────────────────────
WARNING

Routines that your assembly-language procedure calls must not alter the SI,
DI, SS, DS, or BP registers. If they do, and you have not preserved the
registers, they can corrupt the calling program's register variables,
segment registers, and stack frame, causing program failure. If your
procedure modifies the direction flag using the STD or CLD instructions, you
must preserve the flags register.
────────────────────────────────────────────────────────────────────────────

The example below shows an entry sequence that sets up a stack frame,
allocates 4 bytes of local data space on the stack, then preserves the SI,
DI, and flags registers.

push    bp        ; Save caller's stack frame.
mov     bp,sp     ; Establish new stack frame.
sub     sp,4      ; Allocate local data space.
push    si        ; Save SI and DI registers.
push    di
pushf             ; Save the flags register.
.
.
.

In the example above, you must exit the procedure with the following code:

popf              ; Restore the flags register.
pop    di         ; Restore the old value in the DI
register.
pop    si         ; Restore the old value in the SI
register.
mov    sp,bp      ; Restore the stack pointer.
pop    bp         ; Restore the frame pointer.
ret               ; Return to the calling routine.

If you do not issue the instructions above in the order shown, you will
place incorrect data in registers. Follow the rules below when restoring the
calling program's registers, stack pointer, and frame pointer:

■   Pop all registers that you preserve in the reverse order from which
they were pushed onto the stack. So, in the example above, SI and DI
are pushed, and DI and SI are popped.

■   Restore the stack pointer by transferring the value of BP into SP
before restoring the value of the frame pointer.

■   Always restore the frame pointer last.

12.8.6  Accessing Parameters

Once you have established the frame pointer, allocated local storage (if
required), and pushed any registers that need to be preserved, you can write
the main body of the procedure. Figure 12.4 shows how functions that observe
the C calling convention use the stack frame.

(This figure may be found in the printed book.)

The stack frame for the assembly-language procedure shown in Figure 12.4 is
established by the following:

1.  The calling program pushes each of the parameters onto the stack,
after which SP points to the last parameter pushed.

2.  The calling program issues a CALL instruction, which causes the return
address (the place in the calling program to which control will
ultimately return) to be placed on the stack. This address can be
either two bytes long (for near calls) or four bytes long (for far
calls). SP now points to this address.

3.  The first instruction of the called procedure saves the old value of
BP, with the instruction  push bp. SP now points to the saved copy of
BP.

4.  BP is used to hold the current value of SP, with the instruction  mov
bp,sp. BP therefore now points to the old value of BP (saved on the
stack).

5.  While BP remains constant throughout the procedure, SP is often
decreased to provide room on the stack for local data or saved
registers.

In general, the displacement (from BP) for a parameter x is equal to the
size of return address plus 2 plus the total size of parameters between x
and BP.

To calculate the size of parameters between x and BP, you must start with
the rightmost parameter because C pushes parameters from right to left. For
example, consider a FAR procedure that has one argument of type int (two
bytes). The displacement of the parameter is

Argument's displacement = size of far return address + 2
= 4 + 2
= 6

The argument can thus be loaded into BP with the following instruction:

mov    bx,[bp+6]

Once you determine the displacement of each parameter, you can use EQU
directives or structures to refer to the parameter with a single identifier
name in your assembly source code. For example, you can use a more readable
name to reference the parameter at  BP+6  if you put the following statement
at the beginning of the assembly source file:

Arg1    EQU    [bp+6]

You can then refer to the first parameter in your source as  Arg1  in any
instruction. Use of this feature is optional.

For far (segmented) addresses, Microsoft C pushes the segment address before
pushing the offset address. When pushing arguments larger than two bytes,
high-order words are always pushed before low-order words, and parameters
larger than two bytes are stored on the stack in most-significant,
least-significant order.

This standard for pushing segment addresses before pushing offset addresses
facilitates the use of the assembly-language instructions LDS (load data
segment) and LES (load extra segment).

12.8.7  Returning a Value

Your assembly-language procedure can return a value to a C calling program.
All return values of four bytes or less are passed in registers. Far
pointers to return values larger than four bytes are returned in the DX and
AX registers. The DX register contains the segment address; the AX register
contains the offset relative to the segment contained in DX.

Table 12.3 shows the register conventions for returning simple data types to
a C program.

Table   12.3 Register Conventions for Simple Return Values

╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Data Type                         Registers
────────────────────────────────────────────────────────────────────────────
char                              AL

Data Type                         Registers
────────────────────────────────────────────────────────────────────────────

int, short, _near *               AX

long, _far *                      High-order portion (or segment address)
in DX;
low-order portion (or offset address) in
AX

────────────────────────────────────────────────────────────────────────────

Your procedures can return structures.

To return a structure from a procedure that uses the C calling convention,
you must copy the structure to a global variable, then return a pointer to
that variable in the AX register (DX:AX, if you compiled in compact, large,
or huge model).

Procedures that use the FORTRAN/Pascal calling convention return structures
similarly, with the following exceptions:

■   The calling program allocates space for the return value on the stack.

■   The calling program passes a pointer to the location where the return
value is to be placed in a hidden parameter.

■   Instead of copying your structure into a global data item, you copy it
into the location pointed to by the hidden parameter.

■   You must still return the pointer to that location in the AX register
(or DX:AX for far data models).

You can return floating-point values from your procedures.

Procedures that use the C calling convention and return type float or type
double must always copy their return values into the global variable fac. To
return floating-point values from procedures declared with the
FORTRAN/Pascal calling convention, you must return the result on the stack,
just as you would a structure.

To return a value of type long double, you must place the value on the
NDP(80x87) stack using the FLD instruction. The C run-time math routines
guarantee that the only value on the NDP stack is a return value; your
routines must observe the same rule.

12.8.8  Exiting the Procedure

Before you exit your assembly-language procedure, you must perform several
steps to restore the calling program's environment. Some of these steps are
dependent on actions you took in allocating space for local variables and
preserving registers.

You must follow these steps (if appropriate to your procedure) in the order
shown:

1.  If you saved any of the registers SS, DS, SI, or DI, they must be
popped off the stack in the reverse order from which they were saved.
If you pop these registers in any other order, your program will
behave incorrectly.

2.  If you allocated local data space at the beginning of the procedure,
you must restore SP with the instruction  mov s  p  ,bp.

3.  Restore BP with the instruction  pop bp. This step is always
necessary.

4.  Return to the calling program by issuing the ret instruction.

The following example shows the simplest possible entry and exit sequence.
In the entry sequence, no registers are saved and no local data space is
allocated.

push   bp
mov    bp,sp   ; Set up the new stack frame.
.
.
.
pop    bp      ; Restore the caller's stack frame.
ret

The following example shows an entry and exit sequence for a procedure that
saves SI and DI and allocates local data space on the stack.

push   bp
mov    bp,sp   ; Establish local stack frame.
sub    sp,4    ; Allocate space for local data.
push   si      ; Preserve the SI and DI registers.
push   di
.
.
.
pop    di      ; Pop saved registers.
pop    si
mov    sp,bp   ; Free local data space.
pop    bp      ; Restore old stack frame.
ret

12.9  Handling Data in Mixed-Language Programming

This section contains detailed information about naming and calling
conventions in a mixed-language program. It also describes how various
languages represent strings, numerical data, arrays, and logical data.

12.9.1  Default Naming and Calling Conventions

Each language has its own default naming and calling conventions (Table
12.4).

Table 12.4  Default Naming and Calling Conventions

╓┌─────────┌──────────────────┌──────────────────┌───────────────────────────╖
Calling            Naming             Parameter
Language  Convention         Convention         Passing
────────────────────────────────────────────────────────────────────────────
BASIC     FORTRAN/Pascal     Case insensitive   Near reference

Calling            Naming             Parameter
Language  Convention         Convention         Passing
────────────────────────────────────────────────────────────────────────────

C         C                  Case sensitive     Value (scalar variables),
reference (arrays and
pointers)

FORTRAN   FORTRAN/Pascal     Case insensitive   Reference

Pascal    FORTRAN/Pascal     Case insensitive   Value

────────────────────────────────────────────────────────────────────────────

BASIC Conventions

When you call BASIC routines from C, you must pass all arguments by near
reference (near pointer). You can modify the conventions observed by BASIC
routines that interface with C functions by using the DECLARE, BYVAL, SEG,
and CALLS keywords. For more information on these keywords, see the
Microsoft BASIC Language Reference or the Microsoft BASIC Programmer's
Guide.

FORTRAN Conventions

You can modify the conventions observed by FORTRAN routines that call C
functions by using the INTERFACE, VALUE, PASCAL, and C keywords. For more
information about the use of these keywords, see the Microsoft FORTRAN
Reference.

Pascal Conventions

You can modify the conventions observed by Pascal routines that interface
with C functions by using the VAR, CONST, ADR, VARS, CONSTS, ADRS, and C
keywords. For more information about the use of these keywords, see the
Microsoft Pascal Compiler User's Guide.

12.9.2  Numeric Data Representation

Table 12.5 shows how to declare numeric variables of similar type in
different languages.

Table 12.5  Equivalent Numeric Data Types

╓┌─────────────┌───────────────────┌──────────────────────┌──────────────────╖
BASIC         C                   FORTRAN                Pascal
────────────────────────────────────────────────────────────────────────────
x%            short               INTEGER*2              INTEGER2

INTEGER       int                 ---                    INTEGER
(default)

---           unsigned short(1)   ---                    WORD

---           unsigned            ---                    ---

x&            long                INTEGER*4              INTEGER4
BASIC         C                   FORTRAN                Pascal
────────────────────────────────────────────────────────────────────────────
x&            long                INTEGER*4              INTEGER4

LONG          ---                 INTEGER (default)      ---

---           unsigned long(1)    ---                    ---

x!            float               REAL*4                 REAL4

x (default)   ---                 REAL                   REAL (default)

SINGLE        ---                 ---                    ---

x#            double              REAL*8                 REAL8

DOUBLE        ---                 DOUBLE                 ---
PRECISION

---           long double         REAL*16                REAL16

BASIC         C                   FORTRAN                Pascal
────────────────────────────────────────────────────────────────────────────

---           unsigned char       CHARACTER*1(2)         CHAR

────────────────────────────────────────────────────────────────────────────

(1)  Types unsigned short and unsigned long are not supported by BASIC or
FORTRAN. Type  unsigned long is not supported by Pascal. A signed integral
type can be substituted, but the maximum range will be less.
(2)  The FORTRAN type CHARACTER*1 is not the same as LOGICAL.

The FORTRAN types COMPLEX*8 and COMPLEX*16 are not implemented in C but can
be represented with structures.

The FORTRAN types LOGICAL*2 and LOGICAL*4 are not implemented in C.
LOGICAL*2 is stored as a one-byte Boolean indicator followed by an unused
byte; LOGICAL*4 is stored as a one-byte Boolean indicator followed by three
unused bytes.

12.9.3  Strings

Each language implements strings differently. This section describes the
ways that strings are implemented in Microsoft languages.

C String Format

C stores strings as arrays of bytes and uses a null character (  '\0'  ) as
an end-of-string delimiter. For example, consider the following string:

char c_string[] = "C text string";

This string is represented in memory as follows:

(This figure may be found in the printed book.)

Because  c_string  is an array like any other, C passes it by reference in
function calls.

BASIC String Format

BASIC stores strings as four-byte descriptors pointing to the actual string
data. The format of the descriptor is as follows:

(This figure may be found in the printed book.)

The first field of the string descriptor contains an integer indicating the
length (in bytes) of the string. The second field contains the address of
the string in the default data segment.

Do not attempt to alter the length of BASIC strings, because they are
managed by BASIC string-space management routines. You cannot count on a
particular string remaining at a given offset during the execution of a
BASIC program because the BASIC string-space management routines allocate
strings to different areas of memory depending on program requirements.

The format of the string at DS:Address is a simple array of characters. The
string is exactly the length indicated in the descriptor.

To pass a BASIC string to C, append a null character.

Because C needs the null character to delimit the end of the string, you
should append  chr$( 0 ) to your BASIC string before passing it to your C function. For example, A$ = "I am a BASIC string"
A$= A$ + chr$( 0 ) CALL CFunc( SADD(A$) )

Note that the BASIC call is made by near reference using the SADD keyword.

Use a string descriptor to pass a C string to BASIC.

To pass a C string to BASIC, create a structure for the string descriptor.
For example,

char c_string[] = "C String Data";

struct tagBASICStringDes
{
int      sd_len;
} str_des;

str_des.sd_len = strlen( c_string );

BASICFunction( &str_des );

FORTRAN String Format

FORTRAN stores strings as a series of bytes at a fixed location in memory.
There is no delimiter at the end of the string. Consider the string declared
as follows:

STR = 'FORTRAN STRING'

The string is stored in memory as follows:

(This figure may be found in the printed book.)

FORTRAN passes strings by reference, as it does all other data.

────────────────────────────────────────────────────────────────────────────
NOTE

FORTRAN's variable length strings cannot be used in mixed-language
programming because the temporary variable used to communicate string length
is not accessible to other languages.
────────────────────────────────────────────────────────────────────────────

To pass a C string to FORTRAN (or Pascal), pass the variable by reference as
you normally would. In your FORTRAN or Pascal routine, you must specify the
length of the string; strings that are passed as arguments from one language
to another must be of fixed length.

Pascal String Format

Pascal represents strings as fixed-length arrays of CHAR or as strings with
a length byte followed by the string data.

To pass a fixed-length string to C, append a null character.

To pass a fixed-length string to a C function, use the concatenation
operator (*) to append a null character. Then pass the string to the C
function by reference (by declaring the string as CONST, CONSTS, VAR, or
VARS). For example,

PROGRAM PasStr( input, output );
type
stype15 = string(15);  { fixed-length }
var
str : stype15;

PROCEDURE PasStrToC( VAR s1 : stype15 ) [C]; EXTERN;

BEGIN
str := 'Pass this to C' * chr( 0 );
PasStrToC( str );
END.

A more flexible way to pass Pascal strings to C functions is to declare them
as type ADRMEM or ADSMEM, then pass the address of the string. For example,

PROCEDURE PasStrToC( s1adr : ADRMEM ) [C]; EXTERN;

Then you can call the C function with this code:

PasStrToC( ADR str );

Using this method, you can pass strings of different lengths to C functions.

────────────────────────────────────────────────────────────────────────────
NOTE

The Pascal type LSTRING is not compatible with C; you can pass a string
declared as LSTRING by first assigning it to another variable of type
STRING, then passing that variable.
────────────────────────────────────────────────────────────────────────────

Whenever you pass a variable of type STRING or type LSTRING by value, Pascal
pushes the whole string onto the stack and passes the length of the string
as another parameter. C cannot access strings passed in this manner.

Before passing a string from C to Pascal, make sure enough space is
allocated.

Passing a string from a C function to a Pascal function or procedure is
identical to passing a string from a C function to a FORTRAN routine. The
only provision you must make is to specify the length of the string to your
Pascal function.

12.9.4  Arrays

When you use an array in a program written in a single language, the method
for array handling is consistent. When you mix languages, you need to be
aware of the differences between array-handling techniques in various
languages.

Unlike most Microsoft languages, BASIC keeps an array descriptor, which is
similar to the BASIC string descriptor discussed in Section 12.9.3,
"Strings." This array descriptor is necessary because BASIC handles memory
allocation for arrays dynamically (at run time). Dynamic allocation requires
BASIC to shift arrays in memory.

To pass a BASIC array to a  C function, use the VARPTR and VARSEG keywords.

The VARPTR and VARSEG keywords obtain the address of the first element of
the array and its segment, respectively. The example below shows how to call
a C function with a near reference and a far reference to an array:

DIM ARRAY%( 20 )
DECLARE CNearArray CDECL( BYVAL Addr AS INTEGER )
DECLARE CFarArray CDECL( BYVAL Addr AS INTEGER, BYVAL Seg AS INTEGER )
.
.
.
CALL CNearArray( VARPTR( ARRAY%(0) ) )
CALL CFarArray( VARPTR( ARRAY%(0) ), VARSEG( ARRAY%(0) ) )

The C functions receiving  ARRAY  can be declared as follows:

_cdecl CNearArray( int * array );
_cdecl CFarArray( int far * array );

The routine that receives the array must not make a call back to BASIC. If
it does, the location of the array data could change, and the address that
was passed to the routine would become meaningless.

If you only need to pass one member of the array from BASIC to your C
function, you can pass it by value as follows:

CALL CFunc( ARRAY%(8) )

12.9.5  Array Declaration and Indexing

Each language varies in the way that arrays are declared and indexed. Array
indexing is a source-level consideration and involves no transformation of
data. There are two differences in the way elements are indexed by each
language:

1.  The value of the lower array bound is different among Microsoft
languages.

By default, FORTRAN indexes the first element of an array as 1. BASIC
and C index it as 0. Pascal lets you begin indexing at any integer
value. Recent versions of BASIC and FORTRAN also give you the option
of specifying lower bounds at any integer value.

2.  Some languages vary subscripts in row-major order; others vary
subscripts in column-major order.

This issue only affects arrays with more than one dimension. With
row-major order (used by C and Pascal), the rightmost dimension
changes first. With column-major order (used by FORTRAN, and BASIC by
default), the leftmost dimension changes first. Thus, in C, the first
four elements of an array declared as  X[3][3]  are

X[0][0]    X[0][1]    X[0][2]    X[1][0]

In FORTRAN, the four elements are

X(1,1)     X(2,1)     X(3,1)     X(1,2)

The C and FORTRAN arrays shown above illustrate the difference between
row-major and column-major order as well as the difference in the
assumed lower bound between C and FORTRAN. Table 12.6 shows
equivalences for array declarations in each language. In this table, r
is the number of elements of the row dimension (which changes most
slowly), and c is the number of elements of the column dimension
(which changes most quickly).
Table 12.6  Equivalent Array Declarations

╓┌─────────┌────────────────────────────────┌────────────────────────────────╖
Language  Array Declaration                Notes
────────────────────────────────────────────────────────────────────────────
BASIC     DIM x(r-1, c-1)                  With default lower bounds of 0

Language  Array Declaration                Notes
────────────────────────────────────────────────────────────────────────────

C         type x[r][c]                     When passed by reference
struct { type x[r][c]; } x       When passed by value

FORTRAN   type x(c, r)                     With default lower bounds of 1

Pascal    x : ARRAY [a..a+r-1, b..b+c-1]
OF type

────────────────────────────────────────────────────────────────────────────

The order of indexing extends to any number of dimensions you declare. For
example, the C declaration

int arr1[2][10][15][20];

is equivalent to the FORTRAN declaration

INTEGER*2 ARR1( 20, 15, 10, 2 )

The constants used in a C array declaration represent dimensions, not upper
bounds as they do in other languages. Therefore, the last element in the C
array declared as  int arr[5][5]  is  arr[4][4], not  arr[5][5].

12.9.6  Structures, Records, and User-Defined Types

The C struct type, the BASIC user-defined type, the FORTRAN record (defined
with the STRUCTURE keyword), and the Pascal record type are equivalent.
Therefore, these data types can be passed between C, FORTRAN, Pascal, and
BASIC.

These types can be affected by the storage method. By default, C, FORTRAN,
and Pascal use word alignment for types shorter than one word (type char and
unsigned char). This storage method specifies that occasional bytes can be
inserted as padding so that word and double-word objects start on an even
boundary. (In addition, all nested structures and records start on a word
boundary.)

If you are passing a structure or record across a mixed-language interface,
your calling routine and called routine must agree on the storage method and
parameter-passing convention. Otherwise, data will not be interpreted
correctly.

Because Pascal, FORTRAN, and C use the same storage method for structures
and records, you can interchange data between routines without taking any
special precautions unless you modify the storage method. Make sure the
storage methods agree before interchanging data between C, FORTRAN, and
Pascal.

BASIC packs user-defined types, so your C function must also pack structures
(using the /Zp command-line option or the pack pragma) to agree.

You can pass structures as parameters by value or by reference. Both the
calling program and the called program must agree on the parameter-passing
convention. See Section 12.2.3, "Parameter-Passing Requirement," for more
information about the language you are using.

12.9.7  External Data

External data refers to data that is both static and public; that is, the
data is stored in a set place in memory as opposed to being allocated on the
stack, and the data is visible to other modules.

External data can be defined in C, Pascal, and assembly language. Note that
a data definition is distinct from an external declaration. A data
definition causes a compiler to create a data object; an external
declaration informs a compiler that the object is to be found in another
module. FORTRAN can only define external data in COMMON blocks. (See Section
12.9.9, "Common Blocks," for more information about sharing external data
with FORTRAN programs.)

There are three requirements for programs that share external data between
languages:

1.  One of the modules must define the data.

You can define a static data object in a C module by defining a data
object outside all functions. (If you use the static keyword in C,
however, the data object will not be made public.)

2.  The other modules that will access the data must declare the data as
external.

In C, you can declare data as external by using an extern declaration,
similar to the extern declaration for functions. In FORTRAN and
Pascal, you can declare data as external by adding the EXTERN
attribute to the data declaration.

3.  Resolve naming-convention differences.

In C, you can adopt the FORTRAN/Pascal naming convention by applying
_fortran or _pascal to the data declaration. In FORTRAN and Pascal,
you can adopt the C naming convention by applying the C attribute to
the data declaration.

12.9.8  Pointers and Address Variables

Rather than passing data directly, you may want to pass the address of a
piece of data. Passing the address amounts to passing the data by reference.
In some cases, such as in BASIC arrays, there is no other way to pass a data
item as a parameter.

C programs always pass array variables by address. All other types are
passed by value unless you use the address-of (&) operator to obtain the

The Pascal ADR and ADS types are equivalent to near and far pointers,
respectively, in C. You can pass ADR and ADS variables as ADRMEM or ADSMEM.
BASIC and FORTRAN do not have formal address types. However, they do provide
ways for storing and passing addresses.

BASIC programs can access a variable's segment address with the VARSEG
function and its offset address with the VARPTR function. The values
returned by these intrinsic functions should then be passed or stored as
ordinary integer variables. If you pass them to another language, pass by
value. Otherwise you will be attempting to pass the address of the address,
rather than the address itself.

To pass a near address, pass only the offset; if you need to pass a far
address, you may have to pass the segment and the offset separately. Pass
the segment address first, unless CDECL is in effect.

FORTRAN programs can determine near and far addresses with the LOC and
LOCFAR functions. Store the result of the LOC function as INTEGER*2 and the
result of the LOCFAR function as INTEGER*4.

As with BASIC, if you pass the result of LOC or LOCFAR to another language,
be sure to pass by value.

12.9.9  Common Blocks

You can pass individual members of a FORTRAN or BASIC common block in an
argument list, just as you can any data item. However, you can also give a
different language module access to the entire common block at once.

C modules can reference the items of a common block by first declaring a
structure with fields that correspond to the common-block variables. Having
defined a structure with the appropriate fields, the C module must then
connect with the common block itself. The next two sections present methods

Passing the Address of a Common Block

To pass the address of a common block, simply pass the address of the first
variable in the block. (In other words, pass the first variable by
reference.) The receiving C module should expect to receive a structure by
reference.

In the example below, the C function  initcb  receives the address of the
variable  N, which it considers to be a pointer to a structure with three
fields:

C      FORTRAN SOURCE CODE
C
COMMON /CBLOCK/N, X, Y
INTEGER*2 N
REAL*8    X, Y
.
.
.
CALL INITCB( N )

/* C source code */

/* Explicitly set structure packing to word-alignment */
#pragma pack( 2 );

struct block_type
{
int     n;
double  x;
double  y;
};

initcb( struct block_type * block_hed )
{
block_hed-n = 1;
block_hed-x = 10.0;
block_hed-y = 20.0;
}

Accessing Common Blocks Directly

You can access FORTRAN common blocks directly by defining a structure with
the appropriate fields and then using the methods described in Section
12.9.7, "External Data." Here is an example of accessing common blocks
directly:

struct block_type
{
int    n;
double x;
double y;
};

extern struct block_type fortran cblock;

You cannot access common blocks directly using BASIC common blocks.

Note that the technique of accessing common blocks directly works with
FORTRAN common blocks, but not with BASIC common blocks. If your C module
must work with both FORTRAN and BASIC common blocks, pass the address of the
common block as a parameter to the function.

12.9.10  Using a Varying Number of Parameters

Some C functions (for example printf) accept a variable number of
parameters. To call such a function from another language, you need to
suppress the type-checking that normally forces a call to be made with a
fixed number of parameters. In BASIC, you can remove this type-checking by
omitting a parameter list from the DECLARE statement. In FORTRAN or Pascal,
you can call routines with a variable number of parameters by including the
VARYING attribute in your interface to the routine, along with the C
attribute. You must use the C attribute because a variable number of
parameters is feasible only with the C calling convention.

Chapter 13  Writing Portable Programs
────────────────────────────────────────────────────────────────────────────

Because C compilers exist on a variety of computers, some C applications
developed for one computer system can be ported to other systems. However,
some aspects of language behavior depend on how a particular C compiler is
implemented and how a specific computer operates. Therefore, when designing
a program to be ported to another system, it is important that you examine
programming assumptions.

This chapter describes programming assumptions that can affect writing
portable programs.

The American National Standards Institute Standard for the C Language (the
ANSI Standard) details every instance where language behavior is defined by
the implementation. Appendix C summarizes implementation-defined behavior
for Microsoft C.

13.1  Assumptions about Hardware

To make C programs portable, you must examine two aspects of your code:
hardware assumptions and compiler dependency. This section deals with
hardware assumptions. Section 13.2, "Assumptions about the Compiler," deals
with compiler dependency.

13.1.1  Size of Basic Types

In C, the size of basic types (char, signed int, unsigned int, float,
double, and long double) is implementation-defined, so relying on a
particular data type to be a given size reduces the portability of a
program.

Don't make assumptions about the size of data types.

Because the size of basic types is left to the implementation, do not make
assumptions about the size or alignment of data types within aggregate
types. Use only the sizeof operator to determine the size or amount of
storage required for a variable or a type.

Following are some rules governing the size of data types.

Type char

Type char is the smallest of the basic types, but it must be large enough to
hold any of the characters in the implementation's basic character set.
Normally, variables of type char are one byte.

Type int and Type short int

Type int and type short int often correspond to the register size of the
target machine. Both int and short are greater than or equal to the size of
type char but less than or equal to the size of type long.

If you assume that type int is a certain size, your code may not be portable
because

■   An int can be defined as a 16-bit (two-byte) or a 32-bit quantity.

■   An int is not always large enough to hold array indexes. For large
arrays, you must use unsigned int; for extremely large arrays, use
long. To be certain your code is portable, define your array indexes
as type size_t. You may not know, before porting your code, the
maximum value to expect an array index of type int to hold. The file
LIMITS.H contains manifest constants, listed below, for the maximum
and minimum values of each basic integral type.

Constant                          Value
────────────────────────────────────────────────────────────────────────────
CHAR_BIT                          Number of bits in a variable of type
char

CHAR_MIN                          Minimum value a variable of type char
can hold

CHAR_MAX                          Maximum value a variable of type char
can hold

SCHAR_MIN                         Minimum value a variable of type signed
char
can hold

SCHAR_MAX                         Maximum value a variable of type signed
char
can hold

UCHAR_MAX                         Maximum value a variable of type
unsigned char can hold

SHRT_MIN                          Minimum value a variable of type short
can hold

SHRT_MAX                          Maximum value a variable of type short
can hold

USHRT_MAX                         Maximum value a variable of type
unsigned short can hold

INT_MIN                           Minimum value a variable of type int can
hold

INT_MAX                           Maximum value a variable of type int can
hold

UINT_MAX                          Maximum value a variable of type
unsigned int
can hold

LONG_MIN                          Minimum value a variable of type long
can hold

LONG_MAX                          Maximum value a variable of type long
can hold

ULONG_MAX                         Maximum value a variable of type
unsigned long can hold

Type float, Type double, and Type long double

Type float is the smallest of the basic floating-point types. Type double is
usually larger than type float, and type long double is usually the largest
of the floating-point types. You can make only these portability assumptions

■   Any value that can be represented as type float can be represented as
type double (type float is a subset of type double).

■   Any value that can be represented as type double can be represented as
type long double (type double is a subset of type long double).

The file FLOAT.H contains manifest constants, listed below, for the maximum
and minimum values of each basic floating-point type.

Constant                          Value
────────────────────────────────────────────────────────────────────────────
DBL_DIG                           Number of decimal digits of precision a
variable of type double can hold

DBL_MAX                           Maximum value a variable of type double
can hold

DBL_MAX_10_EXP                    Maximum value (base 10) the exponent of
a variable of type double can hold

DBL_MAX_EXP                       Maximum value (base 2) the exponent of a
variable of type double can hold

DBL_MIN                           Minimum positive value a variable of
type double can hold

DBL_MIN_10_EXP                    Minimum value (base 10) the exponent of
a variable of type double can hold

DBL_MIN_EXP                       Minimum value (base 2) the exponent of a
variable of type double can hold

FLT_DIG                           Number of decimal digits of precision a
variable of type float can hold

FLT_MAX                           Maximum value a variable of type float
can hold

FLT_MAX_10_EXP                    Maximum value (base 10) the exponent of
a variable of type float can hold

FLT_MAX_EXP                       Maximum value (base 2) the exponent of a
variable of type float can hold

FLT_MIN                           Minimum positive value a variable of
type float can hold

FLT_MIN_10_EXP                    Minimum value (base 10) the exponent of
a variable of type float can hold

FLT_MIN_EXP                       Minimum value (base 2) the exponent of a
variable of type float can hold

LDBL_DIG                          Number of decimal digits of precision a
variable of type long double can hold

LDBL_MAX                          Maximum value a variable of type long
double can hold

LDBL_MAX_10_EXP                   Maximum value (base 10) the exponent of
a variable of type long double can hold

LDBL_MAX_EXP                      Maximum value (base 2) the exponent of a
variable of type long double can hold

LDBL_MIN                          Minimum positive value a variable of
type long double can hold

LDBL_MIN_10_EXP                   Minimum value (base 10) the exponent of
a variable of type long double can hold

LDBL_MIN_EXP                      Minimum value (base 2) the exponent of a
variable of type long double can hold

Microsoft C Type Sizes

Table 13.1 summarizes the size of the basic types in Microsoft C.

Table 13.1  Size of Basic Types in Microsoft C

╓┌─────────────────────────────────────┌─────────────────────────────────────╖
Number
Type                                  of Bytes
────────────────────────────────────────────────────────────────────────────
char, unsigned char                   1

int, short, unsigned int,             2
unsigned short

near pointer                          2

Number
Type                                  of Bytes
────────────────────────────────────────────────────────────────────────────

long, unsigned long                   4

far pointer                           4

float                                 4

double                                8

long double                           10

────────────────────────────────────────────────────────────────────────────

13.1.2  Storage Order and Alignment

The C language does not define any specific layout for the storage of data
items relative to one another. The layout for storage of structure elements,
or unions within a structure or union, is defined by the implementation.

Some processors require that data longer than one byte be word-aligned
(aligned to an even-byte address). Other processors, such as the 80x86
family, do not have such a restriction.

Structure Order and Alignment

The example below illustrates how alignment can affect your program. In the
example, a structure is cast to type long because the programmer knew the
order in which a particular implementation stored data.

/* Nonportable code */
struct time
{
char hour;     /* 0 < hour < 24   -- fits in a char */
char minute;   /* 0 < minute < 60 -- fits in a char */
char second;   /* 0 < second < 60 -- fits in a char */
};

.
.
.
struct time now, alarm_time;
.
.
.
if ( (long)now >= (long)alarm_time )
{
/* sound an alarm */
}

The preceding code makes these nonportable assumptions:

■   The data for  hour  will be stored in a higher order position than
minute  or  second. Because C does not guarantee storage order or
alignment of structures or unions, the code may not be portable to
other machines.

■   Three variables of type char will be shorter than or the same length
as a variable of type long. Thus, the code is not portable according
to the rules governing the size of basic types, as described in
Section 13.1.1.

If either of these assumptions proves false, the comparison (if statement)
is invalid.

You can write code that makes no assumptions about storage order.

To make the program in the preceding example portable, you can break the
comparison between the two long integers into a component-by-component
comparison. This technique is illustrated in the following example:

/* Portable code */
struct time
{
char hour;     /* 0 < hour < 24   -- fits in a char */
char minute;   /* 0 < minute < 60 -- fits in a char */
char second;   /* 0 < second < 60 -- fits in a char */
};

.
.
.
struct time now, alarm_time;
.
.
.
if ( time_cmp( now, alarm_time ) >= 0 )
{
/* sound an alarm */
}
.
.
.

int time_cmp( struct time t1, struct time t2 )
{
if( t1.hour != t2.hour )
return( t2.hour - t1.hour );
if( t1.minute != t2.minute )
return( t2.minute - t1.minute );
return( t2.second - t1.second );
}

Union Order and Alignment

Programmers use unions most often for two purposes: to store data whose
exact type is not known until run time or to access the same data in
different ways.

Unions falling into the second category are usually not portable. For
example, the union below is not portable:

union tag_u
{
char bytes_in_long[4];
long a_long;
};

The intent of the union above is to access the individual bytes of a
variable of type long. However, the union may not work as intended when
ported to other computers because

■   It relies on a constant size for type long.

■   It may assume byte ordering within a variable of type long. (Byte
ordering is described in detail in Section 13.1.3, "Byte Order in a
Word.")

The first problem can be addressed by coding the union as follows:

union tag_u
{
char bytes_in_long[sizeof( long ) / sizeof( char )];
long a_long;
};

Note the use of the sizeof operator to determine the size of a data type.

13.1.3  Byte Order in a Word

The order of bytes within a word (int or short) or a double word (long) can
vary among machines. Code that assumes an internal order is not portable, as
shown by this example:

/*
* Nonportable structure to access an
* int in bytes.
*/
struct tag_int_bytes
{
char lobyte;
char hibyte;
};

A more portable way to access the individual bytes in a word is to define
two macros that rely on the constant CHAR_BIT, defined in LIMITS.H:

#define LOBYTE(a) (char)((a) & 0xff)
#define HIBYTE(a) (char)((unsigned)(a) >> CHAR_BIT)

The LOBYTE macro is still not completely portable. It assumes that a char is
eight bits long, and it uses the constant  0xff  to mask the high-order
eight bits. Because portable programs cannot rely on a given number of bits
in a byte, consider the revision below:

#define LOBYTE(a) (char)((a) & ((unsigned)~0>>CHAR_BIT))
#define HIBYTE(a) (char)((unsigned)(a) >> CHAR_BIT)

The new LOBYTE macro performs a bitwise complement on 0; that is, all zero
bits are turned into ones. It then takes that unsigned quantity and shifts
it right far enough to create a mask of the correct length for the
implementation.

The following code assumes that the order of bytes in a word will be
leastsignificant first:

int c;
.
.
.
fread( &c, sizeof( char ), 1, fp );

The code attempts to read one byte as an int, without converting it from a
char. However, the code will fail in any implementation where the low-order
byte is not the first byte of an int. The following solution is more
portable. In the example below, the data is read into an intermediate
variable of type char before being assigned to the integer variable.

int c;
char ch;
.
.
.
fread( &ch, sizeof( char ), 1, fp );
c = ch;

The example below shows how to use the C run-time function fgetc to return
the value. The fgetc function returns type char, but the value is promoted
to type int when it is assigned to a variable of type int.

int c;
.
.
.
c = fgetc( fp );

Microsoft C Specific

Microsoft C normally aligns data types longer than one byte to an even-byte
address for improved performance. See the /Zp compiler option and the pack
pragma in the Microsoft C Reference and in on-line help for information
about controlling structure packing in Microsoft C.

13.1.4  Reading and Writing Structures

Many C programs read data from disk into structures and write data to disk
from structures. The functions that perform disk I/O in C require you to
specify the number of bytes to be transferred. You should always use the
sizeof operator to obtain the size of the data to be read or written because
differing data type sizes or alignment schemes may alter the size of a given
structure. For example,

fread( &my_struct, sizeof(my_struct), 1, fp );

Microsoft C Specific

When performing disk input and output in Microsoft C, structures may be
different sizes depending on the structure-packing option you have selected
(see the /Zp compiler option and the pack pragma in the Microsoft C
Reference).

13.1.5  Bit Fields in Structures

The Microsoft C compiler implements bit fields. However, many C compilers
do not.

Bit fields allow you to access the individual bits within a data item. While
the practice of accessing the bits in a data item is inherently nonportable,
you can

improve your chances of porting a program that uses bit fields if you make
no assumptions about order of assignment, or size and alignment of bit
fields.

Order of Assignment

The order of assignment of bit fields in memory is left to the
implementation, so you cannot rely on a particular entry in a bit field
structure to be in a higher order position than another. (This problem is
similar to the portability constraint imposed by alignment of basic data
types in structures. The C language does not define any specific layout for
the storage of data items relative to one another.) See Section 13.1.2,
"Storage Order and Alignment" for more information.

Size and Alignment of Bit Fields

The Microsoft C compiler supports bit fields up to the size of the type
long. Each individual member of the bit field structure can be up to the
size of the declared type. Some compilers do not support bit field-structure
elements that are longer than type int.

The example below defines a bit field,  short_bitfield, that is shorter than
type int:

struct short_bitfield
{
unsigned usr_bkup : 1; /* 0  <= usr_bkup <  1 */
unsigned usr_sec  : 4; /* 9 <= usr_sec < 16 */
};

The example below defines a bit field,  long_bitfield, that has elements
longer than type int:

struct long_bitfield
{
unsigned long disk_pos : 22;  /* 0 <= disk_pos   < 4,194,304 */
unsigned long rec_no   : 10;  /* 0 <= rec_no < 1,024 */
};

The bit field  short_bitfield  is likely to be supported by more
implementations than  long_bitfield.

Microsoft C Specific

The example below introduces another portability issue: alignment of data
defined in bit fields. The Microsoft C compiler does not allow an element in
a structure to extend across two words. The first two elements,  day  and
month, take up nine bits. The third,  year, would extend across a word
boundary, so it must begin on the next word boundary.

struct long_bitfield
{
unsigned int day    : 5;  /* 0 <= day   < 32 */
unsigned int month  : 4;  /* 0 <= month < 16 */
unsigned int year   : 11; /* 0 <= year  < 2048 */
};

Figure 13.1 illustrates the example above.

(This figure may be found in the printed book.)

Other compilers may not use the same storage techniques.

13.1.6  Processor Arithmetic Mode

Two types of arithmetic are common on digital computers: one's-complement
arithmetic and two's-complement arithmetic. Some programs assume that all
target computers perform two's-complement arithmetic. If you take advantage
of the fact that a given operation causes a particular bit pattern to be set
on either a one's-complement or two's-complement computer, your program will
not be portable. For example, two's-complement machines represent the
eight-bit integer value -1 as a binary 11111111. A one's-complement machine
represents the same decimal value (-1) as 11111110. Some programmers assume
that -1 will fill a byte or a word with ones, and use it to construct a mask
template that they later shift. This will not work correctly on
one's-complement machines, but the error will not surface until the
least-significant bit is used.

In two's-complement arithmetic, there is only one value that represents
zero. In one's-complement arithmetic, there is a value for zero and a value
for negative zero. Use the C relational operators to handle this anomaly
correctly; if you write code that deliberately circumvents the C relational
operators, tests for zero or NULL may not operate correctly.

Microsoft C Specific

Microsoft C produces code only for the Intel 80x86 processors, which all
perform two's-complement arithmetic.

13.1.7  Pointers

One of the most powerful but potentially dangerous features of the C
language is its use of indirect addressing through pointers. Bugs introduced
by misusing pointers can be difficult to detect and isolate because the
error often corrupts memory unpredictably.

Casting Pointers

Be sure you do not make nonportable assumptions when casting pointers to
different types.

/* Nonportable coercion */
char c[4];
long *lp;

lp = (long *)c;
*lp = 0x12345678L;

This code is nonportable because using a cast to change an array of char to
a pointer of type long assumes a particular byte-ordering scheme. This is
discussed in greater detail in Section 13.1.3, "Byte Order in a Word."

Pointer Size

A pointer can be assigned (or cast) to any integer type large enough to hold
it, but the size of the integer type depends on the machine and the
implementation. (In fact, it can even depend on the memory model.)
Therefore, you cannot assume:

sizeof( char * ) == sizeof( int )

To determine the size of any unmodified data pointer, use

sizeof( void * )

the size of a generic data pointer.

Pointer Subtraction

Code that assumes that pointer subtraction yields an int value is
nonportable. Pointer subtraction yields a result of type ptrdiff_t (defined
in STDDEF.H). Portable code must always use variables of type ptrdiff_t for
storing the result of pointer subtraction.

The Null Pointer

In most implementations, NULL is defined as 0. In Microsoft C, it is defined
as  ((void *)0). Because code pointers and data pointers are often different
sizes, using 0 for the null pointer for both can lead to nonportability. The
difference in size between code pointers and data pointers will cause
problems for functions that expect pointer arguments longer than an int. To
avoid these problems, use the null pointer, as defined in the include file
STDDEF.H; use prototypes; or explicitly cast NULL to the correct data type.
Here is a portable way to use the null pointer:

/* Portable use of the NULL pointer */
main()
{
func1( (char *)NULL );
func2( (void *(*)())NULL );
}

void func1( char * c )
{
}

void func2( void *(* func)() )
{
}

The invocations of  func1  and  func2  explicitly cast NULL to the correct
size. In the case of  func1, NULL is cast to type char *; in the case of
func2, it is cast to a pointer to a function that returns type void.

Microsoft C Specific

Subtraction of pointers to huge arrays that have more than 32,767 elements
may yield a long result. The _huge keyword is implementation-defined by
Microsoft C and is not portable. Here is how to subtract pointers to huge
arrays:

char _huge *a;
char _huge *b;
long        d;
.
.
.
d = (long)( a - b );

In Microsoft C, the memory model selected and the special keywords _near,
_far, and _huge can change the size of a pointer. The Microsoft memory
models and extended keywords are nonportable, but you should be aware of
their effects.

Sizes of generic pointers and default pointer sizes are shown in Tables 13.2
and 13.3, respectively.

Table 13.2  Size of Generic Pointers

╓┌──────────────┌──────────────────────┌─────────────────────────────────────╖
Declaration    Name                   Size
────────────────────────────────────────────────────────────────────────────
void _near *   Generic near pointer   16 bits
void _far *    Generic far pointer    32 bits
void _huge *   Generic huge pointer   32 bits
Declaration    Name                   Size
────────────────────────────────────────────────────────────────────────────
void _huge *   Generic huge pointer   32 bits
────────────────────────────────────────────────────────────────────────────

Table 13.3  Default Pointer Sizes

╓┌─────────────┌──────────────────┌──────────────────────────────────────────╖
Memory Model  Code Pointer Size  Data Pointer Size
────────────────────────────────────────────────────────────────────────────
Tiny          16 bits            16 bits
Small         16 bits            16 bits
Medium        32 bits            16 bits
Compact       16 bits            32 bits
Large         32 bits            32 bits
Huge          32 bits            32 bits
────────────────────────────────────────────────────────────────────────────

The amount of available memory and the address space on systems varies,
depending on many factors outside your control. A program designed with
portability in mind should handle insufficient-memory situations. To ensure
that your program handles these situations, you should always check the
error return from any of the dynamic memory allocation routines, such as
malloc, calloc, strdup, and realloc.

These situations occur not only because of a lack of installed memory but
also because too many other applications are using memory. For example,

■   Installed resident software can cause your program to fail. In DOS,
these programs are usually device drivers or
terminate-and-stay-resident (TSR) utilities.

■   An event or combination of events in a multitasking operating system
such as OS/2 or XENIX can cause your program to fail. These failures
are complex and difficult to predict. Here is an example: the user has
installed a daemon to "pop up" every so often and check the system
status. The user is running your application along with enough other
large applications to cause a critical shortage of memory. When the
daemon pops up, your program may fail on a memory allocation request.

■   An application running under Windows can use an extraordinary amount
of the global heap and not return it to the free pool. This type of
behavior will cause Windows to deny a GlobalAlloc request.

13.1.9  Character Set

The C language does not define the character set used in an implementation.
This means that any programs that assume the character set to be ASCII are
nonportable.

The only restrictions on the character set are these:

■   No character in the implementation's character set may be larger than
the size of type char.

■   Each character in the set must be represented as a positive value by
type char, whether it is treated as signed or unsigned. So, in the
case of the ASCII character set and an eight-bit char, the maximum
value is 127 (128 is a negative number when stored in a char
variable).

Character Classification

The standard C run-time support contains a complete set of
characterclassification macros and functions. These functions are defined in
the CTYPE.H file and are guaranteed to be portable:

isalnum         isdigit         isprint         isupper
isalpha         isgraph         ispunct         isxdigit
iscntrl         islower         isspace

The following code fragment is not portable to implementations that do not
use the ASCII character set:

/* Nonportable */
if( c >= 'A' && c <= 'Z' )
/* uppercase alphabetic */

Instead, consider using this:

/* Portable */
if( isalpha(c) && isupper(c) )
/* uppercase alphabetic */

The first example above is nonportable, because it assumes that uppercase  A
is represented by a smaller value than uppercase  Z, and that no lowercase
characters fall between the values of  A  and  Z. The second example is
portable, because it uses the character classification functions to perform
the tests.

In a portable program, you should not perform any comparison on variables of
type char except strict equality (==). You cannot assume the character set
follows an increasing sequence─that may not be true on a different machine.

Case Translation

Translation of characters from upper- to lowercase or from lowerto uppercase
is called "case translation." The following example shows a coding technique
for case translation not portable to implementations using a non-ASCII
character set.

#define make_upper(c) ((c)&0xcf)
#define make_lower(c) ((c)|0x20)

This code takes advantage of the fact that you can map uppercase to
lowercase simply by changing the state of bit 6. It is extremely efficient
but nonportable. To write portable code, use the case-translation macros
toupper and tolower (defined in CTYPE.H).

13.2  Assumptions about the Compiler

Different compilers translate C source code into object code in different
ways. The ANSI draft standard for the C programming language defines how
many of these translations must be done; others are implementation-defined.

This section describes assumptions about how the compiler translates your C
code, which can make your programs nonportable. For a complete description
of how Microsoft C handles implementation-defined operations, see Appendix
C, "Implementation-Defined Behavior."

13.2.1  Sign Extension

"Sign extension" is the propagation of the sign bit to fill unoccupied space
when promoting to a more-significant type or when performing bitwise
right-shift operations.

Promotion from Shorter Types

Integral promotions from shorter types occur when you make an assignment,
perform arithmetic, perform a comparison, or perform an explicit cast.

The behavior of integral promotion is well defined, except for type char.
The implementation defines whether type char is treated as signed or
unsigned. The code fragment below is an example of promotion as a result of
assignment:

char c1 = -3;
int  i1;

i1 = c1;

In this example, the expected result of the assignment statement is that  i1
will be set to -3. If the implementation defines type char as unsigned,
however, sign extension will not occur, and  i1  will be 253 (on a
two's-complement machine).

Promotion can also occur as a result of a comparison of different types:

char c;

if( c == 0x80 )
.
.
.

This comparison will never evaluate as true on an implementation that
signextends char types but treats hexadecimal constants as unsigned. Use a
character constant of the form '\x80', or explicitly cast the constant to
type char to perform the comparison correctly.

The following comparison, which is an example of promotion as a result of a
cast, is also nonportable:

char c;
unsigned int u;

if( u == (unsigned)c )

There are two problems with this code:

■   The char type may be treated as signed or unsigned, depending on the
implementation.

■   If the char type is treated as signed, it can be converted to unsigned
in two different ways: the char value may first be sign-extended to
int, then converted to unsigned; or the char may be converted to
unsigned char, then sign-extended to int length.

It is always safe to compare a signed int with a char constant because C
requires all character constants to be positive.

Variables of type char are promoted to type int when passed as arguments to
a function. This will cause sign extension on some machines. Consider the
following code:

char c = 128;

printf( "%d\n", c );

Microsoft C Specific

Microsoft C allows you to treat type char as signed or unsigned. By default,
a char is considered signed, but if you change the default char type using
the /J compiler option, you can treat it as unsigned.

Bitwise Right-Shift Operations

Positive or unsigned integral types (char, short, int, and long) yield
positive or zero values after a right bitwise shift (>>) operation. For
example,

(char)120 >> 4

yields 7,

(unsigned char)240 >> 8

yields 0,

(int)500 >> 8

yields 1, and

(unsigned int)65535 >> 4

yields 4,095.

Negative-signed integral types yield implementation-defined values after a
bitwise right-shift operation. This means that you must know whether you
want to do a signed or unsigned shift, then code accordingly.

If you don't know how the implementation performs, you may get unexpected
results. For example,  (signed char)0x80 >> 3  yields 0xf0 if the
imple-mentation performs sign extension on right bitwise shifts. If the
implementation does not perform the sign extension, the result is 0x10.

You can use right shifts to speed up division when the divisor can be
represented by powers of 2 and the dividend is positive. To maintain
portability, you should use the division operator.

To perform an unsigned shift, explicitly cast the data to an unsigned type.
To perform a shift that extends the sign bit, use the division operator as
follows: divide by 2n, where n is the number of bits you want to shift.

13.2.2  Length and Case of Identifiers

Some implementations do not support long identifiers. Some allow only 6
characters, while others allow as many as 32. They may report each
identifier that exceeds the maximum length or truncate identifiers to a
given length. Truncation causes serious problems, especially if you have a
number of similarly named variables within the scope of a block of code,
such as the following:

double acct_receivable_30_days;
double acct_receivable_60_days;
double acct_receivable_90_days;
double current_interest_rate;

acct_receivable_30_days *= current_interest_rate;

If your target system retains only six significant characters, you will have
to rename all your  acct_receivable  variables.

Case sensitivity also affects portability. C is usually a case-sensitive
language. That is,  CalculateInterest  is not considered the same identifier
as calculateinterest. Some systems are not case sensitive, however, so to
write portable code, differentiate your identifiers by something other than
case.

These problems with identifiers can occur in two locations: the compiler and
the linker or loader. Even if the compiler can handle long and
case-differentiated identifiers, if the linker or loader cannot, you can get
duplicate definitions or other unexpected errors.

Microsoft C Specific

The Microsoft C compiler issues the /NOIGNORECASE command to the Microsoft
Segmented-Executable Linker (LINK), specifically instructing it to consider
the case of identifiers.

13.2.3  Register Variables

The number and type of register variables in a function depend on the
implementation. You can declare more variables as register than the number
of physical registers the implementation uses. In such a case, the compiler
treats the excess register variables as automatic.

Since the types that qualify for register class differ among
implementations, invalid register declarations are treated as automatic.

If you declare variables as register to optimize performance, declare them
in decreasing order of importance to ensure that the compiler allocates a
register to the most important variables.

Microsoft C Specific

The compiler ignores register declarations if you select the global register
allocation optimization. You can select global register allocation as
follows:

Environment                       Selection
────────────────────────────────────────────────────────────────────────────
CL command line                   Specify either the /Oe or /Ox option.

PWB                               Select the Global Register Allocation
option in the Debug Build Options or
Release Build Options dialog boxes.

pragma                            Use the optimize pragma with the e
parameter.

13.2.4  Functions with a Variable Number of Arguments

Functions that accept a variable number of arguments are not portable.
Although both the ANSI Standard and The C Programming Language specify how
to write these functions and how they behave, differences still exist among
compiler implementors about how to use variable argument lists.

Many UNIX(R) systems support a standard that differs from the ANSI Standard
for variable arguments. Although this may change, it currently presents a
portability concern.

Microsoft C run-time libraries and macros allow you to use whichever version
of variable argument support you expect to be most portable for your
application.

13.2.5  Evaluation Order

The C language does not guarantee the evaluation order of most expressions.
Avoid writing constructs that depend on evaluation within an expression to
proceed in a particular manner. For example,

i = 0;
func( i++, i++ );
.
.
.
func( int a, int b )
{

A compiler could evaluate this code fragment and pass 0 as  a  and 1 as  b.
It could also pass 1 as  a  and 0 as  b  and conform equally with the
standards.

The C language does guarantee that an expression will be completely
evaluated at any given "sequence point." A sequence point is a point in the
syntax of the language at which all side effects of an expression or series
of expressions have been completed.

These are the sequence points in the C language:

1.  The semicolon (;) statement separator

2.  The call to a function after the arguments have been evaluated

3.  The end of the first operand of one of the following:

■   Logical AND (&&)

■   Logical OR (||)

■   Conditional (?)

■   Comma separator (,) when used to separate statements or in
expressions; the comma separator is not a sequence point when it
is used between variables in declaration statements or between
parameters in a function  invocation

4.  The end of a full expression, such as

■   An initializer

■   The expression in an expression statement (for example, any
expression inside parentheses)

■   The controlling expression of a while or do statement

■   Any of the three expressions of a for statement

■   The expression in a return statement

13.2.6  Function and Macro Arguments with Side Effects

Run-time support functions can be implemented either as functions or as
macros. Avoid including expressions with side effects inside function
invocations unless you are sure the function will not be implemented as a
macro. Here is an illustration of how an argument with side effects can
cause problems:

#define limit_number(a) ((a>1000)?1000:(a))

a = limit_number( a++ );

If  a ≤ 1000, it is incremented once. If  a > 1000, it is incremented twice,
which is probably not the intended behavior.

A macro can be used safely with an argument that has side effects if it
evaluates its parameter only once. You can determine whether a macro is safe
only by inspecting the code.

A common example of a run-time support function that is often implemented as
a macro is toupper. You will find your program's behavior confusing if you
use the following code:

char c;

c = toupper( getc() );

If  toupper  is implemented as a function,  getc  will be called only once,
and its return value will be translated to uppercase. However, if  toupper
is implemented as a macro,  getc  will be called once or twice, depending on
whether  c is upper- or lowercase. Consider the following macro example:

#define toupper(c) ( (islower(c)) ? _toupper(c) : (c) )

If you include the toupper macro in your code, the preprocessor expands it
as follows:

/* What you wrote */
c = toupper( getc() );

/* Macro expansion */
ch = (islower( (getc()) ) ? _toupper( getc() ) : (getc()) );

The expansion of the macro shows that the argument to  toupper  will always
be called twice: once to determine if the character is lowercase and the
next time to perform case translation (if necessary). In the example, this
double evaluation calls the getc function twice. Because  getc  is a
function whose side effect is to read a character from the standard input
device, the example requests two characters from standard input.

13.2.7  Environment Differences

Many programs perform some file I/O. When writing these programs for
portability, consider the following:

■   Do not hard-code file or path names. Use constants you define either
in a header file or at the beginning of the program.

■   Do not assume the use of any particular file system. For example, the
UNIX-model, hierarchical file system is prevalent on small computers.
On larger systems, the file system often follows a different model.

■   Do not assume a particular display size (number of rows and columns).

■   Do not assume that display attributes exist. Some environments do not
support such attributes as color, underlined text, blinking text,
highlighted text, inverse text, protected text, or dim text.

13.3  Portability of Data Files

Data files are rarely portable across different CPUs. Structures, unions,
and arrays have varying internal layout and alignment requirements on
different machines. In addition, byte ordering within words and actual word
length may vary.

The best way to achieve data-file portability is to write and read data
files as one-dimensional character arrays. This procedure prevents alignment
and padding problems if the data are written and read as characters. The
only portability problem you are likely to encounter if you follow this
course is a conflict in character sets; many computers have character-set
conversion utilities.

13.4  Portability Concerns Specific to Microsoft C

Microsoft C offers extensions that let you take advantage of the full
capabilities of the computer. These extensions are not portable to other
compilers or environments. The following list shows keywords specific to
Microsoft C:

_asm○           _far            _huge           pascal
_based          _fastcall       _interrupt      _pascal
cdecl           fortran         near            _saveregs
_cdecl          _fortran        _near           _segment
_export         huge            _loadds         _segname
far

The Microsoft C Reference contains compatibility information for every
function in the run-time library. Any function or macro that does not have
the ANSI box marked may not be portable to other compilers or computer
systems.

13.5  Microsoft C Byte Ordering

Tables 13.4 and 13.5 summarize Microsoft C byte ordering for short and long
types, respectively. In these tables, the least-significant byte of the data
item is b0; the next byte is denoted by b1, and so on.

Since byte ordering is machine specific, any program that uses this byte
ordering will not be portable.

Table 13.4  Byte Ordering for Short Types

╓┌───────────────────────────┌───────────────────────────────────────────────╖
CPU                         Byte Order
────────────────────────────────────────────────────────────────────────────
8086                        b0   b1
CPU                         Byte Order
────────────────────────────────────────────────────────────────────────────
8086                        b0   b1
80286                       b0   b1
PDP-11(R)                   b0   b1
VAX-11(R)                   b0   b1
M68000                      b1   b0
Z8000(R)                    b1   b0
────────────────────────────────────────────────────────────────────────────

Table 13.5  Byte Ordering for Long Types

╓┌────────────────────────┌──────────────────────────────────────────────────╖
CPU                      Byte Order
────────────────────────────────────────────────────────────────────────────
8086                     b0  b1  b2  b3
80286                    b0  b1  b2  b3
PDP-11                   b2  b3  b0  b1
VAX-11                   b0  b1  b2  b3
M68000                   b3  b2  b1  b0
CPU                      Byte Order
────────────────────────────────────────────────────────────────────────────
M68000                   b3  b2  b1  b0
Z8000                    b3  b2  b1  b0
────────────────────────────────────────────────────────────────────────────

PART IV  OS/2 Support
────────────────────────────────────────────────────────────────────────────

The Microsoft C Professional Development System provides support for OS/2
development.

Chapter 14 explains many of the general issues of OS/2 development,
including accessing the OS/2 system functions, creating module-definition
files, and using the OS/2-specific features of utilities such as the linker
and BIND. Chapter 15 focuses on how to create a multithread application,
including information about C run-time library support, potential problem
areas, and how to use CodeView to debug multithread applications. Chapter 16
concentrates on the creation of dynamic-link libraries, including C run-time
library support, application program interface with DLLs, and debugging DLLs
with CodeView.

Chapter 14  Building OS/2 Applications
────────────────────────────────────────────────────────────────────────────

Using Microsoft C 6.0, you can create applications for OS/2. This chapter
explains features in the compiler and the utilities that

■   Call the OS/2 operating system directly from C functions

■   Perform multitasking within your program by starting multiple
execution paths known as "threads"

■   Create dynamic-link libraries that can be used by multiple
applications

■   Work in either OS/2 or DOS to create programs for both environments

■   Develop "dual-mode" applications that will run under both OS/2 and DOS
from a single executable program file

This chapter contains information about accessing the OS/2 Applications
Program Interface (API) from your C programs. It also discusses compile
options that affect applications you develop for OS/2, module-definition
files and import libraries, linker options specific to developing OS/2
applications, and using the BIND utility to create dual-mode applications.

Chapters 15 and 16, "Creating Multithread OS/2 Applications" and
"Dynamic-Linking with OS/2," contain detailed information about how
Microsoft C supports these advanced OS/2 features.

14.1  The OS/2 Applications Program Interface

The entire set of OS/2 system calls is known as the OS/2 API. You need to
access the OS/2 API for the low-level functions provided by the operating
system, such as

■   Requests for information about the display

■   Requests to display information

■   Requests for information from the pointing device (mouse)

■   Requests for information from the keyboard

■   Requests for blocks of memory

■   Requests for disk actions, including reading and writing

You can call all of the OS/2 system services directly from programs written
in C. Under DOS, the API operates at a lower level, requiring programs to
set up hardware registers and generate a software interrupt to access the
system services. Under OS/2, programs use function calls to access the
operating system services.

Sections 14.1.1-14.1.3 describe the calling conventions and precautions you
must observe when accessing OS/2 API functions.

14.1.1  Calling the OS/2 API

Your program must declare calls to the OS/2 API with both the _far and
_pascal keywords. Adding the _pascal keyword to the function declaration
ensures that the FORTRAN/Pascal calling convention is used. The _far keyword
directs the compiler to generate an intersegment call instruction. A sample
declaration for the OS/2 API function DosExit follows:

void _far _pascal DosExit( unsigned int, unsigned int );

You must be sure that all pointers passed to OS/2 API functions are far
pointers, even if you are writing a program using the small or medium memory
models. This process can be simplified if you include the OS2.H header file.

OS/2 API function calls are far and must use the FORTRAN/ Pascal calling
convention.

OS/2 API functions use the FORTRAN/Pascal language calling convention. They
expect arguments to be pushed onto the stack in left-to-right order, with
the last argument in the list pushed onto the stack last. OS/2 API functions
remove their arguments from the stack before returning to the caller.
Standard C functions push their arguments from right to left, with the first
argument being the last one pushed.

All OS/2 API functions return 0 if the operation is successful. They return
an error code if the operation fails.

14.1.2  Including the OS/2 Header Files

You do not have to construct your own API declarations if you use the OS2.H
header file. It is the first file of a set of header files that supply
function prototypes for every OS/2 API call and definitions of special OS/2
structures, data types, and constants.

The API function prototypes define all functions as far procedures with the
FORTRAN/Pascal calling convention. They also take care of casting all near
pointers to far pointers and other similar type coercions.

Define a constant before including OS2.H.

When you include OS2.H, the most commonly used data types and macros are
automatically defined. To minimize compile time for the C preprocessor,
other definitions are grouped by function. They are included only if your
source file defines the appropriate constant before including OS2.H. The
following list shows how these manifest constants affect functions from the
OS/2 API:

Constant                          Effect
────────────────────────────────────────────────────────────────────────────
INCL_BASE                         All error constants, kernel, keyboard,
video, and mouse definitions (same as
INCL_DOS + INCL_SUB + INCL_DOSERRORS)

INCL_DOS                          All kernel system definitions

INCL_DOSERRORS                    All error constants

INCL_KBD                          All keyboard definitions

INCL_MOU                          All mouse definitions

INCL_SUB                          All keyboard, video, and mouse
definitions (same as INCL_KBD + INCL_VIO
+ INCL_MOU)

INCL_VIO                          All video-display definitions

INCL_WIN                          Basic set of Presentation Manager
definitions

The header files have additional constants that let you include smaller
subsets or functions not defined in the standard sets.

The statement #define INCL_DOS affects the functions defined.

The program in the example below calls the OS/2 kernel to request a
nonshareable, nondiscardable memory segment for an 8K buffer. The INCL_DOS
constant in the #define statement instructs the C preprocessor to include
all of the kernel function definitions. The function prototype for
DosAllocSeg declares the first and third arguments as USHORT (unsigned short
integers). The second argument is a far pointer to the OS/2 data type SEL,
which is used for segment selectors.

#define  INCL_DOS
#include <os2.h>

VOID GetMemorySegment()
{
SEL    selector;

if ( DosAllocSeg( 8192, &selector, 0 ) )
puts( "Allocation failed\n" );
else
puts( "Successful allocation\n" );
}

The function call in the example works correctly even in a small or medium
memory model program where the selector variable is a near data type. All
three arguments are coerced by the function prototype to the proper types,
regardless of the memory model used.

14.1.3  Creating Dual-Mode Programs as Family Applications

The OS/2 API has a subset of system functions that have direct DOS
equivalents. This subset is known as the "Family Applications Program
Interface" (Family API). Programs that use only the Family API can be run
under DOS and the OS/2 compatibility box, as well as under OS/2.

You can build a single executable file for use under both OS/2 and DOS.

By creating a Family API application, you can distribute the same executable
file to both OS/2 and DOS users. The Microsoft C compiler, linker, and
object module librarian are examples of family applications. The benefit of
having a single executable file is offset by a few disadvantages:

■   The executable file is larger, because it includes a special loader
and OS/2 API-simulator routines for running in DOS mode.

■   In real mode, the application loads more slowly than a program created
specifically for either OS/2 or DOS. There is no performance penalty
in loading or running in OS/2 protected mode.

■   When running in real mode, the program cannot use advanced OS/2
features such as multiple threads or system calls that are not part of
the Family API. If you take special precautions (described in Section
14.5, "The BIND Utility"), the program can take advantage of these
features when running in OS/2 protected mode.

Follow the same steps to build both family and protected-mode applications
but add an extra step at the end to create the Family API program. This step
links functions from the dynamic-link libraries directly into a stand-alone
executable file that can run in both real and protected mode.

Restrictions on Family Applications

Programs that use the Family API are subject to certain restrictions:

■   They cannot overcommit memory; they must fit into the DOS 640K
environment.

■   They cannot use advanced OS/2 features, such as threads and
semaphores, that do not have DOS counterparts.

■   They must restrict their use of some calls to the defined common
subset. For example, some of the file-mode options for the DosOpen
function are not available in real mode.

Family API Functions

The system calls that make up Family API are listed below. The calls marked
with an asterisk (*) have different options or behavior, depending on
whether they are running in real mode or protected mode. The Microsoft OS/2
Programmer's Reference explains the functions and the differences between
their real- and protected-mode implementations.

DosAllocHuge*        DosHoldSignal*       DosSubSet
DosAllocSeg*         DosInsMessage*       DosWrite
DosBeep              DosMkDir             KbdCharIn*
DosBufReset          DosMove              KbdFlushBuffer*
DosCaseMap*          DosNewSize           KbdGetStatus*
DosChdir             DosOpen*             KbdPeek*
DosChgFilePtr        DosPutMessage*       KbdSetStatus*
DosCLIAccess         DosQCurDir           KbdStringIn*
DosClose             DosQCurDisk          VioGetBuf
DosCreateCSAlias*    DosQFHandState       VioGetConfig
DosDelete            DosQFileInfo         VioGetCurPos
DosDevConfig         DosQFileMode         VioGetCurType
DosDevIOCtl*         DosQFSInfo           VioGetMode
DosDupHandle         DosQHandType         VioGetPhysBuf
DosExecPgm*          DosReallocHuge*      VioScrLock*
DosExit*             DosReallocSeg*       VioScrollDn
DosFileLocks         DosRmDir             VioScrollLf
DosFindClose         DosSelectDisk        VioScrollRt
DosFindFirst         DosSetCp             VioScrollUp
DosFindNext*         DosSetDateTime       VioScrUnLock
DosFreeSeg*          DosSetFHandState*    VioSetCurPos
DosGetCollate*       DosSetFileInfo       VioSetCurType
DosGetCp             DosSetFileMode       VioSetMode
DosGetCtryInfo*      DosSetFSInfo         VioShowBuf
DosGetDateTime       DosSetSigHandler*    VioWrtCellStr
DosGetDBSCEv*        DosSetVec*           VioWrtCharStr
DosGetEnv            DosSetVerify         VioWrtCharStrAtt
DosGetHugeShift      DosSizeSeg           VioWrtNAttr
DosGetMachineMode    DosSleep             VioWrtNCell
DosGetMessage*       DosSubAlloc          VioWrtNChar
DosGetVersion        DosSubFree           VioWrtTTy

14.2  Compile Options for the CL Command

This section describes the compile options you must specify in the
Programmer's WorkBench or on the CL command line to designate a program's
target environment (OS/2, DOS, or both). It also introduces options you
should use with certain types of OS/2 applications, such as multithread
programs, dynamic-link libraries, and programs calling C function
dynamic-link libraries. For an in-depth discussion of topics that affect
multithread processes and dynamic-link libraries, see Chapter 15, "Creating
Multithread OS/2 Applications," and Chapter 16, "Dynamic-Linking with OS/2."

14.2.1  The Link Mode Options (/Lp, /Lr, and /Lc)

The /Lx options (/Lp, /Lr, and /Lc) provide the flexibility of programming
for both OS/2 and DOS in either environment. Regardless of the host
operating system, you can build applications for either target operating
system. You do not have to switch to the target system to build the program.

The /Lp option produces an OS/2 protected-mode program; the /Lr option
creates a DOS real-mode program. /Lc is a synonym for /Lr.

To use these options, the mode-specific combined libraries must be
installed. Unless you choose a default operating environment, each
mode-specific library has the letter P or R at the end of its base name. For
example, the protected-mode small memory model library with the emulator
floating-point option is named SLIBCEP.LIB. The corresponding real-mode
library is named SLIBCER.LIB. The default name, however, is SLIBCE.LIB.

Installing and Using the Microsoft C Professional Development System
describes how to create mode-specific libraries with the SETUP program. It
also explains how to establish a default target environment by renaming
libraries. A default environment is useful if you work mainly in one mode
(OS/2 or DOS) but sometimes write programs for the other mode. When you set
up OS/2 as the default mode, SLIBCEP.LIB, for example, becomes SLIBCE.LIB.

Don't use /Lx options unless you have mode-specific libraries.

When you use the /Lx options, you instruct the compiler to override the
default library name in the object module's library search record and to
substitute the mode-specific combined library name. The compiler also
generates a link response file with the /NODEFAULTLIBRARYSEARCH (/NOD)
linker option to override the default library. See Section 14.4, "Link

Do not use the /Lp option to specify protected mode when OS/2 is the default
environment. If you do this, the compiler uses the name of the mode-specific
library (e.g., SLIBCEP.LIB). Because SETUP renamed the library to SLIBCE.LIB
to create a default environment, the library search fails. This caution also
applies to specifying /Lr when you have installed DOS as the default
environment.

If you invoke the linker in a separate step from the compilation, you must
specify the /NOD link option.

────────────────────────────────────────────────────────────────────────────
NOTE

There is a special library, LLIBCMT, for building multithread OS/2
applications. Another special library, LLIBCDLL, supports multithread
dynamic-link libraries. If you use LLIBCMT or LLIBCDLL, you must use one of
the library selection options described in Section 14.2.3 instead of / Lp.
────────────────────────────────────────────────────────────────────────────

14.2.2  Creating Bound Programs Option (/Fb)

The /Fb option allows you to compile, link, and bind an application in one
step. Binding an executable file creates a Family API program that can run
under both OS/2 and DOS.

When you use /Fb, the compiler invokes the BIND utility program immediately
after the link step. You can also execute BIND directly (as described in
Section 14.5, "The BIND Utility"). You must have the API.LIB and OS2.LIB
files in the path specified by the LIB environment variable or in your
current working directory.

The syntax for the /Fb option is

/Fb«bound-exe»

You can specify a separate name for a bound-executable file.

The optional bound-exe parameter specifies the name of the bound program. It
must directly follow the /Fb option, without intervening spaces. The
bound-exe name can be a file specification, a drive name, or a directory
specification. If you specify a file name without an extension, the compiler
appends the .EXE extension to the name. If you give a directory
specification for bound-exe, the name must end with a backslash ( \ ) so the
compiler can distinguish it from an ordinary file name. If you do not supply
a name, BIND uses the name of the unbound program and overwrites it.

When creating both bound and protected-mode versions with different names,
consider this example:

CL /Lp /Fbsampleb sample.c

The protected-mode executable file that this command creates is called
SAMPLE.EXE; the bound-executable file is called SAMPLEB.EXE.

You may need to run BIND as a separate step instead of using the /Fb
option.

The /Fb option works only if you are doing a single-step compile and link.
If the CL command line includes the /c (compile without link) option, the
compiler ignores the /Fb option. If you use /c, you must run the BIND
utility as a separate step of the program build.

If your program includes calls to API functions that are not in the FAPI
subset, you must use the /n option of the BIND utility, described in Section
14.5, to build the dual-mode executable file. If you need to use the /n BIND
option, you cannot compile with /Fb. You must compile without linking by
using the /c option at the compile stage; then link the program and run the
BIND utility with the /n option.

14.2.3  Library Selection Options (/MT, /ML, /MD, /Zl)

Special libraries are provided for building OS/2 multithread applications
and dynamic-link libraries. You must not use these libraries with any other
C run-time library.

Special libraries must be the only C run-time libraries linked with your
program.

If you use one of these special libraries, apply one of the library
selection options (/ML, /MD, or /MT) to tell the compiler to replace the
default library name in the object file with the name of the special
library. This ensures that the linker does not bring in code from the
default libraries. If you do not specify one of the options when compiling,
you must link with the /NOD option to prevent search of a default library,
such as SLIBCE.LIB.

If you fail to include any of these options, the linker searches the default
library and may select the wrong version of a library function. It might,
for example, select the single thread version of the printf function for a
multithread program that has more than one thread calling printf.

Because the /Lp option (see Section 14.2.1, "The Link Mode Options")
instructs the compiler to specify the default protected-mode libraries
rather than the special multithread or DLL-specific libraries, do not use it
with /Zl or /Mx.

Multithread Library Option (/MT)

When you specify the /MT option, the compiler embeds the LLIBCMT.LIB
library name in the object file. Chapter 15, "Creating Multithread OS/2
Applications," explains how to build multithread applications using
LLIBCMT.LIB. The /MT option also has the effect of combining these
command-line options:

/ALw /FPi /G2 /D MT

C Run-Time Library for Building DLLs (/ML)

Use the /ML option to specify that you are building a dynamic-link library
that calls functions in LLIBCDLL.LIB, the C run-time library for
dynamic-link  libraries. The library name is embedded in the object file.
The /ML option also has the effect of combining these command-line options:

/ALw /FPa /G2 /D MT

C Run-Time Library for DLLs (/MD)

Use the /MD option to create a dynamic-link library of C run-time routines.
With this option, the object file does not have any library search records.
The /MD option has the effect of combining these command-line options:

/ALw /FPi /G2 /DDLL /D MT

Chapter 16, "Dynamic Linking with OS/2," describes the process of building
and using dynamic-link libraries with LLIBCDLL.LIB.

Suppress Default Library Option (/Zl)

Use the /Zl option when you want to suppress selection of a default library.
It tells the compiler not to place the default library name in the object
file.

You can specify libraries and additional LINK options on the CL command
line.

You can specify link options or the names of libraries on the CL command
line with the /LINK option. You can also give the library name, with its
.LIB extension, before the /LINK option. Each command below selects the
multithread C run-time library:

CL /Zl myprog.c llibcmt.lib

CL /Zl myprog.c /link llibcmt

If you compile with the /c (compile without link) option, your link command
must include the library name:

LINK myprog, myprog.exe, myprog.map, llibcmt.lib, myprog.def

14.2.4  Memory-Model Options (/Ax)

You must select the memory model appropriate to your application. For
protected-mode applications, the large model provides the most convenient
interface with the special libraries. It provides the additional benefit of
placing code and data into multiple segments, allowing OS/2 to swap parts of
the program to disk efficiently.

Use the large memory model with LLIBCMT (/AL and /MT).

The multithread run-time C library, LLIBCMT.LIB, is a large-model library.
All library function calls must be far calls. In addition, all pointers
passed to functions in the library must be far pointers. If you do not
compile with the /AL option, you use must use the keyword _far when
declaring pointers. Variables can be declared either near or far as long as
they are either passed by value or cast to a far address.

If you want to call fopen for example, you must use code such as the
following:

FILE _far * fp;
fp = fopen( ... );

────────────────────────────────────────────────────────────────────────────
NOTE

If you are using the compact, large, or huge memory model, data pointers are
far by default, so you do not need to explicitly specify _far.
────────────────────────────────────────────────────────────────────────────

Because each thread has its own stack, you have to compile in an SS != DS
model.

Multithread applications require that each thread have its own stack. As a
result, you cannot safely assume that the stack segment is in the default
data group (DGROUP). That means that the stack segment can be different from
the data segment (SS != DS).

To specify that you have selected an SS != DS model, you must use the /Au or
/Aw option. The /MT option is a shorthand way of specifying this combination
of options to the compiler:

/ALw/FPi/G2/DMT

The /MT option also causes the compiler to place a library search record for
LLIBCMT in the object file.

14.3  Module-Definition Files and Import Libraries

A module-definition file tells the linker about the characteristics of an
application or dynamic-link library. It describes names, segments, memory
requirements, and import and export definitions. Export definitions make
functions in the OS/2 dynamic-link libraries (DLLs) available to other
programs. Each export definition specifies a function name. A program using
these functions must have import definitions in order to find each
dynamic-link function. Each import definition specifies a function name and
the name of the dynamic-link library where the function resides.

The IMPLIB utility generates a library of import definitions that can be
examined during the link. For imported functions, the import library can be
used in place of a module-definition file.

Module-definition files are optional for most OS/2 programs. Two types of
programs must use them:

■   Programs with I/O privileges

Each module-definition file contains one or more module statements defining
attributes of the executable program. The statements and their associated
attributes are listed below:

Statement                         Attribute
────────────────────────────────────────────────────────────────────────────
CODE                              Gives default attributes for code
segments

DATA                              Gives default attributes for data
segments

DESCRIPTION                       Describes the module in one line

EXETYPE                           Identifies the operating system

EXPORTS                           Defines exported functions

HEAPSIZE                          Specifies local heap size, in bytes

IMPORTS                           Defines imported functions

LIBRARY                           Names a dynamic-link library

NAME                              Names an application

OLD                               Preserves import information from a
previous version of the library

PROTMODE                          Specifies that the module runs only in
OS/2 protected mode

REALMODE                          Relaxes some restrictions that the
linker imposes for protected-mode
programs

SEGMENTS                          Gives attributes for specific segments

STACKSIZE                         Specifies local stack size, in bytes

STUB                              Adds a DOS 3.x executable file to the
beginning of the module, usually to
terminate the program when run in real
mode

In addition to the keywords listed above, each statement includes one or
more fields to complete the attribute description. All keywords must be
entered in uppercase. You can include comments in the module-definition file
by beginning the line with a semicolon (;). For a complete list of the
keywords and their meaning, see on-line help for information about
module-definition files.

14.3.1  Adding a Module-Definition File to the LINK Command

The module-definition file name is the last field of the link command:

LINK objects «,«exe» » «, «map» » «, «lib» » «, «def» » «;»

This example uses the default libraries:

LINK sample, sample.exe, sample.map,,sample.def

When you use a module-definition file, you must use the /c option on the CL
command line and link in a separate step. If you are linking without a
module-definition file, you can use a semicolon after your last entry to
suppress LINK's prompt for the module-definition file name and other missing
parameters.

The segmented-executable linker is the only LINK program that recognizes
module-definition files. Since it is backwards compatible, it should be the
only linker in your path. The QuickC linker does not process these files.

The following sections illustrate ways to use module-definition files.
On-line help describes all of the commands and options available.

14.3.2  Creating Dynamic-Link Libraries (DLLs)

You can build your own dynamic-link libraries. A simple module-definition
file for such a library with one public function is shown below:

LIBRARY Mylib INITINSTANCE

DATA MULTIPLE

EXPORTS
MyProc

You can use the same module-definition file you used to create the
dynamic-link library as input to the IMPLIB utility. IMPLIB generates a
library file with a .LIB extension for use by applications calling your
dynamic-link routines. Section 14.3.5 describes the IMPLIB program. Chapter
16, "Dynamic Linking with OS/2," explains how to build a dynamic-link
library.

The LIBRARY statement tells the linker that this is a dynamic-link library
rather than an application. (Applications use the NAME statement instead of
the LIBRARY statement.)

The EXPORTS statement gives the name of the public function.

You can designate exported functions in a C source file.

The C language keyword _export is an alternative to the EXPORTS statement.
When _export appears in a function declaration or definition, the compiler
puts the function and its parameter size in the object module's export
record. Functions with the _export keyword that are not listed in the
module-definition file cannot have input/output privileges or alias names.

Using generic library names is dangerous.

Since OS/2 systems have many dynamic-link libraries installed, try to pick a
name that uniquely identifies your library. If you choose a generic name,
such as CRT.DLL or WINDOWS.DLL, you run the risk of having your library
overwritten by someone else's dynamic-link library with the same name.

14.3.3  Creating Programs with I/O Privileges

OS/2 programs that must access hardware directly can designate a code
segment with input/output privileges. This segment can then perform a
limited set of I/O instructions but cannot make any calls to dynamic-link
libraries.

You cannot use the C run-time library functions inp and outp for input and
output. Their use is limited to real-mode programs. You can, however, use
in-line assembler code in your C source program to access a port.

The sample module-definition file below shows two segments for a program:

NAME         IOPROG

EXETYPE         OS/2

SEGMENTS
_IOSEG      IOPL
_TEXT       NOIOPL

EXPORTS
CharIn      4
CharOut     4

The first code segment contains the I/O portion of the program and has the
IOPL keyword. The second segment is designated NOIOPL (the default).

The EXPORT statement for IOPL functions must include parameter size.

The EXPORTS section names two functions in the IOPL segment that can be
called by procedures outside the segment. It also specifies the size of the
function's parameters. Procedures with I/O privileges must specify the
number of words needed for their parameters.

────────────────────────────────────────────────────────────────────────────
NOTE

Unless the user has specified IOPL=YES in the CONFIG.SYS file, the program
────────────────────────────────────────────────────────────────────────────

14.3.4  Creating Presentation Manager Applications

The Presentation Manager calls window and dialog procedures inside a
Presentation Manager application. The sample module-definition file below
exports these procedures and gives the linker additional instructions for
building the program. Module-definition files are optional for Presentation
Manager applications. They can be used to control the way different segments
of the program are loaded.

NAME         PMSAMPLE    WINDOWAPI

EXETYPE          OS/2
STACKSIZE        4096

SEGMENTS

In the preceding example, the NAME statement identifies the program as an
application named PMSAMPLE. The WINDOWAPI keyword tells the linker to mark
the executable file as a Presentation Manager application. Only programs
marked as windows applications or windows-compatible applications can share
the Presentation Manager screen group.

The EXETYPE statement tells the linker to build a program that runs only in
protected mode and to produce the optimal executable file for OS/2.

The STACKSIZE statement allocates 4096 bytes of local stack space. This is
the minimum stack size recommended for Presentation Manager programs.

You can reduce run-time memory requirements.

The SEGMENTS statement controls the way code and data segments are handled.
By default, segments are not brought into physical memory until needed. The
PRELOAD keyword in the example tells the system loader to load the  _INIT
segment when the program starts. The  _TEXT  and  _HELP  segments are loaded
on demand. You can use the compiler's /NT option to generate your own
segment names, such as  _INIT  and  _HELP. Separate segments are useful for
code that is executed infrequently, such as a help subsystem. This reduces
the amount of run-time memory required for your application, since each
segment will be loaded when and if there is a request for it.

14.3.5  Creating Import Libraries with the IMPLIB Utility

Applications that call dynamic-link library functions must use import
definitions that specify the location of each dynamic-link function. The
definitions consist of a function name and the name of the dynamic-link
library file where it resides.

Although the application can use a module-definition file to create the
import definitions, it is easier to use import libraries built by the IMPLIB
utility.

IMPLIB creates an import library in the form of a file with a .LIB
extension, which is read by the linker. At link time, the .LIB file is
specified in the LINK command line, along with other libraries.

IMPLIB accepts two types of sources:

■   The module-definition file used to create the dynamic-link library

■   The dynamic-link library itself

The IMPLIB command has the syntax:

IMPLIB «/c»libfile deffile  «deffile ...»

or

IMPLIB  «/c»libfile dynlib «dynlib ...»

The /c option directs IMPLIB to be case sensitive. By default, it is case
insensitive.

The libfile field names the new import library file. The deffile or dynlib
fields name the input files, which are dynamic-link library or
module-definition files.

The following example creates the import library file named MYLIB.LIB from
the MYLIB.DLL dynamic-link library:

IMPLIB mylib.lib mylib.dll

For more information about import libraries and IMPLIB, consult on-line
help.

14.4  Link Command-Line Options

This section describes command-line options that control various aspects of
the linker and the circumstances in which you will need to use them.

/NODEFAULTLIBRARYSEARCH (/NOD)

If you did not compile with /MT, /MD, or /ML, suppress default library
searching.

The /NODEFAULTLIBRARYSEARCH option prevents the linker from searching any
library specified in an object file. When you specify this option, you
should also specify the name of the library to be linked. The minimum
abbreviation for this option is /NOD.

If you are using the multithread library, LLIBCMT, or the dynamic-link
library, LLIBCDLL, you should use this option. Use it with dynamic-link
libraries built with LLIBCDLL. This is mandatory if you did not compile with
the /Zl, /MT, or /ML options.

You can select a specific library by appending the library name to the /NOD
option, as in

/NOD:LLIBCMT.LIB

/NOEXTENDEDDICTSEARCH (/NOE)

The /NOEXTENDEDDICTSEARCH option prevents the linker from searching the
extended dictionary, which is an internal list of symbol locations
maintained by the linker. You need to use this option if a library symbol
(such as _setargv, _binmode, or _varstck) is redefined and you receive error
L2044 from the linker. The minimum abbreviation for this option is /NOE.

/NOIGNORECASE (/NOI)

The /NOIGNORECASE option preserves case sensitivity. By default, LINK maps
all names to uppercase characters. Because many C function names are a mix
of upper- and lowercase letters, it is important to use this option. The
compile option /Zc causes any name declared with the _pascal keyword to be
treated without regard to case at the source level. The minimum abbreviation
is /NOI.

/PMTYPE

The /PMTYPE option is an alternative to specifying Presentation Manager
compatibility with the NAME statement of a module-definition file. Use the
following syntax:

/PMTYPE:type

Type must be one of the following:

Type                              Effect
────────────────────────────────────────────────────────────────────────────
PM                                The application is an OS/2 Presentation
Manager application using the
Presentation Manager API and running in
the Presentation Manager screen group.
This type corresponds to specifying
WINDOWAPI in the NAME statement of a
module-definition file.

VIO                               The application is compatible with the
OS/2 Presentation Manager and can run in
a window or in a
separate screen group. This type
corresponds to specifying WINDOWCOMPAT
in the NAME statement of a
module-definition file.

NOVIO                             The application is not compatible with
the OS/2
Presentation Manager. It must run in a
separate
screen group. This type corresponds to
specifying
NOTWINDOWCOMPAT in the NAME statement of
a module-definition file.

14.5  The BIND Utility

The BIND utility converts a protected-mode program into a program that runs
in both OS/2 and DOS environments. It replaces Family API calls to
dynamic-link library functions with DOS emulator routines from the API.LIB
library. (See Section 14.1.3, "Creating Dual-Mode Programs as Family
Applications," for a list of Family API calls.) BIND produces a stand-alone
program file that can run under

■   OS/2 protected mode

■   OS/2 real mode

■   DOS 2.x and DOS 3.x

BIND is an alternative to the C compiler's /Fb option described in Section
14.2.2, "Creating Bound Programs Option." You must use BIND instead of the
/Fb option when you compile with the /c (compile without link) option or
when your program includes functions that operate only in protected mode.

You can include functions in a bound application that are not members of
the Family API.

To include functions available only in protected mode, you must run the BIND
utility with the /n option. Your run-time code must call the Family API
function DosGetMachineMode to determine whether it is running in real or
protected mode. When your program executes in real mode, it will be aborted
if it tries to call a function available only in protected mode.

You might choose to design your application so it executes different
sections of code, depending on the machine mode. For example, the
application may need to keep track of the passage of elapsed time or to
detect time-outs. In real mode, you might use polling or timing loops or
perhaps intercept the timer interrupts. In protected mode, you should use
the OS/2 semaphore and timer services, such as DosSetSem and DosTimerAsync,

Invoke BIND with the following syntax:

BIND infile «implibs» «linklibs» «/o outfile» «/n @file» «/n names»
«/m mapfile»

The /n option provides a way to include protected-mode functions. It has two
formats:

■   A list of one or more names, separated by spaces.

■   The name of a file, preceded by the at (@) sign. The file should
consist of a list of functions, one name per line.

The /o option specifies a name for the bound-executable file. If it is not
present, the name of the input file is used.

The /m option causes a link map to be generated for the real-mode version of
the executable file.

To bind a program named TIMER that uses DosTimerAsync to manage time-outs
when running in protected mode, invoke BIND as follows:

BIND TIMER /n DosTimerAsync

For more information about BIND and other command-line options, consult
on-line help.

Chapter 15  Creating Multithread OS/2 Applications
────────────────────────────────────────────────────────────────────────────

Microsoft C, version 6.0, provides support for creating multithread
applications under OS/2. You should consider using more than one thread if
your application needs to manage multiple activities, such as simultaneous
keyboard and mouse input. One thread can process keyboard input while a
second thread filters mouse activities. A third thread could update the
display screen based on data from the mouse and keyboard threads. At the
same time, other threads can access disk files or get data from a
communications port.

This chapter explains the features in C 6.0 that support the creation of
multithread programs. It also describes some important ways in which
programming for OS/2 is different than programming for DOS.

OS/2 performs the scheduling and allocation of real hardware resources to
multiple programs, or "processes." It does not actually schedule the
processes themselves; it schedules threads belonging to the processes.

A thread is basically a path of execution through a program. It is also the
smallest unit of execution that OS/2 schedules. A thread consists of a
stack, the state of the CPU registers, and an entry in the execution list of
the system scheduler. Each thread shares all of the process's resources.

A process consists of one or more threads and the code, data, and other
resources of a program in memory. Typical program resources are open files,
semaphores, and dynamically allocated memory. A program executes when the
system scheduler gives one of its threads execution control. The scheduler
determines which threads should run and when they should run. Threads of
lower priority may have to wait while higher priority threads complete their

Threads operate  independently and are  unaware of other threads.

All threads in a process operate independently of one another. Unless you
take special steps to make them visible to each other, each thread executes
while completely unaware of the existence of other threads in a process.
Threads sharing common resources, however, must coordinate their work by
using flags, semaphores or some other method of interprocess communication.
See Section 15.3, "Writing a Multithread Program," for more information

15.1.1  Library Support

All shared functions  in a multithread  program must be re-entrant.

If one thread is suspended by the OS/2 scheduler while executing the printf
function, one of the program's other threads might start executing. If the
second thread also calls printf, data might be corrupted. To avoid this,
access to static data used by the function must be restricted to one thread
at a time. This process of restricting access to certain data is called
serialization.

You do not need to serialize access to stack-based (automatic) variables
because each thread has a different stack. Therefore, a function that uses
only automatic (stack) variables is re-entrant. The standard C run-time
libraries, such as SLIBCE, have a limited number of re-entrant functions. A
multithread program needing to use C run-time library functions that are
normally not re-entrant should be built with the multithread library
LLIBCMT.LIB.

The Multithread C Library LLIBCMT.LIB

The support library LLIBCMT.LIB is a re-entrant large-model library for

A multithread program  linked with LLIBCMT.LIB can  use any memory model.

All calls to library functions must use the large-model calling interface
(far code pointers, far calls, and far data pointers). When your application
calls functions in this library,

■   All library calls must be far calls.

■   All library calls must use the C calling convention; programs compiled
using the /Gr (fastcall calling convention) or /Gc (Pascal calling
convention) options must use the standard include files for the
run-time library functions they call.

■   All data and code pointers must be far pointers.

■   Variables passed to library functions must either be passed by value
or cast to a far address.

■   Your main function must be declared far if you are compiling with the
small or compact memory models.

You do not need to explicitly declare far pointers if you are using the
compact, large, or huge memory models, since these models use far pointers
as default. For the large and huge memory models, the function calls are
also far by default.

A small-model program calling a library function such as isupper, for
example, must use declarations like the following:

int _far _cdecl isupper( int _c );

Programs built with LLIBCMT.LIB are entirely self-contained.

Programs built with LLIBCMT.LIB do not share C run-time library code or data
with any dynamic-link libraries they call. Chapter 16 explains how to build
DLLs and how to share code and data between processes.

Alternatives to LLIBCMT.LIB

If you choose to build a multithread program without using LLIBCMT.LIB, you
must do the following:

■   Use the standard C libraries and limit library calls to the set of
re-entrant functions.

■   Use the OS/2 API thread management functions, such as DosCreateThread.

■   Provide your own synchronization for functions that are not re-entrant
by using OS/2 services such as semaphores and the DosEnterCritSec and
DosExitCritSec functions.

The C run-time library functions listed below are re-entrant and can be used
in multithread programs linked with the standard libraries.

abs
atoi
atol
bsearch
chdir
getpid
halloc
hfree
itoa
labs
lfind
lsearch
memccpy
memchr
memcmp
memcpy
memicmp
memmove
memset
mkdir
movedata
putch
rmdir
strcat
strchr
strcmp
strcmpi
strcpy
stricmp
strlen
strlwr
strncat
strncmp
strncpy
strnicmp
strnset
strrchr
strrev
strset
strstr
strupr
swab
tolower
toupper
──────────────────────────────────────────

WARNING

functions. The _beginthread function performs initialization without which
many C run-time functions will fail. You must use _beginthread instead of
DosCreateThread in C programs built with LLIBCMT.LIB if you intend to call C
run-time functions.
────────────────────────────────────────────────────────────────────────────

The Multithread Library Compile Option (/MT)

The /MT option for the CL command is the best way to build a multithread
program with LLIBCMT.LIB. The /MT option embeds the LLIBCMT library name in
the object file. Using the /MT option automatically specifies the /ALw /FPi
/G2 /D MT options. The following list describes what these options do.

Switch                            Effect
────────────────────────────────────────────────────────────────────────────
/ALw                              Use the large memory model with separate
stack segment; do not reload the DS
register as part of the entry sequence
for every function

/FPi                              Generate in-line floating-point
instructions and select the emulator
math package

/G2                               Use the 80286 processor instruction set

/D MT                             Use the multithread version of the
include files

These options can be combined with other options to specify different memory
models and different relationships between the data segment and the stack.
You can override the /G2 and /FPi options by specifying a different option
later on the command line. The following example shows how to override the
floating-point package option:

CL /MT /FPa /Lp PROG.C

────────────────────────────────────────────────────────────────────────────
NOTE

You cannot replace the /MT option with /ALw /FPi /G2. You must use /MT to
────────────────────────────────────────────────────────────────────────────

15.1.2  Include Files

The Microsoft C 6.0 include files contain conditional sections for
multithread applications using LLIBCMT.LIB. To compile your application with
the appropriate definitions, you can

■   Compile with the /MT option described in Section 15.1.1, "Library
Support."

■   Define the symbolic constant MT in your source file or on the command
line with the /D option.

Always use the  standard include files.

Standard include files declare C run-time library functions as they are
implemented in the libraries. If you used the Maximum Optimization (/Ox) or
Register Calling Convention (/Gr) option, the compiler assumes that all
functions should be called using the register calling convention. The
run-time library functions were compiled using either the C or the
FORTRAN/Pascal calling convention, and the declarations in the standard
include files tell the compiler to generate correct external references to
these functions.

See Section 15.4, "Compiling and Linking," for examples of how to use the MT
constant.

15.1.3  C Run-Time Library Functions for Thread Control

All OS/2 programs have at least one thread. Any thread can create additional
threads. A thread can complete its work very quickly and then terminate, or
it can stay active for the life of the program.

The LLIBCMT and LLIBCDLL C run-time libraries provide two functions for
They also declare the global variable _threadid, which contains the address
of an application's current thread identifier.

The _beginthread function creates a new thread and returns a thread
identifier if the operation is successful. The thread will terminate
automatically if it completes execution, or it can terminate itself with a

The global variable _threadid holds the address of the identifier of the
current thread. It is defined in the STDDEF.H file as shown below:

/* define pointer to thread id value */
extern int far * _threadid;

────────────────────────────────────────────────────────────────────────────
WARNING

If you are going to call C run-time routines from a program built with
LLIBCMT.LIB, you must start your threads with the _beginthread function. Do
not use the OS/2 functions DosExit  and DosCreateThread. Using
is blocked waiting for the suspended thread to complete its access to a C
run-time data structure.
────────────────────────────────────────────────────────────────────────────

The _beginthread and _endthread functions are described in detail below.
Section 15.2 illustrates their use in a sample multithread program.

All threads in a process  can execute concurrently.

The _beginthread function creates a new thread. A thread shares the code and
data segments of a process with other threads in the process but has its own
unique register values, stack space, and current instruction address. The
system gives CPU time to each thread, so that all threads in a process can
execute concurrently. You can find a complete description of _beginthread
and its arguments in on-line help.

The _beginthread function is similar to the DosCreateThread function in the
OS/2 API with these differences:

■   The _beginthread function lets you pass arguments to the thread.

■   The stack address points to the bottom of the stack. It is the address
of the start of an array or of the start of a block of dynamically
allocated memory. When you use the DosCreateThread call, the stack
address points to the top of the stack.

■   If you specify NULL for the stack address, _beginthread manages
allocation and deallocation of the thread stack for you. This option
is advantageous because it is difficult for your program to determine
when a thread has terminated, so you cannot know when to deallocate
the thread stack. However, _beginthread maintains enough information
to know when a thread has terminated and deallocates the thread's
stack the next time its thread ID is used.

The _beginthread function returns the thread ID number of the new thread if
successful or -1 if there was an error. Errors include specifying an
odd-address stack or an odd- or zero-length stack (which is different than
passing NULL for the stack address) or trying to create too many threads.
The multithread library, LLIBCMT.LIB, supports the maximum number of threads
allowed by OS/2.

terminate automatically when they complete. The _endthread function is
useful for conditional termination from within a thread. A thread dedicated
to communications processing, for example, can quit if it is unable to get
control of the communications port. You can find a complete description of
_endthread in on-line help.

15.2  Sample Multithread C Program

BOUNCE.C is a sample multithread program that creates a new thread each time
the letter a' or A' is entered at the keyboard. Each thread bounces a
"happy face" of a different color around the screen. Up to 32 threads can be
created. The program's normal termination occurs when q' or Q' is entered.
It will also terminate if it receives the CTRL+C or CTRL+BREAK signals. See
Section 15.4, "Compiling and Linking," for details on compiling and linking
BOUNCE.C.

/*  Bounce - Creates a new thread each time the letter 'a'is typed.
*  Each thread bounces a happy face of a different color around the
screen.
*  All threads are terminated when the letter 'q' is entered or when
*  the CTRL+C/CTRL+BREAK signals are received.
*
*  This program requires the multithread library. For example, compile
*  with the following command line:
*      CL /MT BOUNCE.C
*/

#define INCL_NOCOMMON                     /* Use only what we need */
#define INCL_NOPM                         /* Don't need PM */
#define INCL_DOSPROCESS                   /* DosBeep and DosSleep */
#define INCL_DOSSEMAPHORES                /* OS/2 semaphore functions */
#define INCL_DOSSIGNALS                   /* OS/2 signal functions */
#define INCL_VIO
#define INCL_KBD
#include <os2.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <process.h>

#define STACK_SIZE   4096

void main( void );                          /* Thread 1: main */
void KbdThread( void );                     /* Thread 2: keyboard input */
void BounceProc( char * MyID );             /* Threads 3 to n: display */
void VioClrScr( void );                     /* Screen clear */
void ShutDown( void );                      /* Program shutdown */
void VioWrtCStr( char *pchString,           /* Write string to display */
unsigned usRow, unsigned usColumn );
void pascal far SigHandler( unsigned SigArg,/* Signal handler */
unsigned SigNum );
/* Screen clear macro */
#define VioClrScr() VioScrollDn( 0, 0, 50, 80, 50, BlankCell, 0 )

struct tagCoords                            /* Display coordinates */
{
int xLoc;
int yLoc;
int xInc;
int yInc;
};

unsigned long  RunFlag = 0;                 /* "Keep Running" semaphore */
unsigned long  ScreenLock = 0;              /* Screen update semaphore  */

char BlankCell[2] = { 0x20, 0x07 };
VIOMODEINFO vmi = { sizeof( VIOMODEINFO ) };/* Mode information */

PFNSIGHANDLER PrevHandler;                  /* for SetSigHandler call */
unsigned int  PrevAction;                   /* for SetSigHandler call */

void main()                                 /* Thread One */
{
/* Get display screen's text row and column sizes & clear the
screen.*/
VioGetMode( &vmi, 0 );
VioClrScr();
VioWrtCStr( "Threads running: 00.  Press 'a' to start another thread",
vmi.row - 1, 0 );

/* Set the "we are running" semaphore. */
DosSemSet( &RunFlag );

/* Start keyboard thread. Let _beginthread allocate memory
*  for the thread's stack.
*/

/* Install signal handler for CTRL+BREAK & CRTL+C. */
DosSetSigHandler( (PFNSIGHANDLER)SigHandler, &PrevHandler,
&PrevAction,
SIGA_ACCEPT, SIG_CTRLC );

/* Wait for "running" semaphore to clear (from signal or 'q' key). */
DosSemWait( &RunFlag, SEM_INDEFINITE_WAIT );

}

void pascal far SigHandler( unsigned int SigArg, unsigned int SigNum )
{
static char BreakMsg[] = "Signal Termination";

ShutDown();
VioWrtCStr( BreakMsg, vmi.row - 1, 0 );
/* Restore original signal handler for CTRL+BREAK & CRTL+C. */
DosSetSigHandler( (PFNSIGHANDLER)PrevHandler, &PrevHandler,
&PrevAction,
PrevAction, SIG_CTRLC );
}
void ShutDown( void )                       /* Clean up display when done
*/
{
/* Lock out screen updates from BounceProc & clear "running" semaphore
*/
DosSemWait( &ScreenLock, SEM_INDEFINITE_WAIT );
DosSemSet( &ScreenLock );
VioClrScr();
DosSemClear( &RunFlag );
}

void KbdThread( void )                          /* Thread Two: keyboard */
{
KBDKEYINFO  KeyInfo;                        /* for KbdCharIn call */
char        ThreadNr = 0;

do
{
/* Block this thread by waiting for keyboard input. */

KbdCharIn( &KeyInfo, IO_WAIT, 0 );
if( tolower( KeyInfo.chChar ) == 'a' && ThreadNr < MAX_THREADS)
{
vmi.row - 1, 17, 0 );
}
} while( tolower( KeyInfo.chChar ) != 'q' );

ShutDown();
}

/* getrandom returns a random number between min and max, which must be in
* integer range.
*/
#define getrandom( min, max ) ((rand() % (int)(((max) + 1) - (min))) +
(min))

void BounceProc( char * MyID )                  /* Threads Three to n */
{
int       xOld, yOld;
char      MyCell[2];
char      CurrentCell[2];
int       CellLen = 2;
struct tagCoords Coords;

/* Generate update increments and initial display coordinates. */
srand( (unsigned) *MyID * 3 );
Coords.xLoc = getrandom( 0, vmi.col - 1 );
Coords.yLoc = getrandom( 0, vmi.row - 1 );
Coords.xInc = getrandom( -3, 3 );
Coords.yInc = getrandom( -3, 3 );

/* Set up "happy face" & generate color attribute from thread
number.*/
if( *MyID > 16)
MyCell[0] = 0x01;          /* outline face */
else
MyCell[0] = 0x02;          /* solid face */
MyCell[1] =  *MyID & 0x0F;     /* force black background */

for( ;; )
{
/* Wait for display to be available, then lock it. */
DosSemWait( &ScreenLock, SEM_INDEFINITE_WAIT );
DosSemSet( &ScreenLock );

/* If we still occupy the old screen position, blank it out. */
VioReadCellStr( CurrentCell, &CellLen, yOld, xOld, 0 );
if ( CurrentCell[0] == MyCell[0] && CurrentCell[1] == MyCell[1] )
VioWrtCellStr( BlankCell, CellLen, yOld, xOld, 0 );

/* Draw new face, then clear screen lock */
VioWrtCellStr( MyCell, CellLen, Coords.yLoc, Coords.xLoc, 0 );
DosSemClear( &ScreenLock );

/* Increment the coordinates for next placement of the block. */
xOld = Coords.xLoc;
yOld = Coords.yLoc;
Coords.xLoc += Coords.xInc;
Coords.yLoc += Coords.yInc;

/* If we are about to go off the screen, reverse direction */
if( Coords.xLoc < 0 || Coords.xLoc >= vmi.col )
{
Coords.xInc = -Coords.xInc;
DosBeep( 400, 50 );
}
if( Coords.yLoc < 0 || Coords.yLoc >= vmi.row )
{
Coords.yInc = -Coords.yInc;
DosBeep( 600, 50 );
}

/* Sleep to slow down screen update rate */
DosSleep( 75L );
}
}

void VioWrtCStr( char *pchString, unsigned usRow, unsigned usColumn )
{
VioWrtCharStr( pchString, strlen( pchString ), usRow, usColumn, 0 );
}

15.3  Writing a Multithread Program

When you write a program with multiple threads, you must coordinate their
behavior and use of the program's resources. You must also make sure that

Sharing Common Resources

Each thread has its own stack and its own copy of the CPU registers. Other
resources, such as files, static data, and heap memory, are shared by all
threads in the process. Threads using these common resources must coordinate
their work. OS/2 provides semaphores and the DosEnterCritSec and
DosExitCritSec system services for synchronizing resources.

Your program must provide  for resource conflicts.

When multiple threads are accessing static data, your program must provide
for possible resource conflicts. Consider a program where one thread updates
a static data structure containing x,y coordinates for items to be displayed
by another thread. If the update thread alters the x coordinate and is
preempted before it can change the y coordinate, the display thread may be
scheduled before the y coordinate is updated. The item would be displayed at
the wrong location. You can avoid this type of problem by using semaphores

Using semaphores is a way of communicating among threads or processes that
are executing asynchronously of one another. This communication is usually
used to coordinate the activities of multiple threads or processes,
typically by controlling access to a shared resource by "locking" and
"unlocking" the resource. To solve the x,y coordinate update problem
described above, the update thread would set a semaphore indicating that the
data structure is in use before performing the update. It would then clear
the semaphore when both coordinates had been processed. The display thread
must wait for the semaphore to be clear before updating the display. This
process of waiting for a semaphore is often called "blocking" on a semaphore
because the process is blocked and cannot continue until the semaphore
clears.

RAM semaphores are faster  than system semaphores.

OS/2 supports two types of semaphores: system and RAM semaphores. You must
use a system semaphore if more than one process needs to access the
semaphore. You can use the much faster RAM semaphores if their use is
confined to the threads within a process.

The BOUNCE.C program in Section 15.2 uses a RAM semaphore named  ScreenLock
to coordinate screen updates. Each time one of the display threads is ready
to write to the screen, it calls DosSemWait with a pointer to  ScreenLock
and constant SEM_INDEFINITE_WAIT to indicate that the DosSemWait call should
block on the semaphore and not time out. If the ScreenLock semaphore is
clear, the wait function returns immediately. Otherwise, the thread blocks
until the semaphore clears. When the thread receives control again, it calls
DosSemSet to set the ScreenLock semaphore so other threads cannot interfere
with the display. When the thread completes the display update, it releases
the semaphore by calling DosSemClear.

The  ShutDown  routine in BOUNCE.C is called from both the keyboard thread
and the signal handler. The routine uses the  ScreenLock  semaphore to make
sure other threads do not write to the screen after the screen has been
cleared.

Screen displays and static data are only two of the resources requiring
careful management. For example, your program may have multiple threads
accessing the same file. Since another thread may have moved the file
pointer, each thread must reset the file pointer before reading or writing.
In addition, each thread must make sure that it is not preempted between the
time it positions the pointer and the time it accesses the file. These
threads should use a semaphore to coordinate access to the file by
bracketing each file access with DosSemRequest and DosSemClear calls. The
following code fragment illustrates this technique:

HSEM    hsemIOSem;

DosSemRequest( hsemIOSem, SEM_INDEFINITE_WAIT );
fseek( fp, desired_position, 0L );
fwrite( data, sizeof( data ), 1, fp );
DosSemClear( hsemIOSem );

Stack checking is performed for each thread.

All of an application's default stack space is allocated to the first thread
of execution, which is known as thread 1. As a result, you must allocate
memory to provide a separate stack for each additional thread your program
needs. You must do this before creating the thread. Stack checking, if
enabled, is performed for each thread. The keyboard thread in BOUNCE.C calls
the malloc function each time the user wants to start a new display thread.
If the allocation is successful, the _beginthread function is called. The
first argument in the _beginthread call is a pointer to the BounceProc
function, which will execute the threads. The last argument is an ID number
that is passed to BounceProc. BounceProc uses the ID number to seed the
random number generator and to select the thread's color attribute and
display character.

Threads that make calls to the C run-time library or to the OS/2 API must
allow sufficient stack space for the library and API functions they call.
The C printf function requires more than 500 bytes of stack space, and you
should have 2K of stack space available when calling OS/2 API routines. To
be safe, allocate at least 4K for each thread's stack.

Use as little static  data as possible.

Since each thread has its own stack, you can avoid potential collisions over
data items by using as little static data as possible. Design your program
to use automatic stack variables for all data that can be private to a
thread. The only global variables in the BOUNCE.C program are either RAM
semaphores or variables that never change once they are initialized.

Signal Handling

Signals are events that interrupt the normal flow of your program's
execution. They are similar to hardware interrupts, but they come from the
operating system or other programs and occur asynchronously. If you do not
provide your own routines, OS/2 will take the default action for each
signal, such as cancelling your program when the user enters CTRL+BREAK. You
can install your own signal handler with the OS/2 API function
DosSetSigHandler.

────────────────────────────────────────────────────────────────────────────
WARNING

The C run-time function signal is not supported in the multithread library
LLIBCMT.LIB.
────────────────────────────────────────────────────────────────────────────

When a signal occurs, OS/2 always suspends thread 1 and gives control to the
signal handler, if installed. As a result, thread 1 must not be executing C
run-time library code when the signal handler gets control or a potential
deadlock condition can occur. In addition, the signal handler must not call
C run-time library functions. Consider the following sequence of events:

1.  Thread 2 is executing printf when the user interrupts it by pressing
CTRL+C. The program has designated a CTRL+C signal handler, so OS/2
immediately transfers control to the signal handler in thread 1.

2.  The signal handler in thread 1 tries to execute the statement:

printf( "^C: Do you want to quit?" );

3.  The printf call in thread 2 has already locked output to the console,
so thread 1's printf must wait for release of that lock.

4.  The thread 2 printf function never regains control because the signal
handler must complete before other processing can continue. As a
result, it is never able to release the lock on console output.

If a situation like this happens, the program will wait indefinitely for
resolution of the two mutually exclusive conditions.

A multithread C program can process signals if it adheres to the following
restrictions:

■   Thread 1 must be dedicated to signal handling and must not call the C
run-time library once it identifies the signal handler to OS/2 using
the API function DosSetSigHandler. When the signal handler gets
control, it should set a semaphore or flag so other threads in the
program can determine that the signal has occurred and is being
processed.

■   The other threads in the process must check the status of semaphores
set by thread 1 and respond accordingly.

The BOUNCE.C sample program waits until thread 2, the keyboard handler,
starts before installing the signal handler. It then dedicates thread 1 to
signal handling by having the thread wait for a semaphore. Thread 1 blocks
until either the keyboard thread or the signal handler clears the semaphore.
It then calls _endthread to terminate the process, including all the other

15.4  Compiling and Linking

The steps for compiling and linking the multithread program BOUNCE.C are
given below:

1.  Ensure that the files LLIBCMT.LIB and OS2.LIB are in the directory
specified in your LIB environment variable.

The file LLIBCMT.LIB takes the place of the regular C run-time library
files. The file OS2.LIB provides support for OS/2 system calls made in
the program, such as KbdCharIn.

2.  Compile and link the program with the CL command-line option /MT.

The /Lp option instructs the compiler to create a protected-mode
application. The /MT option implies the large memory model with a
separate stack segment (/ALw). The multithread library functions have
their own data segment but use the caller's stack. This option also
sets the library search record to LLIBCMT.LIB and sets the MT symbolic
constant for the multithread versions of the include files. The /link
GRTEXTP option instructs the linker to search GRTEXTP.LIB, the
character-graphics library for protected mode.

To compile and link in a single step, use this CL command line:

CL /Lp /MT BOUNCE.C /link grtextp

For separate compile and link steps, you invoke the compiler and the
linker with this code:

CL /c /Lp /MT BOUNCE.C

3.  If you choose not to use the /MT option, you must take these steps:

■   Ensure that the special multithread include file support is enabled.

■   Use the /Aw option. This is required because the functions in
LLIBCMT.LIB have their own data segment but use the caller's
stack. The /Aw option specifies a segment setup of SS not equal to
DS with DS not reloaded on function entry.

■   Make sure that only far pointers are passed to library functions.

■   Make sure that all variables are either passed by value or cast to
a far address (the large memory model).

■   Specify the multithread library and suppress default library
selection.

The multithread include files are used when you define the symbolic
constant MT. You can do this with the CL command line option /D MT or
within the C source file before any include statements, as shown
below:

#define MT
#include <stdlib.h>

To compile and link in a single step with the default libraries
suppressed, this is the complete CL command line:

CL /Lp /ALw /Zl /D MT BOUNCE.C /link LLIBCMT+OS2

To perform a two-step compile and link with the default libraries
suppressed in the link step, use these commands:

CL /c /Lp /ALw /D MT BOUNCE.C

1.  Run the program under OS/2.

15.5  Avoiding Problem Areas

There are several problems you can encounter in creating, linking, or
executing a multithread C program. Some of the more common ones are
described here.

Problem                           Probable Cause
────────────────────────────────────────────────────────────────────────────
LINK searches for mLIBC f.LIB.    If you omit the /NOD option from the
default library. The
default library should not be used with
multithread programs. The /NOD option
tells the computer not to search the
default libraries. This problem can also
be avoided by compiling with the /Zl
option, which suppresses default library
search records in the object files.

You get error SYS1943. A program  Many OS/2 programming errors cause
caused a protection               protection violations. A common cause of
violation.                        protection violations is the indirect
assignment of data to null pointers.
This results in your program trying to
access memory that does not "belong" to
it, so a protection violation is issued.
Protection violations also occur if your
program gets a memory buffer from the
operating system and then tries to read
or write past the end of the

buffer. Another cause of this error is
failing to specify the condition "SS is
not equal to DS" in the CL command
invocation. Specify the correct
conditions with the /ALw memory model
option.

An easy way to detect the cause of a
protection violation is to compile your
program with CodeView information, then
run it in CodeView. When the protection
fault occurs, OS/2 will transfer control
to CodeView, and the cursor will be
positioned on the line that caused the
problem. See Chapter 9, "Debugging C
Programs with CodeView," for more
information about the CodeView debugger.

Your program generates numerous   If you attempt to compile and link a
program without defining the symbolic
constant MT, many of the definitions
required for the multithread library
will be missing. Define MT on the CL
command line with /MT or /D MT, or use
#define MT in your program.

You can eliminate many potential problems by setting the compiler's warning
level to one of its highest values and heeding the warning messages. By
using the /W3 or /W4 warning level options, you can detect unintentional
data conversions, missing function prototypes, and use of non-ANSI features.

15.6  Using the Protected-Mode CodeView Debugger

The protected-mode version of CodeView (CVP) has special commands for
debugging multiple processes and threads. It adds Thread and Process items
to the standard Run Menu. Your CONFIG.SYS file must specify IOPL=YES for
protected-mode CodeView to run.

To enable multiple process debugging, invoke CodeView with the /O
(offspring) option. Selecting the Process item from the Run Menu brings up a
list box of child processes associated with the parent process. You choose
the process to be debugged by selecting it with the list box. The Process
item will be grey (unselectable) if you did not specify the /O option. The
/O option applies only to debugging multiple processes. You do not need to
use it to debug multiple threads.

Selecting the Thread item from the Run Menu produces a list box showing the
status of each thread associated with the current process. You can use the
list box to designate a different current thread or to change a thread's
status. There are equivalent keyboard commands for each option.

15.6.1  Compiling with the /Zi Option

The compiler option /Zi causes the compiler to include symbolic information
and line numbers in the object file for debugging with CodeView. If you run
LINK in a separate step, you must invoke it with the /CODEVIEW option, which
can be abbreviated as /CO. To compile and link the sample program BOUNCE.C
in a single step, enter this code:

CL /MT /Zi BOUNCE.C

The following commands are for a two-step compile and link:

CL /c /MT /Zi BOUNCE.C

15.6.2  Prompt for Thread Number

When you debug a protected-mode program with CodeView, the command prompt is
preceded by a three-digit number indicating the current thread. Thread 1 is
always the current thread when you start a program. The prompt appears as

001>

Protected-mode CodeView (CVP) has special commands to control the execution
of threads. The CodeView Thread commands are accessed using the Thread
tilde character (~). Thread commands specify which thread(s) the command
applies to, followed by the command. The syntax of the dialog version of the

~«specifier«command»»

Entering the tilde character by itself displays the status of all threads.
Enter the tilde and a specifier to see the status of particular threads.
Legal values for the specifier field are listed below:

Specifier                         Function
────────────────────────────────────────────────────────────────────────────
(blank)                           Displays the status of all threads

#                                 Specifies the last thread that executed

.                                 Specifies the current thread

*                                 Specifies all threads

n                                 Specifies the number of an existing

The optional command field controls the way specified threads are executed.
If it is omitted, status is displayed, but thread activity is not affected.
Thread commands are summarized below, followed by examples. For more
information about command execution and about how other threads in the
process may be affected, consult on-line help.

Command                           Function
────────────────────────────────────────────────────────────────────────────
(blank)                           Display status

BP                                Set a breakpoint (used with the normal
Breakpoint Set command syntax)

E                                 Execute in slow motion

F                                 Freeze the thread(s)

G                                 Pass control to a thread

P                                 Execute a program step

S                                 Select specified thread as the current

T                                 Trace a thread

Controlling a Thread Being Debugged

If your program has multiple threads using the same functions, you may want
to monitor the behavior of one particular thread. The standard Breakpoint
Set command will affect every thread. The thread Breakpoint Set command lets
you limit the breakpoint to one or more threads. The sample program BOUNCE.C
has multiple threads executing the function BounceProc. This function erases
the symbol at the thread's current screen position, writes it to a new
location, computes the display coordinates to be used the next time the
thread receives control, and then sleeps to slow down the rate at which the
display is updated.

Since thread-specific breakpoints can only be set for threads that are
already running, you can set a breakpoint that will be executed after the
target thread starts. In BOUNCE.C, the source line in thread 2 that tests
each character received from the keyboard is a good location for such a
breakpoint (line 113). Since thread 2 is not active when the program begins,
you must first set a breakpoint in thread 1 after it has started thread 2
(line 73). The first breakpoint can be set by conventional methods or by
using the thread breakpoint command:

001>~1BP .73

Once you have reached the first breakpoint, you can set the keyboard test
breakpoint for thread 2:

001>~2BP .113

The BOUNCE.C program starts a new thread each time the letter a' is typed.
(A' is also accepted.) Once you have started the desired number of threads,
you can trigger the thread 2 breakpoint without starting a new thread by
pressing another key, such as the space bar. When you reach the breakpoint
in thread 2, you can set breakpoints for the other threads. To set a
breakpoint in thread 3's BounceProc function immediately after it has
updated the screen (source line 168), enter this code:

001>~3BP .168

When this breakpoint is reached, the CodeView prompt will reflect the

003>

You can then set other breakpoints for the thread, execute it in slow motion
without any other threads running in the background, or enter other CodeView
commands, such as Breakpoint Clear.

Freezing and Unfreezing Threads

Frozen threads do not execute.

It can be useful to freeze one or more threads so they don't interfere with
execution of a thread you are debugging. In the BOUNCE.C program, for
example, you can monitor the path of a single bouncing ball by freezing all
but one of the bounce threads. Frozen threads will not be scheduled for
execution.

If you have a large number of threads running, you can freeze all of them in
a single command and then unfreeze the threads you want to monitor. Unfrozen
threads continue to operate normally and will execute any breakpoints they
encounter. The following example freezes all threads, enables threads 1 and
4, and then checks the status of all threads:

001>~*F
001>~1U
001>~4U
001>~

If thread 1 is waiting for a semaphore when the status command is invoked,
the report shows the following:

001 Blocked
002 Frozen
003 Frozen
004 Runnable

Switching to a Particular Thread

The S (select) and E (execute) variations of the Thread command can be used
to switch the current thread. However, when another thread causes the
program to stop by hitting a breakpoint, the debugger will select the thread
that encountered the breakpoint as the current thread.

If you include ~.S in the breakpoint command, CodeView stops the thread that
encounters the breakpoint, then immediately switches back to the current
thread. The following example selects thread 4, sets a breakpoint at line
168 in thread 3, and switches to thread 4 when the breakpoint is hit:

001>~4S
001>~3BP .168 "~.S"
001>G

15.6.4  Screen Groups Used by CodeView

Only one CodeView session at a time is supported in protected mode. You
cannot run multiple copies in concurrent screen groups.

The View Output Screen command ( \ ) works differently in protected mode and
in real mode. In protected mode, your application's output will be displayed
for three seconds. The display will then revert to the CodeView display. To
view the output window for a longer period, specify a different delay
interval, measured in seconds, as follows:

\10

Chapter 16  Dynamic Linking with OS/2
────────────────────────────────────────────────────────────────────────────

An OS/2 dynamic-link library (DLL) is an executable file containing
functions that are available to other programs. In a statically linked
program, you link the program with all its component functions when you
build the executable file. In a dynamically linked program, the
program-build step does not link all of the code. Instead, OS/2 links calls
to functions in dynamic-link libraries at program load time or while the
program is running. The DLL code and data become part of the address space
of each program, even when the DLL is being accessed by several application
programs.

This chapter describes how to build your own dynamic-link libraries and how
to build programs that use them.

16.1  Overview of Dynamic Linking

Dynamic linking is the process of resolving external calls when a program
runs, instead of at link time. It offers several benefits:

■   Multiple programs can use the same dynamic-link library
simultaneously. Since only one copy of the DLL is in memory, there are
fewer demands for physical memory and swap space.

■   Updates to dynamic-link libraries do not affect the programs that use
them, since the only connection between DLLs and application programs
is the function-calling sequence.

■   Application programs require less disk space and memory, since their
executable program files contain the names of DLL functions but not
the code for the functions.

■   Dynamic-link libraries can call other dynamic-link libraries.

■   DLLs can extend the OS/2 operating system to provide new or improved
system services. This is possible because most of OS/2 consists of a
set of dynamic-link libraries.

Dynamic linking can take place both at program load time and while the
program is running. A program can call functions in more than one DLL and

For load-time dynamic linking, build a program that calls DLL functions by
name.

The linker creates special records containing the name of each DLL
subroutine and the name of its DLL file. It does not put any DLL code into
the program's executable file. At load time, OS/2 dynamically links the
program and its DLLs. It brings the program and the DLLs into memory and
updates the program's DLL calls with the address of each DLL routine. If a
DLL is already in memory, it is not reloaded.

With run-time dynamic linking, the program creates the DLL file name and
subroutine names during execution. The program then passes these names to
OS/2 so the operating system can load the dynamic-link library.

An example of a run-time dynamic link is an extension to the Programmer's
WorkBench (PWB). PWB has no information about which extensions it needs
until it reads the initialization file, TOOLS.INI. PWB then sends requests
to OS/2 to demand-load the DLLs that it needs.

16.1.2  Application Programs and DLLs

With static linking, all library code is bound into the executable program
when you link the program. If the library changes, all programs using the
library must be relinked. With the exception of some Microsoft Windows
programs, all DOS programs use static linking.

Updates to parts of a program are easier to deliver using DLLs.

You can create loosely coupled applications and DLLs and modify the DLLs
without relinking the program. For example, if your product has an
underlying database access mechanism, you can package the database access
routines into a DLL. You can then ship improvements or changes to the
database code in a new dynamic-link library. The executable files for the
program do not have to be relinked or redistributed.

The programs calling a DLL are known as the DLL's "clients."

16.1.3  DLLs and Microsoft C Run-Time Libraries

You can construct three types of dynamic-link libraries with the Microsoft C
Professional Development System. All of them can be multithreaded; they can
support more than one client at a time. There are three types:

■   A stand-alone dynamic-link library that includes both your routines
and code for the Microsoft C run-time library functions used by your
DLL. This type of DLL is self-contained and completely independent of
the programs that call it.

■   A dynamic-link library that does not use any functions from the
Microsoft C run-time library. This type of DLL is also self-contained.

■   A private dynamic-link library that consists only of selected
functions from the Microsoft C run-time library. This DLL is usually
specific to one program or a closely tied group of programs.
Application programs and dynamic-link libraries using this DLL do not
contain any code for the C run-time library functions.

between the various types of DLLs.

Stand-alone DLLs include C run-time functions.

If you want to call C run-time library functions in your DLL, you can
include the functions you need. These run-time functions are statically
linked in the DLL and the DLL does not rely on the client or any other DLL
for run-time support.

Figure 16.1 illustrates the relationships between this type of DLL, an
application program, and C run-time library functions. Both the application
program and the dynamic-link library have their own copies of functions from
the C run-time library. This ensures that

■   The DLL always has access to the C run-time library routines it needs.

■   The DLL is not dependent on the calling application for any support
code.

■   The programs using the DLL do not depend on the DLL for C run-time
library functions.

Section 16.3.1, "DLLs with Static C Run-Time Library Functions," describes
the steps involved in creating this type of dynamic-link library using the
special library LLIBCDLL.LIB.

(This figure may be found in the printed book.)

DLLs without C Run-Time Library Functions

You can write a dynamic-link library in C without calling any functions from
the C run-time library. Section 16.3.2, "DLLs without C Run-Time Library
Functions," shows how to set up this type of DLL. These DLLs contain only
your code and require no run-time library support; they make no calls to
run-time library functions.

Private C Run-Time DLLs

You can create a custom C run-time DLL.

A C run-time DLL can be shared by multiple programs and their DLLs. You
generate the C run-time DLL in two steps. The first builds a
module-definition file with a list of the C run-time library functions
needed by your application and its DLLs; the second step links the
module-definition file with the special library CDLLOBJS.LIB to create a C
run-time DLL.

The executable files for programs and DLLs linked with a customized C
run-time DLL do not contain any code for the C run-time library functions.
Figure 16.2 shows the relationships of the components.

A private C run-time DLL must be closely tied to its programs and
associated DLLs.

Processes and DLLs that share a private run-time DLL share environment
strings and global C run-time data (for example, file pointers for buffered
I/O and memory allocated with the malloc function). Therefore, the program
and the DLLs must cooperate on the use of this data.

(This figure may be found in the printed book.)

A closely tied structure is suitable for a complex application consisting of
a set of application programs that act as front-end processors to several
DLLs. A word processor, for example, might support one user interface for
beginners, another for intermediate users, and a third for expert users. The
different user interfaces could be implemented in three separate executable
program files. All three programs would share the DLLs that do most of the
real work.

Section 16.3.3, "Programs and DLLs with a C Run-Time DLL," describes the
procedures for building a C run-time library DLL and its associated programs

16.2  Designing and Writing DLLs

Before you write a DLL, you must determine some of the DLL's requirements.
You need to know

■   Floating-point math requirements

■   Special initialization requirements such as allocation of buffers or
registration of special termination routines

■   Termination requirements such as clearing semaphores or releasing
allocated memory

■   Re-entrancy requirements; if the DLL is to be called by more than one
process, it must be re-entrant

This section explains how to design a DLL to take these requirements into
account.

16.2.1  Floating-Point Math Requirements

Stand-alone DLLs built with the LLIBCDLL library are independent of the
programs calling them. They are "black boxes" that must operate without
knowing anything about their client programs and without interfering with
their clients.

One area of potential conflict for stand-alone DLLs is control of the 80x87
math coprocessor. For a DLL to use the 80x87 coprocessor or the emulator
floating-point library, the DLL and all of its client programs must agree on
which process is going to handle floating-point exceptions and on which
process is going to handle emulation if the machine does not have a
coprocessor.

Floating-point emulation is not possible with a genuinely independent DLL. A
stand-alone DLL must use the alternate math library, which ignores the math
coprocessor chip. The alternate math library provides the fastest processing
available without a coprocessor, but results are not as accurate as those
produced by the emulator floating-point library. Because the constraint
applies only to the DLL and not to applications, clients of a stand-alone
DLL can use any floating-point model. Since the DLL uses the alternate math
library, it does not conflict with clients over control of the math
coprocessor.

In contrast, DLLs and programs using a private C run-time DLL are tightly
coupled. This means that the floating-point math option is known when the
program is built. Because these programs and DLLs all use the same C runtime
functions (unlike the stand-alone DLL and its clients, which may incorporate
different C run-time libraries), no contention can arise over control of the
math coprocessor. The same floating-point math library is used for the
entire application.

The only way to use a math coprocessor within a DLL is with a private C
run-time DLL.

A private C run-time DLL uses the CDLLOBJS library and the emulator
floating-point package. The emulator uses the 80x87 math coprocessor if one
is installed; otherwise, it emulates the coprocessor. Floating-point
emulation produces the most accurate results. There is no conflict over use
of the coprocessor, since the C run-time DLL performs all floating-point
math. The programs and DLLs calling the C run-time DLL do not have any C
run-time library code of their own.

16.2.2  Initialization and Termination Requirements

When you design a DLL, you must decide if it has special initialization or
termination requirements. If the DLL needs to initialize variables or
allocate memory buffers when it starts, it needs custom start-up procedures.
If the DLL acquires system resources for a client program, the resources
must be released when the program completes its processing.

Initialization

All DLLs built with the Microsoft C run-time libraries must use per-process
initialization to set up the C run-time data. Per-process initialization
(also known as instance initialization) means that OS/2 calls the DLL's
initialization code each time it loads a program linked with the DLL. For
most DLLs, the default initialization routine is sufficient, and you do not
need to take any other measures.

If your DLL has special requirements, you must provide additional start-up
processing.

The C run-time library initialization function is called each time a new
client is attached to the DLL. To override the default initialization, you
must link your DLL with one of the following object modules, which are
provided with the Microsoft C Professional Development System:

File Name                         Description
────────────────────────────────────────────────────────────────────────────
DLLINIT.OBJ                       Initialization module for DLLs built
with LLIBCDLL.LIB and using C run-time
library code

CRTDLL_I.OBJ                      Initialization module for DLLs using a C
run-time DLL built with CDLLOBJS.LIB
(replaces CRTDLL.OBJ)

In addition, you must declare an entry point for your own DLL initialization
function. Your function, or the application program calling your DLL, must
initialize the C run-time data by calling the library function C_INIT before
any other C run-time library functions are called.

The prototype for C_INIT is

void _far _pascal C_INIT( void );

Designate your initialization function as the DLL's starting point.

To have your custom function recognized as the DLL's default initialization
routine, it must be the starting point for the DLL. This requires an
assembly language file with an END statement naming your function. The
sample file, SETENTRY.ASM, in the following example shows the minimum
assembler code required for specifying a C language function named
SampleInit as the DLL's entry point.

; SETENTRY.ASM
extrn _SampleInit:FAR         ;name of C start-up routine
end   _SampleInit

The following example, SAMPLE.C, shows a simple custom initialization
routine that maintains a count of how many clients it is currently serving.
Since this example overrides the default dynamic-link library
initialization, it must return a nonzero status code to OS/2 to show a
successful start-up. If a DLL initialization function returns a status of 0,
OS/2 will not load the program using the DLL.

/*                    SAMPLE.C                    */
void _far _pascal C_INIT( void );
int  UserCount = 0;

int _export _loadds SampleInit()
{
UserCount++;        /* increment number of users  */
C_INIT();           /* initialize C run-time data */
return( 1 );        /* indicate successful start  */
}

/* code for other DLL functions belongs here */

All DLLs must be linked with a module-definition file that contains a
LIBRARY statement, such as the following:

LIBRARY SAMPLE INITINSTANCE

The following commands will create object files from the sample files and
link them with DLLINIT.OBJ to make a stand-alone dynamic-link library  named
SAMPLE.DLL. The /ML compile option, explained in Section 16.2.6, "Compile
Options for Dynamic-Link Libraries," sets the library search record to
LLIBCDLL.LIB.

MASM /Mx SETENTRY;
CL /c /Gs /ML SAMPLE.C

────────────────────────────────────────────────────────────────────────────
WARNING

For DLLs linked with Microsoft C run-time libraries, the LIBRARY statement
in the DLL's module-definition file must specify INITINSTANCE in the
initialization field. If you omit this, the initialization routine is called
only when the DLL is loaded into memory for the first client program, and
the DLL will not function properly if it is called by additional programs.
────────────────────────────────────────────────────────────────────────────

Termination

You may have to clean up before terminating.

You may need to know when an application using your DLL is finished. If your
DLL has created buffers, semaphores, or other resources for a particular
application, they must be released when the application terminates.

You can have an initialization routine in your DLL that calls the OS/2 API
function DosExitList to register one or more exit subroutines for your DLL.
OS/2 will call the exit routines when the client program finishes. The exit
functions should free any resources your DLL acquired for the client
program.

DLLs built with LLIBCDLL.LIB have a default termination routine.

The start-up routine for dynamic-link libraries built with the LLIBCDLL
library calls DosExitList with a pointer to a default termination function.
To replace the default processing with your own function, link the module
DLLTERM.OBJ into the DLL. This suppresses the call to DosExitList. During
initialization, your DLL must register its own routine by calling
DosExitList unless you are sure the termination routine will be called
explicitly. The termination processing must include a call to the library
function C_TERM.

The prototype for C_TERM is

void _far _pascal C_TERM( void );

There is no equivalent to DLLTERM.OBJ and C_TERM for DLLs using a private C
run-time DLL built with the CDLLOBJS library. If special cleanup processing
is required, these DLLs must provide their own termination function. The
function is registered during initialization by calling either the C
run-time library function atexit or the OS/2 API function DosExitList.

Any DLL that calls DosExitList should also have a termination function.

DLLs that set exit lists must provide termination functions that can be
called by clients when they no longer need the DLL. If a program attaches
itself to the DLL at run-time (using DosLoadModule), it cannot disconnect
from the DLL as long as the exit list points to a function in the
dynamic-link library. The DLL's termination function can perform any
necessary cleanup and call DosExitList to remove itself from the exit list.

────────────────────────────────────────────────────────────────────────────
NOTE
There is no special termination procedure for DLLs build with CDLLOBJS.LIB
because the C run-time termination code is called by the exit or _exit
functions. If the process is terminated by a critical error or DosExit, C
run-time termination does not occur.
────────────────────────────────────────────────────────────────────────────

16.2.3  Making the DLL Re-Entrant

Re-entrant code is code that can be shared by multiple programs in a
multitasking environment. DLLs that may be used by more than one program
must be re-entrant. To do this, they must isolate each client program's data
and resources. File handles belonging to one client, for example, must not
be used for other clients. Re-entrancy also means that the DLL cannot allow
itself to be switched to a different thread while it is performing certain
operations.

Global Versus Instance Data

A dynamic-link library can have separate data segments for each program
that calls it.

Separate data segments are known as "instance" data. With instance data
segments, the DLL does not have to keep track of which resources belong to
each client. OS/2 assigns a different data segment to each process calling
the DLL, even though the selectors are the same.

A dynamic-link library can also have a global data segment used for internal
purposes or to support all of the programs using its services.

A DLL providing time and date conversions might, for example, keep the
current date in a global storage area. The same DLL might provide functions
to compute elapsed time, such as the number of minutes between two clock
readings. If static variables are used by the elapsed time functions, they
should be in instance data segments, since the OS/2 scheduler might preempt
the function and schedule another thread that calls the same function with
different arguments before it has completed the first caller's task.

Data sharing is controlled by DATA and SEGMENTS statements in a dynamic-link
library's module-definition file. By default, a DLL's automatic data segment
(the local stack and heap) is shared by all processes calling the DLL. You
can specify a unique automatic data segment for each client process by
specifying DATA MULTIPLE.

────────────────────────────────────────────────────────────────────────────
WARNING

DLLs built with the LLIBCDLL or CDLLOBJS C run-time libraries must use DATA
MULTIPLE in the module-definition file.
────────────────────────────────────────────────────────────────────────────

You can use SEGMENTS to specify attributes on a segment-by-segment basis.

Using the SEGMENTS statement allows you to have both global and per-process
(instance) data in the same DLL. The C run-time data segment must be
per-process. The following is an example of a C program fragment and
moduledefinition file that implement both instance and global data:

/* Define static data in the shared segment SHR_SEG */
int _based(_segname("SHR_SEG")) intvar;
char _based(_segname("SHR_SEG")) charvar;

In the module-definition file, define all data segments as nonshareable,
then override that default for  SHR_SEG  as follows:

DATA MULTIPLE NONSHARED
SEGMENTS
SHR_SEG     CLASS   'FAR_DATA'  SHARED

Global data segments are created when OS/2 brings the dynamic-link library
into memory for its first client process. All of the processes calling the
DLL share the same global variables.

Serializing Nonatomic References

An atomic operation is an operation that can be completed in one machine
language instruction. When writing a re-entrant procedure (in a multithread
program or in a DLL), you must ensure that changes to static or global data
are not preempted by the OS/2 scheduler before the update is complete. To
prevent this, you must explicitly serialize nonatomic references to static
or global data. The following code example is safe from preemption, because
incrementing an integer requires only one machine instruction:

int int_var;
_export _loadds void _far _pascal dynlink_proc( void )
{
int_var++;
}

The following variation on the same function is not safe because
incrementing  a long variable is not atomic; it requires two machine
instructions. Between incrementing the least-significant word and the
most-significant word, another thread could gain control of the processor.
If that thread executes code in your DLL that uses  long_var, that data
would be in an indeterminate state.

long long_var;
_export _loadds void _far _pascal dynlink_proc( void )
{
long_var++;
}

Critical Code Sections

A critical code section is a section of code that manipulates a resource
(such as the long variable in the previous example) while blocking all other
threads. When your program enters a critical section, it cannot be preempted
until it performs a DosExitCritSec or until a signal is received. You don't
usually just alter the value of a variable; you alter it and then use it
later. In this case, you must isolate the smallest group of operations that
must occur without interruption. You define these sections with the
DosEnterCritSec and DosExitCritSec OS/2 API functions, as in the following
example:

_export _loadds void _far _pascal dynlink_proc( void )
{   static int_var;

DosEnterCritSec();
int_var += 7;
SetLeftCorner( int_var, int_var );
DosExitCritSec();

/* Code that does not reference int_var */
}

Keep your critical sections as short as possible.

While in a critical section, all other threads in the process are blocked
from execution. Writing extremely long critical sections can make your
program inefficient and can degrade system performance.

Although other threads are blocked from execution by DosEnterCritSec and
DosExitCritSec, these functions do not block signal handling.

Note that static variables in DLLs are protected from interference from
other processes if they are in an instance data segment designated as
MULTIPLE in the DATA statement of the DLL's module-definition file. Memory
is "owned" by a process and, unless specifically allocated as shareable,
cannot be altered by any other process.

16.2.4  Signal Handling

The C library function signal is not supported for multithread programs or
for DLLs. If you need to process signals, use the OS/2 API signal functions,
such as DosSetSigHandler.

See Chapter 15, "Creating Multithread OS/2 Applications," for more
information about signal handling in OS/2 programs.

16.2.5  Using Microsoft C Keywords

The _export and _loadds keywords simplify writing DLLs. They are used to
define or declare functions or pointers to functions. In the DLL, an
exported function with a single argument might be defined as

int _export _loadds sample( int )

The _export Keyword

All DLL functions that will be called from outside the library must be
exported.

The _export keyword gives a function the export attribute. Stack checking
must be disabled for exported entry points. You can use the /Gs compile
option or the check_stack pragma to accomplish this.

Using the _export keyword is an alternative to declaring the name of the
function in the EXPORTS section of a module-definition file. It assigns
certain default attributes: no I/O privilege, shared data, load on demand,
and no alias name. If the defaults are not acceptable, you must specify the
proper attributes in the module-definition file.

Not all functions in a DLL are for external use. A DLL can have any number
of utility subroutines supporting the work of the exported functions.
Functions that are private to the DLL should not have the _export keyword.

At entry to a DLL, the DS (data segment) register points to the calling
program's data segment. To access the DLL's data, the DS register has to be
loaded with the DLL's segment selector. The _loadds keyword causes the
compiler to add prolog and epilog code to the function. The prolog code
initializes the DS register to point to the function's data group. The
epilog code restores the caller's DS register when the function terminates.

Since loading the DS register is a high overhead operation, you should limit
the use of _loadds to the exported functions in your DLL.

────────────────────────────────────────────────────────────────────────────
WARNING

Do not use the _loadds keyword in a function definition if the function uses
only stack variables. If you specify _loadds in a DLL that does not have any
static data, the linker will issue a segment fix-up error.
────────────────────────────────────────────────────────────────────────────

16.2.6  Compile Options for Dynamic-Link Libraries

Dynamic-link libraries must be compiled with specific options that control
linking, memory models, and library selection.

Compile without Linking (/c)

You must use the /c option to build your DLL in separate compile and link
steps. This is necessary because the DLL must be linked with a
module-definition file specifying that the output file is a dynamic-link
library. (The compiler does not pass module-definition file names to the
linker.) The /c option is automatically specified in the makefile generated
by the Programmer's WorkBench.

Large Memory Model with Separate Stack (/ALw)

The /ALw option instructs the compiler to use the large memory model with a
separate stack segment. Because all DLLs use the caller's stack, you must
use /Aw or /Au. The /Aw option sets up separate stack and data segments but
does not cause the DS register to be reloaded at the entry to each function.
This allows you to call private functions (functions that you do not export)
without incurring the overhead of loading the DS register. Functions that
you do export must also be declared using the _loadds keyword, described
above, which sets up the proper DS register handling. If you use the /Au
option, the DS register will be reloaded on entry to every function, which
can cause the function calls in your DLLs to execute more slowly.

All DLL functions are reached using far calls. Pointers passed to and from
the DLL must be far pointers.

Remove Stack Probes (/Gs)

Since the DLL uses the caller's stack, you should usually use the /Gs option
to disable stack checking within the DLL.

Specify 80286 Code (/G2)

Use the /G2 option to designate code generation for the 80286 processor
instruction set, since OS/2 runs only on 80286 and higher model processors.

Link C Run-Time into Stand-Alone DLL (/ML)

Use the /ML option to build a stand-alone dynamic-link library that includes
static code for C run-time library functions. This option has the same
effect as using the /ALw, /FPa, /G2, and /D MT options. It changes the
library search record to LLIBCDLL.LIB. See Section 16.3.1, "DLLs with Static
C Run-Time Library Functions" for more information about these options.

Link Executable or DLL with C Run-Time DLL (/MD)

Use the /MD option to build an executable file or a dynamic-link library
that calls a C run-time DLL. This option has the same effect as using the
/ALw, /FPi, /G2, /D DLL, and /D MT options. It inhibits library search
records. See Section 16.3.3, "Programs and DLLs with a C Run-Time DLL," for

Suppress Default Library Selection (/Zl)

If you do not compile with the /MD or /ML options described above, compile
with the /Zl option or use the /NOD option when you link in order to inhibit
searches for default libraries.

16.3  Building DLLs with Microsoft C

Building a DLL for OS/2 is like building an executable program file.

To build a DLL, compile and link the dynamic-link library like any other
executable file, but add a module-definition file. This module-definition
file tells the linker that the output is a dynamic-link library.

When you build applications that use a dynamic-link library, you must tell
the linker where to find the library's dynamically linked functions. You use
import libraries and module-definition files for this purpose.

16.3.1  DLLs with Static C Run-Time Library Functions

The LLIBCDLL library is used to create stand-alone DLLs. The library
functions are re-entrant and can be called by multiple threads within a
program as well as by multiple programs. The code for the stand-alone DLL's
C run-time library functions is contained within the DLL. Programs that call
stand-alone DLLs have their own run-time library code.

Building the DLL

The files required to build a stand-alone DLL with the LLIBCDLL library are
listed below:

File Name                         Description
────────────────────────────────────────────────────────────────────────────
OS2.LIB                           OS/2 kernel import library

LLIBCDLL.LIB                      Large-model multithread C run-time
library for DLLs

DLLINIT.OBJ                       Optional initialization module for DLLs
requiring custom initialization

DLLTERM.OBJ                       Optional termination module for DLLs
requiring custom exit processing

userdll.C                         Source code for the DLL you create

userdll.DEF                       Module-definition file for the DLL you
create

The module JUSTIFY.C, below, is an example of source code for a simple
dynamic-link library. The RightJustify routine calls the strlen function
from the C run-time library and right-justifies a caller's buffer. The
function definition includes the _export keyword. The _loadds keyword is
omitted, since this function does not need any static data. If it did, you
would need to specify _loadds.

For simplicity, JUSTIFY.C below shows a DLL with a single function. In
actual practice, you would usually package a group of similar utilities into
one DLL.

/* JUSTIFY.C -- Sample Dynamic-Link Library */

#include <string.h>

/* Right justifies the string in TargetBuff to TargetSize
* and inserts necessary number of FillChars on the left.
*/

#pragma stack_check(off)

int _export RightJustify( char *TargetBuff, int TargetSize,
char FillChar)
{
char *s, *d;
s = TargetBuff + strlen( TargetBuff );
d = TargetBuff + TargetSize;
while ( s = TargetBuff )
*d-- = *s--;
while ( d = TargetBuff )
*d-- = FillChar;

return( 0 );
}

The steps for creating a stand-alone dynamic-link library with JUSTIFY.C are
given below. The DLL in the example is named JUSTLIB1.DLL.

■   Compile with the /ML Option.

with LLIBCDLL must be compiled with specific options.

Use the /ML option to set the library search record to LLIBCDLL.LIB
and to indicate that C run-time code is to be included in the DLL.
When you use /ML, the following options take effect:

Option                            Effect
────────────────────────────────────────────────────────────────────────────

/ALw                              Use large memory model with separate
stack
segment

/G2                               Use 80286 processor instruction set

/D MT                             Use the multithread version of the
include files

/FPa                              Generate floating-point calls and select
the alternate math library

The /G2 and the /ALw options can be overridden.

You should also use the /Gs option to suppress stack checking and the
/c option to compile without linking. The complete command to compile
the sample file JUSTIFY.C is

CL /ML /Gs /c JUSTIFY.C

■   Create a module-definition file.

Create a module-definition file, JUSTLIB1.DEF, which includes the
following lines:

LIBRARY JUSTLIB1 INITINSTANCE
DATA MULTIPLE

The LIBRARY statement identifies the executable file, JUSTLIB1.DLL, as
a dynamic-link library. DLLs linked with the LLIBCDLL library must
specify INITINSTANCE in the initialization field. You could add an
EXPORTS statement for the RightJustify function in JUSTIFY.C, but it
is optional since the _export keyword was used in the source code.

See Chapter 14, "Building OS/2 Applications," for more information

■   Link with LLIBCDLL.LIB.

Ensure that the file LLIBCDLL.LIB, which takes the place of the
regular C run-time library, is available.

Create JUSTLIB1.DLL with a command such as

────────────────────────────────────────────────────────────────────────────
WARNING

When you link with LLIBCDLL, you cannot have any other C run-time libraries
────────────────────────────────────────────────────────────────────────────

■   Create an import library.

Applications that call DLLs use import libraries to identify DLL
functions to the linker. The following example uses JUSTLIB1.DLL and
the IMPLIB utility to create an import library named JUSTLIB1.LIB.

IMPLIB justlib1.lib justlib1.dll

For more information about import libraries, see Chapter 14, "Building
OS/2 Applications."

Building Programs that Call the DLL

To link a dynamic-link library with an application, you must have one of the
following:

■   A module-definition file with an IMPORTS statement for each DLL
function called by your program

■   An import library created from the DLL itself or from a
module-definition file

All calls to DLLs must be far calls; all pointers passed must be far data
pointers. If you do not compile with the large memory model option (/AL),
you must cast the DLL function calls and pointers yourself.

The sample file below, TESTJUST.C, is compiled and linked into a small-model
program named SAMPLE1.EXE. TESTJUST.C includes a function prototype that
declares RightJustify as a far function expecting a far pointer as its first
argument. Because of the prototype, the compiler will generate a far call to
RightJustify and coerce the pointer argument to the proper value.

/* TESTJUST.C. Call sample DLL library */

#include <stdio.h>
#include <string.h>

/* DLL function prototype */

int _far RightJustify( char _far *, int, char );

void main( void )
{
char buff[12];

strcpy( buff, "ABCD" );

/* Right justify to 8 characters and zero fill. */
RightJustify( buff, 8, '0' );
printf( "Result: %s\n", buff );
}

You need several files to link an application with a stand-alone DLL:

File Name                         Description
────────────────────────────────────────────────────────────────────────────
userdll.LIB                       Import library file for the DLL

userapp.DEF                       Optional module-definition file for your
application that contains an IMPORTS
statement for each DLL function called
(required if not using an import
library)

OS2.LIB                           Optional import library file for the
OS/2 kernel (required if your
application calls the kernel directly or
via a C run-time library function)

userapp.OBJ                       Object module(s) for your application

mLIBC f P.LIB                     Regular C run-time library for protected
mode, where m indicates memory model (S,
C, M, L) and
f indicates math package (A, E, 7)

The following command lines illustrate how TESTJUST.C can be compiled  and
linked with the standard libraries, plus the sample dynamic-link library,
JUSTLIB1.DLL. The example uses the small memory model library and the
JUSTLIB1.LIB import library created from JUSTLIB1.DLL to create SAMPLE1.EXE.

CL /AS /G2 /c TESTJUST.C

Make sure that the JUSTLIB1.DLL file is in a directory on your LIBPATH
before executing SAMPLE1.EXE.

16.3.2  DLLs without C Run-Time Library Functions

Building a DLL that does not call any of the C run-time library functions is
similar to creating a stand-alone DLL.

To use the JUSTIFY.C sample program shown in Section 16.3.1, "DLLs with
Static C Run-Time Library Functions," without calling C run-time functions,
one change must be made. You must remove the call to the C run-time library
function strlen. The strlen function was used in the sample program to
calculate a pointer to the end of the caller's buffer. Remove the following
line in the program JUSTIFY.C:

s = TargetBuff + strlen( TargetBuff );

Replace the line above with the following code fragment, which does the same
thing without calling strlen:

s = TargetBuff;
while ( *s )
s++;

After making this change, you can use the following commands to create a DLL
named JUSTLIB2.DLL and its import library:

CL /c /ALw /G2s /Zl JUSTIFY.C
IMPLIB JUSTLIB2.LIB JUSTLIB2.DLL

Note that object modules compiled with releases of Microsoft C prior to
Version 6.0 refer to the C run-time library variable _acrtused. C 6.0
defines this variable if the main function is present. This causes the
linker to automatically add the C run-time start-up module to the DLL. To
suppress the start-up module, your source file must include a line defining
_acrtused as follows:

int _acrtused = 0;

This is required only if you do not use a C run-time library and if the link
includes object modules built with earlier versions of the compiler.

16.3.3  Programs and DLLs with a C Run-Time DLL

The CDLLOBJS.LIB and CDLLOBJS.DEF files are the foundation for building a
DLL that consists only of C run-time library functions. The application
programs and optional dynamic-link libraries linked with this DLL do not
contain any C run-time library code.

You create an application to use the C run-time DLL in either two or three
phases, depending on whether or not the application has additional DLLs:

■   Build a C run-time DLL.

■   Build any optional DLLs that use the C run-time DLL.

■   Compile and link the application.

The examples in this section use the JUSTIFY.C and TESTJUST.C source files
shown in Section 16.3.1, "DLLs with Static C Run-Time Library Functions."

Building a C Run-Time DLL

The C run-time DLL is derived from the CDLLOBJS.LIB and CDLLOBJS.DEF files
provided with the Microsoft C Professional Development System. The
CDLLOBJS.DEF file includes export definitions for all of the C run-time
library functions.

The steps for creating a C run-time DLL are given below. The C run-time DLL
in the example is named CEXAMPLE.DLL.

1.  Create a module-definition file.

You can use CDLLOBJS.DEF as the basis for your own module-definition
file by copying and editing it. This allows you to create a customized
DLL that contains only the functions your application requires. If you
use the CDLLOBJS.DEF file without modification, every program that
links to your C run-time DLL will get the entire C run-time library.

The following examples create the sample file CEXAMPLE.DEF to define
the custom dynamic link library CEXAMPLE.DLL. The CEXAMPLE.DEF file,
shown below, exports the three C run-time library functions called
from JUSTIFY.C and TESTJUST.C. It also exports functions required by
the C run-time library start-up modules.

LIBRARY CEXAMPLE INITINSTANCE
DESCRIPTION 'Sample Dynamic-link C Run-Time Library'
DATA MULTIPLE
PROTMODE
EXPORTS
_printf
_strlen
_strcpy
__CRT_INIT
__aFchkstk
_exit

2.  Create the C run-time DLL.

The files for creating a C run-time DLL are listed below:

File Name                         Description
────────────────────────────────────────────────────────────────────────────

OS2.LIB                           Import library for the OS/2 kernel

CDLLOBJS.LIB                      Dynamic link C run-time library

CRTLIB.OBJ                        Start-up code for C run-time DLL

yourclib.DEF                      Module-definition file specifying C
run-time library functions for the DLL

The command to create the sample CEXAMPLE.DLL file is

LINK /NOD /NOE /NOI crtlib.obj,cexample.dll,,cdllobjs+os2,cexample.def

3.  Create an import library.

You need to create a library file of import definitions that can be
used by programs that will be linked with your custom DLL. This is a
two-step process. The first phase uses the module-definition file and
the IMPLIB utility to create an interim version of the library, as in
this example:

IMPLIB cexample.lib cexample.def

Note that the IMPLIB utility accepts either a module-definition file
or a DLL as input.

The second step uses the LIB utility to append the file CDLLSUPP.LIB
to the import library. You must append CDLLSUPP.LIB because it
contains some routines that cannot be dynamically linked. The LIB
utility requires the full path name for CDLLSUPP.LIB. If it is in a
directory named C:\ LIB, the command to complete the library build for
CEXAMPLE.LIB is

LIB CEXAMPLE.LIB+C:\LIB\CDLLSUPP.LIB;

When you have finished building the custom DLL, be sure to copy it to
a directory specified in the LIBPATH statement of the CONFIG.SYS file.

Building an Application-Specific DLL

You must compile a DLL that calls a C run-time DLL with specific options
and link it with the C run-time DLL's import library. The steps for building
an application-specific DLL named JUSTLIB3.DLL are given below.

1.  Compile with the /MD option.

The easiest way to be sure you choose the proper options is to use the
/MD switch, which indicates that the DLL will be used with a C
run-time DLL. When you use /MD, library search records are suppressed
and the following options are in effect:

Option                            Effect
────────────────────────────────────────────────────────────────────────────

/ALw                              Use large memory model with separate
stack
segment

/G2                               Use 80286 processor instruction set

/D MT                             Use the multithread version of the
include files

/D DLL                            Use a C run-time dynamic-link library

/FPi                              Generate in-line floating-point
instructions and select the emulator
math package

The /G2 and /ALw options can be overridden. The FPi option can be replaced
with /FPi87 or /FPc, but not with /FPa. See Chapter 4, "Controlling
floating-point options.

You should also use the /c option to compile without linking. The
command line to compile the sample file JUSTIFY.C is

CL /MD /c JUSTIFY.C

2.  Create a module-definition file.

Create a module-definition file named JUSTLIB3.DEF that includes the
following line:

LIBRARY JUSTLIB3 INITINSTANCE

3.  Link the DLL with the C run-time and OS/2 import libraries.

To create a DLL that will call a C run-time DLL, the following files
must be linked together:

File Name                         Description
────────────────────────────────────────────────────────────────────────────

OS2.LIB                           Import library for the OS/2 kernel

yourclib.LIB                      Import library for your C run-time DLL

CRTDLL.OBJ                        Start-up code for DLLs using a C
run-time DLL

CRTDLL_I.OBJ                      Optional initialization module for DLLs
requiring custom initialization
(replaces CRTDLL.OBJ)

yourdll.OBJ                       Object file for your DLL

yourdll.DEF                       Module-definition file for your DLL

The command for linking these files to create JUSTLIB3.DLL is

4.  Create an import library.

Use JUSTLIB3.DLL and the IMPLIB utility to create an import library
file, JUSTLIB3.LIB, for use by applications calling JUSTLIB3.DLL:

IMPLIB JUSTLIB3.LIB JUSTLIB3.DLL

Remember to copy JUSTLIB3.DLL to a directory named in the LIBPATH
statement in the CONFIG.SYS file.

Using C Run-Time and Application-Specific DLLs

Application programs using a C run-time DLL, such as the sample program
CEXAMPLE.DLL (described earilier in this section), must define the symbolic
constants MT and DLL. These constants cause the compiler to use the
multithread and DLL sections of the include files. You can define the
constants in your source code or with the compiler's /D command-line option.
Since the C run-time DLL uses the large memory model, your program must
either use the same model or declare all C run-time functions and pointers
passed to them as _far. If you use the standard include files for the C
run-time functions in your program, all these declarations are made for you.

The following files are required to link an application that calls a C
runtime DLL:

File Name                         Description
────────────────────────────────────────────────────────────────────────────
OS2.LIB                           Import library for the OS/2 kernel

yourclib.LIB                      Import library for your C run-time DLL

yourdll.LIB                       Import library for each optional
application DLL

CRTEXE.OBJ                        Start-up code for executable files
calling a C run-time DLL

yourapp.OBJ                       Object file(s) for your application

yourapp.DEF                       Optional module-definition file for your
application

The following commands compile and link the TESTJUST.C file from Section
16.3.1 for use with the dynamic-link libraries CEXAMPLE.DLL and
JUSTLIB3.DLL. The link command uses the /NOD option to suppress selection of
the standard large-model library. The result is a program named SAMPLE2.EXE.

CL /AL /D MT /D DLL /G2 /c TESTJUST.C

16.3.4  Using CodeView to Debug Dynamic-Link Libraries

The protected-mode version of CodeView (CVP) supports debugging of
dynamic-link libraries. The /L option lets you name one or more DLLs to be
debugged with your application.

To enable full symbolic debugging, use the CodeView options /Zi when
compiling and /CO when linking. Do this for both the DLL to be debugged and
for the program that calls the DLL.

The syntax for the /L CodeView option is

/L file

At least one space must separate /L from the file name(s). You can enter
multiple DLL names. To debug the JUSTLIB3.DLL dynamic-link library and the
SAMPLE2.EXE program discussed in the previous section, use this command
line:

CVP /L JUSTLIB3.DLL SAMPLE2.EXE

Use the CodeView Trace command (F8) to enter and view DLL code.

A simple way to use CodeView is to place a breakpoint at the instruction
that calls the DLL function you want to debug. When you reach the
breakpoint, press F8 to execute the current source line. CodeView will then
display the DLL function's source code, allowing you to set additional
breakpoints and enter other CodeView commands.

Appendix A  Using Exit Codes
────────────────────────────────────────────────────────────────────────────

When C programs terminate, they return values to the process that started
them. These values are called "exit codes." The process that starts a C
program can be either an operating system, such as DOS or OS/2, or another
program. The process that starts the C program is referred to as the "parent
process"; the program started is referred to as the "child process." The
parent process can interpret return values as an error code sent to the
operating system or use those return values as a form of interprocess
communication (communication between two separate processes).

A.1  The exit Function

The exit function terminates execution of your C program and returns an exit
code (an integer value) to the parent process. The parent process can be the
operating system or another program, depending on how the child process was
executed. Note that a C program always returns an integer, regardless of how
you declare the main function.

Most programs use exit codes to communicate errors to the parent process;
these are called "error codes." By convention, programs return zero if they
complete normally and a nonzero value if they are exiting because of an
error. This error code (the nonzero value) can then be used by the operating
system to control the execution of other programs (for example, from inside
a batch file).

The Microsoft C compiler is a good example of a program that returns an exit
code. It returns 0 if no errors occur in your compile and a positive value
if an error occurs during compilation.

The following program attempts to open a file for reading. If the file
cannot be opened, exit returns 1 to the calling program. Therefore, 1 and 0
are both exit codes.

#include <stdio.h>

int main(void)
{
FILE * fp;

if( !(fp = fopen( filename, "rb" )) )
{
printf("Error %d: Could not open file\n", errno);
exit(1);
}

do_file_access(fp);
}

In the preceding example, the exit code is unpredictable because the exit
function is not used. The value actually returned to the parent process (or
to the operating system shell) is whatever happens to be in the AX register
when the program terminates─in this case, whatever  do_file_access
returned.

A.2  Testing Exit Codes from Command and Batch Files

Using the IF ERRORLEVEL command, you can test to see if a program has
executed successfully by checking its exit code. The IF ERRORLEVEL command
is an OS/2 command file or DOS batch file command that tests the exit code
of the most recently executed program.

IF ERRORLEVEL can help you organize program execution. For example, you can
define program execution to be dependent on the successful exit code testing
of earlier programs by IF ERRORLEVEL. You can also use the value of the exit
code to branch to different commands in a batch or command file.

When placed in a batch or command file, the following commands will execute
REPORTS.EXE only if FILEMNG.EXE does not return an error:

echo Running file manager....
FILEMNG.EXE
IF NOT ERRORLEVEL 1 REPORTS.EXE

Despite the name  ERRORLEVEL, the exit code does not always denote an error.
You can define error codes to communicate any information useful to you.

Refer to the Microsoft Operating System/2 User's Guide or the Microsoft
MS-DOS User's Guide and User's Reference for more information about the IF
ERRORLEVEL command.

A.3  Accessing Exit Codes from Other Programs

When you use any of the spawn family of functions to run a program as the
child of another program, the return value of spawn is the exit code of the
function. The following code performs the same function as the batch file in
Section A.2:

void main( void )
{
if( !spawnl( P_WAIT, "filemng.exe", "filemng.exe",
NULL ) )
spawnl( P_WAIT, "reports.exe", "reports.exe",
NULL );
}

The program  reports.exe  is executed only if the program  filemng.exe
terminates with an exit code of 0.

The following code uses the exit code as part of a simple menu system:

void main(void)
{
int option;
int menu_num = 0;  /* Initialize for first execution */

while( (option = spawnl( P_WAIT, "menu.exe",
{
switch( option )
{
case 1 :
menu_num = spawnl( P_WAIT, "program1.exe",
"program1.exe", NULL );
break;
case 2 :
menu_num = spawnl( P_WAIT, "program2.exe",
"program2.exe", NULL );
break;
case 3 :
menu_num = spawnl( P_WAIT, "program3.exe",
"program3.exe", NULL );
break;
default:       /* Guard against a bad option */
break;
}
}
}

The preceding example demonstrates how you could have a program,  menu.exe,
that solicits input from a menu of choices. This input is interpreted and
passed back to the main program in the form of an exit code. (The spawnl
function returns the value of the child process's exit code.) This exit code
value is stored in  option, which is used as a selector variable in a switch
statement.

Based on the value returned from  menu.exe, the main program executes
program1.exe,  program2.exe, or  program3.exe. Finally,  menu_num, the exit
code of the program selected, is used as a parameter to the next execution

Appendix B  Differences between C Versions 5.1 and 6.0
────────────────────────────────────────────────────────────────────────────

This appendix describes the differences between versions 5.1 and 6.0 of
Microsoft C, including additions, deletions, and changes. Some of the
changes are required by the American National Standards Institute (ANSI)
draft standard for the C programming language. Other changes improve or
augment the existing capabilities of the compiler.

Many of the changes will have no effect on code that was written and
compiled with previous versions of Microsoft C. In some cases, however, you
may have to modify or correct existing code before compiling with version
6.0.

B.1  Modifications for ANSI Compatibility

A number of changes have been made to the compiler to support the ANSI draft
standard. These include new features (Section B.1.1) and changes (Sections
B.1.2 - B.1.8).

B.1.1  ANSI-Mandated New Features

The following ANSI-mandated features are new to version 6.0:

■   The semantics for volatile have been implemented.

■   Both long and unsigned long values are allowed in switch expressions
and case constants.

■   The compiler supports unsigned long decimal constants. It is now
possible to initialize unsigned long variables with values larger than
MAX_LONG using decimal (rather than hexadecimal or octal) constants.

■   Bit fields are permitted in unions.

■   The address-of operator (&) works correctly on arrays and functions.

■   Storage classes or types (or both) are now required on variable
declarations. The compiler previously assumed that untyped variables
(such as  a;) were integers. This declaration now generates a warning.

■   The LOCALE.H header file is new to version 6.0. It declares functions
and structures for describing conventions that vary from one country
to the next, such as the currency symbol and the way calendar dates
are printed.

B.1.2  Integer Promotion Rules

The ANSI draft standard requires a change in the evaluation of some
expressions that mix signed and unsigned integers. Earlier versions of the
compiler attempted to preserve an expression's unsigned nature as much as
possible. Version 6.0 attempts to preserve the expression's value.

In version 5.1, an unsigned char promotes to an unsigned int; an unsigned
int promotes to an unsigned long.

In version 6.0, an unsigned char promotes to a signed int; an unsigned int
promotes to a signed long.

For example,

main()
{
long int li = -256L;
test( li );
}

test( long li)
{
if( li < 0xffff )
puts( "C 6.0 does a signed compare" );
else puts( "C 5.1 does an unsigned compare" );
}

B.1.3  Defining NULL as a Pointer

The constant NULL is now defined as  ((void *)0). Previous versions of
Microsoft C defined NULL as 0x0000 in small and medium models and
0x00000000L in compact and large models.

B.1.4  Shift Operators

Shift operators now give a result that is of the same type as the left side.
For example,

short si;
long li;
si = 0x0001;
li = si << 16L;

The compiler previously yielded a result that was the size of the largest of
the two values. In the example above, the short value would be automatically
cast to a long because 16L is long. The value assigned to  li  would be
0x00010000L in Microsoft C 5.1.

To adhere to the ANSI draft standard, Microsoft C 6.0 maintains the size of
the left operand. The variable  si  has 16 bits. Shifting left 16 times
produces a value of 0, which is then assigned to  li.

B.1.5  Pointers to Typedefs

The rules for handling pointers to typedefs have changed subtly. For
example, C 5.1 interprets

typedef int far f_int;
f_int *fp_i;

as being equivalent to

int *far fp_i;

which means  fp_i  is a distant pointer to an integer. The address of  fp_i
contains 32 bits. The size of the integer's address is indeterminate.

C 6.0 interprets it as

int far *fp_i;

This means  fp_i  is a far pointer to an integer. The address of the integer
contains 32 bits. The size of the address of  fp_i  is indeterminate.

This affects typedefs containing _near, _far, _based, and other modifiers.
Although these are Microsoft-specific keywords, their new behavior is
consistent with what the ANSI draft standard requires for the const and
volatile keywords.

B.1.6  Identifying Nonstandard Keywords

The following modifiers are specific to Microsoft C; they are not described
in the ANSI draft standard. To identify these implementation-defined
keywords as non-ANSI, an initial underscore has been added.

C 5.1 Keyword                     C 6.0 Keyword
────────────────────────────────────────────────────────────────────────────
far                               _far

huge                              _huge

near                              _near

cdecl                             _cdecl

fortran                           _fortran

interrupt                         _interrupt

pascal                            _pascal

The compiler still accepts the obsolescent versions of these keywords,
unless the /Za option is used.

B.1.7  Trigraphs

To maintain compatibility with and portability to other systems, Microsoft C
6.0 supports the following trigraphs:

Trigraph                          Character
────────────────────────────────────────────────────────────────────────────
??=                               #

??(                               [

??/                               \

??)                               ]

??'                               ^

??<                               {

??!                               |

??>                               }

??-                               ~

B.1.8  ANSI Nonconformance

This section lists the areas where Microsoft C 6.0 does not conform to the
ANSI draft standard.

■   Microsoft C does not support multibyte characters, wide-character and
string constants, and the related library functions and types.

■   Microsoft C contains some name-space violations in the language
(extended keywords, such as near and far) and in the library (non-ANSI
macros and types in header files and extended library function names,
such as read and write).

B.2  New Keywords and Functions

This section describes keywords and functions that did not exist in previous
versions of Microsoft C. Details about how to use these features can be
found elsewhere in the documentation.

B.2.1  In-Line Assembler

The new _asm keyword allows you to mix assembly instructions with C source
code. This feature includes the _emit function, which lets you enter
arbitrary values into the code stream.

See Chapter 3, "Using the In-Line Assembler."

B.2.2  Based Pointers and Objects

A based pointer is a special, compact form of pointer. It is always
represented as a short offset. The address represented by such a pointer is
calculated by adding the based pointer to its base. The base must be
supplied each time the pointer is dereferenced, either explicitly using a
special operator or implicitly by associating the base value with the
pointer when it is declared. The base can be a far pointer, a near pointer,
or a new type that represents a segment.

Based pointers and objects are declared using the new keyword, _based.

Segment Types

The new type specifier, _segment, specifies a segment.

Any pointer or address can be cast to _segment. If the operand is a near
pointer, the result is the current value of the data segment register (DS).
If the operand is a far pointer, the result is the segment part of the far
pointer.

Segment Names

Segment names are declared using the built-in function _segname. The
compiler recognizes four predefined segment names: _CODE, _CONST, _DATA, and
_STACK.

Each segment name represents a constant of type _segment.

Base Operator

The base operator (:>) associates a base expression (usually a segment) with
a based pointer, to form a far pointer value. For example,

0x0F01:>0x0015

combines the segment 0x0F01 with the offset 0x0015 to form the effective
address 0x0F025. The base operator's precedence falls between ( ) and [ ].

Casting Based Pointers

A based pointer can be cast to a pointer, a long integer, a short integer,
or another based pointer. When a based pointer is converted to a far
pointer, a long integer, a near pointer, or another based pointer having a
different base expression, it is first normalized to a far pointer
(including adding the offset in the base, if present, to the based pointer);
then any additional conversions are applied.

Operations on Based Pointers

Based pointers, for the purpose of arithmetic and dereferencing, are treated
as semantically equivalent to far pointers. When a based pointer mixes with
another integral type (int, long, near pointer, far pointer, or based
pointer), implicit casting is done. In some cases, the compiler can optimize
these references and treat the pointer as an offset.

The value of 0 is treated specially, as it is for near and far pointers. No
conversions are applied to the constant 0 because it is assumed to be a null
pointer.

See Chapter 2, "Managing Memory."

B.2.3  Based Heap Allocation Support

The functions listed below provide support for allocating, expanding, and
freeing memory for based heaps, which dynamically allocate memory for based
items. The functions are prototyped in the MALLOC.H include file.

╓┌──────────┌───────────┌────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_bcalloc   _bheapchk   _bmalloc
_bexpand   _bheapmin   _bmsize
_bfree     _bheapseg   _brealloc
_bfreeseg  _bheapset

See Chapter 2, "Managing Memory."

B.2.4  Releasing Unused Heap Memory

The following routines release unused heap memory by shortening data
segments. MALLOC.H contains the function prototypes.

────────────────────────────────────────────────────────────────────────────
_fheapmin
_heapmin
_nheapmin

B.2.5  Making Static Data Available to the Heap

The _heapadd function is new. It allows the user to make unused static data
available to the heap.

B.2.6  Long Doubles

Microsoft C version 5.1 treated double and long double as syntactically
different types that were semantically equal. Both types were stored in
memory as 64-bit quantities. For purposes of type-checking, long double and
double have always been different types.

Because the 80x87 family of math coprocessors supports an 80-bit
floating-point type, Microsoft C version 6.0 stores long double variables in
the 80x87 10-byte (80-bit) form.

Certain functions have been modified to handle the long double type. The
printf and scanf family of functions supports long double values with the
trailing l. The library contains new versions of the transcendental
functions as well as intrinsic forms that accept long double arguments.

B.2.7  Long Double Functions

All the functions below are defined in the standard include file MATH.H.
They return long double values and results and error codes analogous to the
double versions.

╓┌───────┌───────┌───────────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
acosl   expl    _matherrl
asinl   fabsl   modfl
atanl   floorl  powl
atan2l  fmodl   sinl
_atold  frexpl  sinhl
cabsl   hypotl  sqrtl
ceill   ldexpl  tanl
cosl    logl    tanhl
coshl   log10l
────────────────────────────────────────────────────────────────────────────
coshl   log10l

B.2.8  Model-Independent String and Memory Functions

The following functions make it easier to write mixed-model programs by
providing model-independent (large model) forms for most of the standard
string and memory functions. These functions can be called from any point in
any program, no matter which memory model has been selected. These functions
take only far pointers as arguments. Thus, any data item, near or far, in
any combination, can be handled.

The names of these functions are the same as the model-dependent forms,
except they include an _f prefix. For example, _fstrlen is the
model-independent version of the strlen function.

The functions listed below are defined in the standard include file
STRING.H.

Memory Functions

╓┌────────────────────────────┌──────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_fmemccpy                    _fmemcpy
_fmemchr                     _fmemmove
_fmemcmp                     _fmemset
_fmemicmp

String Functions

╓┌──────────┌───────────┌────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_fstrcat   _fstrlwr    _fstrrchr
_fstrchr   _fstrncat   _fstrrev
_fstrcmp   _fstrncmp   _fstrset
_fstricmp  _fstrnicmp  _fstrspn
_fstrcpy   _fstrncpy   _fstrstr
────────────────────────────────────────────────────────────────────────────
_fstrcpy   _fstrncpy   _fstrstr
_fstrcspn  _fstrnset   _fstrtok
_fstrlen   _fstrpbrk   _fstrupr

String Duplication Functions

────────────────────────────────────────────────────────────────────────────
_fstrdup
_nstrdup

B.2.9  Mixed-Model Memory Allocation Support

The following functions are based on realloc, calloc, and expand, but they
affect only near memory or far memory. MALLOC.H contains the function
prototypes.

╓┌───────────────────────────┌───────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_fcalloc                    _ncalloc
_fexpand                    _nexpand
_frealloc                   _nrealloc

B.2.10  The _fastcall Attribute (/Gr Option)

Individual function prototypes can be declared with the new attribute
_fastcall.

The /Gr option enables the fastcall function-calling convention for all
functions that are not explicitly prototyped with the _cdecl, _pascal, or
_fortran attributes. Using /Gr on the command line causes each function in
the module to compile as _fastcall unless the function is declared with a
conflicting attribute, or the name of the function is main.

When you use the /Gr option, all functions are assumed to use the _fastcall
convention. As a result, to use any run-time library functions, you must
either include the standard include files or explicitly prototype the
function you want to call.

A fastcall function receives up to three 16-bit arguments, passed in
registers rather than on the stack. Arguments are passed in the AX, BX, and
DX registers. This may change in future versions of the compiler.

The argument types and their potential register assignments are

Argument                          Registers
────────────────────────────────────────────────────────────────────────────
character (3)                     AL, DL, BL

short integer (3)                 AX, DX, BX

near pointer (3)                  BX, AX, DX

long integer (1)                  DX:AX

far pointer (1)                   ES:BX

If the registers for a particular class have already been used, or if an
argument is not one of the five types listed above, it is pushed on the
stack as usual. An argument list of types long, float, short would pass the
long in DX:AX, push the float, and pass the short in BX.

The treatment of character arguments depends further on prototypes. If there
is no prototype, the argument is promoted to short and the rules for short
integers apply. Only if the argument is prototyped as a char do the
character rules apply.

The _fastcall convention is not compatible with any of the following
attributes: _interrupt, _saveregs, _export, _cdecl, _fortran, or _pascal.

See Chapter 1, "Optimizing C Programs."

B.2.11  Drive and Directory Functions

Several new functions make it easier to get and set the current drive and
the current directory. The prototypes for the following routines are in
DIRECT.H:

────────────────────────────────────────────────────────────────────────────
_chdrive
_fullpath
_getdrive
_getdcwd

B.2.12  Text Output Functions for OS/2

Several text-mode screen functions have been added to Microsoft C 6.0 for
OS/2. With the exception of the new _scrolltextwindow function, they are
identical to what is defined in real mode, except for any references to
behavior in graphics modes. The following routines are located in
GRTEXT.LIB, and the prototypes are in GRAPH.H:

╓┌─────────────────┌─────────────────┌───────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_clearscreen      _getvideoconfig   _settextrows
_displaycursor    _outtext          _settextwindow
_getbkcolor       _setbkcolor       _setvideomode
_gettextcolor     _settextcolor     _setvideomoderows
────────────────────────────────────────────────────────────────────────────
_gettextcolor     _settextcolor     _setvideomoderows
_gettextcursor    _settextcursor    _scrolltextwindow
_gettextposition  _settextposition  _wrapon

See Part 4 of this manual, "OS/2 Support."

B.3  New Features

The features described in Sections B.3.1-B.3.10 are new to version 6.0.

B.3.1  Strings and Macros

The compiler now allows longer string literals (up to 4K) and longer macro
expansions (up to 6K).

B.3.2  CL Options

The following options are new to Microsoft C 6.0:

Option                            Action
────────────────────────────────────────────────────────────────────────────
/AT                               Compiles in tiny model (.COM files).

/Fr« filename»                    Outputs source browser information file.

/FR« filename»                    Outputs extended source browser
information file.

/Gd                               Forces _cdecl calling conventions.

/Gr                               Enables register (_fastcall)
function-calling
conventions.

/MAmasmoption                     Supports invocation of the assembler
using the CL driver. All MASM-supported
options are accepted. In addition, the
compiler recognizes file names with .ASM
suffixes and passes them directly to
MASM.

/MD-                              Uses C run-time as DLL option. Defaults
to
/ALw /FPi /G2 /DDLL /DMT and inhibits
library search records.

/ML                               Links C run-time as part of a
dynamic-link library (DLL). Defaults to
/ALw /FPa /G2 /DMT and changes library
search record to LLIBCDLL.LIB.

/MT                               Enables multithread option. Defaults to
/ALw /FPi /G2 /DMT and changes library
search record to LLIBCMT.LIB.

/Oe                               Enables global register allocation.

/Og                               Enables global optimizations and global
common subexpressions (CSEs).

/Ox                               Is now equivalent to /Ocegilt /Gs. Note
that this implies that maximum
optimization includes the _fastcall
function-calling convention.

/Oz                               Enables aggressive optimizations.

/Ta name                          Specifies that name is to be treated as
an assembler input file.

/W4                               Turns on extra warning level which
supports more detailed (LINT-like)
warnings and recognition of ANSI
violations.

/WX                               Causes warnings to be treated as errors.
If a warning occurs, the .OBJ file is
not created.

B.3.3  Tiny Memory Model (.COM Files)

Microsoft C 6.0 now supports the tiny memory model, which produces .COM
rather than .EXE files (for DOS only).

The /AT option selects the tiny model. This forces the linker to use options
/NOE and /TINY. Within the linker, /TINY turns on /FARCALLTRANSLATION to
help eliminate far segment relocations. If you link your own .OBJ files,

B.3.4  The Optimize Pragma

The optimize pragma turns optimizing options on or off:

#pragma optimize("<optimization switch list>",{off|on})

where <optimization switch list> can be an empty list or one or more of the
following: a, c, e, g, l, w, n, p, t, and z. For example,

#pragma optimize("lp",on) /* equivalent to /Olp */
#pragma optimize("",off)  /* turns off all optimization */
#pragma optimize("",on)  /* restores default settings */

See Chapter 1, "Optimizing C Programs."

B.3.5  Nameless Structures and Unions

Both struct and union declarations can now be specified without a declarator
when they are members of another structure or union.

A nameless union would look like this:

struct str
{
int a,b;
union           /* unnamed union */
{
char c[4];
long l;
float f;
};
char c_array[10];
} my_str;
.
.
.
my_str.l == 0L;

A nameless structure would look like this:

struct s1
{
int a,b,c;
};

struct s2
{
float y;
struct s1;
char str[10];
} *p_s2;
.
.
.

p_s2->b = 100;

B.3.6  Unsized Arrays as the Last Member of a Structure

The compiler now allows an unsized or zero-sized array as the last member of
a structure. The declaration of such a structure would look like this:

struct var_length
{
<set of declarations>;
<type> array[];
};

Unsized arrays can appear only as the last member of a structure. Structures
containing unsized array declarations can be nested within other structures
as long as no further members are declared in any enclosing structures.
Arrays of such structures are not allowed.

The sizeof operator, when applied to a variable of this type or to the type
itself, assumes 0 for the size of the array.

B.3.7  Improved Warnings

A new warning level four (CL option /W4) has been added for the following
warnings:

■   Detection of unused global variables

■   Expressions without side effects

■   Nonportable (non-ANSI) constructs

■   Local variable referenced before being initialized

■   Undefined or implementation-defined constructs

B.3.8  Macros

The number of macros definable with /D options has increased from 20 to 30.

B.3.9  Improved Multithread Support in OS/2

The number of OS/2 threads supported at run time has increased from 32 to
the operating system limit. Three new options aid development of multithread
applications and dynamic-link libraries:

1.  /MT for building multithread programs. It implies /ALw /FPi /G2 /D MT,
and changes the library search record emitted in the object file to
reference LLIBCMT.

2.  /ML for building a DLL that uses the C run-time library. It implies
/ALw /FPa /G2 /D MT, and changes the library search record emitted  in
the object file to reference LLIBCDLL.

3.  /MD for building .EXE files and DLLs that share a C run-time DLL. It
implies /ALw /FPi /G2 /DDLL /D MT, and no library search records are
emitted in the object file.

B.3.10  Pipe Support in OS/2

Microsoft C 6.0 supports pipes as part of the file I/O system. The functions
listed below are defined in the standard include file IO.H:

────────────────────────────────────────────────────────────────────────────
_pipe
_popen
_pclose

B.4  Differences in Code Generation

This section lists ways in which the executable files produced by Microsoft
C 6.0 may differ from the files produced by previous versions of the
compiler.

B.4.1  Speed and Space Improvements

Executable files are smaller and faster.

B.4.2  Code Quality

Microsoft C 6.0 generates improved local code in default optimization cases
and, under full optimization, supports global (function level) register
allocation and common subexpressions (CSEs), loop optimizations, parameter
passing through registers, and generation of in-line code for certain
intrinsic functions.

B.4.3  Floating-Point Code Generation

In Microsoft C 6.0, the /FPi87 option suppresses the fixups previously used
for emulation. Pure coprocessor instructions are now emitted. This makes
object files smaller and speeds up linking, in addition to making in-line
assembly easier to use.

In version 5.1, /FPi and /FPi87 generated the same code; the only difference
was the library. In C 6.0, the two options generate different code. It is no
longer possible to force /FPi87 to act like /FPi. If you use /FPi87, the
math coprocessor must be in the computer on which the program is running.

Note that if you use /FPi87 you must link with mLIB7, not mLIBCE.

B.4.4  Intrinsic Functions

The intrinsic function optimization option (/Oi) causes the compiler to
generate in-line code for the following functions:

╓┌─────────┌───────┌─────────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
abs       _lrotl  _rotl
_disable  _lrotr  _rotr
_enable   memcmp  strcat
ffabs     memcpy  strcmp
inp       memset  strcpy
inpw      outp    strlen
labs      outpw   strset

The compiler does not generate in-line code for the following functions,
although it will modify the calling convention to pass the arguments on the
floating-point chip:

╓┌──────┌───────┌────────────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
acos   pow     coshl
asin   sin     expl
atan   sinh    floorl
atan2  sqrt    fmodl
ceil   tan     logl
────────────────────────────────────────────────────────────────────────────
ceil   tan     logl
cos    tanh    log10l
cosh   acosl   powl
exp    asinl   sinl
floor  atanl   sinhl
fmod   atan2l  sqrtl
log    ceill   tanl
log10  cosl    tanhl

B.5  Changes and Deletions

The changes and deletions listed in this section have a high probability of
affecting existing programs.

B.5.1  Deleted Features

The data_seg pragma has been deleted.

The memory management routine sbrk has been deleted.

The compiler and tools do not run under DOS 2.1. The run-time files produced
by the compiler and linker will continue to run under DOS 2.1.

B.5.2  Evaluation of Real Expressions

Real expressions inside parentheses are now evaluated according to the
semantics of the parentheses. For example, in the expression

((r1 / r2) * r3)

the division is performed before the multiplication. Previous versions of
the compiler might have reordered the operations.

B.5.3  Default Optimizations

Version 6.0 performs more extensive optimizations than version 5.1. This
implies that code that had aliasing but worked with the /Oa option in 5.1
might not work with version 6.0 and /Oa. Also, because of the improved
optimizations, the /Od option should be used to turn off all optimizing
before you begin debugging with CodeView.

B.5.4  Sign Extension of char Arguments

Previous versions of Microsoft C would sign-extend char arguments to int
size before passing them to a second function. Version 6.0 does not extend
the sign if the function is prototyped and the prototype includes a char
argument. The most-significant byte is considered undefined.

B.5.5  Conditional Compilation and Signed Values

Version 5.1 of Microsoft C treated conditional compilation expressions as
signed long values. Version 6.0 evaluates these expressions using the same
rules as expressions in C. For example,

#if 0xFFFFFFFFL > 1UL
.
.
.
#endif

The expression evaluates to be true. It was evaluated as false in version
5.1.

B.5.6  The const and volatile Qualifiers

The const  and volatile qualifiers must be placed after the type they
qualify. The declaration

int (const *p);

is now treated as a syntax error. Previous versions of the compiler would
accept such a construction.

The following declarations are legal:

int const *p_ci;  /* pointer to constant int */
int const (*p_ci); /* pointer to constant int */
int *const cp_i;  /* constant pointer to int */
int (*const cp_i); /* constant pointer to int */

B.5.7  Memory Allocation

The _fmalloc function attempts to allocate far memory. It previously called
_nmalloc if far memory was not available. Now it returns a null pointer if
far memory isn't available, even if near memory is available.

B.5.8  Memory Used by Command-Line Arguments

Previous versions of the compiler placed the command-line argument strings
and environment strings in the near heap. Now they are allocated though
malloc, which means that they will be in far memory in compact and large
models.

B.5.9  Format Specifiers in printf

The printf format specifier modifiers N, F, h, and l have changed.

The specifier %Np is a synonym for %hp, but the latter is preferred.
Likewise, %Fp is a synonym for %lp.

For scanf, N and F refer to the distance to the object being read in; that
is, whether the pointer itself is allocated near or far. The modifiers h and
l refer to the size of the object (16-bit near pointer or 32-bit far
pointer). In these examples,

scanf("%Nlp", n_fp);
scanf("%Fhp", f_np);

the first line reads in an address that resides in near memory (N) but holds
a 32-bit far pointer variable (lp). The second line reads in a near pointer
value (hp) into a pointer variable that resides in far memory (F).

B.5.10  Functions that Return Float Values

In Microsoft C 5.1, a prototype or definition such as

float funcname();

was interpreted as

double funcname()

Version 6.0 interprets it as

float

Appendix C  Implementation-Defined Behavior
────────────────────────────────────────────────────────────────────────────

The American National Standards Institute (ANSI) Standard for the C
programming language contains an appendix called "Portability Issues." The
ANSI appendix lists areas of the C language that ANSI leaves open to each
particular implementation. This appendix describes how Microsoft C handles
these implementation-defined areas of the C language.

This appendix follows the same order as the ANSI Standard appendix. Each
item covered includes references to the ANSI chapter and section that
explains the implementation-defined behavior.

────────────────────────────────────────────────────────────────────────────
NOTE

This appendix describes the U.S. English-language version of the C compiler
only. Foreign-language implementations of Microsoft C may differ slightly.
────────────────────────────────────────────────────────────────────────────

C.1  Translation

C.1.1  Diagnostics

How a diagnostic is identified (§2.1.1.3)

Microsoft C produces error messages in the form:

filename(line-number) : diagnostic Cnumber message

where filename is the name of the source file in which the error was
encountered; line-number is the line number at which the compiler detected
the error; diagnostic is either "error" or "warning"; number is a unique
four-digit number (preceded by a C) that identifies the error or warning;
message is an explanatory message.

C.2  Environment

C.2.1  Arguments to main

The semantics of the arguments to main (§2.1.2.2)

In Microsoft C, the function called at program start-up is called main.
There is no prototype declared for main, and it can be defined with zero,
two, or three parameters:

int main( void )
int main( int argc, char *argv[] )
int main( int argc, char *argv[], char *envp[] )

The third line above, where main accepts three parameters, is a Microsoft
extension to the ANSI Standard. The third parameter, envp, is an array of
pointers to environment variables. The envp array is terminated by a null
pointer. See on-line help for more information about main and envp.

The variable argc never holds a negative value.

The array of strings ends with argv[argc], which contains a null pointer.

All elements of the argv array are pointers to strings.

A program invoked with no command-line arguments will receive a value of one
for argc, as the name of the executable file is placed in argv[0]. (In DOS
versions prior to 3.0, the executable file name is not available. The letter
"C" is placed in argv[0].) Strings pointed to by argv[1] through argv[argc -
1] represent program parameters.

The parameters argc and argv are modifiable and retain their last-stored
values between program start-up and program termination.

C.2.2  Interactive Devices

What constitutes an interactive device (§2.1.2.3)

Microsoft C defines the keyboard and the display as interactive devices.

C.3  Identifiers

C.3.1  Significant Characters without External Linkage

The number of significant characters without external linkage (§3.1.2)

Identifiers are significant to 31 characters. The compiler does not restrict
the number of characters you can use in an identifier; it simply ignores any
characters beyond the limit.

C.3.2  Significant Characters with External Linkage

The number of significant characters with external linkage (§3.1.2)

Identifiers declared extern in programs compiled with Microsoft C are
significant to 31 characters. You can modify this default to a smaller
number using the /H (restrict length of external names) option. See on-line
help for more information on the syntax of the /H option.

C.3.3  Upper- and Lowercase

Whether case distinctions are significant (§3.1.2)

Microsoft C treats identifiers within a compilation unit as case sensitive.
Externally linked identifiers may or may not be case sensitive, depending on
whether you use /NOIGNORECASE option when you invoke the linker. The default
for the linker is to ignore case, making externally linked identifiers case
insensitive.

Thus, symbols in source files are sensitive to case. By default, symbols in
object files are not.

Two CL command-line options affect case sensitivity:

1.  The /Gc (generate Pascal-style function calls) command-line option
converts all external identifiers (including function names) to
uppercase.

The _pascal declarator performs the same operation on a
function-byfunction basis.

2.  The /Zc (compile case insensitive) converts all identifiers (excluding
function names) to uppercase.

C.4  Characters

C.4.1  The ASCII Character Set

Members of source and execution character sets (§2.2.1)

The source character set is the set of legal characters that can appear in
source files. For Microsoft C, the source character set is the standard
ASCII character set. Figure C.1 contains an ASCII table.

────────────────────────────────────────────────────────────────────────────
WARNING

Because keyboard and console drivers can remap the character set, programs
intended for international distribution should check the country code.
────────────────────────────────────────────────────────────────────────────

C.4.2  Multibyte Characters

Shift states for multibyte characters (§2.2.1)

Multibyte characters are used by some implementations to represent
foreignlanguage characters not represented in the base character set.
Microsoft C 6.0 does not support multibyte characters.

C.4.3  Bits per Character

Number of bits in a character (§2.2.4.2)

The number of bits in a character is represented by the manifest constant
CHAR_BIT. The LIMITS.H file defines CHAR_BIT as 8.

C.4.4  Character Sets

Mapping members of the source character set (§3.1.3.4)

The source character set and execution character set include the ANSI ASCII
characters listed in Table C.1. Escape sequences are also shown in Table
C.1.

Table   C.1

╓┌────────────────┌─────────────────┌────────────────────────────────────────╖
Escape Sequence  Character         ASCII Value
────────────────────────────────────────────────────────────────────────────
\b               Backspace         8
\f               Form feed         12
\n               Newline           10
\r               Carriage return   13
\t               Horizontal tab    9
\v               Vertical tab      11
\"               Double quotation  34
\'               Single quotation  39
\\               Backslash         92
────────────────────────────────────────────────────────────────────────────
Escape Sequence  Character         ASCII Value
────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────

C.4.5  Unrepresented Character Constants

The value of an integer character constant that contains a character or
escape sequence not represented in the basic execution character set or the
extended character set for a wide character constant (§3.1.3.4)

Microsoft C does not support wide characters.

C.4.6  Wide Characters

The value of an integer character constant that contains more than one
character or a wide character constant that contains more than one multibyte
character (§3.1.3.4)

Microsoft C does not support wide characters or multibyte characters.

C.4.7  Converting Multibyte Characters

The current locale used to convert multibyte characters into corresponding
wide characters (codes) for a wide character constant (3.1.3.4)

Microsoft C does not support multibyte characters.

C.4.8  Range of char Values

Whether a "plain" char has the same range of values as a signed char or an
unsigned char (§3.2.1.1)

All character values range from 0x00 to 0xFF, signed or unsigned. If a char
is not explicitly marked as signed or unsigned, it defaults to the signed
type.

The CL option /J changes the default from signed to unsigned.

C.5  Integers

C.5.1  Range of Integer Values

The representations and sets of values of the various types of integers
(§3.1.2.5)

Short integers contain 16 bits (two bytes). Long integers contain 32 bits
(four bytes). Signed integers are represented in two's-complement form. The
mostsignificant bit holds the sign: 1 for negative, 0 for positive and zero.
The values are listed below:

Type                              Minimum and Maximum
────────────────────────────────────────────────────────────────────────────
unsigned short                    0 to 65535

signed short                      -32768 to 32767

unsigned long                     0 to 4294967295

signed long                       -2147483648 to 2147483647

C.5.2  Demotion of Integers

The result of converting an integer to a shorter signed integer, or the
result of converting an unsigned integer to a signed integer of equal
length, if the value cannot be represented (§3.2.1.2)

When a long integer is cast to a short, or a short is cast to a char, the
least significant bytes are retained.

For example, this line

short x = (short)0x12345678L;

assigns the value 0x5678 to  x, and this line

char y = (char)0x1234;

assigns the value 0x34 to  y.

When signed variables are converted to unsigned and vice versa, the bit
patterns remain the same. For example, casting -2 (0xFE) to an unsigned
value yields 254 (also 0xFE).

C.5.3  Signed Bitwise Operations

The results of bitwise operations on signed integers (§3.3)

Bitwise operations on signed integers work the same as bitwise operations on
unsigned integers. For example,  -16 & 99  can be expressed in binary as

11111111 11110000
& 00000000 01100011
-----------------
00000000 01100000

The result of the bitwise AND is 96.

C.5.4  Remainders

The sign of the remainder on integer division (§3.3.5)

The sign of the remainder is the same as the sign of the dividend. For
example,

50 / -6 == -8
50 % -6 ==  2
-50 /  6 == -8
-50 %  6 == -2

C.5.5  Right Shifts

The result of a right shift of a negative-value signed integral type
(§3.3.7)

Shifting a negative value to the right yields half the absolute value,
rounded down. For example, -253 (binary 11111111 00000011) shifted right one
bit produces -127 (binary 11111111 10000001). A positive 253 shifts right to
produce +126.

Right shifts preserve the sign bit. When a signed integer shifts right, the
mostsignificant bit remains set. When an unsigned integer shifts right, the
mostsignificant bit is cleared. Thus, if 0xF000 is signed, a right shift
produces 0xF800. If 0xF000 is unsigned, the result is 0x7800.

Shifting a positive number right sixteen times produces 0x0000. Shifting a
negative number right sixteen times produces 0xFFFF.

C.6  Floating-Point Math

C.6.1  Values

The representations and sets of values of the various types of
floating-point numbers (§3.1.2.5)

The float type contains 32 bits: 1 for the sign, 8 for the exponent, and 23
for the mantissa. Its range is +/- 3.4E38 with at least 7 digits of
precision.

The double type contains 64 bits: 1 for the sign, 11 for the exponent, and
52 for the mantissa. Its range is +/- 1.7E308 with at least 15 digits of
precision.

The long double type is new to Version 6.0 of Microsoft C. It contains 80
bits: 1 for the sign, 15 for the exponent, and 64 for the mantissa. Its
range is +/- 1.2E4932 with at least 17 digits of precision.

C.6.2  Casting Integers to Floating-Point Values

The direction of truncation when an integral number is converted to a
floating-point number that cannot exactly represent the original value
(§3.2.1.3)

When an integral number is cast to a floating-point value that cannot
exactly represent the value, the value is rounded (up or down) to the
nearest suitable value.

For example, casting an unsigned long (with 32 bits of precision) to a float
(whose mantissa has 23 bits of precision) rounds the number to the nearest
multiple of 256. The long values 4294966913 - 4294967167 are all rounded to
the float value 4294967040.

C.6.3  Truncation of Floating-Point Values

The direction of truncation or rounding when a floating-point number is
converted to a narrower floating-point number (§3.2.1.4)

When an underflow occurs, the value of a floating-point variable is rounded
down to zero. An overflow causes a run-time math error.

C.7  Arrays and Pointers

C.7.1  Largest Array Size

The type of integer required to hold the maximum size of an array─that is,
the size of size_t (§3.3.3.4, 4.1.1)

The size_t typedef is an unsigned short, with the range 0x0000 to 0xFFFF.
Huge arrays can exceed this limit if they contain more than 65,535 elements.
Arithmetic operations on huge arrays should therefore cast size_t and the
results of an arithmetic operations on pointers to unsigned long.

C.7.2  Casting Pointers

The result of casting a pointer to an integer or vice versa (§3.3.4)

Near pointers are the same size as short integers; casting near to short (or
short to near) has no immediate effect on the value.

Far pointers and huge pointers are the same size as long integers. Casting
far/huge to long (or long to far/huge) has no immediate effect on the value.

When a near pointer is cast to a long, the 16-bit value is "normalized,"
which means the segment (usually DS) and offset are combined to produce a
32-bit memory location.

When a far or huge pointer is cast to a short, the long value is truncated
to a short.

The compiler normalizes based pointers when necessary, unless the based
pointer is a constant zero, in which case it is assumed to be a null
pointer. See Chapter 13, "Writing Portable Programs," for more information

C.7.3  Pointer Subtraction

The type of integer required to hold the difference between two pointers to
elements of the same array, ptrdiff_t (§3.3.6, 4.1.1)

A ptrdiff_t is a signed integer in the range -32768 to 32767, with one
exception. Because huge pointers can address more than 64K of memory,
subtracting one huge pointer from another can yield a result that is a long
integer. The result of subtracting two huge pointers should be cast to a
long.

The compiler normalizes based pointers when necessary. In most cases, based
pointers are treated as far pointers.

C.8  Registers

C.8.1  Availability of Registers

The extent to which objects can actually be placed in registers by use of
the register storage-class specifier (§3.5.1)

Two registers, SI and DI, are available in Microsoft C. Register variables
with a type that has 16 bits may be allocated in these registers.

C.9  Structures, Unions, Enumerations, and Bit Fields

A member of a union object is accessed using a member of a different type
(§3.3.2.3)

If a union of two types is declared and one value is stored, but the union
is accessed with the other type, the results are unreliable.

For example, a union of float and int is declared. A float value is stored,
but the program later accesses the value as an int. In such a situation, the
value would depend on the internal storage of float values. The integer
value would not be reliable.

C.9.2  Sign of Bit Fields

Whether a "plain" int field is treated as a signed int bit field or as an
unsigned int bit field (§3.5.2.1)

Bit fields can be signed or unsigned. Plain bit fields are treated as
signed.

C.9.3  Storage of Bit Fields

The order of allocation of bit fields within an int (§3.5.2.1)

Bit fields are allocated within a 16-bit integer from least-significant to
mostsignificant bit. In the following code,

struct mybitfields
{
unsigned a : 4;
unsigned b : 5;
unsigned c : 7;
} test;

void main( void )
{
test.a = 2;
test.b = 31;
test.c = 0;
}

the bits would be arranged as follows:

00000001 11110010
cccccccb bbbbaaaa

Since the 80x86 processors store the low byte of integer values before the
high byte, the integer 0x01F2 above would be stored in physical memory as
0xF2 followed by 0x01.

C.9.4  Alignment of Bit Fields

Whether a bit field can straddle a storage-unit boundary (§3.5.2.1)

Bit fields default to size short, which can cross a byte boundary (see
Section C.9.3 above) but not a 16-bit boundary. If the size and location of
a bit field would cause it to overflow the current integer, the field is
moved to the beginning of the next available integer.

If a bit field is declared as a long, it can hold up to 32 bits.

In either case, an individual field cannot cross a 16- or 32-bit boundary.

C.9.5  The enum Type

The integer type chosen to represent the values of an enumeration type
(§3.5.2.2)

A variable declared as enum is a signed short integer.

C.10  Qualifiers

What constitutes an access to an object that has volatile-qualified type
(§3.5.3)

Any reference to a volatile-qualified type is an access.

C.11  Declarators

C.11.1  Maximum Number

The maximum number of declarators that can modify an arithmetic, structure,
or union type (§3.5.4)

Microsoft C does not limit the number of declarators. The number is limited
only by available memory.

C.12  Statements

C.12.1  Limits on Switch Statements

The maximum number of case values in a switch statement (§3.6.4.2)

Microsoft C does not limit the number of case values in a switch statement.
The number is limited only by available memory.

C.13  Preprocessing Directives

C.13.1  Character Constants and Conditional Inclusion

Whether the value of a single-character character constant in a constant
expression that controls conditional inclusion matches the value of the same
character constant in the execution character set. Whether such a character
constant can have a negative value (§3.8.1)

The character set used in preprocessor statements is the same as the
execution character set. The preprocessor recognizes negative character
values.

C.13.2  Including Bracketed File Names

The method for locating includable source files (§3.8.2)

The preprocessor first searches the directories specified by the CL option
/I. If the /I option is not present or if it fails, the preprocessor uses
the INCLUDE environment variable to find any include files within angle
brackets. If more than one directory appears as part of the /I option or
within the INCLUDE variable, the preprocessor searches them in the order
they appear.

For example, the command

CL /ID:\MSC\INCLUDE MYPROG.C

causes the preprocessor to search the directory D:\MSC\INCLUDE for include
files such as STDIO.H.

The commands

SET INCLUDE = D:\MSC\INCLUDE
CL MYPROG.C

have a similar effect.

If both sets of searches fail, a fatal error is generated.

C.13.3  Including Quoted File Names

The support for quoted names for includable source files (§3.8.2)

If the file name is fully specified, with a path that includes a colon (for
example, F:\C6\SPECIAL\INCL\ORANGE.H), the preprocessor follows the path.

If the file name is not fully specified, the preprocessor searches the
directory of the file that included it. If the file is not found there, the
preprocessor searches the parent directory, the parent's parent, and so on,
terminating with the root directory.

If the include file is not found in any of those directories, the rules for
bracketed file names apply.

C.13.4  Character Sequences

The mapping of source file character sequences (§3.8.2)

Preprocessor statements use the same character set as source file statements
with the exception that escape sequences are not supported.

Thus, to specify a path for an include file, use only one backslash:

#include "path1\path2\myfile"

Within source code, two backslashes are necessary:

fil = fopen( "path1\\path2\\myfile", "rt" );

C.13.5  Pragmas

The behavior on each recognized #pragma directive (§3.8.6)

The following pragmas are defined in the Microsoft C Reference:

#pragma alloc_text                #pragma optimize
#pragma check_pointer             #pragma pack
#pragma check_stack               #pragma page
#pragma comment                   #pragma pagesize
#pragma function                  #pragma same_seg
#pragma intrinsic                 #pragma skip
#pragma linesize                  #pragma subtitle
#pragma loop_opt                  #pragma title
#pragma message

C.13.6  Default Date and Time

The definitions for _DATE_ and _TIME_ when, respectively, the date and time
of translation are not available (§3.8.8)

When a hardware clock is not accessible, the default values for _DATE_ and
_TIME_ are Friday, May 3, 1957 and 5:00 PM.

C.14  Library Functions

C.14.1  NULL Macro

The null pointer constant to which the macro NULL expands (§4.1.5)

Several include files define the NULL macro as  ((void *)0).

C.14.2  Diagnostic Printed by the assert Function

The diagnostic printed by and the termination behavior of the assert
function (§4.2)

The assert function prints a diagnostic message and calls the abort routine
if the expression is false (0). The diagnostic message has the form

Assertion failed: [expression], file [filename], line [linenumber]

where filename is the name of the source file and linenumber is the line
number of the assertion that failed in the source file. No action is taken
if expression is true (nonzero).

C.14.3  Character Testing

The sets of characters tested for by the isalnum, isalpha, iscntrl, islower,
isprint, and isupper functions (§4.3.1)

Function                          Tests For
────────────────────────────────────────────────────────────────────────────
isalnum                           Characters 0 - 9, A-Z, a-z
ASCII 48-57, 65-90, 97-122

isalpha                           Characters A-Z, a-z
ASCII 65-90, 97-122

iscntrl                           ASCII 0 -31, 127

islower                           Characters a-z
ASCII 97-122

isprint                           Characters A-Z, a-z, 0 - 9, punctuation,
space
ASCII 32-126

isupper                           Characters A-Z
ASCII 65-90

C.14.4  Domain Errors

The values returned by the mathematics functions on domain errors (§4.5.1)

The ERRNO.H file defines the domain error constant EDOM as 33.

C.14.5  Underflow of Floating-Point Values

Whether the mathematics functions set the integer expression errno to the
value of the macro ERANGE on underflow range errors (§4.5.1)

A floating-point underflow does not set the expression errno to ERANGE. When
a value approaches zero and eventually underflows, the value is set to zero.

C.14.6  The fmod Function

Whether a domain error occurs or zero is returned when the fmod function has
a second argument of zero (§4.5.6.4)

When the fmod function has a second argument of zero, the function returns
zero.

C.14.7  The signal Function

The set of signals for the signal function (§4.7.1.1)

The first argument passed to signal must be one of the symbolic constants
listed below. The constants are defined in SIGNAL.H. Also listed is the
operating mode support for each signal.

Signal Argument                   Description
────────────────────────────────────────────────────────────────────────────
SIGABRT                           Abnormal termination (real and protected
mode).

SIGBREAK                          CTRL+BREAK signal. Terminates the
calling program (protected mode only).

SIGFPE                            Floating-point error, such as overflow,
division by zero, or invalid operation.
Terminates the calling program (real and
protected mode).

SIGILL                            Illegal instruction. Terminates the
calling program (protected mode only).

SIGINT                            CTRL+C interrupt. Issues INT 23H (real
and
protected mode).

SIGSEGV                           Illegal storage access. Not generated by
DOS or OS/2, but supported for ANSI
compatibility. Terminates the calling
program (real and protected mode).

SIGTERM                           Termination request sent to the program.
Not generated by DOS or OS/2, but
supported for ANSI compatibility.
Terminates the calling program (real and
protected mode).

SIGUSR1                           OS/2 process flag A (protected mode
only).

SIGUSR2                           OS/2 process flag B (protected mode
only).

SIGUSR3                           OS/2 process flag C (protected mode
only).

C.14.8  Default Signals

If the equivalent of signal (sig, SIG_DFL) is not executed prior to the call
of a signal handler, the blocking of the signal that is performed (§4.7.1.1)

Signals are set to their default status when a program begins running.

C.14.9  The SIGILL Signal

Whether the default handling is reset if the SIGILL signal is received by a
handler specified to the signal function (§4.7.1.1)

The SIGILL signal applies to OS/2 applications only. When SIGILL is
received, the signal handling is not reset to the default SIG_DFL.

C.14.10  Terminating Newline Characters

Whether the last line of a text stream requires a terminating newline
character (§4.9.2)

Stream functions recognize either newline or end-of-file as the terminating
character for a line.

C.14.11  Blank Lines

Whether space characters that are written out to a text stream immediately
before a newline character appear when read in (§4.9.2)

Space characters are preserved.

C.14.12  Null Characters

The number of null characters that can be appended to data written to a
binary stream (§4.9.2)

Any number of null characters can be appended to a binary stream.

C.14.13  File Position in Append Mode

Whether the file position indicator of an append mode stream is initially
positioned at the beginning or end of the file (§4.9.3)

When a file is opened in append mode, the file position indicator initially
points to the end of the file.

C.14.14  Truncation of Text Files

Whether a write on a text stream causes the associated file to be truncated
beyond that point (§4.9.3)

Writing to a text stream does not truncate the file beyond that point.

C.14.15  File Buffering

The characteristics of file buffering (§4.9.3)

Disk files accessed through standard I/O functions are fully buffered. By
default, the buffer holds 512 bytes. Some of the low-level DOS and BIOS
functions (all of which are non-ANSI) are unbuffered.

C.14.16  Zero-Length Files

Whether a zero-length file actually exists (§4.9.3)

Files with a length of zero are permitted.

C.14.17  File Names

The rules for composing valid file names (§4.9.3)

A file specification can include an optional drive letter (always followed
by a colon), a series of optional directory names (separated by
backslashes), and a file name.

File names and directory names can contain up to eight characters followed
by a period and a three-character extension. Case is ignored. The wild-card
characters * and ? are not permitted within the name or extension.

C.14.18  File Access Limits

Whether the same file can be open multiple times (§4.9.3)

Opening a file that is already open is not permitted.

C.14.19  Deleting Open Files

The effect of the remove function on an open file (§4.9.4.1)

The remove function deletes a file, even if the file is open.

C.14.20  Renaming with a Name that Exists

The effect if a file with the new name exists prior to a call to the rename
function (§4.9.4.2)

If you attempt to rename a file using a name that exists, the rename
function fails and returns an error code.

C.14.21  Printing Pointer Values

The output for %p conversion in the fprintf function (§4.9.6.1)

Microsoft C supports three types of pointer conversions: %p (a pointer), %lp
(a 32-bit far pointer), and %hp (a 16-bit near pointer).

The fprintf function produces hexadecimal values of the form XXXX (an
offset) for near pointers or XXXX:XXXX (a segment plus an offset, separated
by a colon) for far pointers. The output for %p depends on the memory model
in use.

C.14.22  Reading Pointer Values

The input for %p conversion in the fscanf function (§4.9.6.2)

When the %p format character is specified, the fscanf function converts
pointers from hexadecimal ASCII values into the correct address.

The interpretation of a dash (-) character that is neither the first nor the
last character in the scanlist for % [ conversion in the fscanf function
(§4.9.6.2)

The following line

fscanf( fileptr, "%[A-Z]", strptr);

reads any number of characters in the range A-Z into the string to which
strptr  points.

C.14.24  File Position Errors

The value to which the macro errno is set by the fgetpos or ftell function
on failure (§4.9.9.1, 4.9.9.4)

When fgetpos or ftell fails, errno is set to the manifest constant EINVAL if
the position is invalid or EBADF if the file number is bad. The constants
are defined in ERRNO.H.

C.14.25  Messages Generated by the perror Function

The messages generated by the perror function (§4.9.10.4)

The perror function generates these messages:

0  Error 0
1
2  No such file or directory
3
4
5
6
7  Arg list too long
8  Exec format error
9  Bad file number
10
11
12 Not enough core
13 Permission denied
14
15
16
17 File exists
19
20
21
22 Invalid argument
23
24 Too many open files
25
26
27
28 No space left on device
29
30
31
32
33 Math argument
34 Result too large
35
36 Resource deadlock would occur

C.14.26  Allocating Zero Memory

The behavior of the calloc, malloc, or realloc function if the size
requested is zero (§4.10.3)

The calloc, malloc, and realloc functions accept zero as an argument. No
actual memory is allocated, but the memory size can be modified later by
realloc.

C.14.27  The abort Function

The behavior of the abort function with regard to open and temporary  files
(§4.10.4.1)

The abort function does not close files that are open or temporary. It does
not flush stream buffers.

C.14.28  The atexit Function

The status returned by the atexit function if the value of the argument is
other than zero, EXIT_SUCCESS, or EXIT_FAILURE (§4.10.4.3r)

The atexit function returns zero if successful, or a nonzero value if
unsuccessful.

C.14.29  Environment Names

The set of environment names and the method for altering the environment
list used by the getenv function (§4.10.4.4)

The set of environment names is unlimited.

To change environment variables from within a C program, call the putenv
function. To change environment variables from the DOS command line, use the
SET command (for example, SET LIB = D:\ LIBS).

Environment variables exist only as long as their host copy of DOS is
running. For example, the line

system( "SET LIB = D:\LIBS" );

would run a copy of DOS, set the environment variable LIB, and return to the
C program, exiting the secondary copy of DOS. Exiting that copy of DOS
removes the temporary environment variable LIB.

Likewise, changes made by the putenv function last only until the program
ends.

C.14.30  The system Function

The contents and mode of execution of the string by the system function
(§4.10.4.5)

The system function executes an internal DOS or OS/2 command, or an EXE,
COM, or BAT file from within a C program rather than from the command line.

It examines the COMSPEC environment variable to find the command
interpreter, which is typically COMMAND.COM in DOS or CMD.EXE in OS/2. The
system function then passes the argument string to the command interpreter.

C.14.31  The strerror Function

The contents of the error message strings returned by the strerror function
(§4.11.6.2)

The strerror function generates these messages:

0   Error 0
1
2   No such file or directory
3
4
5
6
7   Arg list too long
8   Exec format error
9   Bad file number
10
11
12  Not enough core
13  Permission denied
14
15
16
17  File exists
19
20
21
22  Invalid argument
23
24  Too many open files
25
26
27
28  No space left on device
29
30
31
32
33  Math argument
34  Result too large
35
36  Resource deadlock would occur

C.14.32  The Time Zone

The local time zone and Daylight Saving Time (§4.12.1)

The local time zone is Pacific Standard Time. Microsoft C supports Daylight
Saving Time.

C.14.33  The clock Function

The era for the clock function (§4.12.2.1)

The clock function's era begins (with a value of 0) when the C program
starts to execute. It returns times measured in 1/1000th seconds.

INDEX
──────────────────────────────────────────────────────────────────────────

80x87 coprocessor
80x87
Detection of

A
Alternate math package

B
_based keyword

C
C extensions, PWB
building protected-mode
building real-mode
calling C library functions
calling C library routines
calling PWB functions
describing functions and switches
initializing functions
prototyping functions
receiving parameters
sample
versus executable files
C extentions, PWB
building protected-mode
Calls
Emulator math package
Emulator package
Floating-point
Math coprocessor package
CODEVIEW.LST file
CodeView
debugging
DLLs with
CONFIG.SYS file
CONFIG.SYS files
_control87
Coordinates
overview
physical
screen location
viewport
window
Coprocessor
CURRENT.STS

D
Default math package
Denormalized numbers
Dot commands
double

E
80x87 coprocessor
80x87
Detection of
EMOEM.ASM
Emulator math package
Emulator package
In-line
Environment
NO87 variable

F
_far keyword
_fastcall keyword
Files
.FON
CODEVIEW.LST
CONFIG.SYS
CURRENT.STS
TOOLS.INI See TOOLS.INI
Fill patterns
float
Floating-point accumulator
Floating-point math
requirements, DLLs
Floating-point
Alternate math package
Biased exponent
Compatibility between options
Floating-Point
Controlling
Floating-point
Default math package
Default package
Denormalized numbers
Effect of calls on code size
Effect of calls on speed
Effect on optimization
Emulator package
Exceptions
exponent
Fastest programs
In libraries
Infinities
Interrupt-enable
Library considerations for
Mantissa
Math coprocessor package
Maximizing accuracy
NaNs
On non-IBM compatible computers
Packages
Precision
Program size
Program speed
Sign bit
Smallest programs
Transcendental function results
Underflow
Using in dynamic-link libraries
With libraries not provided by Microsoft
.FON files
/FPa
/FPc
/FPc87
/FPi
/FPi87
Function calls
near call
C calling convention
FORTRAN/Pascal calling convention
register calling convention
_fastcall calling convention
Functions
drive and directory (list)
graphics (list)
initializing
prototyping
Returning floating-point types

G
/Gd option
Gd option
Graphics
video modes
default mode
graphics mode, defined
text mode, defined

H
Help files
local help context
HIMEM.SYS driver
_huge keyword
_huge Keyword
_huge keyword

I
IEEE
In-line assembly
In-line
Floating-point emulator package
Floating-point emulator
Floating-point instructions
Floating-point math coprocessor package
Floating-point
Institute of Electrical and Electronics Engineers
see IEEE; see

L
Language conventions
calling conventions
naming conventions
parameter-passing conventions
Libraries
import
special
/EXEPACK option
/FARCALLTRANSLATION option
/NODEFAULTLIBRARYSEARCH (/NOD) option
/NOEXTENDEDDICTSEARCH (/NOE) option
/NOIGNORECASE (/NOI) option
/PACKCODE option
/PACKDATA option
/TINY option
compatibility (/Lc)
PACKCODE option
protected-mode (/Lp)
real-mode (/Lr)
LLIBCDLL.LIB
long double

M
Macros
inherited
Math coprocessor package
In-line

N
_near keyword
NO87

O
optimise pragma
Optimization
Effect of floating-point math on
optimize pragma
OS/2
calling

P
Process
child
debugging multiple processes
Programmer's WorkBench
see PWB; see
Pseudotargets
PWB
80x87 option
Debug Build Options
Emulation calls option
extensions.See C extensions, PWB
Fast alternate math option
Global Compile Options
Inline 80x87 Instructions option
Inline Emulation option
Release Build Options
Selecting floating-point options from

R
Run-time
support of type long double

S
SETUP
SLLs
data segments

T
/TINY option
Type
double
float
long double
Promotion of floating point
Range of floating-point
Significance of floating-point
Storage requirements for floating-point
Widening for floating-point types
Types
double
float
long double

V
Variables
Declaring as floating-point types
Precision with floating-point
Promotion of floating-point
Range of floating-point
Significance of
Storage requirements for
`