Home of the original IBM PC emulator for browsers.
[PCjs Machine "ibm5150"]
Waiting for machine "ibm5150" to load....
A statistical analysis package for handling numerical data, operated by entering one-line commands and subcommands. Command ``batch'' files can be created for automatic execution, along with explanatory screen remarks. STATMATE operates on information contained in a database, generated by the program. A user ID is required before entering a database, and for every new user ID, an empty database is created. This feature permits multiple users to work with STATMATE while keeping the data files separated. Extract data from an ASCII text file and load it into the database for operation. Data is stored in columns and rows, and you can extract portions of the data according to your specifications. As you manipulate the data, the results can be displayed on the screen, printed, or saved on a disk file. The main analytic features are elementary statistics, scatter plots, cross tabulations, histograms, data comparison using the T-Test, correlation, arithmetic operations, distribution functions, curvilinear regression, multiple regression, nonlinear regression, data recoding, and data transformation and manipulation. An on-line help facility is included to give you a detailed description of all the STATMATE commands.
Disk No 863 Program Title: STATMATE/PLUS version 1.3 (Disk 3 of 3) PC-SIG version 1.1 This is the third disk of the STATMATE package, disks #861-63, and contains the five-part documentation for the program. Please refer to disk #861 for full information. Usage: Statistics Analysis System Requirements: 128K memory and two disk drives. Suggested Registration: $50.00 File Descriptions: SMPART1 DOC Documentation, part 1. SMPART2 DOC Documentation, part 2. SMPART3 DOC Documentation, part 3. SMPART4 DOC Documentation, part 4. SMPART5 DOC Documentation, part 5. README How to get started. PC-SIG 1030D E Duane Avenue Sunnyvale Ca. 94086 (408) 730-9291 (c) Copyright 1987,88 PC-SIG Inc.
╔═════════════════════════════════════════════════════════════════════════╗ ║ <<<< Disk #863 STATMATE/PLUS (Disk 3 of 3) >>>> ║ ╠═════════════════════════════════════════════════════════════════════════╣ ║ To copy the documentation to your printer, Type: ║ ║ PRINTDOC (press enter) ║ ╚═════════════════════════════════════════════════════════════════════════╝
STATMATE/PLUS (A Statistical Package) Version 1.3 Shareware User's Guide August 1, 1988 The Software Hill 1857 Apple Tree Lane Mountain View, Ca. 94040 Copyright (C), 1987 COPYRIGHT The STATMATE/PLUS statistical application package is copyrighted (C) 1987, by The Software Hill. All rights reserved. Non-registered users are granted a limited license to use this product on a trial basis, and to copy the program for trial use by others subject to the following limitations: 1. STATMATE/PLUS is distributed in unmodified form, complete with documentation. 2. No fee, charge or other consideration is requested or accepted. 3. STATMATE/PLUS is not distributed in conjunction with any other product. If you intend to use STATMATE/PLUS on a regular basis, please show your support by registering the program for a nominal fee. Registration information is give below. Commercial, business or governmental use by non-registered uses is prohibited. If you are interested in multiple copies for use at work, site and corporate licenses are available. Please write for information. TRADEMARKS STATMATE/PLUS is a trademark of The Software Hill. TABLE OF CONTENTS REGISTRATION....................................1 USER-SUPPORTED SOFTWARE.........................2 PRODUCT SUPPORT.................................3 INTRODUCTION TO STATMATE/PLUS...................4 FEATURES........................................5 OPERATION.......................................6 STATMATE Example................................7 Command Summary.................................9 INTERACTING WITH STATMATE.......................11 Commands and Subcommands........................11 Commands........................................11 Subcommands.....................................12 STATMATE DATABASE CONCEPTS......................14 STATMATE Database and Directory.................14 Organization and Manipulation of Data...........14 Variable Names..................................15 Ways of Referencing Data--Variables and Cases...16 ENTERING DATA INTO THE SYSTEM...................17 External Data Entry--Files......................17 Creating ASCII Files............................17 COMMAND DESCRIPTIONS............................21 CROSSTABS.......................................22 ERASE...........................................25 EXECUTE.........................................26 EXIT............................................27 GIVE............................................28 HELP............................................31 INPUT...........................................32 LET.............................................35 PLOT............................................43 PRINT...........................................48 QUERY...........................................49 REGRESSION......................................50 REMARK..........................................53 SET.............................................54 SHOW............................................56 STATISTICS......................................58 TTEST...........................................60 WHEN-ELSE-END...................................62 WRITE...........................................67 APPENDIX A: Computation Methods.................69 APPENDIX B: Sample Data.........................70 APPENDIX C: Installation and Miscellanea........73 APPENDIX D: STATMATE Size Limitations...........77 APPENDIX E: HELP................................79 APPENDIX F: Suggested Diskette Organization.....81 APPENDIX G: Invoice and Order Form..............82 References .....................................85 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 REGISTRATION ------------ Feedback on STATMATE/PLUS is an important part of developing a useful and successful software package. Please share your impressions, suggestions and comments by writing to us. STATMATE/PLUS is distributed as User-Supported Software. You are encouraged to try the program and share it with your friends and colleagues as long as: 1. STATMATE/PLUS is distributed in unmodified form, complete with documentation. 2. No fee, charge or other consideration is requested except by The Software Hill. 3. STATMATE/PLUS is not distributed in con- junction with any other product. If you use STATMATE/PLUS on a regular basis, please show your support by registering the program. You may register by sending a check or money order for $45 to: The Software Hill 1857 Apple Tree Lane Mountain View, Ca. 94040 Registered users will receive (1) notification of major releases of STATMATE/PLUS, newsletters and other information supporting the package and (2) two sort utilities and a high resolution scatter plot utility (for use with CGA, EGA or Hercules graphics cards). Program disks are not included in the registration fee. Note that when you register you receive a $10 coupon applicable to additional purchases. 1 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 USER-SUPPORTED SOFTWARE User-supported software is a means for users to receive quality software while directly supporting software authors. It is based on the ideas that: a. Immediate assessment of the package through hands-on use to detemine whether the package satisifies the user's personal application needs and operational tastes. b. Creation and support of independent personal computer software is important and desirable by interested application users. c. Copying of programs should be encouraged, rather than restricted to promote the widest possible development, interest and support by the application's community. Under the concept of user-supported software, anyone may request a copy of STATMATE/PLUS by sending a blank, DOS formatted, 5-1/4 inch diskette to The Software Hill along with a self-addressed, postage-paid return mailer. You will receive STATMATE and program documentation on the disk by return mail. The program carries a notice suggesting registration, but registration is strictly voluntary. You are encouraged to copy and distribute STATMATE, regardless of whether or not you register, for private and non-commercial use of others. 2 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PRODUCT SUPPORT As of this date, we are able to provide support of registered owners by mail or phone. A bulletin board system is not presently available. In order to answer questions or comments, please provide as complete a description of your problem as possible. Include a description of your configuration (hardware and operating system), steps taken before a problem occurrence and any printed material identifying the problem. The latest version of STATMATE/PLUS may always be found on the PC SIG bulletin board system. As the popularity of the program grows, it will be found on your local bulletin boards, shareware distribution disks and a number of other computer environments. MACHINE REQUIREMENTS STATMATE/PLUS requrires 128K of memory (RAM). It is best operated from a hard disk but may be operated from two 5 1/4-inch floppies. See the appendix for suggestions on tailoring STATMATE to your system. It operates under DOS version 2.0 or higher. STATMATE may be used on a PC, XT, AT or compatible. There is no dependence no the type of terminal used, whether monochrome, composite or color. Any terminal type is satisfactory. 3 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 Introduction This guide briefly describes some of the capabilities of STATMATE/PLUS. Enough descriptive material is given in this guide so that you should be able to understand the capabilities of the package and put STATMATE to use in your applications. For registered owners, a complete guide to STATMATE/PLUS is available for $35. Probably the best place to start is by reading over the material in this guide. When you are ready to try the program, read the section on operating STATMATE. Try the example discussed there and read appendix F regarding the suggested organization of STATMATE program files on disk. Once you have completed the example, try the package with the EXECUTE command on the DEMO file. After entering STATMATE, give the three character ID required, just enter in response to the command prompt: EXECUTE DEMO This will cause STATMATE to run through a sequence of commands. A description of what is happening is given as the program proceeds through the commands. The demonstration will pause and give you the chance to read what is happening at your own pace. The command summary given later indicates commands which will operate in the demonstration program. 4 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 FEATURES A brief summary of some of the many STATMATE operational features are given below. -Operational Features- * Help facility * Data extraction from external files * Placement of output on results files * User named variables * Default names for variables * Data transformation and manipulation * Display of selected data * Missing or not applicable data values * Multiple user operation * Database maintainence operations * Data selection by specified conditions * ASCII output files to other applications The analytic features available with the STATMATE package are given below. -Statistical Features- * Elementary statistics * Scatter plots * Cross tabulations * Histograms * T-Test * Correlation * Curvilinear regression * Random Number Generation * Arithmetic operations * Distribution functions * Multiple regression * Group statistics * One-way ANOVA * Two-way ANOVA * Control chart calculations * Nonparametric methods * Nonlinear regression * Polynomial regression * Data recoding 5 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 OPERATION Operation of STATMATE is begun by entering: SMATE You will be prompted with the message: ENTER ID- The program expects a three alphabetic*, upper or lower case letters, character identification as a response. The characters are used as a database identification. Your initials are usually the simplest ID to use. For example, ENTER ID-XYZ In this case XYZ is used as an ID and XYZ is the identified database. Use of another ID would identify another database. This mechanism allows you to create databases for different purposes, for example, one might belong to your data, and another to a colleague. (IMPORTANT: Each time you supply a different ID, STATMATE creates an empty database file corresponding to the ID you supplied. These files may be large. It is best to use the same ID each time you use STATMATE or you will quickly exceed your disk capacity. See the use of databases in section STATMATE DATABASE CONCEPTS.) After you provide the ID, the program will issue the prompt: Command: At this point, a STATMATE command must be entered in order to continue the operation of STATMATE. A carriage return must be entered at the completion of each line of input. After each command is completed, another 'Command:' prompt will be issued. Entering EXIT will terminate the program. * By supplying a fourth character, q or Q, with the ID, the shareware banner output after entering the ID is suppressed. 6 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 STATMATE EXAMPLE The following sample illustrates the operation of STATMATE: DATA U.S. POPULATION:YEAR,URBAN,RURAL 1860 , 6.217 , 25.227 1870 , 9.902 , 28.656 1880 , 14.130 , 36.026 1890 , 22.106 , 40.841 1900 , 30.160 , 45.835 1910 , 41.999 , 49.973 1920 , 54.158 , 51.553 1930 , 68.955 , 53.830 1940 , 74.924 , 57.246 1950 , 88.927 , 61.770 The data used in the example problem is the U. S. Population data for rural and urban areas from 1860 through 1950. The data is on a file called USPOPDEM.DAT (a portion of the USPOP.DAT file supplied with STATMATE) and consists of three fields: year, urban population and rural population; population data is in millions. The example of STATMATE operation is shown on the next page, and the description of the operation is described in this paragraph. From the next page, the ENTER ID- prompt is answered with ABC. This establishes the user's ABC database as the database which is to be used in the example. Next, the ERASE command clears all data from the database. Data is then extracted from the data file USPOPDEM.DAT by the INPUT command. In the INPUT command, the clause OMIT 2 causes the first two fields of the data file to be ignored. The clause KEEP 1 causes the third field, rural population to be extracted. Hence, only one variable, that is, one column or field of data, containing rural population, is placed in the database. STATMATE only operates on data placed in its database. There are 10 data points or cases for the extracted variable as is reported by STATMATE as 10 CASES at completion of the INPUT command. Initially this variable has the name #1 assigned to it. The GIVE NAME command is used to give #1 an alternate name, RURALPOP. The STATISTICS command is applied to RURALPOP to derive the simple statistics produced by the command. The program is terminated by the EXIT command. 7 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 SMATE ENTER ID-ABC Command: ERASE #1 THRU END 10 VARIABLES ERASED Command: INPUT USPOPDEM.DAT OMIT 2,KEEP 1 1 VARIABLE INPUT AT #1 10 CASES Command: GIVE NAME #1,RURALPOP 1 ATTRIBUTES MODIFIED Command: STATISTICS RURALPOP VARIABLE: RURALPOP 10 CASES 0 MISSING CENTRAL TENDENCY SPREAD DISTRIBUTION ------------------- ------------------------ -------------------- MEAN 45.10 STD. DEV. 12.17 MINIMUM 25.23 VARIANCE 148.15 MAXIMUM 61.77 RANGE 36.54 COEFF. VAR. 0.27 SUMMATIONS HIGHER MOMENTS ----------------------- -------------------- TOTAL 450.96 SKEWNESS -0.37 SUM SQ 21669.59 KURTOSIS 1.93 SUM SQ(DEV) 1333.37 Command: EXIT END STATMATE 8 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 Command Summary Commands are the basic form of communicating with STATMATE. A summary of the commands available for both STATMATE/PLUS and STATMATE are shown below. Summary of STATMATE/PLUS Commands Command Name Description Modifiers/Descriptors ------------ -------------------- ---------------------- BREAKDOWN Statistics by groups none CHART X- and R-Charts TYPE, CONTROL, KCENTER, HMARK, VRANGE, TITLE, HLABEL, KSIGMA, DISPLAY, HFILLER, VPOSITION, VLABEL COMPUTE Fit and forecast TYPE CORRELATE Pair-wise correlation none CROSSTABS Two-way cross tabs none CURVE Ten curve fits TABLE, BEST, EQUATION CUSUM Cusum chart TARGET, DISPLAY, RESET, HMARK, HFILLER, VRANGE, VPOSITION, TITLE, VLABEL,HLABEL EDIT Database editing none ELSE Reverse WHEN condition none END Remove WHEN condition none EXIT End STATMATE operation none ERASE Remove database data none EXECUTE Multiple command entry none GIVE Give data attributes none HELP Provide command help none HISTOGRAM Histogram TITLE, RANGE, BARS, VPOSITION INPUT Data input KEEP, OMIT KOLMOGOROV Kolmogorov tests DISTRIB, SPARAM, UPARAM LET Arithmetic operations none NONLINEAR Nonlinear regression MODEL, MAXITER, REPORT, TYPE, CONVERGE ONEWAY One-way ANOVA METHOD,ALPHA ONPARAM Nonparametric methods none PLOT Scatter plot TITLE, HRANGE, VRANGE, HPOS, VPOS, HLAB, VLAB POLYNOMIAL Polynomial regression TABLE, PRINT Data display none QUERY Database status none RCORRELATION Rank correlation TEST RECODE Recode crosstab data none REGRESSION Multiple regression TABLE, INTERCEPT, DURBIN REMARK Allows documentation none 9 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 Summary of STATMATE/PLUS Commands (Continued) Command Name Description Modifiers/Descriptors ------------ -------------------- ---------------------- SET Set output type COPY SHOW Show internal status none STATISTICS Summary statistics TABLE STEPWISE Stepwise regression TABLE, MAXSTEP, FORCE, FENTER, FREMOVE, METHOD TNPARAM 2-way nonparam ANOVA TEST TTEST Student T-test none TWOWAY Two-way ANOVA DESIGN WHEN Select database view none WRITE Put variables to file none 10 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 INTERACTING WITH STATMATE Commands and Subcommands The primary means by which you communicate with STATMATE is through commands. For example, the PRINT command tells STATMATE to list data which is specified in the command. In some instances, a command requires the use of a subcommand to enter additional information about the operation requested by the command. In STATMATE, subcommands are available, for example, with the CHART, CUSUM, STEPWISE and PLOT commands. Information related to specific commands and subcommands is found in the command description portion of this manual. If you need help while actually entering a command or subcommand, an on-line HELP command is available. Commands A command contains a command name and a reference to the variables that it is to operate on. For example, STATISTICS URBANPOP calculates statistics for the variable URBANPOP, representing urban population data. A command name may be abbreviated by using the first three characters of its name. That is, STA URBANPOP is an acceptable command. Some commands operate on several variables. For example, the scatter plot command, PLOT, requires variables for plotting on the vertical and horizontial plot axis. In order to help distiguish between the use of variables, some commands use a keyword which essentially divides the list of variables into easily identifiable pieces. For example, in PLOT SALES,WAGES ON YEAR ON is a keyword that separates the list of vertically plotted variables, SALES and WAGES, from the horizontially plotted variable, YEAR. Another type of item used in a command is a modifier. A modifier supplies some additional information about how the command should operate. For example, 11 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PLOT WAGES ON YEAR,TITLE='PLOT OF WAGES BY YEAR' TITLE is a modifier indicating that the title shown should appear on the scatter plot. Modifiers are followed by an equal sign (=) and are separated from the list of variables and from one another by commas. For example, CURVE WAGES ON YEAR,TABLE=ANOVA,EQUATION=LIN,QUA Some modifiers contain option names following the equal sign to designate which options are to be selected for the modifier. For example, TABLE=FIT,PARAMETERS indicates the FIT and PARAMETERS options are selected for the TABLE modifer. As with command names, only the first three characters of a modifier or option name need be used. Although some commands contain modifiers, the modifiers do not need to be entered. If a modifier is not entered it assumes a default value. For example, if the TITLE modifier is not given for PLOT, the title is assumed to be blank. In most instances, you cannot specify your own default values; however, with a few commands, STEPWISE and PLOT, for example, you are allowed to change the default settings. This is a very useful feature. For example, in the event that you use the same scatter plot title for much of your work, the plot title can be set and not changed until necessary. Subcommands Subcommands are used to specify additional operations for a command. Subcommands are available only with a few of the commands. Usually a subcommand permits you an alternate way of entering information about modifiers. Other subcommands permit the values of current modifier settings to be displayed or saved. Entering and Using Commands and Subcommands STATMATE will prompt you with a message to enter a command or subcommand. In the case of a command, the prompt is: Command: To cause STATMATE to execute a command, you only need enter the command as in: Command: PLOT URBANPOP ON YEAR In some instances, the text of a command may be so long that it will not fit on a single line. Commands may be continued 12 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 by breaking them after a comma. That is, anywhere that a comma is allowed in a command is a point at which the command may be broken. When a line is terminated by a comma, STATMATE will ask you for an additional line of text with the following prompt Continue: For example, Command: PLOT URBANPOP ON YEAR, Continue: TITLE='URBAN POPULATION FROM 1790 TO 1950' As many as 250 characters of text may be included in a command. Command names, keywords and other elements of a command may be entered in either upper or lower case letters. In the case of the PLOT and STEPWISE commands, for example, it is not necessary to enter the information following the command name in response to the command prompt. An alternate method is available that some users may find easier is available. For example, if only the command name PLOT is entered, STATMATE will then prompt you for the names of the variables to be plotted. After the names have been entered, you will be placed in the subcommand mode. With the subcommands, you may enter any of the PLOT modifiers. When you have entered the modifiers you need, entering the CONTINUE subcommand causes the PLOT command to be executed and a plot to be produced. The following is a sequence of commands and subcommands used to enter the PLOT command discussed above. Command: PLOT Enter Y-axis variables: URBANPOP Enter X-axis variable: YEAR Subcommand: TITLE='URBAN POPULATION FROM 1790 TO 1950' Subcommand: CONTINUE Note that TITLE and CONTINUE are subcommands. CONTINUE causes STATMATE to leave the subcommand mode and execute the PLOT command. 13 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 STATMATE DATABASE CONCEPTS STATMATE Database and Directory An important concept in the operation of STATMATE is the STATMATE database. This is a file containing the data on which the program operates, and is generated by STATMATE. Data is placed on the file by the INPUT command and may be displayed by the PRINT command. A user identifies which of his databases he wants to operate on when responding to the ID prompted by the program. The ID is associated with a database. Usually one database is sufficient for most applications. Since these database files can be occupy a lot of disk space, some care should be taken in utilizing a number of different IDs. An empty database is created each time a new ID is specified. A database is reused when the ID given corresponds to an existing database. When a database is created, it is large enough to accommodate 10 variables with as many as 250 cases per variable. The STATMATE install program, SMINSTLL may be used to change the size of the databases created by STATMATE. With a database of 10 variables, the user may store, modify, manipulate and repeatedly use up to 10 variables for analysis. This ability reduces the re-entry of data for each analysis. In situations where several users work with STATMATE, they may want to create their own databases by using a different ID. This feature provides additional security when multiple users operate the package. Associated with the database is a program generated directory which contains the names of the variables and other attributes of the data contained in the database. The directory may be examined and manipulated by such commands as GIVE, QUERY and ERASE. Organization and Manipulation of Data within the Database Data is organized by variables within the database. Variables represent a collection of data on which some analytic or manipulative operation is to be performed, for example, a variable could be the number of houses built each year for 15 years. Each database has a maximum number of variables that can be placed in it, and a maximum number of data values that can be placed in any variable. Initially, a database is empty, but contains space for the maximum number of variables and data values. Variables in a database are assigned to specific database 14 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 locations in a sequential fashion, starting the with variable 1 and proceeding to the highest numbered variable. For example, the first variable is assigned to database location 1, the second to location 2, and so on. Variables are given predefined names according to their ordered location in the datatase. Each predefined name is prefixed with a # and is followed by a number. For example, #7 is the name of the seventh variable. The user may assign his own alternate names to the predefined variable names, as well. Data is usually brought into a database with either the INPUT or EDIT commands. The INPUT command allows you to enter data contained on external files. EDIT allows you to enter data from the keyboard. When bringing variables into a database with the INPUT command, variables are assigned to successive variables in a database. For example, assume that there are variables in the first five database variables. If two variables are input, then the new variables will reside in #6 and #7. New variables are generally placed after existing variables. One form of the INPUT command allows you to place variables at specific locations in a database. See the INPUT command for further details on the entry of data into a database. Use of the LET command, permits data to be moved from one variable to another. Variables may be removed from a database by erasing them, using the ERASE command. Variable Names A particularly useful feature of STATMATE is that it allows you to assign names to variables, Thus, it is possible to assign more meaningful names to variables. For example, SALES, AGE, URBANPOP, ACCIDENTS, QUARTER, INVTRY1982, etc. are valid names. These names consist of from 1 to 10 characters. The first character must be alphabetic but the remaining characters may be either numeric or alphabetic. Alphabetic characters must be in upper case letters. All variables have alternate predefined names of the form #n where n is the location number of the variable. For example, #4 and #9 are predefined variable names. The GIVE command allows alternate names to be given to a variable. The variable #4 might, for example, be given the alternate name AGE. Either #4 or AGE could be used to reference the same variable. A special variable, #0, is available, which provides the data values 1, 2, 3, .... This variable does not occupy any space in the database. It may not be given an alternate name. 15 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 Ways of Referencing Data--Variables and Cases Data is generally regarded as arranged in rows and columns by STATMATE. Although this generalization is not entirely adequate, it is at least a starting point for understanding how STATMATE deals with sets of data. A simple example of such an arrangement is the U.S. population data discussed earlier and contained in data file USPOP.DAT (data in millions): <-----Columns-----> Year Urban Pop Rural Pop 1790 0.202 3.728 1800 0.322 4.986 ^ 1810 0.525 6.714 | 1820 0.693 8.945 | 1830 1.127 11.739 Rows 1840 1.845 15.224 | 1850 3.544 19.648 | 1860 6.217 25.227 V 1870 9.902 28.656 1880 14.130 36.026 ... ... ... The data is arranged so that each column represents some item (variable) that is to be examined in detail. For example, it might be of interest to determine the average value of the item. A row represents some common element that each of the columnar items have in common, in this case, the corresponding population for a given year. In STATMATE the data in a column to be examined or studied is called a variable. The data in a row is referred to as a case or observation. In the above example we have: <------- Variables------> Year Urban Pop Rural Pop ^ 1790 | 1800 cases | | v Year, urban population and rural population are variables. The data 0.202 and 3.728 represent the case data for 1790. Note that there is no reason why the dates themselves could not be considered as a variable and the individual years as belonging to a case. 16 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 ENTERING DATA INTO THE SYSTEM External Data Entry--Files Files are the basic way of entering data into STATMATE. (Another method for entering data, with the STATMATE/PLUS EDIT command, is available). Data files have names within the Operating System, CP/M, MS DOS or PC DOS. Any of these Operating Systems allows a variety of characters for names; however, STATMATE only recognizes file names and file name extensions which are composed of alphabetic characters or numbers. File names and extent names must begin with an alphabetic character. ABCDE, uspop.DAT, HIST1980.DT and MYDATA are examples of valid file references within STATMATE. STOCK/82.DAT (/ is a special character and is not allowed), SAL$DATA ($ is a special character) and YEARS.DATA (the file extension DATA is too long) are examples of invalid references. Your Operating System may allow these names, but STATMATE will not accept them if they are used in commands where a file name is needed. Use of your Operating Systems renaming capabilities will solve any difficulties with file names. A file name may be prefixed with a disk drive identifier as in A:XYZDATA.DAT. Data files are entered into the STATMATE database with the INPUT command. There are two ways of creating files for input. Only one of these will be described here, the use of ASCII files. ASCII data files may be created using a text editor program, such as WORDSTAR or EasyWriter. An ASCII file can be easily printed or listed. Creating ASCII Data Files Often the simplest way of producing an ASCII file is to use a text editor. All that is required for preparing a STATMATE input file with a text editor is a basic understanding of how to arrange the text representing the data. This is a simple task and it is addressed below. The following shows the contents of a file prepared using a text editor. DATA MY ENERGY USE - 1981,ELEC CONSUMPTION--JAN TO DEC 400,250,390, 280,250,305, 235,220,230, 330,450,525 If the file were called ENERGY.DAT, it could be read with the STATMATE INPUT command by entering: INPUT ENERGY.DAT KEEP 3 17 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 See the INPUT command description for additional information. The following rules apply to preparing a data file: 1. The word DATA (uppercase or lowercase) must appear as the first word of the first line. Blanks or any other characters may not appear before DATA. Any comments may appear after DATA on the line. MY ENERGY ... is a comment describing the data in the above example. 2. Data items follow on each subsequent line. Each item is separated from another by a comma or blank. 3. All lines must be followed by a carriage return and line feed, including the last line. 4. Alphanumeric information must be enclosed in single quotes (') or begin with an alphabetic character. If alphanumeric data contains an embedded blank, the data must be surrounded by quotes. The MOTOR.DAT file in Appendix B is an example of using alphanumeric data in a DATA file. 5. Data must be arranged on a case by case basis. It should be noted that although many word processors and editors will produce files which STATMATE will read, there are some word processors and editors which place extra characters at the beginning of a file. Users of EasyWriter, for example, must use the TRANSFER utility to produce a proper ASCII file. EasyWriter users must also use the ENTER key to generate carriage returns after each line. If extra characters are placed before DATA, STATMATE will issue a message that the file is invalid. 18 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 This page deliberately left blank 19
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 This page deliberately left blank 20 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 COMMAND DESCRIPTIONS Each command is described in detail in this section. In order to determine which commands are executable in each of the three STATMATE packages, observe that at the top of each page a subtitle lists the packages. When a package name appears there, then the package includes the command described. An important part of the command description is the syntax or format of the command. When describing the syntax of the command the following general rules are followed: 1. Uppercase characters should be entered as shown. 2. For brevity, only the first 3 characters of commands, modifier names, etc. need be entered. If additional characters are entered, they should always match the name in every position given from the first to last character entered.For example, STATISTICS is matched by STA or STATI but STATS does not match it. 3. Lowercase is used to describe the type of entry that you must provide. For example, in a description var1 might represent a variable name. 4. Punctuation is entered as shown. 5. An ellipsis (...) means repeat the previous item as needed. For example, num,..., where num represents a number, indicates either that a single number or a list of numbers with separating commas is acceptable. In addition to a command's syntax, each command is further clarified with descriptive material and detailed examples concerning its use. Input to STATMATE that is entered by the user is shown in boldface in the detailed examples. Remember to enter a carriage return after entering a command line. 21 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 CROSSTAB Command: CROSSTAB Purpose: Used to cross tabulate data on two variables. Syntax: A. CROSSTAB classavar ON classbvar where classavar = classification variable A classbvar = classification variable B Defaults: none Syntax Examples: CROSSTAB AGE ON WEIGHT CROSSTAB SMOKING ON SEX CROSSTAB COMPANY ON SALES CROSS TAXES ON COUNTY Description: CROSSTAB peforms a two-way cross tabulation on two variables. Each variable contains data which divides the data into classes. For example, the data below collected on the season and observed color of a botanical specimen. SEASON might be thought of as defining classes for each of the four seasons and COLOR as defining five classes: BLUE, GREN (green), RED, BLCK (black) and YELL (yellow). SEASON COLOR ------ ----- FALL BLUE FALL GREN WINT GREN SUMR RED SPNG BLCK FALL GREN SUMR BLUE SPNG YELL In a two-way cross tabulation, the data might appear as: COLOR BLUE BLCK GREN RED YELL ---- ---- ---- ---- ---- SPNG 1 1 SEASON SUMR 1 FALL 1 2 WINT 1 Classes may be defined for either numeric or alphanumeric data. CROSSTAB tabulates data for the class A variable by 22 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 CROSSTAB rows and the class B variable by columns. As many as 20 different classes may be contained in a variable. If a variable contains more than 20 classes, the STATMATE/PLUS RECODE command may be used to reduce the total number of classes. CROSSTAB produces a table of statistics. Each table entry, or cell, contains a frequency, row percentage, column percentage, and total percentage for the corresponding classes. Percentages and totals for rows and columns are reported. Fisher's exact probabilities for the special case of 2x2 tables are calculated when there are 32 or fewer entries in the table, These probabilities are used to produce Tocher's correction. If a 2x2 table is produced with more than 32 entries, Yate's correction to the Chi-square is output. The Chi-square statistic is produced for other tables. A Phi value, a measure independent of the number of cross tabulation entries, is produced for all tables. If the number of columns and rows in the output table exceed a reasonable page width and length, the table is output in conveniently divided sections. Basically, the entire table is output in sections from left to right and from top to bottom. Classes are sorted in ascending order before they are output, so classes placed in the output table are easy to find. Detailed Example The following example uses CROSSTAB to perform a two-way cross tabulation on the class data contained in FACTORA and FACTORB. Note that some of the class data in FACTORA is defined by a -2 value, and note that any values may be used to identify a class. Class identifiers are, numeric values in this instance, are printed before each row and above each column. FACTORB has four classes: -2, 4, 6 and 8. FACTORA has three classes: 2, 3 and 4. Command: PRINT FACTORA,FACTORB FACTORA FACTORB ----------- ----------- 3.00 4.00 2.00 -2.00 3.00 6.00 2.00 8.00 2.00 6.00 4.00 8.00 2.00 4.00 2.00 8.00 2.00 4.00 2.00 8.00 4.00 6.00 23 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 CROSSTAB 3.00 -2.00 Command: CROSSTAB FACTORA ON FACTORB CLASS A VARIABLE: FACTORA CLASS B VARIABLE: FACTORB 12 CASES 0 MISSING 0 NOT TABULATED FACTORB COUNT | ROW PCT | | | | | COL PCT | | | | | ROW TOT PCT | -2.00| 4.00| 6.00| 8.00| TOTAL FACTORA | | | | | |--------|--------|--------|--------| 2.00 | 1 | 2 | 1 | 3 | 7 | 14.3% | 28.6% | 14.3% | 42.9% | 58.3% | 50.0% | 66.7% | 33.3% | 75.0% | | 8.3% | 16.7% | 8.3% | 25.0% | |--------|--------|--------|--------| 3.00 | 1 | 1 | 1 | 0 | 3 | 33.3% | 33.3% | 33.3% | 0.0% | 25.0% | 50.0% | 33.3% | 33.3% | 0.0% | | 8.3% | 8.3% | 8.3% | 0.0% | |--------|--------|--------|--------| 4.00 | 0 | 0 | 1 | 1 | 2 | 0.0% | 0.0% | 50.0% | 50.0% | 16.7% | 0.0% | 0.0% | 33.3% | 25.0% | | 0.0% | 0.0% | 8.3% | 8.3% | |--------|--------|--------|--------| COLUMN 2 3 3 4 12 TOTAL 16.7% 25.0% 25.0% 33.3% 100.0% CHI SQ: 3.7381 DF: 6 PHI = 0.558 24 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 ERASE Command: ERASE Purpose: Erases data variables from the EZPACK database. Syntax: A. ERASE vname THRU END B. ERASE vname where vname = name of first variable where data is to be erased Syntax Examples: ERASE #6 THRU END ERASE WAGES THRU END ERASE SALES Description: In order to erase data from the database, the ERASE command must be used. For the variables specified, the command resets the number of cases to zero, removes any assigned name, and resets the missing value to 1.0E30. Variables where data is erased become numeric variables. Data may be erased variable by variable, or from a given variable through the end of the database. Erasing data does not affect the amount of data stored in the database (since erased data is simply ignored), but does affect the use of the INPUT command (See the INPUT command for details). Detailed Example: Below, the ERASE command is used to erase data from the database. Erasing begins with the variable TOTALPOP, #4, and proceeds to the last variable, #10. Command: ERASE TOTALPOP THRU END 7 VARIABLES ERASED 25 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 EXECUTE Command: EXECUTE Purpose: Allows execution of commands from an EXECUTE file. Syntax: EXECUTE fname where fname is the name of a file Syntax Examples: EXECUTE MYCOMFIL EXECUTE A:CMDFILE.EXE Description: The EXECUTE command allows commands placed on an EXECUTE file to be executed by STATMATE. This feature simplifies the entry of often used sequences of commands. A discussion of this feature may be found in the introductory information on the use of EXECUTE files. An EXECUTE command may not appear in an EXECUTE file. When an EXECUTE command is entered, the subsequent Command: prompts that usually are issued by STATMATE are replaced by !Command: until the commands on the file are completely processed. The Command: prompt is then issued to indicate that STATMATE wants you to enter a command when all commands on the file have been processed. Detailed Example: In the following example, EXECUTE refers to the EXECUTE file REGSTUDY.EXC, found on the C disk (directory). Command: EXECUTE C:REGSTUDY.EXC 26 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 EXIT Command: EXIT Purpose: Returns the user to the Operating System Syntax: EXIT Syntax Examples: EXIT Description: The EXIT command returns the user to the Operating System. Detailed Example: EXIT is entered in response to an STATMATE command request to return to the Operating System. Command: EXIT END STATMATE 27 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 GIVE Command: GIVE Purpose: Assigns user defined attributes (names, missing values, and data types) to variables. Syntax: A. GIVE NAME aname,vname1,... where aname = user or predefined variable name at which assignment begins vname1 = user name for aname vname2 = user name for next variable after aname B. GIVE MISSING aname,missv1,... where aname = user or predefined variable name at which assignment begins missv1 = missing value to be assigned to variable aname missv2 = missing value to be assigned to next variable after aname C. GIVE TYPE aname,atype1,... where aname = user or predefined variable name at which assignment begins atype1 = A (alphanumeric) or N (numeric) to be assigned to variable aname atype2 = type to be assigned to next variable after aname Syntax Examples: GIVE NAME #1,SALES,STOCK,TIME GIVE NAME #4,AGE,SCORE GIVE NAME AGE,NEWAGE GIVE MISSING #1,200.0,200.0,-1000.0,'NA' GIVE MISSING QUARTER,'Q?' GIVE MISSING AGE,0.0,0.0 GIVE TYPE,#15,A,N,A,N,N Description: The GIVE command allows a specific name, missing value or data type to be assigned to a variable. Initially, each variable has a default name, missing value and data type until you change these attributes with GIVE. Recall that each database contains a fixed number of variables, even though you have not used INPUT to bring any data into the database. Hence, GIVE can be used to modify variable attributes before data is placed in a variable. GIVE is often just for this purpose just before using INPUT. A variable name (user assigned name) is used as an alternate name to the predefined name given to every variable. A 28 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 GIVE numeric value of 1.0E+30 is assigned to every numeric variable, if it is not changed by GIVE MISSING. A missing value of blanks is assigned to every alphanumeric variable, if it is not changed by GIVE MISSING. Alphanumeric missing values must be enclosed in single quotes ('). The MISSING attribute may not be changed once data is placed in the variable; the ERASE command must be used first before assigning a new MISSING attribute to a variable. The data type may be numeric (N) or alphanumeric (A). In order to place alphanumeric data in the database with INPUT, you must make the appropriate variable an alphanumeric type. 29 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 GIVE Detailed Example: GIVE NAME is used to assign the names YEAR and AGE to #3 and #4, respectively, then GIVE MISSING is used to assign missing values of 0 and -1 to YEAR and AGE. Command: GIVE NAME #3,YEAR,AGE 2 ATTRIBUTES MODIFIED Command: GIVE MISSING YEAR,0,-1 2 ATTRIBUTES MODIFIED 30 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 HELP Command: HELP Purpose: Displays descriptive information regarding the use of all commands. Syntax: HELP cname Syntax Examples: HELP HELP CURVE HELP REGRESSION Description: The HELP command displays descriptive information regarding the use of commands currently implemented. This command provides a reminder of available commands and formats when working interactively at the terminal. If HELP is followed by the name of a command, specific help for the command is displayed. If HELP is given without a command name, a list of commands is given along with some general information about the package. Help information is contained on a file with the name IFHELP.TXT, which can be modified by the user with a text editor. Detailed Example: The following illustrates the output of the HELP command. Only a part of the output is shown. Command: HELP ---COMMANDS--- COMPUTE CORRELATE ELSE END ERASE EXECUTE GIVE HELP INPUT LET PLOT PRINT QUERY REMARK SET STATISTICS WHEN WRITE ... (more follows but is not shown) 31 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 INPUT Let's look at the record. Al Smith (1938) Command: INPUT Purpose: Allows entry of data into the STATMATE database from external files. Syntax: A. INPUT fname descriptor1, descriptor2, ... B. INPUT fname descriptor1, descriptor2, ... AT vname where fname = a file name descriptor = one of the following field extraction descriptors: KEEP n OMIT n vname = variable name Syntax Examples: INPUT MYDATA KEEP 1 INPUT STOCKFILE KEEP 6, OMIT 4 AT MYSTOCK INPUT POPFILE.DAT KEEP 2, OMIT 1,KEEP 1 INPUT B:XYZ KEEP 5 AT #20 INPUT DATAFILE KEEP 2, OMIT, KEEP Description: INPUT provides one way of entering data into the STATMATE database. Data is read in (kept) or not read in (omitted) from the specified file according to field descriptors given in the command. The descriptors permit any field of a file record to be selectively extracted and inserted into the database as a variable. Data read from the file is placed either beginning at the variable specified after the AT keyword in the command, or at the start of the rightmost block of erased or empty variables at the end of the database. STATMATE assumes that a file is record oriented. A record consists of one or more successive fields . Each record must contain exactly the same number of fields. A file contains one or more records. See the introductory section on specific details of the two file types (DATA and PROG) files that may be read by the INPUT command. When INPUT is used to read such files, it must be told field by field, for every field of a record, whether the field that is to be transferred to a variable in the database. The OMIT and KEEP descriptors are used for this purpose. If several 32 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 INPUT successive fields are to be kept or omitted, a number may follow OMIT and KEEP to indicate the number of such fields. If a number does not follow a descriptor, a value of one is assumed for the number of fields the descriptor applies. The total number of fields kept and omitted must be exactly the number of fields in a record. In terms of fields and records, a case corresponds roughly to a record entry and a variable to all the entries in a specific field. Fields extracted from a data file are placed in successive variables in the database. The location of the first field in which data is to be placed may be specified by using the form containing the AT keyword. If this form is not used, INPUT will placed the first variable extracted from the file at the start of the rightmost block of erased or empty variables at the end of the database. For example, if the last variable in the database is #25, and all variables from #12 through #25 are erased, then #12 will be the first variable to receive data of the INPUT command, when the AT form is not used. Detailed Example: INPUT is used to extract data from the States and population density fields of the MOTOR.DAT file given in the Appendix B. The file contains 8 fields, and the first and fourth fields contain the data of interest. All variables from #4 through the end of the database are assumed erased or empty. Since the States field contains alphanumeric data, #4 is given the alphanumeric attribute with the GIVE TYPE command prior to using INPUT. When INPUT is executed, the two data fields are placed in variables #4 and #5. Each field contains 50 values or cases. WHEN is used to select the first five cases, and PRINT is used to list the first five cases of the two variables read into the database. Command: GIVE TYPE #4,A 1 ATTRIBUTE MODIFIED Command: INPUT MOTOR.DAT KEEP 1,OMIT 2, KEEP 1,OMIT 4 2 VARIABLES INPUT AT #4 50 CASES 33 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 INPUT Command: WHEN #0=1 THRU 5 5 OF 250 CASES Command: PRINT #4,#5 #4 #5 ----------- ----------- AL 64.00 AK 0.40 AZ 12.00 AR 34.00 CA 100.00 34 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET Command: LET Purpose: Allows simple arithmetic operations and other manipulations on data. Syntax: A. LET vname = simpexp B. LET vname = fcn(arg) C. LET aname1 = aname2 where vname = numeric variable name simpexp = simple arithmetic expression fcn = STATMATE function name arg = list of arguments for the function aname1 = alphanumeric variable name aname2 = alphanumeric variable name Syntax Examples: LET NEWAGE = AGE-21 LET SALES = COST*AMTSOLD LET WAGES = SALARY-TAXES LET LAGSALES = LAG(SALES,1) LET #4 = #0 LET LENGTH = SQRT(AREA) LET NEWVAR = #2/3.0 LET RESPONSE = LOG10(EXPOSURE) Description: LET is a very useful command for performing arithmetic manipulations on variables, and moving data from one variable to another. In addition to the simple arithmetic operations, several simple functions are available to deal with lagging and other useful operations. Numeric and alphanumeric data may be moved from one variable to another by using a simple assignment of one variable to another. The entry of LET NEWYEAR = YEAR + 1900 produces the following: NEWYEAR YEAR ------- ---- 1979 79 1980 80 1981 81 1982 82 35 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET Other simple arithmetic operations that may be performed are: A+B A-B A*B A/B A+cnst A-cnst A*cnst A/cnst A**cnst -A +A A -cnst +cnst cnst where A and B represent any variable names, and cnst refers to any numeric constant. The symbols +, -, * and / represent their normal meanings, * indicating multiplication. The symbol ** indicates the raising of a number to a power. Note that A**-1 is the same as the reciprocal of A (1/A). The LET also accomodates a number of arithmetic functions. The following functions are avaiable: Function Description -------- ------------------------------------------- LAG Lag or shift time period data a specified number of periods LOG Logarithm (base e) LOG10 Logarithm (base 10) SQRT Square root of data SAM Select a random sample MOV Moving average EXP Raise a number to a power of e NUM Produce a sequence of numbers PRD Produce a repeated sequence of numbers STP Produce a sequence of numbers at a given increment NOR Normally distributed random numbers UNI Uniformly distributed random numbers DNOR Normal distribution points DEXP Exponential distribution points DWEI Weibull distribution points DCAU Cauchy distribution points ABS Absolute value CUM Cumulative sum INT Integer or whole number Functions have the form: LET vname = fname(a1,a2,a3,...) where vname is a variable a1, a2 and a3 are arguments For example, LET GROWTH = LOG(DOSAGE) LET RESPONSE = NOR(HEIGHT,0,2.0) 36 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET The first argument of any function is always the variable name for which the function is to be applied. The first example causes the logarithm, base e, to be taken on all of the data in the variable DOSAGE, and placed into the variable GROWTH. In the second example, note that the NOR function has three arguments. The second and third arguments represent the mean and standard deviation of the normally distributed random numbers to be generated. Although it is always necessary to provide a variable as the first argument of a function, the values of the variable may not be used by the function. For many functions, the variable is used to determine how many values are to be generated by the function. For example, in the use of NOR cited above, exactly as many normally distributed values are produced as there are cases in DOSAGE. However, if a case in DOSAGE is missing, a missing value will be assigned to corresponding case in the resulting variable. In fact, this is true in general. That is, a missing value found in a variable used on the right side of a LET expression produces a missing value in the assignment variable. Descriptions of the individual functions and their argument lists are given below. Function Description ----------- ---------------------------------------------- LAG(v,p) Lag v by p periods. See the NOTE below for more on LAG. Example: LAG(YEAR,3) -- lag 3 periods SAM(v,n) Select exactly n items randomly from v. The values of the cases selected are placed in the assignment variable. Items not selected are marked with a missing value. Example: SAM(STATES,12) -- randomly select 12 items from STATES MOV(v,n) Compute an n-term moving average. If v contains the values a,b,c and d in the first four cases, a two-term moving average produces four cases in the resulting variable: mv, (a+b)/2, (b+c)/2, (c+d)/2, where mv is a missing value. Example: MOV(SALES,4) 37 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET Function Description ----------- ---------------------------------------------- SQRT(v) Take the square root of v. Example: SQRT(X) LOG(v) Compute the log of v using base e. Example: LOG(POPULATON) LOG10(v) Compute the log of v using base 10. Example: LOG10(STEMSIZE) EXP(v) Raise e to the power of the data in v. Example: EXP(WGT) PRD(v,b,i,p) Produce a repeated sequence of numbers beginning with b. Increment b by an amount i exactly p times, and then repeat the sequence beginning with b again. Example: PRD(SALES,1,1,12) -- 1,2,3,4,5,6,7, 8,9,10,11,12,1,2,3,... STP(v,b,j,p) Produce the number b exactly p times, then increment b by j and produce b+j exactly p times. Continue adding j to the last sequence of p numbers produced. Example: STP(RESP,4,1,2) -- 4,4,5,5,6,6,... (4=start,1=jump,2=repeat) NUM(v,b,s) Produce the sequence of numbers b, b+s, b+2*s, b+3*s, ... Example: NUM(MONTH,1,2) -- 1,3,5,7,... (1=start, 2=step) NOR(v,m,s) Generate random numbers from a normal distribution with a mean of m and standard deviation of s. Example: NOR(DOSE,5.0,1.4) -- Normal random nos. (mean=5, sd=1.4) 38 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET Function Description ----------- ---------------------------------------------- UNI(v,a,b) Generate random numbers from a uniform distribution on the interval from a to b. Example: UNI(COST,20,44.5) -- Uniform random nos. (left=20,rght=44.5) DNOR(v,m,s) Compute the probability F(x), given that F is a normal distribution with mean m and standard deviation s. See the KOLMOGOROV command. Example: DNOR(LOSS,-40.2,4) DEXP(v,s) Compute the probability F(x), given that F is an exponential distribution with mean s. See the KOLMOGOROV command. Example: DEXP(DOSAGE,3.4) DWEI(v,u,s) Compute the probability F(x), given that F is a Weibull distribution with location parameter u and scale parameter s. See the KOLMOGOROV command. Example: WEI(MACHINE,2.2,4.1) DCAU(v,u,s) Compute the probability F(x), given that F is a Cauchy distribution with median given by the parameter u and the 1st quartile given by u-s. See the KOLMOGOROV command. Example: DCAU(CONT,12.4,8.2) ABS(v) Take the absolute value of v. Example: ABS(X) CUM(v) Produce the cumulative sum of v. For example, if 10, 20 and 15 are the first three cases of v, the cumulative sum is 10, 30, 45. Example: CUM(COST) INT(v) Take the integer portion of v. For example, if the first two cases of v are 22.4 and -15.8, then the result is 22 and -15. INT(WGT) 39 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET ------------------------------------------------------------- NOTE: The LAG function provides a very useful way of examining relationships between one time period and another. For example, assume that annual inventory and sales data are available and that a comparison of a year's inventory to a previous year's sales is to be made. A lag period of one year is needed. A lag has the effect of advancing data from earlier periods to recent periods by the specified number of lag periods. Consider the following data, where INVLAG is a variable created by using LET INVLAG = LAG(INV,1) Case (Year) SALES INV INVLAG ---------- ----- --- ------ 1975 40 115 ? 1976 55 135 115 1977 65 185 130 1978 90 140 185 1979 70 200 140 ... ... ... 200 A comparison of SALES and INVLAG shows a simple relationship for any year; SALES is about one-half the inventory figure for the previous year. Ah, to be so lucky! Since data for 1974 did not exist, a missing value (shown as a ?) is indicated. The concept of lagged variables is a particularly useful one in curve fitting, forecasting and regression. ------------------------------------------------------------- A LET operation sometimes involves missing values. For those variables on the right side of the equal sign whose cases contain a missing value, a missing value is created for the corresponding cases in the left variable. When a LET operation is performed, new cases or values are formed for the variable appearing on the left of the equal sign. The length of this variable, or the number of cases it contains, can be affected by the operation. Generally the effect is of little concern. In some instances, some insight into how the length is determined may be useful. In most instances, the length is found by determining the longest variable involved in the operation. If a variable does not appear on the right, as in the case of simply assigning a constant, the length is the same as the length of left variable. If, in assigning a constant, the left variable has a zero length or was erased or never used, its length becomes the maximum length that can be used in the database. If the variable to the left of the equal sign, the assignment variable, has fewer cases than other variables, 40 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET its length is not increased. An already existing variable's length can be increased only by erasing the variable and then assigning data to it with the LET. Detailed Example: LET is used to adjust the YEAR variable from the USPOP.DAT file given in Appendix B. Instead of beginning with 1790, the data is transformed to begin with the year 0 and continues in intervals of 10. The new variable is ADJYEAR. The name ADJYEAR was assigned earlier with the GIVE NAME command to some variable. Command: LET ADJYEAR = YEAR-1790 ADJYEAR MODIFIED 17 CASES 0 MISSING 41 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 LET This page deliberately blank 42
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PLOT Command: PLOT Purpose: Produces a plot at the terminal. Syntax: A. PLOT yvar1,yvar2,... ON xvar B. PLOT yvar1,yvar2,... ON xvar,mod1,mod2,... C. PLOT D. PLOT ? where yvar1=name of variable to be plotted vertically xvar=name of variable to be plotted horizontally mod1=one of the following: HRANGE=h1 THRU h2 VRANGE=v1 THRU v2 HRANGE=DATA VRANGE=DATA HPOSITIONS=number of spaces VPOSITIONS=number of lines TITLE='title' VLABEL='vert-label' HLABEL='horiz-label' Defaults: HRANGE = DATA (i.e., use minimum and maximum of xvar data) VRANGE = DATA (i.e., use minimum and maximum of yvar data) HPOS=50 VPOS=40 Syntax Examples: PLOT SCORES ON AGE PLOT SALES ON INVENTORY,HRANGE=0 THRU 100, TITLE='SALES HISTORY' Description: PLOT produces a plot of one to five variables against another variable in the form of a scatter plot. A number of modifiers provide ways of titling, labeling, and sizing the plot. Multiple curves, variables, are plotted vertically. Vertical and horizontal axes, scale, and legend information are added to the plot. If the ranges are not specified with HRANGE or VRANGE, the range over which the data for either axis is to be displayed is derived from the data. The TITLE, VLABEL and HLABEL modifiers provide a way to add title and axis descriptions to the plot. Note that the descriptive information for these modifiers must be enclosed by single quotes. 43 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PLOT Physical sizing of a plot is aided by the HPOSITION and VPOSITION modifiers. They permit the height and width of a plot to be specified in terms of the number of lines and characters per line. A maximum of 60 lines and 120 characters may be specified. These modifiers apply to the portion of the plot exclusive of the axes and descriptive information placed on it. Plot points are shown as +, *, X, @and # for the first through the fifth curve plotted. If points are coincident, only the point for the rightmost curve in the list is plotted. Before actually producing the plot, a pause occurs at the terminal to allow positioning and placement of paper in the printer. Once the paper is adjusted, depressing return will cause the plot to appear. Don't forget to enter ctrl-P if you want the output on the printer! See the SET command for an alternate way of placing output on the printer. If PLOT is used without specifying any variable names or modifiers, you will be prompted for the variable names. After entering the variable names, STATMATE will then prompt you for subcommands. The subcommands allow you to change modifier values. The following subcommands are available: Subcommands Purpose -------------------------------- ---------------------------- CONTINUE Execute PLOT or exit command SAVE Save modifier default values SHOW Display modifier values HPOSITION = x Set HPOS modifier VPOSITION = x Set VPOS modifier HRANGE = x.x THRU x.x or DATA Set HRAN modifier VRANGE = x.x THRU x.x or DATA Set VRAN modifier TITLE = 'xxxx' Set TITLE modifier HLABEL = 'xxxx' Set HLABEL modifier VLABEL = 'xxxx' Set VLABEL modifier where x.x is a decimal number, x is whole, xxxx is a string of characters Some examples of subcommands are: HRANGE=30 THRU 200.5 TITLE='POPULATION DENSITY FOR 1980' VPOSITION=30 SAVE VRANGE=DATA CONTINUE VLAB='RESPONSE VALUES' Entering CONTINUE in response to a subcommand prompt, causes STATMATE to produce the desired plot and return to command 44 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PLOT prompting. If you used PLOT ? to enter the subcommand mode, as explained below, CONTINUE will simply return STATMATE to command prompting without executing the PLOT. SHOW displays the current setting of the modifiers. When a modifier is set to a particular value using a subcommand, its value is only temporary. That is, when the plot is produced, the value for the modifier will be the temporary value that you assigned it. However, the next time you use PLOT the modifier will revert to its original permanent default value. You may change the default values permanently by using the SAVE subcommand. The SHOW subcommand displays the current values of the modifers. Setting the default values as permanent is a very useful feature of STATMATE. It is particularly useful when you have titles and labels that you continually use from PLOT to PLOT. Entering PLOT ?, allows you to examine and change default settings of the modifiers without producing a plot. In this case, you will not be prompted for variable names. The default settings will be displayed and the subcommand prompt will appear. Entering CONTINUE causes STATMATE to return to command prompting without producing a PLOT. Detailed Example: The use of the PLOT command is illustrated by plotting RURALPOP and YEAR data from the USPOP.DAT file found in the appendix. Also plotted are the fitted, FITRURPOP, and forecast, FORRURPOP, values obtained by fitting a linear equation to RURALPOP with the STATMATE CURVE command and then computing these two variables with the COMPUTE command. A similar set of fitted and forecast data could be produced using the STATMATE STEPWISE regression command. RURALPOP, FITRURPOP and FORRURPOP are plotted against YEAR. YEAR has been extended by modifying the USPOP.DAT file with the years 1960 and 1970. The corresponding cases for RURALPOP were assigned missing values. After the plot is made, the PRINT command is used to display the values of these values. The forecast values are printed as X on the plot. The actual and fitted points at 1810 coincide so only the FITRURPOP point (*) is printed. The command is long enough that the first portion is terminated with a comma to permit continuation of the remainder of the command after the CONTINUE: prompt. Command: PLOT RURALPOP,FITRURPOP,FORRURPOP ON YEAR, Continue: VLABEL='POPULATION(MILLIONS)', Continue: HLABEL='YEARS', Continue: TITLE='EXAMPLE OF FORECASTING RURAL POPULATION' 45 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PLOT 19 CASES 19 MISSING ADJUST PAPER, HIT RETURN 46 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PLOT EXAMPLE OF FORECASTING RURAL POPULATION 70.50+ : X : : X : 61.55+ * : : * : + : * P 52.59+ + O : + P : + * U : L : + * A 43.64+ T : * I : + O : * N : ( 34.68+ + M : * I : L : * L : + I 25.73+ * O : + N : * S : ) : * + 16.77+ : + : * : + : * 7.82+ + : * : + :+ * : -1.14+* +---------+---------+---------+---------+---------+ 1790.00 1826.00 1862.00 1898.00 1934.00 1970.00 YEARS LEGEND: + RURALPOP * FITRURPOP X FORRURPOP 47 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 PRINT Command: PRINT Purpose: Prints (lists) data for variables. Syntax: A. PRINT vname1,vname2,... where vname1 = name of variable to be printed Defaults: none Syntax Examples: PRINT DOW,MYSTOCK,#8,INVENTORY PRINT STOCK,PRICES PRINT #0,WEIGHT Description: The PRINT command is useful for printing (listing) data from the database. Data is not actually sent the printer, it is displayed on the CRT. When variables with a dissimilar number of cases are printed, an asterisk is printed in place of cases which do not exist. If the output of PRINT is to be placed on a printer, remember to enter a ctrl-P before pressing the carriage return at the end of the command. See the SET command for an alternate way of placing output on the printer. Detailed Example: An example of the PRINT command use is shown below that lists the first seven cases of the three variables YEAR, URBANPOP and RURALPOP. The WHEN command is first used to select a view of the first seven cases. WHEN #0=1 THRU 7 7 OF 250 CASES Command: PRINT YEAR,URBANPOP,RURALPOP YEAR URBANPOP RURALPOP ----------- ----------- ----------- 1790.00 0.20 3.73 1800.00 0.32 4.99 1810.00 0.52 6.71 1820.00 0.69 8.95 1830.00 1.13 11.74 1840.00 1.84 15.22 1850.00 3.54 19.65 48 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 QUERY Command: QUERY Purpose: Provides summary information about database variables. Syntax: A. QUERY vname1,vname2,... B. QUERY vname1 THRU vname2 where vnamei = name of variable Syntax Examples: QUERY STOCKF,WAGES,AGE QUERY LOANS,#5,#3,REGION QUERY #1 THRU #36 Description: The QUERY command displays information about a variable which includes its name, predefined name, data type, number of cases, number of missing cases and missing value code. Detailed Example: In this example, QUERY is used to display database information about the first four variables in the database. Command: QUERY #1 THRU #4 NAME USER NAME TYPE # CASES # MISSING MISSING VALUE ---- ---------- ------- ------- --------- ------------- # 1 YEAR NUMERIC 17 0 1.0000E+030 # 2 URBANPOP NUMERIC 17 0 1.0000E+030 # 3 RURALPOP NUMERIC 17 0 1.0000E+030 # 4 STATES ALPHA 50 0 49 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 REGRESSION Command: REGRESSION Purpose: Performs a multiple regression analysis. Syntax: A. REGRESSION yvar ON xvar1,xvar2,... B. REGRESSION yvar ON xvar1,xvar2,... ,mod1,mod2,... where yvar=dependent variable name xvari=independent variable name modi=one of the following modifiers TABLE=tabnam,... where tabname is PARAM,FIT, ANOVA, SEPARAM TABLE=ALL INTERCEPT=opt where opt is YES or NO DURBIN=opt Defaults: TABLE=PARAM,FIT INTER=YES DURBIN=NO Syntax Examples: REGRESSION AUTOSALES ON TREND,GNP,CPIDELTA,SEASON REGRESSION SLPLOSS ON DOSAGE,AGE,STRESS,TABLE=ANOVA,PARAM REGRESS RESPONSE ON #0,WEIGHT,INTERCEPT=NO REGRESS DEMAND ON BINDEX,GASPRICE,DURBIN=YES Description: The REGRESSION command computes the linear regression equation relating the dependent variable, yvar, to one or more independent variables, xvar1, xvar2, ..., xvarn. It also provides analysis of variance (ANOVA), standard error of the parameters (SEPARAM) and fit (FIT) statistics. The Durbin-Watson (DURBIN) statistics for testing serial correlation among residuals may be calculated. An equation, with or without an intercept, may be fit to the data. In combination with the COMPUTE command, REGRESSION allows residual computations to be derived from the fitted data. The equation derived by the regression method is: Y = a0 + a1 * X1 + a2 * X2 + a3 * X3 + ... Y is the variable that we wish to determine as a combination of the variables X1, X2, etc. The parameters a0, a1, etc. are determined by the program using the method of least squares. The above equation is sometimes referred to as a model. The form shown is the intercept model. If the a0 parameter is 50 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 REGRESSION removed from the model, it is referred to as the no-intercept, or origin, model. The use of the word intercept refers to the fact that the equation represents a plane in space which intercepts the coordinate axes of the space. The no-intercept form passes directly through the origin of the coordinate system. REGRESSION computes either form depending on the setting of the INTERCEPT modifier. The above equation, with a0, is the one most often used in data analysis and forecasting. The TABLE modifier allows any or all of the four output tables, PARAM, FIT, ANOVA, and SEPARAM, to be selected. The ALL option selects all four tables. The INTERCEPT modifier is used to select whether an intercept form of the regression is to be used. Detailed Example: In the following example, the REGRESSION command is used to find a regression equation in the intercept form for the HALD.DAT data file shown in the appendix. The application involves trying to relate the heat produced, HEATPROD, in cement production to the amount of certain materials present in the cement: ALUMINATE, SILICATE, FERRITE and DISILICATE. Examination of the PARAMETER table output shows the relation to be: HEADPROD = 62.41 - 0.144*DISILICATE + 0.102*FERRITE + 0.510*SILICATE + 1.551*ALUMINATE The TABLE=ALL option produces four output tables. The command is long enough that the first portion is terminated with a comma to permit continuation of the remainder of the command after the CONTINUE: prompt. Command: REGRESSION HEATPROD ON ALUMINATE,SILICATE,FERRITE, Continue: DISILICATE,TABLE=ALL 13 CASES 0 MISSING DEPENDENT VARIABLE HEATPROD INDEPENDENT VARIABLES ALUMINATE SILICATE FERRITE DISILICATE 51 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 REGRESSION PARAMETERS TABLE ---------------- PARAMETER ESTIMATE ---------- ---------------- ALUMINATE 1.551102 SILICATE 0.510167 FERRITE 0.101909 DISILICATE -0.144061 INTERCEPT 62.405403 FIT TABLE --------- R-SQUARE 0.9736 ADJ. R-SQUARE 0.9366 STD. ER. RES 2.4460 ANOVA TABLE ----------- SOURCE DF SUM OF SQUARES MEAN SQUARE F-VALUE ---------- ---- -------------- -------------- ---------- REGRESSION 4 2667.89900 666.97460 111.4792 ERROR 8 47.86361 5.98295 0.0264 TOTAL 12 2715.76200 226.31350 P > |F| = 0.0000 STD. ER. OF PARAMETERS TABLE ---------------------------- PARAMETER ESTIMATE T-VALUE STD. ER. P > |T-VAL| ----------- ------------ ---------- ----------- ----------- ALUMINATE 1.551102 2.082660 0.74477 0.0758 SILICATE 0.510167 0.704858 0.72379 0.5037 FERRITE 0.101909 0.135031 0.75471 0.8964 DISILICATE -0.144061 -0.203175 0.70905 0.8448 52 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 REMARK Command: REMARK Purpose: Allows information remarks to document a ezstep run. Syntax: REMARK rmrktxt where rmrktxt is any text representing a remark Syntax Examples: REMARK THE FOLLOWING JOB IS FOR JOHN DOE REMARK IF I WERE YOU, WHO WOULD BE READING THIS SENTENCE? Description: The REMARK command does not cause anything to happen other than for ezstep to issue another Command: prompt. Its purpose is to provide a way of documenting a ezstep terminal session. The text following the command may contain any characters. Detailed Example: In this example REMARK is used to comment on the next use of the STATISTICS command. Command: REMARK COMPUTE STATISTICS FOR JULY 15 DATA COLLECT Command: STATISTICS NEWDATA 53 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 SET Command: SET Purpose: Sets destination of ezstep output. Syntax: A. SET COPY=opt B. SET SEED=sno where opt = HARDCOPY, or OFF, or FILE fname where fname = file name sno = integer for random number seed Defaults: COPY=OFF Syntax Examples: SET COPY=HARDCOPY SET COPY=FILE MYRESULTS.DAT SET SEED=1832 Description: When COPY=HARDCOPY, all output results from executed commands are sent to the printer. The output does not appear on the terminal. When COPY=OFF is used, output results appear on the terminal. When COPY=FILE fname is used, the output of commands is placed in the file name, fname, designated. This file can be edited and displayed by other programs, including text editors such as WORDSTAR. Unless changed by this command, output results are sent to the terminal. Error messages and command prompts are always directed to the terminal. Whenever output is first directed to a results file, it replaces already existing data on the file. Data are not appended to the file. With the SEED modifier, you may set the value of the seed used for random number generation. The seed number initializes the random number generators. Normally, the seed is the same number each time you use ezstep. Hence, it is possible to generate the same random numbers each time you use ezstep. If this is undesirable, you may want to choose different seeds with the SET command. Use any number from 1 to 30000. 54 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 SET Detailed Example: In this example, SET is used to direct results to the printer instead of the terminal. Command: SET COPY=HARDCOPY TURN ON THE PRINTER 55
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 SHOW Command: SHOW Purpose: Displays internal STATMATE settings. Syntax: A. SHOW Defaults: none Syntax Examples: SHOW Description: SHOW displays internal information that is set by the user. For example, items displayed are the value of the random number seed used in LET, the current setting of the SET command, limits of the database being used and default limits of any new database created. The default information displayed is taken from information supplied through the STATMATE install program, EZINST (see appendix C). This information includes default limits for any database created by STATMATE and the location of internal files. Since it is possible to have used the STATMATE install program with STATMATE, default information relevant to program groups used in STATMATE is displayed also. 56 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 SHOW Detailed Example: The following example illustrates the use of the SHOW command. Command: SHOW ABC DATABASE SIZE MAXIMUM VARIABLES: 50 MAXIMUM CASES: 250 HIGHEST USED VAR: 49 GENERAL RANDOM NO. SEED: 8632 COPY ASSIGNMENT: TERMINAL INSTALLED SETTINGS DEFAULT DATABASE SIZE MAXIMUM VARIABLES: 10 MAXIMUM CASES: 1000 DEFAULT GROUP ASSIGNMENTS GROUP DISK DRIVE -------------- ---------- NUCLEUS CURRENT INTERNAL FILES CURRENT STATISTICS CURRENT REGRESSION CURRENT MISCELLANEOUS CURRENT 57 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 STATISTICS Command: STATISTICS Purpose: Produces elementary statistics. Syntax: A. STATISTICS vname B. STATISTICS vname where vname=name of variable Defaults: none Syntax Examples: STATISTICS SALES STAT WEIGHT Description: A number of useful statistical quantities such as totals, averages, and variances may be computed for a single variable from the STATISTICS command. The statistics may be broken into the following five categories: Summation Spread Totals Standard Deviation Sum of Squares Variance Sum of Squares about Mean Range Coeff. of Variation Central Tendency Distribution Average or Mean Minimum and Maximum Higher Moments Skewness Kurtosis (thickness of tail) 58 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 STATISTICS Detailed Example: The STATISTICS command is used to compute a complete set of statistics on 1964 motor death data from the MOTOR.DAT file in the appendix. The variable MTRDEATHS represents the data. Command: STATISTICS MTRDEATHS VARIABLE : MTRDEATHS 50 CASES 0 MISSING CENTRAL TENDENCY SPREAD DISTRIBUTION ------------------- ------------------------ -------------------- MEAN 926.76 STD. DEV. 889.32 MINIMUM 43.00 VARIANCE 790887.37 MAXIMUM 4743.00 RANGE 4700.00 COEFF. VAR. 0.96 MIDSPREAD 789.00 SUMMATIONS HIGHER MOMENTS ----------------------- -------------------- TOTAL 46338.00 SKEWNESS 2.07 SUM SQ 81697687.00 KURTOSIS 8.48 SUM SQ(DEV) 38753481.12 59 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 TTEST Command: TTEST Purpose: Perform a comparison of two data sets using the T-Test. Syntax: TTEST vname ON classvar where vname=variable name classvar=classification variable name Syntax Examples: TTEST DOSAGE ON SEX TTEST VOTE ON POLPARTY Description: A comparison of the means of two data sets (samples) may be performed using TTEST. The means of two sets are computed and compared by computed using the Student T-test. Values are calculated using assumptions of both equal and unequal variances. In the case of unequal variances, the degrees of freedom is calucated using Satterthwaite's approximation. A table of means and standard deviations for each of the two sets is produced. The analysis is performed on the data contained in vname. The variable classvar determines which of the two sets or classes the corresponding value in vname belongs. That is, classvar contains codes that indicate which category or class the data in vname belong. Although not strictly necessary, class codes should be coded as integer (1, 2, ...) values. Detailed Example: In the example, the PRINT command is used to first display the data to be used in TTEST. CHEMRESULT contains the weight of a substance after a chemical reaction as produced by 20 individual experiments. Two laboratories were involved and 4 experiments were performed at one lab and the remaining 6 at the other lab. The laboratory performing the experiment is coded in LABCODE. TTEST is used to determine if a significant difference exists in the procedures used by the laboratories in performing the experiment. 60 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 TTEST Command: PRINT CHEMRESULT,LABCODE CHEMRESULT LABCODE ----------- ----------- 40.50 1.00 42.30 1.00 50.80 2.00 46.00 1.00 47.80 1.00 42.20 1.00 49.00 2.00 36.90 1.00 44.40 2.00 39.00 2.00 Command: TTEST CHEMRESULT ON LABCODE RESPONSE VARIABLE: CHEMRESULT CLASS VARIABLE: LABCODE 10 CASES 0 MISSING CLASS STATISTICS LABCODE CASES MEAN STD. DEV. MINIMUM MAXIMUM ---------- ----- ----------- ------------ ---------- ---------- 1 6 42.6167 3.8923 36.9000 47.8000 2 4 45.8000 5.2738 39.0000 50.8000 TWO SAMPLE T-TEST RESULTS VARIANCE T-VALUE DF PROB > |T-VALUE| -------- ----------- ------ ---------------- EQUAL -1.1055 8 0.3011 UNEQUAL -1.0340 6 0.3410 61 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 WHEN--ELSE--END Command: WHEN--ELSE--END Purpose: Permits data to be selected for analysis from the database according to logical and relational conditions Syntax: A. WHEN relexp B. WHEN relexp1 AND relexp2 C. WHEN relexp1 OR relexp2 where relexp if one of the following vname relation c1 vname = c1, c2, ... vname = c1 THRU c2 and vname = variable name relation = one of the relations: =, >, <, >=, or <= c1 and c2 = numeric or alphanumeric constants Defaults: none Syntax Examples: WHEN AGE=13 THRU 30 WHEN COLOR='RED','BLUE','PINK' WHEN WIDTH>25.55 WHEN STOCK<=4500.00 WHEN #0=40 THRU 80 WHEN AGE=21 THRU 29 AND WEIGHT=150 THRU 175 Description: The WHEN command is a very useful command for analyzing data according to specific selection criteria. For example, consider a variable containing ages of individuals and another containing their weights. It might be important to find the average weight of this group for all members between the ages of 50 and 65. The WHEN command can be used to select data according to this criterion, and then the STATISTICS command can be used to find the required average. The ELSE command reverses, or negates, a condition established by WHEN. The END command removes any conditions established by WHEN or ELSE. Only one WHEN or ELSE may be in effect. Simple forms of WHEN relational expressions include the following relational operators: Operator Meaning -------- ------------- = Equality >= Greater than or equal <= Less than or equal > Greater than < Less than For example, YEAR>1950 means select all cases for which the 62 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 WHEN--ELSE--END year is greater than 1950. Another useful conditional expression is illustrated by AGE=15,20,25,26,27. In this example, all cases which have an age corresponding to 15, 20, 25, 26 and 27 are selected. A frequently useful conditional expression is illustrated by VOLUME=405.44 THRU 650.25. This form of a conditional expression permits a range of values to be selected. In this example, all cases which have a volume from 405.44 to 650.25 are selected for use. There are intances when it is desirable to compound conditions with a logical "and" or "or". STATMATE allows two simple expressions to be logically combined in such a manner by using an AND or OR to separate simple expressions. For example, AGE=15 THRU 40 AND WEIGHT>190.55 selects all cases for which the age is from 15 to 40 and whose weight is greater than 190.55. The WHEN--ELSE commands operate on a case by case basis. When data is selected with a WHEN, STATMATE effectively is restricted to a window or view of your total data. The view extends across all variables on a case by case basis. That is, any variable in the database may be used, but only those cases selected by the WHEN condition are used. To better understand how the data view established by a WHEN operates, consider the following variables after the WHEN command WHEN AGE=30 THRU 70 has been used. The cases designated by the small x are the part of the data view established by the command. AGE WEIGHT --- ------ 22 140.5 x 33 177.2 15 105.4 10 88.0 x 54 188.3 x 38 224.5 Although the selection was made on AGE, only the cases in WEIGHT corresponding to those selected in AGE are accessible when WEIGHT is used by an analytic command such as STATISTICS. That is, STATISTICS WEIGHT would compute the average of 177.2, 188.3 and 224.5. Incidentally, applying ELSE now would put the cases corresponding to ages 22, 15, and 10 in the view. Some important aspects of the WHEN command should be understood when an attempt is made to write into the database with a COMPUTE, LET or RECODE command while WHEN is in effect. Perhaps the best way to explain the effect is to 63 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 WHEN--ELSE--END consider an example. Suppose we consider the three variables AGE, WEIGHT, DOSAGE and NEWDOSAGE shown below. Assume further that WHEN AGE=30 THRU 70 has been applied to the database, as described earlier. The x symbols show the selected cases. AGE WEIGHT DOSAGE NEWDOSAGE --- ------ ------ --------- 22 140.5 15.2 empty x 33 177.2 11.4 15 105.4 17.8 10 88.0 10.4 x 54 188.3 11.3 x 38 224.5 21.5 Note that WEIGHT and DOSAGE contain data, but NEWWEIGHT does not. Assume the following two LET commands are executed: LET NEWWEIGHT=WEIGHT+100 and LET DOSAGE=DOSAGE+2.0. The result is: AGE WEIGHT DOSAGE NEWWEIGHT --- ------ ------ --------- 22 140.5 15.2 missing x 33 177.2 13.4 277.2 15 105.4 17.8 missing 10 88.0 10.4 missing x 54 188.3 13.3 288.3 x 38 224.5 23.5 324.5 Note that new values have been calculated for cases in the view, but not for cases outside of the view. Further, note that because NEWWEIGHT did not have data, missing values are placed at cases not within the view. While the WHEN is in effect, some care must be exercised in writing data into variables that have been used to select the view. For example, WHEN AGE=10 THRU 25 followed by LET AGE=AGE+5 changes AGE but does not affect the view. That is, the WHEN selects cases whose age is from 10 to 25, but the subsequent LET recalculates the ages so that the selected cases have ages between 15 and 30. In this situation, STATMATE does not reselect cases to conform to the previous WHEN. Detailed Example: In the example shown below, WHEN is used to select all cases of YEAR which are greater than 1890. The STATISTICS command is then executed on URBANPOP to find statistics of urban population. The statistics produced are those for the urban population from 1900 to 1950. ELSE is then used to reverse the condition, and statistics are computed on the urban population before 1900. PRINT is used to display YEAR and 64 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 WHEN--ELSE--END URBANPOP before and after the use of the WHEN and ELSE. END is used to restore the data view to the full view. Command: PRINT YEAR,URBANPOP YEAR URBANPOP ----------- ----------- 1790.00 0.20 1800.00 0.32 1810.00 0.52 1820.00 0.69 1830.00 1.13 1840.00 1.84 1850.00 3.54 1860.00 6.22 1870.00 9.90 1880.00 14.13 1890.00 22.11 1900.00 30.16 1910.00 42.00 1920.00 54.16 1930.00 68.95 1940.00 74.92 1950.00 88.93 Command: WHEN YEAR>1890 6 OF 17 CASES Command: PRINT YEAR,URBANPOP YEAR URBANPOP ----------- ----------- 1900.00 30.16 1910.00 42.00 1920.00 54.16 1930.00 68.95 1940.00 74.92 1950.00 88.93 Command: STATISTICS URBANPOP VARIABLE: URBANPOP 6 CASES 0 MISSING CENTRAL TENDENCY SPREAD DISTRIBUTION ---------------- ------------------------ -------------------- MEAN 59.85 STD. DEV. 21.85 MINIMUM 30.16 VARIANCE 477.63 MAXIMUM 88.93 RANGE 58.77 COEFF. VAR. 0.37 65 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 WHEN--ELSE--END SUMMATIONS HIGHER MOMENTS ----------------------- -------------------- TOTAL 359.12 SKEWNESS -0.07 SUM SQ 23883.03 KURTOSIS 1.74 SUM SQ(DEV) 2388.15 Command: ELSE 244 OF 250 CASES Command: PRINT YEAR,URBANPOP YEAR URBANPOP ----------- ----------- 1790.00 0.20 1800.00 0.32 1810.00 0.52 1820.00 0.69 1830.00 1.13 1840.00 1.84 1850.00 3.54 1860.00 6.22 1870.00 9.90 1880.00 14.13 1890.00 22.11 Command: STATISTICS URBANPOP VARIABLE: URBANPOP 11 CASES 0 MISSING CENTRAL TENDENCY SPREAD DISTRIBUTION ---------------- ------------------------ -------------------- MEAN 5.51 STD. DEV. 7.14 MINIMUM 0.20 VARIANCE 50.92 MAXIMUM 22.11 RANGE 21.90 COEFF. VAR. 1.29 SUMMATIONS HIGHER MOMENTS ----------------------- -------------------- TOTAL 60.61 SKEWNESS 1.34 SUM SQ 843.17 KURTOSIS 3.61 SUM SQ(DEV) 509.17 Command: END FULL DATA VIEW RESTORED 66 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 WRITE Command: WRITE Purpose: Places data from the database onto an external file Syntax: A. WRITE vname1,... ON filenm where vname1 = variable name filenm = a file name Defaults: none Syntax Examples: WRITE YEAR,URBANPOP ON POPFILE WRITE STOCK ON STOCK.DAT WRITE #0,DOSAGE,ANIMGROUP ON STUDY.DTA Description: The WRITE command places data from the database onto a designated file. The list of variables specified in the command are written case by case onto the file in ASCII form. Files in the ASCII format may be printed on your system, or modified by word processors, such as WORDSTAR. With minor modifications, the file may be used as input to other application programs, such as dBASE II (III), which accept ASCII data. When the output file is written, the very first line of data contains information which enables the file to be read by STATMATE, if it is desirable to re-enter the data into STATMATE again. The format of the first line is the same as discussed in the section ENTERING DATA INTO THE SYSTEM (See the discussion of ASCII files). Also, the names of the variables placed on the file are listed on the first line. Each subsequent line of output contains one case for each variable specified in the command. Detailed Example: In the example below, WRITE is used to place the data from variables YEAR and URBANPOP on the file POPDATA.DAT. Command: WRITE YEAR,URBANPOP ON POPDATA.DAT 17 CASES AND 2 VARIABLE(S) WRITTEN 67 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 This page deliberately blank 68
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX A: COMPUTATIONAL METHODS (Abbreviated) CORRELATE The computations of CORRELATE may be found in many statistical texts; see in particular Ostle in the references. STATISTICS Most of the statistics computed by STATISTICS may be found in any standard text on statistics. Sums and sums of powers are calculated using provisional methods. LET The basis for the uniform random number generator used in the LET functions is the Wichmann article cited in the references. Normally distributed numbers are generated by summing twelve numbers generated from a uniform distribution and applying appropriate transformations to scale the result to the desired mean and standard deviation. 69 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX B: SAMPLE DATA This appendix describes three data files which are included with the STATMATE package: 1. U. S. Population Data 2. Motor Vehicle Death Data 3. Hald Cement Data 1. U. S. Population Data The data listed below represents U. S. Population data from 1790 through 1950. Column Description ------ ---------------------------- 1 Year 2 Urban population in millions 3 Rural population in millions DATA U.S. POPULATION:YEAR,URBAN,RURAL--POPULATION IN MILLIONS 1790 , 0.202 , 3.728 1800 , 0.322 , 4.986 1810 , 0.525 , 6.714 1820 , 0.693 , 8.945 1830 , 1.127 , 11.739 1840 , 1.845 , 15.224 1850 , 3.544 , 19.648 1860 , 6.217 , 25.227 1870 , 9.902 , 28.656 1880 , 14.130 , 36.026 1890 , 22.106 , 40.841 1900 , 30.160 , 45.835 1910 , 41.999 , 49.973 1920 , 54.158 , 51.553 1930 , 68.955 , 53.830 1940 , 74.924 , 57.246 1950 , 88.927 , 61.770 70 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 2. Motor Vehicle Death Data The data below may be found in Draper and Smith, page 191, given in the references. Note the presence of a missing value in the rural road mileage for Washington, D. C.. Column Description ------ --------------------------------------------- 1 States 2 (Y) Motor vehicle deaths in 1964 3 Number of drivers (in units of 10,000) 4 Persons per sq. mi. in 1960 5 Rural road mileage (1,000) 6 Percentage of males greater than females (0=no, 1=yes) 7 January maximum temperature 8 Highway fuel consumption in 1964 (10 million) DATA MOTOR VEHICLE DATA AL, 968, 158, 64.0, 66.0, 0, 62, 119.0 AK, 43, 11, 0.4, 5.9, 1, 30, 6.2 AZ, 588, 91, 12.0, 33.0, 1, 64, 65.0 AR, 640, 92, 34.0, 73.0, 0, 51, 74.0 CA, 4743, 952, 100.0, 118.0, 0, 65, 105.0 CO, 566, 109, 17.0, 73.0, 0, 42, 78.0 CT, 325, 167, 518.0, 5.1, 0, 37, 95.0 DE, 118, 30, 226.0, 3.4, 0, 41, 20.0 DC, 115, 35, 12524.0, 1E+30, 0, 44, 23.0 FL, 1545, 298, 91.0, 57.0, 0, 67, 216.0 GA, 1302, 203, 68.0, 83.0, 0, 54, 162.0 ID, 262, 41, 8.1, 40.0, 1, 36, 29.0 IL, 2207, 544, 180.0, 102.0, 0, 33, 350.0 IN, 1410, 254, 129.0, 89.0, 0, 37, 196.0 IA, 833, 150, 49.0, 100.0, 0, 30, 109.0 KS, 669, 136, 27.0, 124.0, 0, 42, 94.0 KY, 911, 147, 76.0, 65.0, 0, 44, 104.0 LA, 1037, 146, 72.0, 40.0, 0, 65, 109.0 ME, 196, 46, 31.0, 19.0, 0, 30, 37.0 MD, 616, 157, 314.0, 29.0, 0, 44, 113.0 MA, 766, 255, 655.0, 17.0, 0, 37, 166.0 MI, 2120, 403, 137.0, 95.0, 0, 33, 306.0 MN, 841, 189, 43.0, 110.0, 0, 22, 132.0 MS, 648, 85, 46.0, 59.0, 0, 57, 77.0 MO, 1289, 234, 63.0, 100.0, 0, 40, 180.0 MT, 259, 38, 4.6, 72.0, 1, 29, 31.0 NB, 450, 89, 18.4, 97.0, 0, 32, 61.0 NE, 215, 23, 2.6, 44.0, 1, 40, 24.0 NH, 158, 37, 67.0, 13.0, 0, 32, 23.0 NJ, 1071, 329, 807.0, 21.0, 0, 43, 231.0 NM, 387, 54, 7.8, 62.0, 1, 46, 48.0 NY, 2745, 744, 350.0, 84.0, 0, 31, 439.0 NC, 1580, 226, 93.0, 71.0, 0, 51, 177.0 ND, 185, 38, 9.1, 102.0, 1, 20, 24.0 71 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 OH, 2096, 530, 237.0, 84.0, 0, 41, 358.0 OK, 785, 137, 34.0, 94.0, 0, 46, 107.0 OR, 575, 108, 18.0, 73.0, 0, 45, 81.0 PA, 1889, 570, 252.0, 89.0, 0, 39, 353.0 RI, 100, 46, 812.0, 1.3, 0, 38, 27.0 SC, 870, 122, 79.0, 52.0, 0, 61, 86.0 SD, 270, 40, 9.0, 87.0, 1, 23, 28.0 TN, 1059, 177, 85.0, 67.0, 0, 49, 135.0 TX, 3006, 515, 37.0, 196.0, 0, 50, 448.0 UT, 295, 57, 10.8, 32.0, 0, 37, 38.0 VT, 131, 20, 42.0, 13.0, 0, 25, 15.0 VA, 1050, 208, 100.0, 50.0, 0, 50, 150.0 WA, 730, 160, 43.0, 59.0, 1, 46, 109.0 WV, 467, 88, 77.0, 32.0, 0, 43, 54.0 WI, 1059, 207, 72.0, 87.0, 0, 26, 141.0 WY, 148, 22, 3.4, 67.0, 1, 37, 20.0 3. Hald Cement Data The data shown below may be found in Draper and Smith, page 630, given in the references. Column Description ------ -------------------------------------------- 1 Amount of tricalcium aluminate (% clinker wgt) 2 Amount of tricalcium silicate 3 Amount of tetracalcium ferrite 4 Amount of dicalcium silicate 5 (Y) Heat produced in hardening cement DATA HALD DATA FROM DRAPER AND SMITH 7,26, 6,60, 78.5 1,29,15,52, 74.3 11,56, 8,20,104.3 11,31, 8,47, 87.6 7,52, 6,33, 95.9 11,55, 9,22,109.2 3,71,17, 6,102.7 1,31,22,44, 72.5 2,54,18,22, 93.1 21,47, 4,26,115.9 1,40,23,34, 83.8 11,66, 9,12,113.3 10,68, 8,12,109.4 72 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX C: STATMATE INSTALLATION AND MISCELLANEA STATMATE Components The STATMATE package is supplied on several disks. Generally, the disks contain a set of compiled programs with file extensions of .OVL and .COM (.OVL and .EXE for the PC DOS-MS DOS version). A set of sample data sets with file extensions of .DAT are included. As well, the disks contain an SMSA$ file, an SMHELP.TXT file and an SMINSTLL.COM (.EXE for DOS) program file. It is advisable to make a copy of these disks. Use the copies as your working disks. Once you have made backup copies, you are ready to begin. Make sure that the disk containing SMATE.COM is in your A-drive when you execute STATMATE. See the section on STATMATE Operation for instructions on how to operate the package. If you have space problems fitting the package onto your system, see the section in this appendix titled Tailoring STATMATE to Your System. STATMATE Internal Files In order to operate, STATMATE creates several files for its internal use. These include SMSA$, xxxSM$DI, xxxSM$DB, xxxSM$S1, xxxSM$PL, xxxSM$WH, xxxSM$EQ and xxxSM$PL, xxxSM$CH, SM$CU, where xxx represents the ID provided by the user. Some versions of STATMATE may produce additional files, but they are always prefaced by xxSM$. SMSA$ is a control file containing installation information (problem size, etc.). Tailoring STATMATE to Your System STATMATE contains an SMINSTLL program which allows you to tailor STATMATE to meet various disk space needs and to modify problem and data size parameters. Let us examine how SMINSTLL can be used to help solve disk space needs. In order to accomodate different disk space needs, the STATMATE package provides a way of distributing the various programs over several disks by dividing STATMATE program files into five groups. These groups designate the disk location of the programs for particular commands and files for internal STATMATE use. The following table shows the five groups and the commands and files controlled by the groups: 73 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 Group Controls Disk for -------------- ---------------------------------- Internal Files STATMATE files Nucleus ERASE, EXECUTE, GIVE, HELP, WRITE INPUT, PRINT, QUERY, REMARK SET, EXIT, WHEN, ELSE, END Statistics STATISTICS, BREAKDOWN ONEWAY, TTEST, TWOWAY, CROSSTAB, KOLMOGOROV, TNPARAM, ONPARAM, RCORR Miscellaneous PLOT, HISTOGRAM, EDIT, LET, RECODE CHART, CUSUM Regression REGRESSION, CURVE, POLYNOMIAL, NONLINEAR, COMPUTE, CORRELATE, STEPWISE For example, the Regression group controls which disk STATMATE expects to find the programs for the CURVE and REGRESSION commands. Through the use of the SMINSTLL program, discussed in the next paragraph, the user may alter the disk locations where STATMATE expects to find the programs for its commands. In order to modify the delivered configuration, it is necessary to use the SMINSTLL program provided with STATMATE. SMINSTLL is an interactive program that asks for the desired disk drive of the groups shown above. Once the drives for these groups are specified, SMINSTLL will give instructions on how STATMATE programs should be distributed on your disk drives. Information about the new configuration is placed on the SMSA$ file, which is used by STATMATE to determine what configuration is to be used during execution. It is necessary to operate STATMATE in the specified disk configuration. STATMATE only knows of the current configuration as specified by SMINSTLL. A second need solved by SMINSTLL, is the ability to change database and problem size parameters. As delivered, the maximum number of variables allowed in the database is 10 and the maximum number of cases that can be placed in the database is 250. Use SMINSTLL to change these values. SMINSTLL will query you for the values. For the best results, change the maximum number of variables to some multiple of 5, for example, 15. Once you use SMINSTLL to change these values, all new databases will be of the specified sized. STATMATE maintains these size parameters with each database. Hence, previously created databases can be used without any difficulty. The size of an existing database cannot be changed. Some care should be exercised in specifiying the size of the database. A database with a maximum of 10 variables and 250 cases occupies about 5K bytes of file space. This space is 74 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 assigned immediately, and not as you add data to the database. If you specify a database of 50 variables and 500 cases, then you will use about 50K bytes of file space. If you only work with at most 200 cases and 5 variables, only part of the database will be used. The remainder will nevertheless occupy file space on your disk. Note that there is no way to change individual command limitations on problem sizes. For example, there is no way to increase the size of problems that can be accommodated by ONEWAY. User's with a hard disk should use SMINSTLL, and specify that all groups belong on the current directory. All STATMATE files should then be placed in a single directory. When using SMINSTLL, it is probably a good idea to print the output so that you will have a record of how your configuration should be installed. The names of program files which are to be placed on specific disk drives may vary with the version of STATMATE. The example below is representative of the interaction with SMINSTLL. In any case, the output instructions from SMINSTLL should be followed when you actually perform the installation. SMINSTLL Example In the following example, SMINSTLL is used to create a configuration that establishes databases with a maximum of 40 variables and 500 cases. The location of internal files is specified as the B-disk. We begin by entering the word SMINSTLL at the terminal. SMINSTLL STATMATE INSTALL PROGRAM Note: [ ] denotes default value SPECIFY MAXIMUM VARIABLES : 40 MAXIMUM CASES : 500 NOTE: Your STATMATE database file will occupy about 80K bytes of disk storage. SPECIFY DISK DRIVE (A,B,C,D,E,F OR RETURN=CURRENT) FOR: STATMATE INTERNAL FILE GROUP [CURRENT]: B --- File Distribution Settings --- STATMATE Internal File Group Files Will Be Generated on B Drive CP/M PC DOS/MS DOS -------------- ------------- 75 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 xxxSM$DB xxxSM$DB xxxSM$DI xxxSM$DI xxxSM$S1 xxxSM$S1 xxxSM$WH xxxSM$WH xxxSM$EQ xxxSM$EQ xxxSM$PL xxxSM$PL xxxSM$ST xxxSM$ST xxxSM$CU xxxSM$CU xxxSM$CH xxxSM$CH where xxx is the user ID Press carriage return to continue Make sure all of the following STATMATE files are on the same diskette: CP/M PC DOS/MS DOS --------------- ------------- 1. STATMATE.COM STATMATE.EXE 2. All .OVL files .OVR files 3. SMSA$ file SMSA$ file 4. SMHELP.TXT SMHELP.TXT End of installation 76 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX D: STATMATE SIZE LIMITATIONS Command problem size limitations are shown in the table below. STATMATE/PLUS SIZE LIMITATIONS Command problem size limitations are shown in the table below. Command Limitations Command Cases Var Classes Other Limitations ------------ ----- ---- ------- ------------------- BREAKDOWN 32000 1 100 CHART 32000 1 100 100 subgroups & 20 items per subgroup COMPUTE 32000* 1 na** CORRELATE 32000 20 na CROSSTABS 32000 2 20 CURVE 32000 2 na CUSUM 32000 1 100 100 subgroups & 20 items per subgroup EDIT 32000 20 na ERASE na na na EXECUTE na na na EXIT na na na GIVE na na na 20 attribute modifications HELP na na na HISTOGRAM 32000 1 20 20 bars INPUT 32000 50 na KOLMOGOROV 500 2 na LET 32000 2 na NONLINEAR 250 5 na 5 parameters ONPARAM 500 2 xx ONEWAY 32000 2 20 PLOT 32000 6 na 5 on y-axis and 1 on x-axis POLYNOMIAL 250 2 na Degree less than 11 PRINT 32000 6 na QUERY na na na 20 variables per query RCORRELATION 500 2 na RECODE 32000 1 na 10 individual values when recoding a set of specific values REGRESSION 32000 20 na REMARK na na na * 32,767 for those who can remember or need it ** not applicable 77 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 Command Limitations (Continued) Command Cases Var Classes Other Limitations ------------ ----- ---- ------- ------------------- SET na na na SHOW na na na STATISTICS 32000 1 na 500 cases when computing quartiles STEPWISE 32000 20 na TNPARAM na 2 na 20 sets by 20 groups TTEST 32000 2 na TWOWAY 32000 3 20 WHEN 32000 2 na WRITE 32000 20 na A STATMATE database may contain as many as 32,000 cases, and several hundred variables. When a database size is large, the limiting factor becomes the disk capacity of your system. Generally, analyses can process as many as 32,000 cases and 20 variables. Specific limitations are shown in the above table. 78 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX E: HELP 1. Question: How do I enter alpanumeric data into the database. Answer: Read the INPUT and GIVE TYPE command descriptions. 2. Question: I keep running out of disk space because the disk contains many STATMATE internal files. What can I do? Answer: Try to keep your use of identifiers in response to the ID prompt to a minimum. Use the install program to create larger databases and use as few databases as possible. 3. Question: Why won't INPUT read my data. Answer: Check your input file to see that data is in the correct fields and that data items are correct. 4. Question: What needs to be done to print my output on my printer? Answer: Use the SET COPY=HARDCOPY command. 5. Question: Why can't the INPUT command find my data file? Answer: Use a disk identifier, such as B:, before the file name to specify the disk the file is on. 6. Question: Why aren't my PLOT modifiers retained for repeated use. Answer: Use SAVE in the subcommand mode. 7. Question: When I created one of my databases, I forgot how many variables and cases I specified as the maximum. How can I find out what the maximum is for a database. Answer: Use the SHOW command. 8. Question: Is it possible to get data from another program, such as dBASE into STATMATE. Answer: Create an ASCII file with the program and turn it into a 'DATA' file with your word processor or editor. 9. Question: Almost every time I use the package, I use the same seven or eight commands for my task. Is their any way to reduce the effort. Answer: Use the EXECUTE command. 79 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 10. Question: PRINT produces too many cases to view easily. Can I reduce the output some way? Answer: Use the WHEN command to restrict output. 11. Question: How can I control my output to the screen, it scrolls by so fast that it can't be read easily. Answer: Use your operating system's ability to hold screen output with control keys. Use the SET COPY=FILE command to output the results to a file for examination later. 12. Question: Do the STATMATE database and internal files and my data files need to be on the same diskette? Answer: No. Internal file usage is controlled by the install program. Your data files may be referenced with the INPUT command by preceding the file name with a disk identifier, if necessary. 80 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX F: SUGGESTED DISKETTE ORGANIZATION STATMATE/PLUS is distributed on diskettes with the contents listed below. If you are executing the program from diskettes, this is the arrangement that you should begin with. Disk 1 should be placed in your a-drive and disk 2 in your b-drive. Disk 3 contains the shareware version of the STATMATE/PLUS user's guide as listable files. It is not needed to execute STATMATE. The suggested diskette arrangement should leave you with about 30K of free space on disk 1, enough to use STATMATE as configured by the install program, SMINSTLL.EXE. Some of this space will disappear when you use STATMATE for the first time. STATMATE creates a database file and other internal files as it is used. If you need more space, consider moving SMINSTLL.EXE and the data files (.DAT) to another diskette. If you continue to have space problems, see appendix C for information on installing STATMATE in other configurations. Suggested Diskette Contents Disk 1 SMATE .EXE --- main program SMASKC.OVR SMATE .EXE SMBRKD.OVR --- overlay files SMINLX.OVR SMINPT.OVR SMKOLM.OVR SMONET.OVR SMONEW.OVR SMONPR.OVR SMRCOR.OVR SMSETI.OVR SMSHOW.OVR SMSTAT.OVR SMSYAN.OVR SMTNPR.OVR SMTTES.OVR SMTWOT.OVR SMTWOW.OVR SMSA$ --- configuration file SMHELP.TXT --- help file SMINSTLL.EXE --- install program DEMO --- example command file USPOP.DAT USPOPDEM.DAT --- sample data files HALD.DAT MOTOR.DAT QAMEAS.DAT SAMPLE.DAT README Disk 2 SMASUM.OVR SMCHLX.OVR SMCHPL.OVR SMCHRT.OVR --- overlay files SMCOMP.OVR SMCORR.OVR SMCORT.OVR SMCSLX.OVR SMCSUM.OVR SMCURT.OVR SMCURV.OVR SMEDIT.OVR SMHIST.OVR SMLETA.OVR SMLETX.OVR SMNONL.OVR SMNONT.OVR SMNONX.OVR SMPLLX.OVR SMPLOT.OVR SMPOLY.OVR SMRECO.OVR SMREGR.OVR SMROUT.OVR SMSTEP.OVR SMSTLX.OVR SMXTAB.OVR Disk 3 SMPART1.DOC SMPART2.DOC SMPART3.DOC SMPART4.DOC --- user's guide SMPART5.DOC 81 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 APPENDIX G: INVOICE AND ORDER FORM A sample invoice and form is enclosed to simplify ordering. Use the order form to place an order. The invoice form is to be used within your organization to generate payment for STATMATE/PLUS registrations. STATMATE/PLUS is available on disk for your evaluation and convenience for $35. This fee only covers diskette costs, handling and postage (within the U.S.). It does not cover registration. Please show your support by registering the program, if you are using it on a regular basis and find it of value. Note that a 190+ page user's guide is available for $35 and you can save $10 by purchasing the diskettes and registering at the same time. Remember that registered owners receive several utilities when they register. Further, when you purchase either the diskettes or registration, you receive a coupon worth $10 on your next purchase. If you purchase both, then you receive $10 off now and a $10 coupon. The coupon offer applies only to purchases made directly from the Software Hill. Business, commercial, governmental or educational institution use of non-registered copies of STATMATE/PLUS is strictly forbidden. Write for details concerning site or corporate licensing. 82 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 ---INVOICE FORM--- for STATMATE/PLUS Remit to: The Software Hill 1857 Apple Tree Lane Mt. View, CA 94040 (415) 969-4233 Sold to: ____________________ Ship to: ______________________ ____________________ ______________________ ____________________ ______________________ ------------------------------------------------------------------- Date: PO #: ------------------------------------------------------------------- Items: Qty Price [ ] STATMATE Evaluation disks $ 35 ______ $______.____ [ ] STATMATE/PLUS Registration $ 45 ______ $______.____ [ ] STATMATE disks & registration $ 70 (save) ______ $______.____ [ ] STATMATE/PLUS User's Guide only $ 35 ______ $______.____ [ ] California Residents add 7% sales tax $_____.____ Total (U.S.) $________.____ Shipping: [ ] Ship COD via UPS or not U.S. mail; add $15 $______.___ [ ] Ship outside of North America: Add $15 [ ] STATMATE/PLUS disks only $______.___ Add $25 [ ] STATMATE/PLUS manual & disks $______.___ Add $15 [ ] STATMATE/XG disks and guide $______.___ 83 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 ---ORDER FORM--- for STATMATE/PLUS The Software Hill 1857 Apple Tree Lane Mt. View, CA 94040 (415) 969-4233 Operating System and Computer: [ ] PC DOS [ ] MS DOS Version _____ Computer ______________ Items: Qty Price [ ] STATMATE Evaluation disks $ 35 ______ $_____.___ [ ] STATMATE/PLUS Registration $ 45 ______ $_____.___ [ ] STATMATE disks & registration $ 70 (save) ______ $______.__ [ ] STATMATE/PLUS User's Guide only -- $ 35 ______ $_____.___ [ ] California Residents add 7% sales tax $____.___ Total (U.S.) $_______.___ Shipping: [ ] Ship COD via UPS or not U.S. mail; add $20 $______.___ [ ] Ship outside of North America: Add $15 [ ] STATMATE/PLUS disks only $______.___ Add $25 [ ] STATMATE/PLUS manual & disks $______.___ Add $15 [ ] STATMATE/XG disks and guide $______.___ Payment: [ ] Check enclosed. Amount enclosed $________._____ (U.S.) Make check The Software Hill payable to: 1857 Apple Tree Lane Mt. View, CA 94040 1. Allow three weeks for delivery 2. Orders outside U.S. send check drawn on U.S. bank or international money order. Amount in U.S. dollars. ------------------------------------------------------------------ Date: PO #: ------------------------------------------------------------------ ---Customer Information--- Name ______________________________________________ Address ___________________________________________ ___________________________________________ ___________________________________________ City __________________ State ______ Country __________ Phone (_____) ______ - _________ 84 STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987 REFERENCES (Partial) Cooper, B. E., Statistics for Experimentalists, Pergamon Press Ltd., London, Englad, First Edition, 1969 Draper, N. R. and Smith, H., Applied Regression Analysis, John Wiley and Sons, New York, New York, Second Edition, 1981 Duncan, A. J., Quality Control and Industrial Statistics, Richard D. Irwin Inc., Homewood, Illinois, Fourth Edition, 1974 Marquardt, Donald W., An Algorithm for Least-Squares Estimation of Nonlinear Parameters, Journal of the SIAM, vol. 11, no. 2, pp 431-441, June 1963. Ostle, Bernard, Statistics in Research, Iowa State Univ. Press, Ames, Iowa, Second (now in Seventh Edition), 1963 Seigel, Sidney, Nonparametric Statistics for the Behavorial Sciences, Chapters 6 and 8, New York, McGraw-Hill, 1956 Tukey, John W., Exploratory Data Analysis, Addison-Wesley, Reading, Massachussetts, 1977 Wichmann, B. A. and Hill, I. D., A Psuedo-random Number Generator, NPL Report, DITC, June, 1982 Winer, B. J., Statistical Principles in Experimental Designs, Mc Graw-Hill, New York, 1970 85
Volume in drive A has no label Directory of A:\ FILES863 TXT 751 9-07-88 2:18p GO BAT 38 11-05-87 3:26p GO TXT 463 11-05-87 3:56p PRINTDOC BAT 789 11-05-87 3:59p SMPART1 DOC 44886 8-31-88 6:45a SMPART2 DOC 42240 8-07-87 5:42p SMPART3 DOC 25216 8-07-87 5:43p SMPART4 DOC 23808 8-07-87 5:44p SMPART5 DOC 35845 8-31-88 6:47a 9 file(s) 174036 bytes 143360 bytes free