Technical Reference

ASSEMBLY TO HIGH-LEVEL INTERFACE

11.1 Interfacing Basic with Assembly Language

Sometimes it may be desirable to write certain subroutines in assembly language instead of BASIC because of speed, size, or other constraints. This section explains how to successfully combine BASIC programs with assembly language modules so that parameters are passed correctly between the BASIC program and the assembly language subroutine.

To understand this section fully, the reader should have a basic knowledge of the following:

  1. The BASIC interpreter
  2. The assembler
  3. The register structure of the 8086/8088
11.1.1 Calling Assembly Language Subroutines

In order to call an assembly language subroutine from an interpretive BASIC program, it is necessary to know the address of the assembly language routine. The routine must be resident in memory when BASIC is loaded, and its entry address must be known. The module containing the subroutine should also contain a short program that loads the module into memory, using the MS-DOS terminate and remain resident function (Int 27 Hex). When the program is run, it loads the subroutine into memory permanently. The program should also display the entry address of the subroutine, or store the entry address to some specific memory location that the BASIC program can PEEK in order to determine where the routine is. A possible location to store this information is an Interrupt Vector Table entry, but be very careful not to use an entry that is used by the operating system! Interrupt Vector Table entries available for use include 128 through 191 (80 - BF Hex). Since each entry is four bytes long, entry 128 is at memory address 0:200 Hex, entry 129 is at address 0:204 hex, and so on, with entry 191 at address 0:2FC Hex.

The BASIC program can then determine the address of the subroutine by PEEKing the four consecutive bytes that were saved by the assembly loader program. After the BASIC program has PEEKed these locations, it can set up the entry address of the assembly language subroutine. The following program segment shows how this is done.

	10 '
	20 '   do  a  DEF SEG  to the  segment  where  the  entry 
        25 '   address is stored
	30 '   LOCATION  =  offset address of the  entry  address 
        35 '   location
	40 '
	50 LOWOFF = PEEK(LOCATION)
	60 HIOFF  = PEEK(LOCATION+1)
	70 LOWSEG = PEEK(LOCATION+2)
	80 HISEG  = PEEK(LOCATION+3)
	90 ASM.SEG = (256*HISEG)+LOWSEG
	100 SUBROUTINE = (256*HIOFF)+LOWOFF
	110 DEF SEG = ASM.SEG

After these statements have been executed, calls to the assembly language subroutine can be performed as follows:

	150 CALL SUBROUTINE(PARAMETER1, PARAMETER2, ...)

The assembly language subroutine must follow some simple rules in order to work correctly.

  1. It must be declared FAR.
  2. Segment registers DS and ES must be restored to their entry values before returning to BASIC.
  3. The general purpose registers (AX, BX, CX, DX, SI, DI, and BP) can have any value when when returning to BASIC.
  4. The assembly language routine MUST NOT change the length of any BASIC strings.
  5. The assembly language routine must perform a RET (where n = 2 times the number of parameters) to restore the stack pointer to its proper value.
  6. Values can be returned to BASIC by passing a parameter that the result will be returned in.
11.1.2 Basic Data Types

It is necessary to understand how the various data types are represented in memory. When a subroutine is called, BASIC will pass the address of one of the following data representations.

  1. Integer - two byte two's complement number
  2. Single Precision Number - four byte binary floating point quantity. The most significant byte contains the value of the exponent minus 127. The remaining three bytes contain the mantissa. The most significant byte of the mantissa contains the sign bit, followed by the seven highest bits of the mantissa. A positive number is represented with a 0 as the sign bit, a negative number with a 1 as the sign bit. The binary point is to the left of the most signifi- cant bit of the mantissa. A 1 is always assumed to exist immediately to the left of the mantissa, although it is not represented. Thus the number is represented as

    ( 1. * 2) ^ (exponent-127)

  3. Double Precision Number - eight byte binary floating point quantity. It is represented exactly the same as a single precision number, except that the mantissa is made up of 41 bits (7 bytes less the sign bit).
  4. String - BASIC will pass a pointer to a 'string descriptor' which is a three byte data structure. The first byte of the string descriptor contains the length of the string. The second and third bytes contain the address where the actual ASCII string is located. The assembly language subroutine is allowed to modify the string, but must not change the string descriptor.
  5. Array - arrays are made up of sequential elements of the array type. For example, an integer array containing twenty elements is represented as twenty sequential integers in memory.
11.1.3 Passing Parameters

BASIC passes all subroutine parameters by reference. The offset of each parameter's address is pushed onto the stack in the same order that the parameters are listed in the procedure call. Upon entry to the subroutine, the stack will be arranged as follows:

                SP+8 -> |       etc.      |
			+-----------------+
		SP+6 -> |    Offset of    |
			|  2nd parameter  |
			+-----------------+   |
		SP+4 ->	|    Offset of    |   |
			|  1st parameter  |   | Stack grows down
			+-----------------+   |
		SP --->	| return address  |   v
			|    (4 bytes)    |
			+-----------------+

The parameters can then be referenced by using the BP register to get their address off of the stack. The following example shows how to do this.

11.1.4 Example

This example shows how to call an assembly language routine from BASIC. The assembly language routine performs modulo arithmetic on two integers, returning the remainder that results when the first integer is divided by the second. The assembly language module consists of two procedures. The first procedure loads the module into memory, and puts the entry address of the second procedure into interrupt vector table entry 128. The second procedure is called from BASIC, and performs the modulo function. The BASIC program peeks the Interrupt Vector Table to get the entry address of the modulo function, and the performs the call with some sample data.

ASSEMBLY LANGUAGE MODULE

	name	modulo

	code	segment	public 'code'
	assume	cs:code, ds:code

		org	100h	   ; necessary for .COM program

	; This procedure loads the module into memory and sets up
	; interrupt vector table entry 128.

	loader	proc	near

		IVT_seg    equ 0	; Interrupt Vector Table 
		Int128_off equ 512	;   entry 128 is at 0:512

		push	cs
		pop	ds		; DS = CS in .COM program

	; set up Interrupt vector table entry 128 to
	; point to the Modulo arithmetic function.

		mov	ax, IVT_seg
		mov	es, ax
		mov	bx, Int128_off
		mov	ax, offset modulo
		mov	es:[bx], ax
		mov	es:[bx+2], cs

	; Terminate and remain resident. Dx = last byte of program + 1.

		mov	dx, offset mod_ends
		inc	dx
		int	27h

	loader	endp

	modulo	proc	far		; must be declared far

	; This module is called from BASIC with 3 parameters.
	; It divides the first parameter by the second and 
	; returns the remainder in the third.
	
		mov	bp, sp		; BP used to get parameters
		mov	bx, [bp+8]	; BX = pointer to dividend
		mov	ax, [bx]	; AX = value of dividend
		mov	bx, [bp+6]	; BX = pointer to divisor
		mov	cx, [bx]	; CX = value of divisor
		mov	dx, 0		; DX:AX = dividend
		idiv	cx		; AX = quotient, DX = remainder
		mov	bx, [bp+4]	; BX = address of result
		mov	[bx], dx	; return result to BASIC
		ret	6

	mod_ends:
	modulo	endp
	code	ends
		end

BASIC PROGRAM

	5 ' Get address of assembly language MODULO routine from
	6 ' Interrupt Vector Table entry 128, which is located at
	7 ' memory address 0:512.
	8 '
	10 DEF SEG = 0
	20 LOWOFF = PEEK(512)
	30 HIOFF  = PEEK(513)
	40 LOWSEG = PEEK(514)
	50 HISEG  = PEEK(515)
	60 SEG = (256*HISEG)+LOWSEG
	70 MODULO = (256*HIOFF)+LOWOFF
	80 DEF SEG = SEG
	85 '
	90 ' call the MODULO routine
	95 '
	100 A% = 140
	110 B% = 11
	120 REMAINDER% = 0
	130 CALL MODULO(A%,B%,REMAINDER%)
	140 PRINT A%;"modulo";B%;"is";REMAINDER%
	150 END

This example illustrates one other important point. All parameters must be variables, and they must be initialized before calling the assembly language subroutine. After assembling and linking the assembly language module, it is necessary to convert the resulting .EXE file into a .COM file in order for the terminate and remain resident function to work correctly. An easy way to do this is with the Microsoft debugger, using the following sequence of instructions:

	debug asm_module.exe
	nasm_module.com
	w
	q

Then the assembly language module can be loaded by running the .COM program. After it has loaded, you can run your BASIC program which calls the assembly language module.

11.2 Interfacing Compiled Basic with Assembly Language

Occasionally, you may wish to write certain subroutines in assembly language instead of Compiled BASIC because of speed, size, or other constraints. This section explains how to combine compiled BASIC programs with assembly language modules so that parameters are passed correctly between the Compiled BASIC program and the assembly language subroutine.

To understand this section fully, the reader should have a basic knowledge of the following:

  1. The MS-BASIC Compiler
  2. The Microsoft assembler
  3. The Microsoft linker
  4. The register structure of the 8086/8088
11.2.1 Assembly Language Subroutines

Assembly language subroutines can be invoked from compiled BASIC using either the CALL statement or the CALLS statement. The CALL statement pushes the offset addresses of any parameters on the stack before it transfers execution to the subroutine, while the CALLS statement pushes both the segment and offset addresses of any parameters on the stack. The example later in this discussion will fully illustrate this difference.

The assembly language subroutine must follow some simple rules in order to work correctly.

  1. It must be declared FAR.
  2. It must be declared PUBLIC.
  3. Segment registers DS and ES must be restored to their entry values before returning to Compiled BASIC.
  4. The general purpose registers (AX, BX, CX, DX, SI, DI, and BP) can have any value when returning to Compiled BASIC.
  5. The assembly language routine MUST NOT change the length of any Compiled BASIC strings.
  6. The assembly language routine must perform a RET (where n = 2 times the number of parameters) to restore the stack pointer to its proper value.
  7. Values can be returned to Compiled BASIC by passing a parameter that the result will be returned in.
11.2.2 Compiled Basic Data Types

In order to manipulate data passed to an assembly language subroutine, it is necessary to understand how the various data types are represented in memory. When a subroutine is called, Compiled BASIC will pass the address of one of the following data representations.

  1. Integer - two byte two's complement number
  2. Single Precision Number - four byte binary floating point quantity. The most significant byte contains the value of the exponent minus 127. The remaining three bytes contain the mantissa. The most significant byte of the mantissa contains the sign bit, followed by the seven highest bits of the mantissa. A positive number is represented with a 0 as the sign bit, a negative number with a 1 as the sign bit. The binary point is to the left of the most significant bit of the mantissa. A 1 is always assumed to exist immediately to the left of the mantissa, although it is not represented. Thus the number is represented as

    ( 1. * 2) ^ (exponent-127)

  3. Double Precision Number - eight byte binary floating point quantity. It is represented exactly the same as a single precision number, except that the mantissa is made up of 41 bits (7 bytes less the sign bit).
  4. String - Compiled BASIC will pass a pointer to a 'string descriptor' which is a four byte data structure. The first two bytes of the string descriptor contain the length of the string. The last two bytes contain the address where the actual ASCII string is located. The assembly language subroutine is allowed to modify the string, but must not change the string descriptor.
  5. Array - arrays are made up of sequential elements of the array type. For example, an integer array containing twenty elements is represented as twenty sequential integers in memory.
11.2.3 Passing Parameters

Compiled BASIC passes all subroutine parameters by reference. In a CALL statement, the offset of each parameter's address is pushed onto the stack in the same order that the parameters are listed in the procedure call. It is important to note that all parameters to the assembly language subroutine must be variables. Upon entry to the subroutine, the stack will be arranged as follows:

                SP+8 -> |       etc.      |        
			+-----------------+
		SP+6 -> |    Offset of    |
			|  2nd parameter  |
			+-----------------+   |
		SP+4 ->	|    Offset of    |   |
			|  1st parameter  |   | Stack grows down
			+-----------------+   |
		SP --->	|  return address |   |
			|     (4 bytes)   |   v
			+-----------------+


If  a CALLS statement is used instead,  then the stack will  look 
like this when a subroutine is entered:


                SP+12 > |      etc.       |
			+-----------------+
		SP+8 -> | Full Address of |
			|  2nd parameter  |
			+-----------------+   |
		SP+4 ->	| Full Address of |   |
			|  1st parameter  |   | Stack grows down
			+-----------------+   |
		SP --->	| return address  |   |
			|    (4 bytes)    |   v
			+-----------------+

The parameters can then be referenced by using the BP register to get their address off of the stack. The following example shows how to do this.

11.2.4 Example

This example shows how to link an assembly language subroutine with a Compiled BASIC program. The assembly language routine performs modulo arithmetic on two integers, returning the remainder that results when the first integer is divided by the second. The example program is shown twice, once using a CALL statement and once using a CALLS statement.

1) Compiled BASIC program with CALL statement

	10 '
	20 ' call the MODULO routine
	30 '
	40 A% = 140
	50 B% = 11
	60 REMAINDER% = 0
	70 CALL MODULO(A%,B%,REMAINDER%)
	80 PRINT A%;"modulo";B%;"is";REMAINDER%
	90 END


   Assembly language module for use with CALL statement

name	modulo

code	segment	public 'code'
assume	cs:code, ds:code

public	modulo

modulo	proc	far

; This module is called from Compiled BASIC with 3 parameters,
; using the CALL statement. It divides the first parameter by
; the second and returns the remainder in the third.

	mov	bp, sp			; BP used to get parameters
	mov	bx, [bp+8]		; BX = pointer to dividend
	mov	ax, [bx]		; AX = value of dividend
	mov	bx, [bp+6]		; BX = pointer to divisor
	mov	cx, [bx]		; CX = value of divisor
	mov	dx, 0			; DX:AX = dividend
	idiv	cx			; AX = quotient, DX = remainder
	mov	bx, [bp+4]		; BX = address of result
	mov	[bx], dx		; return result to BASIC
	ret	6

modulo	endp
code	ends
	end


2) Compiled BASIC program with CALLS statement

	10 '
	20 ' call the MODULO routine
	30 '
	40 A% = 140
	50 B% = 11
	60 REMAINDER% = 0
	70 CALLS MODULO(A%,B%,REMAINDER%)
	80 PRINT A%;"modulo";B%;"is";REMAINDER%
	90 END


   Assembly language module for use with CALLS statement

name	modulo

code	segment	public 'code'
assume	cs:code, ds:code

public	modulo

modulo	proc	far

; This module is called from Compiled BASIC with 3 parameters, using
; the CALLS statement. It divides the first parameter by the second
; and returns the remainder in the third.

	mov	bp, sp			; BP used to get parameters
	les	bx, dword ptr [bp+12]	; ES:BX = pointer to dividend
	mov	ax, es:[bx]		; AX = value of dividend
	les	bx, dword ptr [bp+8]	; ES:BX = pointer to divisor
	mov	cx, es:[bx]		; CX = value of divisor
	mov	dx, 0			; DX:AX = dividend
	idiv	cx			; AX = quotient, DX = remainder
	les	bx, dword ptr [bp+4]	; ES:BX = address of result
	mov	es:[bx], dx		; return result to BASIC
	ret	6

modulo	endp
code	ends
	end

After compiling and assembling the various modules, use the Microsoft linker to create the executable program. The compiled BASIC object modules should be listed before the names of the assembly language object modules. After your modules have been linked, your program is ready to run.

11.3 Calling Assembly Language Subroutines from GWBASIC

Assembly language subroutines can be invoked from GWBASIC using the CALL statement. The CALL statement pushes the offset addresses of any parameters on the stack before it transfers execution to the subroutine.

The assembly language subroutine must follow some simple rules in order to work correctly.

  1. It must be declared FAR.
  2. Segment registers DS and ES must be restored to their entry values before returning to Compiled BASIC.
  3. The general purpose registers (AX, BX, CX, DX, SI, DI, and BP) can have any value when returning to GWBASIC.
  4. The assembly language routine MUST NOT change the length of any GWBASIC strings.
  5. The assembly language routine must perform a RET (where n = 2 times the number of parameters) to restore the stack pointer to its proper value.
  6. Values can be returned to GWBASIC by passing a parameter that the result will be returned in.
11.3.1 GWBASIC Data Types

In order to manipulate data passed to an assembly language subroutine, it is necessary to understand how the various data types are represented in memory. When a subroutine is called, GWBASIC will pass the address of one of the following data representations.

  1. Integer - two byte two's complement number
  2. Single Precision Number - four byte binary floating point quantity. The most significant byte contains the value of the exponent minus 127. The remaining three bytes contain the mantissa. The most significant byte of the mantissa contains the sign bit, followed by the seven highest bits of the mantissa. A positive number is represented with a 0 as the sign bit, a negative number with a 1 as the sign bit. The binary point is to the left of the most significant bit of the mantissa. A 1 is always assumed to exist immediately to the left of the mantissa, although it is not represented. Thus the number is represented as

    ( 1. * 2) ^ (exponent-127)

  3. Double Precision Number - eight byte binary floating point quantity. It is represented exactly the same as a single precision number, except that the mantissa is made up of 41 bits (7 bytes less the sign bit).
  4. String - GWBASIC will pass the offset address of a 'string descriptor' which is a three byte data structure. The first byte of the string descriptor contains the length of the string. The last two bytes contain the address where the actual ASCII string is located. The assembly language subroutine is allowed to modify the string, but it must not change the string descriptor.
  5. Array - arrays are made up of sequential elements of the array type. For example, an integer array containing twenty elements is represented as twenty sequential integers in memory.
11.3.2 Passing Parameters

GWBASIC passes all subroutine parameters by reference. In a CALL statement, the offset of each parameter's address is pushed onto the stack in the same order that the parameters are listed in the procedure call. It is important to note that all parameters to the assembly language subroutine must be variables. Upon entry to the subroutine, the stack will be arranged as follows:

                SP+8 -> |      etc.       |
			+-----------------+
		SP+6 -> |    Offset of    |
			|  2nd parameter  |
			+-----------------+   |
		SP+4 ->	|    Offset of    |   |
			|  1st parameter  |   | Stack grows down
			+-----------------+   |
		SP --->	| return address  |   |
			|    (4 bytes)    |   v
			+-----------------+

The parameters can then be referenced by using the BP register to get their address off of the stack. The following example shows how to do this.

11.3.3 Example

This example shows how to load an assembly language subroutine from a GWBASIC program. The assembly language routine performs modulo arithmetic on two integers, returning the remainder that results when the first integer is divided by the second. In this example, the assembly language module is loaded at address 1664:0 Hex, but this address will be different for different applications. The method of determining this address is explained after the example.

GWBASIC program

	10 '
	20 ' load the MODULO routine
	30 '
	40 DEF SEG = &H1664
	50 BLOAD "MODULO",0
	60 MODULO = 0
	70 '
	80 ' call the MODULO routine with some sample data
	90 '
	100 A% = 140
	110 B% = 11
	120 REMAINDER% = 0
	130 CALL MODULO(A%,B%,REMAINDER%)
	140 PRINT A%;"modulo";B%;"is";REMAINDER%
	150 END


   Assembly language module for use with CALL statement

name	modulo

code	segment	public 'code'
assume	cs:code, ds:code

modulo	proc	far

; This module is called from GWBASIC with 3 parameters,
; using the CALL statement. It divides the first parameter by
; the second and returns the remainder in the third.

	mov	bp, sp	       	; BP used to get parameters
	mov	bx, [bp+8]     	; BX = pointer to dividend
	mov	ax, [bx]       	; AX = value of dividend
	mov	bx, [bp+6]     	; BX = pointer to divisor
	mov	cx, [bx]       	; CX = value of divisor
	mov	dx, 0	       	; DX:AX = dividend
	idiv	cx	       	; AX = quotient, DX = remainder
	mov	bx, [bp+4]     	; BX = address of result
	mov	[bx], dx       	; return result to BASIC
	ret	6

modulo	endp
code	ends
	end


LOADING THE ASSEMBLY LANGUAGE MODULE

     In  order  to  call the assembly  language  module,   it  is 
necessary to know the address that  it is located at.   The BLOAD 
statement allows  you to load the module at any physical  address 
desired.  However,  to  use the BLOAD statement to load a module,  
you must first  create the disk file  containing the module  with 
the  Microsoft linker and debugger and  the BSAVE statement,   as 
follows:

     1)   After assembling your module to create the object file, 
          use  the  linker  to create the  .EXE  file.   Use  the 
          /HIGH  switch  when linking so  that the  module   will 
          load in high address memory.

     2)   Use the debugger to load the  .EXE file produced in step 1.

     3)   Display  the register values  (with the R  command)  to 
          determine  where the subroutine was loaded.  Write down 
          the  values contained in the  CS:IP register  pair  and  
          the CX register. The  CS:IP register pair  contains the 
          starting  address of the subroutine and the CX register 
          contains its length.

     4)   Load  and   execute  GWBASIC  from  DEBUG   with   this 
          sequence of commands: 
			NGWBASIC
			L
			N
			G

          Note  that  your   assembly language  module  is  still 
          loaded in high address memory.

     5)   Set the segment value in GWBASIC with a DEF SEG statement:

	   DEF SEG = 

     6)   Save the module with a BSAVE statement:

	   BSAVE "module_name", , 

     The assembly language  subroutine is now ready  to be called  
from your GWBASIC program.  The following statements are required 
in your GWBASIC program before the subroutine can be called:

		DEF SEG = 
		BLOAD "module_name", 
		SUBROUTINE = 

The subroutine can then be called with statements of the form:

		CALL SUBROUTINE(PARAMETER1, PARAMETER2, ...)
11.4 Interfacing COBOL with Assembly Language

Occasionally, you may wish to write certain subroutines in assembly language instead of Cobol because of speed, size, or other constraints. This section explains how to combine Cobol programs with assembly language modules so that parameters are passed correctly between the Cobol program and the assembly language subroutine.

To understand this section fully, the reader should have a basic knowledge of the following:

  1. The Cobol Compiler
  2. The Microsoft MACRO Assembler
  3. The Microsoft linker
  4. The register structure of the 8086/8088
11.4.1 Calling Assembly Language Subroutines

Assembly language subroutines can be invoked from Cobol using the CALL statement with the assembly language module name as a literal. Parameters can be passed to the assembly language routine with the addition of the USING clause. The CALL statement pushes the offset addresses of any parameters on the stack before it transfers execution to the subroutine. The examples later in this discussion will fully illustrate the calling procedure.

The assembly language subroutine must follow some simple rules in order to work correctly.

  1. It must be declared FAR.
  2. It must be declared PUBLIC.
  3. Segment registers DS and ES along with register BP must be restored to their entry values before returning to Cobol.
  4. The general purpose registers (AX, BX, CX, DX, SI, and DI) can have any value when returning to Cobol.
  5. The assembly language routine must perform a RET (where n = 2 times the number of parameters) to restore the stack pointer to its proper value.
  6. Values can be returned to Cobol by passing a parameter to the assembly language subroutine that the result be returned in.
11.4.2 COBOL Data Types

In order to manipulate data passed to an assembly language subroutine, it is necessary to understand how the various data types are represented in memory. When a subroutine is called, Cobol will pass the address of one of the following data representations.

	1) Computational-0
		Also known as a  binary item, uses  the  base 2
		system to  represent an  integer  in the  range
		-32768 to 32767.  It  occupies one 16-bit word,
		with  the leftmost bit  reserved for the opera-
		tional sign. 

		It should also be noted that  Cobol  represents 
		all data types, except Index, internally in re-
		verse  order. For example, if you have the fol-
		lowing Cobol declaration :
		
		77	EXAMPLE1	PIC 99 COMP-0 VALUE 50.
		
		It would be represented internally, in hex, as:

			    low byte   high byte	
		   	       00         32
			
		instead of :

			    low byte   high byte
			       32         00

	2) Computational-3
		Also  known  as  an  internal  decimal item, is 
		stored  internally is binary-coded decimal for-
		mat.  A Computational-3	data item, defined by n
		9's  in  its  PICTURE,  occupies 1/2 of (n + 2) 
		bytes of memory. All bytes except the rightmost
		contain a  pair of  digits, and  each  digit is 
		represented by the binary equivalent of a valid
		digit value from 0 to 9.  The item's low  order
		digit and the operational sign are found in the
		rightmost byte.  The compiler considers a Comp-
		utational-3 item to  have an  arithmetic  sign, 
		even if the original PICTURE  lacked an S char-
		acter.	The  operational sign, contained in the
		rightmost byte, is  hexadecimal F for  positive
		numbers and hexadecimal D for negative numbers.

	3) External Decimal
		An external data item is an item in which one 
                byte  is  employed to  represent one  numeric 
                digit.  An  unsigned  external data  item  is 
                reprepresented   internally   as  its   ASCII 
                equivalent.  A signed external data item   is 
                represented    internally   as   its    ASCII 
                equivalent,  EXCEPT   the low  order byte  on 
                negative  items,  which have  the   following 
                rules :

		   If the low 	      |	 The value 
		    order digit is :  |	  internally is :
		  --------------------+------------------- 
			  0	      |	     7D hex
			  1	      |	     4A hex
			  2	      |	     4B hex
			  3	      |	     4C hex
			  4	      |	     4D hex
			  5	      |      4E hex
			  6	      |      4F hex
			  7	      |      50 hex
			  8	      |      51 hex
			  9	      |      52 hex

		For example, if you have the following Cobol
		declaration :

		77	EXAMPLE2	PIC S999 VALUE -121.
		
		The value internally would be :

				   31 32 4A

	4) Alphanumeric
		An  alphanumeric  data  item  is  represented
		internally as its ASCII equivalent.

	5) Alphabetic
		An  alphabetic  data  item  is  represented in-
		ternally as its ASCII equivalent.

	6) Index
		An index data item is represented internally as
		a binary word.
11.4.3 Passing Parameters

Cobol passes all subroutine parameters by reference. In a CALL statement, the offset of each parameter's address is pushed onto the stack in the same order that the parameters are listed in the USING clause. Upon entry to the assembly language subroutine, the stack will be arranged as follows:

                SP+8 -> |      etc.       |
			+-----------------+
		SP+6 -> |    Offset of    |
			|  2nd parameter  |
			+-----------------+   |
		SP+4 ->	|    Offset of    |   |
			|  1st parameter  |   | Stack grows down
			+-----------------+   |
		SP --->	| return address  |   |
			|    (4 bytes)    |   v
			+-----------------+

The parameters can then be referenced by using the BP register to get their address off of the stack. The following example shows how to do this.

11.4.4 Example

This example shows how to link an assembly language subroutine with a Cobol program. The assembly language routine performs modulo arithmetic on two Computational-0 variables, returning the remainder that results when the first variable is divided by the second.

Cobol program with CALL/USING statement
	
        IDENTIFICATION DIVISION.
        PROGRAM-ID. EXAMPLE.
        ENVIRONMENT DIVISION.
        DATA DIVISION.
        WORKING-STORAGE SECTION.
        77  PARM1   PIC  99  COMP-0 VALUE 50.
        77  PARM2   PIC  99  COMP-0 VALUE 11.
        77  PARM3   PIC  99  COMP-0 VALUE 0.
        77  PAR1    PIC  99.
        77  PAR2    PIC  99.
        77  PAR3    PIC  99.
        PROCEDURE DIVISION.
        MAIN.
            CALL "MODULO" USING PARM1, PARM2, PARM3
            MOVE PARM1 TO PAR1.
            MOVE PARM2 TO PAR2.
            MOVE PARM3 TO PAR3.
            DISPLAY PAR1 "  MOD  " PAR2 "  =  " PAR3.
            STOP "Hit  to return to system"


Assembly language module for use with CALL/USING statement

name	modulo

code	segment	public 'code'
assume	cs:code, ds:code

public	modulo

; This module is called from Cobol with 3 parameters,
; using the CALL statement. It divides the first parameter by
; the second and returns the remainder in the third.
;
; Stack structure after saving BP
;
  parm1	equ	10			; pointer to dividend
  parm2	equ	 8			; pointer to divisor
  parm3 equ	 6			; pointer to return variable
; off	equ	 4			; return offset
; seg	equ	 2			; return segment
; bp	equ	 0			; save BP
modulo	proc	far

	push	bp			; Save BP
	mov	bp, sp			; BP used to get parameters
	mov	bx, [bp+parm1]		; BX = pointer to dividend
	mov	ax, [bx]		; AX = value of dividend
	mov	bx, [bp+parm2]		; BX = pointer to divisor
	mov	cx, [bx]		; CX = value of divisor
	mov	dx, 0			; DX:AX = dividend
	idiv	cx			; AX = quotient, DX = remainder
	mov	bx, [bp+parm3]		; BX = address of result
	mov	[bx], dx		; return result to COBOL
	pop	bp			; Restore BP
	ret	6

modulo	endp
code	ends
	end

After compiling and assembling the various modules, use the Microsoft linker to create the executable program. The Cobol object modules should be listed before the names of the assembly language object modules. After your modules have been linked, your program is ready to run.

11.5 Interfacing Pascal Programs with Assembly Language Subroutines

Sometimes it may be desirable to write certain procedures or functions in assembly language instead of Pascal, because of speed, size, or other constraints. This section explains how to successfully link Pascal programs with assembly language subroutines so that parameters and function return values are passed correctly between the Pascal and assembly code modules. The terms subroutine and procedure used interchangably to mean either a Pascal procedure or function, while the term function applies specifically to functions. In order to best understand this section, knowledge of the following is desirable:

  1. The Pascal compiler
  2. The assembler
  3. The linker
  4. The register structure of the 8086/8088
11.5.1 Calling External Subroutines

To call an assembly language subroutine from a Pascal program, it is necessary to declare the assembly language subroutine as an external procedure or function. The format of an external procedure declaration in Pascal is exactly like that of a standard Pascal procedure declaration with the addition of the external directive and no procedure body. See the section on directives in the 'Reference Manual for MS-Pascal' for detailed information.

The assembly language subroutine must have the far attribute in its 'proc' statement, since Pascal assumes that all external procedures are far. Pascal also requires assembly language rout- ines to have identical class names to Pascal routine class names. The acceptable class names can be found in the file ENTX6L.ASM, found on the Pascal compiler diskette. For example, code of an assembly language routine should have the class name 'CODE' and data used by an assembly language routine should have the class name 'DATA'. In addition, the name of the subroutine must be declared public in the assembler code. This must be the same name that is declared as an external procedure in Pascal. The two examples later in this chapter illustrate the relationship between the external declaration in Pascal and the public declar- ation in assembler.

The user written subroutine can modify the AX, BX, CX, DX, DI, SI, and ES registers. The SP, BP and DS registers can also be modified, but their values must be restored before returning to Pascal. The SS register should NEVER be modified. The user should also pop all parameters off of the stack by using a ret N state- ment, where N equals the number of bytes on the stack used by the parameters.

11.5.2 Passing Parameters

When a procedure is called in Pascal, either the address or the value of any parameters are passed to the procedure on the stack. The address of a parameter is pushed on the stack when the formal parameter is declared as a VAR, VARS, CONST, or CONSTS type. The value of a parameter is passed on the stack if the for- mal parameter in the procedure declaration does not have one of these types. The following example code should clarify this dis- tinction.

--------------------------- Figure 1 ----------------------------

		PROGRAM Sample (INPUT,OUTPUT);

		VAR
		  alpha, delta : INTEGER;

		PROCEDURE Gamma(VARS x:INTEGER; z:INTEGER);
		  BEGIN
  		    { body of procedure here }
		  END;

		BEGIN  { main program body }
		  alpha := 10;
		  delta := 21;
		  Gamma(alpha, delta);
		END.

------------------------ End of Figure 1 ------------------------

The declaration for procedure gamma defines gamma as having two parameters: x and y. The first parameter is passed by address (this is also known as 'passing by reference'), while the second parameter is passed by value. When gamma is actually called, the address of alpha is pushed onto the stack, while delta is passed by having its value (21 in this case) pushed onto the stack.

In the 8086, addresses can be one of two types: near or far. A near address consists of the sixteen bit offset address in the current segment, while a far address consists of a full twenty bit address made up of specified segment and offset values. In Pascal, the user has the ability to specify which form of addres- sing to use for parameters (in relation to the data segment) when they are passed by reference. In the declaration of a procedure or function, near addresses are specified by declaring formal parameters as VAR or CONST, while far addresses are specified by declaring parameters as VARS or CONSTS. In the above example, the address of alpha is passed to the procedure by pushing the seg- ment address (i.e. the value of the DS register) of alpha on the stack, followed by its offset address. The segment address of a parameter is the value contained in the DS register. If the for- mal parameter x in the procedure declaration had been declared as VAR instead of VARS, then only the offset address of alpha would have been pushed onto the stack. Note that the declared type of the variable does not affect the way that the parameter address is passed.

When a parameter is passed by value, the actual value of the variable at that time is pushed onto the stack. For variables of large size (such as large arrays, records, etc.), this causes the stack to grow quickly and potentially overflow. It is usually preferable to pass structured variables by reference to prevent this occurrence.

11.5.3 Pascal Data Types
  1. Byte - simple 1 byte unsigned value.
  2. Char - 1 byte ASCII character representation.
  3. Boolean - 1 byte value. FALSE is represented with a 0 in the low order bit (bit 0). TRUE is represented with a 1 in bit 0. Bits 1-7 should be 0.
  4. Word - Normally, a 2 byte unsigned value. However, Word subranges in the range 0..255 are represented by a one byte unsigned value.
  5. Integer - Normally, a 2 byte two's complement number. Subranges in the range -128..127 are represented using one byte only.
  6. Integer4 - 4 byte two's complement number.
  7. Real4 - 4 byte IEEE standard real format. The most sig- nificant bit is the sign bit, followed by an 8 bit expo- nent with a bias of 127. This is followed by a 23 bit mantissa. The mantissa has a 'hidden' most significant bit that is always a 1, so the mantissa is actually a 24 bit quantity that represents a number greater than or equal to 1.0 but is less than 2.0.
  8. Real8 - 8 byte IEEE format. The sign bit is followed by a 11 bit exponent with bias of 1023, followed by a 52 bit mantissa. As in Real4, the mantissa has a 'hidden' most significant bit that is always a 1.
  9. arrays and records - the internal format of arrays and records is composed of the internal forms of the compo- nents, in the same order as in the array or record dec- laration.
  10. super arrays - like arrays, super arrays are composed of the form of its declared component type. In a procedure declaration, a super array type can be defined as an address parameter. When the procedure is called, the actual parameter is substituted for the formal paramater and the size of the super array is pushed onto the stack before its address is pushed. This allows the procedure to be more general, as it can operate on arrays of dif- ferent lengths. If the formal parameter is a dimensioned super array type, then the length of the super array is not passed on the stack. It is important to note that an undimensioned super array type cannot be passed by value and cannot be a function return type.
  11. string - the string type is a predeclared super array, and as such has the same restrictions as a super array. The string itself is just a sequential array of type Char, with each element represented by its ASCII value.
  12. lstring - the lstring type is also a predeclared super array. It is exactly like the string type, except that the first byte of the array contains the length of the string. This allows lstrings to have variable lengths but limits their maximum length to 255 characters.
  13. Enumerated types - If there are 256 or fewer values in the enumerated type, 1 byte is required. If the enumer- ated type has more than 256 possible values, two bytes are required. The values are numbered from 0 to n-1, in the order declared (i.e. the value returned by the ord function on the type).
  14. Address types - the ADR type is represented by a 2 byte value containing the offset address of the value. The ADS type is a 4 byte value containing both the segment and offset addresses of the value.
11.5.4 Returned Values

In Pascal, the returned value of a function is passed back to the calling module in specific registers. For small data types the actual value is returned, while for large data types the ad- dress of the result is returned. It is necessary for user written functions to follow the Pascal conventions.

Return values declared as one of the simple types Boolean, Byte, Char, Integer2, Word, or Integer4, or as one of the address types Adr or Ads, or an Enumerated type are returned by value in specified registers. The value of single byte types (Boolean, Byte, or Char) should be returned in the AL register. The value of single word types (Adr, Integer2, or Word) should be returned in the AX register. An enumerated type is returned in either the AL or AX register, depending on whether it has more or less than 256 declared elements. The value of double word types (Ads and Integer4) should be returned in the DX and AX registers, with the most significant word in DX.

Function return values of any other type (such as Real, Rec- ord, and Super Array based types) are returned by address. This address should be returned in the AX register. When a function with one of these types is called, a temporary variable is allo- cated by Pascal. The near address of this temporary variable is pushed onto the stack just before the return address (i.e. after all the parameters have been pushed). The user function should put the value to be returned in this variable, and return the address in the AX register. The second example below shows how this mechanism works.

11.5.5 Example 1 - Sum function

This example (Figure 2) shows a user written assembly lan- guage routine that performs a sum function. The routine requires two input parameters - an integer super array of type vector and an integer value that contains the number of valid elements in the array. The function adds all of the valid elements in the array (which should be in array positions 1 to count) and returns the total of these values as an integer in the AX register.

It is important to note that the function has been declared external in the Pascal program and public in the assembler sub- routine. It should also be noted that the return statement in Pascal pops six bytes off of the stack, even though there are only two parameters. This is due to the fact that one of the for- mal parameters is an undimensioned super array type, so that when the function is called, the size of the actual super array para- meter is pushed onto the stack just before its address. The state of the stack just after the call to sum is shown here:

	High addresses:
			+-------------+
		      8	|  adr of cnt |
			+-------------+	   |
		      6	|  size of v  |    | Stack
			+-------------+	   | grows
	              4	|   adr of v  |    | down
			+-------------+    |
	                | return  adr |	   v
	      SP -->  2	|  (4 bytes)  |
			+-------------+
	Low addresses:

In the assembly language function below, the value of the BP register is pushed on the stack, and then BP is used to access the values on the stack. Note that the offsets from BP are two greater than the offsets in the above diagram, because the value of BP has been saved on the stack, increasing the stack size by two bytes.

--------------------------- Figure 2 ----------------------------

	PASCAL PROGRAM:

	PROGRAM Sumtest (input, output);

	TYPE
	  VECTOR = SUPER ARRAY [1..*] OF INTEGER;

	VAR
	  scores : VECTOR(10);
	  total  : INTEGER;
	  count  : INTEGER;

	FUNCTION Sum (cnt:INTEGER; VAR v:VECTOR) : INTEGER; EXTERNAL;
	  (* sum must be declared as an external function *)

	BEGIN

	  (* User code here sets values for count and scores array *)

	  total := Sum(count, scores);
	  WRITELN('The total is: ',total);
	END.

	ASSEMBLY LANGUAGE FUNCTION

	name sum

	; This subroutine is called from PASCAL.
	;
	; This subroutine requires two parameters: 
	; 1 - an integer containing the number of elements to sum
	; 2 - a super array of integers containing the values to sum
        ;     (in elements 1..n)

	cgroup  group code
	dgroup  group data

	assume  cs:cgroup, ds:dgroup
	code	segment	public 'code'	; segment uses class name
					;  of 'code'

	public	sum			; sum must be 'public'

	sum	proc	far		; sum must be 'far'
		push	bp
		mov	bp, sp
		mov	ax, 0		; initialize sum
		mov	dx, ax		; initialize array index
		mov	bx, [bp+6]	; bx <- adr of array
		mov	cx, [bp+10]	; cx <- length of array
	sumloop:
		mov	di, dx		; di <- index into array
		shl	di, 1		; convert to word index
		add	ax, [bx+di]	; add in next value of array
		inc	dx		; increment array index
		loop	sumloop

		pop	bp
		ret	6		; return - sum in ax
	sum	endp

	code	ends
		end
	
----------------------- End of Figure 2 -------------------------
11.5.6 Example 2: String concatenation function

This example illustrates how to return a value of a struc- tured type to a Pascal program. The Pascal program passes two strings to the subroutine, which returns the string that results from cancatenating the second string to the first.

Note that the function requires two lstrings (declared as type shortstring) as parameters, and returns a third lstring (of type longstring) to the calling program. Since the returned value is a structured type, Pascal passes the address of a temporary variable of this type on the stack immediately before calling the user function. The assembly language routine uses this temporary variable to build the concatenated string, and then returns this address to the caller in the AX register. Just after the call, the stack is structured as follows:

	High addresses:
			+-------------+
		      8	|  adr of s1  |
			+-------------+
		      6	|  adr of s2  |	   | 
			+-------------+	   | Stack 
	              4	| adr of temp |    | grows
			|  variable   |    | down
			+-------------+    |
	                | return  adr |	   v
	     SP -->   2	|  (4 bytes)  |
			+-------------+
	Low addresses:

As in the first example, the BP register is used to access parameters passed on the stack. The temporary variable passed on the stack is used to build the new string, and then the address is returned in the AX register to the Pascal program.

It is important that the programmer of the assembler func- tion understands the data structures of any types being used by his routine. In this example, the structure of the lstring type needed to be known in order for the concatenated string to be correctly built and correctly interpreted by the Pascal program when it is returned.

--------------------------- Figure 3 ----------------------------


	PROGRAM Myname (INPUT, OUTPUT);

	TYPE
	  SHORTSTRING = LSTRING(15);
	  LONGSTRING  = LSTRING(30);

	VAR
	  first_name : SHORTSTRING;
	  last_name  : SHORTSTRING;
	  full_name  : LONGSTRING;

	FUNCTION Concat(VAR s1,s2:SHORTSTRING) : LONGSTRING; EXTERNAL;
	  (* Concat must be external *)

	BEGIN
	  first_name := 'Mortimer ';
	  last_name  := 'Freeblekoff';
	  full_name  := concat(first_name, last_name);
	  writeln('My first name is ',first_name);
	  writeln('My last name is ',last_name);
	  writeln('My full name is ',full_name);
	END.


	ASSEMBLY LANGUAGE CONCATENATION ROUTINE:

	name	concat

	cgroup group code
	assume cs:cgroup

	code	segment public 'code'	; segment uses class name
					;  of 'code'

	public	concat			; concat must be 'public'

	concat	proc	far		' concat must be 'far'
		push	bp
		mov	bp, sp
		push	ds
		pop	es	  	; set up ES for string moves
		cld

		mov	di, [bp+6]	; di <- address of result string
		mov	bx, di
		inc	di	  	; advance to string field
		mov	si,[bp+10]	; si <- address of 1st string
		mov	cl, [si]  	; cl <- length of 1st string
		mov	al, cl	  	
		mov	ch, 0
		inc	si
		rep	movsb		; mov 1st string to result

		mov	si, [bp+8]	; si <- address of 2nd string
		mov	cl, [si]  	; cl <- length of 2nd string
		add	al, cl	  	; al <- length of result string
		mov	ch, 0
		inc	si
		rep	movsb	  	; add 2nd string to result
					
		mov	[bx], al  	; mov length to result string
		mov	ax, bx	  	; return address of resultant
		pop	bp	  	;   string to Pascal in AX.
		ret	6

	concat	endp
	code	ends
		end

----------------------- End of Figure 3 -------------------------
11.5.7 Linking

After running the Pascal compiler and the macro assembler, it is necessary to link the object modules produced to create the executable .EXE file. The order of the object modules given to the linker is important - The Pascal program module must be the first module in the list of objects, with the assembly objects last. The order of the assembly objects does not matter, but they must come after all Pascal modules. Thus, the proper link command for the second example above is

	A>link myname+concat

Refer to the "User's Guide for MS-DOS Utility Software" for more information on the Microsoft Linker.







Last revision 29/01/2005