All serious computer programmers use more than one computer language. Different languages have their own features that cause us to favor the use of one language over another, depending of course, on the task at hand. Sometimes, to complete a programming task and to have produced software that performs according to a specification, the use of one particular language is mandatory. Yet other routines, necessary for the current task, are available and are written in another language.
For this and a number of other reasons, it would be desirable for the programmer to be able to merge sets of object modules (.obj) produced by different compilers into a single load module (.exe). This is possible only if the compilers (the programs that translate the source modules [.for, .c, .cpp, etc.] into object modules) that are written in a mutually compatible format or in sets of formats that all can be read correctly by the linker (the linker is the program that reads the sets of object modules and produces the executable load module.)
Compilers from different vendors are seldom mutually compatible.
Unfortunately, this incompatibility goes much further than just the different object file formats. For compilers to be truely compatible, subroutine calls must be made in precisely the same way. This is just one example.
This author uses Fortran, C, and C++ languages on a daily basis. When a source module is created in one language, it seems foolish to have to translate (or transliterate) it into another language, just for compatibility reasons. So, we invariably end up with sets of routines written in different languages. We want to use all of these sets of routines and we want to use them together.
Also, in using multiple languages, we do not want to change the way we write in any of the languages.
Achieving this compatibility is quite a task. For this to be possible, we must choose the compilers, linker(s), library generators, and operating system(s) carefully. Then, a set of programs, called compiler filters, can be written to rewrite the output of each compiler to achieve this mutual compatibility. This document describes what has been done to accomplish making our compilers "play together".
In order for the remainder of this document to be understood, it is assumed that the reader is totally familiar with the use of X86 assembly language. It is not the intention that this document give this background information.
The compiler filter software described in the document was designed and implemented by John H. Letcher , Professor of Computer Science of the University of Tulsa and President of Synergistic Consultants Incorporated.
On systems running Microsoft Windows NT (3.51, 4.0 and 2000), this author has chosen to use the Fortran-77 compiler from Microway, For the C and C++ languages, the Visual C/C++ 5.0 compiler has been chosen for use. The assembler chosen is the Microsoft Macro Assembler, MASM 6.0. The linker and library generator are supplied by Microsoft.
For a variety of reasons, it is also desirable to be able to translate Fortran source modules directly into the C programming language. For this purpose, this author uses the Fortran to C translation program Promula.Fortran (PFC). This generates a homogeneous C enviornment for porting software to Linux (Unix, Solaris, etc.) Yet, we may still write in Fortran, if we wish.
To achieve mutual compatibility, it was chosen to instruct each compiler to output assembly language. Then, a single assembler could be used to generate the object modules. Since only one program will produce the objects, clearly, there is no compatability problem with regard to object file formats. However,serious problems still remain. The problem here is that the assembly code generated by one compiler will not "play together" with the assembly code produced by another compiler.
Now, our task is to produce a set of programs that each read an assembly language source module produced by one compiler and translate the module into assembly language conforming to a fixed (common) set of specifications. Then, each module will be compatible with every other. Furthermore, the linker will not even know which compiler has produced the module.
A number of choices must be made in the design of the compiler filters. Of great importance, is the choice of the method of how subroutines are called. That is, would the stack be used to pass subroutine argument locations? All of the really fast Fortran compilers of thirty years ago (e.g., from Cray), did not use stacks, but these notions were abandoned, not to make compilers easier to write (which it does) but because the older techniques did not allow recursion. Since C flatly requires recursion (which is good), C passes subroutine argument locations by pushing them onto a stack. But now we have to ask, should the arguments be pushed in the order of from right to left or from left to right? Also, does the called routine or the caller routine pop the arguments off the stack. Of the four possible answers to these two questions, Microsoft has used almost all of the options.
To simplify life, it was decided that the format and structure used by the Visual C/C++ compiler 5.0 when it writes assembly language. This saved writing one compiler filter.
This filter, NDPPREP.EXE, translates the assembly language output of the Microway Fortran compiler into a compatible assembly language format.
The filter program, PFCPREP.EXE, translates the C language output of the Promula.Fortran System into a format compatible with Microsoft C language conventions.
Once routines are written in Fortran, C or C++, we would like to be able to call these routines directly from Visual Basic and Powerbuilder. This filter produces dynamic link libraries (.dll) and source code to be included in the Visual Basic and Powerbuilder source modules. Then the Fortran or C routines may be called directly by Visual Basic and Powerbuilder programs.
Consider the Fortran source module:
SUBROUTINE MAINSUB COMMON/LOOK/I,J,K I=5 J=7 CALL MYSUB(I,J,K) RETURN ENDSUBROUTINE MYSUB(I,J,K) K=I+J RETURN END
The assembly language module produced by the Microway Fortran compiler from the above fortran source is:
; NDP Version 4.6.0 -- 03/18/95 ; fcom -OLMA -X22 -X37 -X171 -X210 -X214 -X215 -X226 -X244 -X247 -X266 -X325 ; -X334 -X335 -X357 -X358 -X382 -X474 -X592 -X682 -X683 -X899 -X908 ; -X925 -X928 -X929 -X939 -X1002 -X1006 -X1010 -X1011 -X1016 ; name mysub.for .386 .387 assume cs:codeseg assume ds:dataseg codeseg segment para use32 public 'code' codeseg ends dataseg segment para use32 public 'data' extrn __vms_fortran:dword dataseg ends codeseg segment use32 para public 'code' _mainsub_ proc near ; .bf mov dword ptr ds:_look_,5 ; 00000000 nop mov dword ptr ds:_look_+4,7 ; 0000000b lea eax,dword ptr ds:_look_+8 ; 00000015 push eax ; 0000001b lea eax,dword ptr ds:_look_+4 ; 0000001c push eax ; 00000022 push offset ds:_look_ ; 00000023 add cl,0 call _mysub_ ; 00000028 add esp,12 ; 00000030 ; .ef ret ; 00000033 _mainsub_ endp codeseg ends dataseg segment para use32 public 'data' dataseg ends codeseg segment use32 para public 'code' _mysub_ proc near push ebx ; 00000034 ; .bf mov eax,dword ptr [esp]+8 ; 00000035 mov ecx,dword ptr [esp]+12 ; 00000039 mov ebx,dword ptr [esp]+16 ; 0000003d mov eax,dword ptr [eax] ; 00000041 add eax,dword ptr [ecx] ; 00000043 mov dword ptr [ebx],eax ; 00000045 ; .ef pop ebx ; 00000047 ret ; 00000048 _mysub_ endp codeseg ends dataseg segment para use32 public 'data' ;_i eax local ;_j ecx local ;_k ebx local ;_i [esp]+8 local ;_j [esp]+12 local ;_k [esp]+16 local dataseg ends codeseg segment use32 para public 'code' codeseg ends dataseg segment para use32 public 'data' public _mysub_ @12 struct 1t x db 12 dup (?) @12 ends externdef _look_:@12 public _mainsub_ dataseg ends codeseg segment use32 para public 'code' codeseg ends end
The filter program NDPPREP.EXE reads the above file and produces this:
.386 .387 assume cs:_TEXT assume ds:_DATA_TEXT segment para use32 public 'code'
_mainsub proc near
mov dword ptr ds:_look,5 nop mov dword ptr ds:_look+4,7 lea eax,dword ptr ds:_look+8 push eax lea eax,dword ptr ds:_look+4 push eax push offset ds:_look add cl,0 call _mysub add esp,12
ret _mainsub endp
_mysub proc near push ebx
mov eax,dword ptr [esp]+8 mov ecx,dword ptr [esp]+12 mov ebx,dword ptr [esp]+16 mov eax,dword ptr [eax] add eax,dword ptr [ecx] mov dword ptr [ebx],eax
pop ebx ret _mysub endp _TEXT ends
_DATA segment para use32 public 'data' public _mysub @12 struct 1t x db 12 dup (?) @12 ends public _mainsub
COMM _look :BYTE:12
_DATA ends end
Please notice all of the differences between the two assembly language source modules.
The assembly language file starts with the directives .386 and .387 to tell the assembler that we wish to use full 32 bit X86 instruction. Two segments are defined, a code segment and a data segment. The code segment is supposed to be invariant under code execution. A stack segnment is used but it need not be named within this module.
When a subroutine is called, the subroutine argument locations are pushed onto the stack from left to right order. Then the subroutine is accessed by the use of the call instruction. Immediately following the call instruction, the stack pointer ESP is modified to remove the subroutine argument location pointers that had been pushed onto the stack before the call. This is saying that the calling program removes the argument locations from the stack.
A convention has been adopted that the value in EBX be pushed onto the stack immediately on entry and poped off just before the return, ret. This is used when the code is not terribly complicated.
At the time of execution of the first instruction after the push ebx instruction, the stack looks like this:
Offset Value 0 The value in EBX 4 Return Address 8 Subroutine Argument #1 Location 12 Subroutine Argument #2 Location 16 Subroutine Argument #3 LocationIf, however, the code needs the full use of the registers, normally as pointers,the push ebx is replaced with
mov ebp,esp push esi push edi push ebxUpon exit from the routine, the stack must be returned to its origional state.
Listed below are a set of batch file (each to be run from the NT Command Prompt. Notice that the files are named NT or ND followed immediately by A (for assembler), F (for Fortran), C (for C) or CPP (for C++). The we have the letters TO followed by the designation of the purpose of the batch file. That is NTATOOBJ converts an assembly file into an object module.
NTNEWLIB.BAT (create a new Library using NT.obj and PLIB.obj) DEL FLIBSCI.LIB LIB /OUT:FLIBSCI.LIB NT.OBJ LIB FLIBSCI.LIB PLIB.OBJNTLINK.BAT (link mainline program with Library producing %.EXE) LINK %1.OBJ FLIBSCI.LIB LIBC.LIB /NOLOGO
NTATOOBJ.BAT (assemble %.asm to %.obj using MASM 6.11) ML /c %1.ASM /nologo
NTATOLIB.BAT (assemble %.asm into %.obj and place in the Library) ML /c %1.ASM /nologo LIB FLIBSCI.LIB %1.obj /nologo
NTCTOASM.BAT (compile %.c into %.asm) CL -c -G6 /Ox -nologo -Fa%1.ASM %1.c
NTCTOOBJ.BAT (compile %.c into %.obj using the Visual C++ Compiler) CL -c -G6 /Ox -nologo -Fo%1.obj %1.c
NTCTOLIB.BAT (compile %.c into %.obj and place in Library) CL -c -G6 /Ox -nologo -Fo%1.obj %1.c LIB FLIBSCI.LIB %1.obj /nologo
NTCTODLL.BAT (compile %.c into %.obj and create %.dll) DLLPREP %1 CL -c -Gz -G6 /Ox -nologo -Fa%1.ASM %1.c ML /c /nologo %1.ASM LINK %1.OBJ LIBC.LIB /DEF:%1.DEF /DLL /nologo
NTFTOASM.BAT (translate %.for into %.c then compile %.c into %.asm) PFC %1 BO PFCPREP %1 CL -c -G6 /Ox -nologo -Fa%1.ASM %1.c
NTFTOOBJ.BAT (translate %.for into %.c then compile %.c into %.obj) PFC %1 BO PFCPREP %1 CL -c -G6 /Ox -nologo -Fo%1.obj %1.c
NTFTOLIB.BAT (translate %.for into %.c, compile into %.obj PFC %1 BO PFCPREP %1 CL -c -G6 /Ox -nologo -Fo%1.obj %1.c LIB FLIBSCI.LIB %1.obj /nologo
NTFTODLL.BAT (translate %.for into %.c, compile into %.obj, create %.dll) PFC %1 BO PFCPREP %1 DLLPREP %1 CL -c -Gz -G6 /Ox -nologo -Fa%1.ASM %1.c ML /c /nologo %1.ASM LINK %1.OBJ LIBC.LIB /DEF:%1.DEF /DLL /nologo
NDFTOASM.BAT (compile %.for into %.asm using the Microway F-77 Compiler) SET NDP=. SET BIN=. SET LIB=. SET INCLUDE=. MFIW -S -x928 -on %1.FOR NDPPREP %1
NDFTOOBJ.BAT (compile %.for into %.obj using the Microway F-77 Compiler) SET NDP=. SET BIN=. SET LIB=. SET INCLUDE=. MFIW -S -x928 -on %1.FOR NDPPREP %1 ML /c /nologo %1.ASM
NDFTOLIB.BAT (compile %.for into %.obj and place it in the library) SET NDP=. SET BIN=. SET LIB=. SET INCLUDE=. MFIW -S -x928 -on %1.FOR NDPPREP %1 ML /c /nologo %1.ASM LIB FLIBSCI.LIB %1.obj /nologo
One nasty problem remained: how to handle memory allocation. The C language allows 1) local to an individual routine 2) global to this source module, only and 3) global storage, known to all. Local variable values have a nasty habit of going away after leaving the routine, as they are normally placed on the stack.
The Fortran language allows for 1a) local variables on the stack and 1b) local variable placed in memory. 2 and 3) global storage which is set up in named blocks of storage, accessable by anyone who knows and uses the name. Storage is allocate as extern block of type char.