GNU MARST is an Algol-to-C translator. It automatically translates programs written in the algorithmic language Algol 60 into the ANSI C programming language.
Processing scheme can be understood as the following:
Algol-60 source program | V +-------------+ | MARST | +-------------+ | V C source code | V +-------------+ algol.h ------>| C compiler |<------ Standard headers +-------------+ | V Object code | V +-------------+ ALGLIB ------>| Linker |<------ Standard libraries +-------------+ | V +-------------+ Input data ------>| Executable |-------> Output data +-------------+
where:
algol.h
stdio.h
, stdlib.h
, etc.), however,
no other headers are used explicitly in the generated code. This file is
a part of GNU MARST;
algol.h
);
libalgol.a
;
In order to install GNU MARST under GNU/Linux the standard installation
procedure should be used. For details see file INSTALL
included
in the distribution.
As a result of installation the following four components will be installed:
marst
usr/local/bin
;
macvt
usr/local/bin
;
algol.h
usr/local/include
and/or usr/include
;
libalgol.a
usr/local/lib
.
In order to invoke the MARST translator the following syntax should be used:
marst
[options ...] [filename]
Options:
-d
, --debug
If this option is set, the translator emits elementary syntactic units of source Algol program to the output C code in the form of comments.
This option is useful for localizing syntax errors more precisely. For
example, Algol 60 allows comments of three kinds: ordinary comments,
end-end comments, and extended parameter delimiters.
Therefore it is easy to make a mistake, for example, forgetting a comma
between the end bracket and the next statement.
-e
nnn, --error-max
nnnThis option sets maximal error allowance. The translator stops
processing after the specified number of errors detected. The value of
nnn should be in the range from 0 to 255. If this option is not
specified, the default option -e 0
is used and means that the
translation is continued until the end of the input file.
-h
, --help
exit(0)
-l
nnn, --linewidth
nnnThis option sets desirable line width for the output C code produced by
the translator. The value nnn should be in the range from 50 to 255.
If this option is not specified, the default option -l 72
is
used.
Note that the actual line width may happen to be greater than nnn,
because the translator is not able to break the output text at any
place. However, this happens relatively seldom.
-o
filename, --output
filenameIf this option is not set, the translator uses the standard output by
default.
-t
, --notimestamp
By default the translator writes date and time of translation to the
output C code as a comment.
-v
, --version
exit(0)
-w
, --nowarn
By default the translator displays warning messages which reflect potential errors and non-standard features used in the source Algol program.
In order to translate a program written in Algol 60, it should be prepared as a plain text file, and the name of this file should be specified in the command line. If the name of the input text file is not specified, the translator uses the standard input by default.
Note that the translator reads the input file twice, therefore this file should be only a regular file, but not a pipe, terminal input, etc. Thus, if the standard input is used, it should be redirected to a regular file.
For one run the translator is able to process only one input text file.
The following example shows how the MARST translator may be used in most cases.
At first we prepare source Algol 60 program, say, in the text file named `hello.alg':
begin outstring(1, "Hello, world!\n") end
Now we translate this program to the C programming language:
marst hello.alg -o hello.c
and get the text file named `hello.c', which then we compile and link in an usual way (we should remember about Algol and math libraries):
gcc hello.c -lalgol -lm -o hello
Finally, we run executable
./hello
and see what we have. That's all.
The input language of the MARST translator is hardware representation of the reference language Algol 60 described in the following IFIP document 1:
Modified Report on the Algorithmic Language ALGOL 60. The Computer Journal, Vol. 19, No. 4, Nov. 1976, pp. 364—79. (This document is an official IFIP standard. It is not a part of GNU MARST.)
Source Algol 60 program is coded as a plain text file using ASCII character set.
Basic symbols should be coded as follows:
Basic symbol Hardware representation ----------------------------------------------- a, b, ..., z a, b, ..., z A, B, ..., Z A, B, ..., Z 0, 1, ..., 9 0, 1, ..., 9 + + - - x * / / integer division % exponentiation ^ (or **) < < not greater <= = = not less >= > > not equal != equivalence == implication -> or | and & not ! , , . . ten (10) # (pound sign) : : ; ; := := ( ( ) ) [ [ ] ] opening quote " closing quote " array array begin begin Boolean Boolean (or boolean) code code comment comment do do else else end end false false for for go to go to (or goto) if if integer integer label label own own procedure procedure real real step step string string switch switch then then true true until until value value while while
Any symbol can be surrounded by any number of white-space characters
(i.e. by spaces, HT
, CR
, LF
, FF
, and
VT
). However, any multi-character symbol should contain
no white-space characters. Moreover, a letter sequence is recognized as
a keyword if and only if there is no letter or digit that immediately
precedes or follows the sequence (except the keyword `go to'
that may contain zero or more spaces between `go' and `to').
For example:
... 123 then abc ...
... 123then abc ...
... 123 thenabc ...
... 123 th en abc ...
Note that identifiers and numbers can contain white-space characters. This feature may be used in the case if an identifier is the same as keyword. For example, identifier label may be coded as `la bel' or `lab el'. Note also that white-space characters are non-significant (except when they are used within character strings), so `abc' and `a b c' denote the same identifier abc.
Identifiers and numbers can consist of arbitrary number of characters, all of which (except internal white-space characters) are significant.
All letters are case sensitive (except the first "b" in the keyword Boolean). This means that `abc' and `ABC' are different identifiers, and `Then' will not be recognized as the keyword then.
Quoted character string are coded in the C style. For example:
outstring(1, "This\tis a string\n"); outstring(1, "This\tis a st" "ring\n"); outstring(1, "This\tis all one st" "ring\n");
Within a string (i.e. between double quotes that enclose the string body) escape sequences may be used (as `\t' and `\n' in the example above). Double quote and backslash within string should be coded as `\"' and `\\' respectively. Between parts of a string any number of white-space characters is allowed.
Except coding character strings there are no other differences between the syntax of the reference language and the syntax of GNU MARST input language.
Note that there are some differences between the Revised Report on Algol 60 and the Modified Report on Algol 60, because the latter is a result of application of the following IFIP document to the former:
R. M. De Morgan, I. D. Hill, and B. A. Whichman. A Supplement to the ALGOL 60 Revised Report. The Computer Journal, Vol. 19, No. 3, 1976, pp. 276—88. (This document is an official IFIP standard. It is not a part of GNU MARST.)
All input/output is performed by standard Algol 60 procedures.
GNU MARST implementation provides up to 16 input/output channels, which
have numbers 0, 1, ..., 15. The channel number 0 is always connected
to stdin
, so only input from this channel is allowed.
Analogously, the channel number 1 is always connected to stdout
,
so only output to this channel is allowed. Other channels can be used
for both input and output. (The standard procedure fault uses the
channel number
<sigma>,
which is not available to the programmer. This latent channel is always
connected to stderr
.)
Before Algol program startup all channels (except the channels number 0 and 1) are disconnected, i.e. no files are assigned to them.
If input (output) is required by the Algol program from (to) the channel number n, the following actions are taken:
In order to determine the name of file, which should be assigned to the
channel number n, the I/O routine checks for an environment variable
named `FILE_n'. If such variable exists, its value is used as
filename. Otherwise, its name (i.e. the character string
"FILE_n"
) is used as filename.
The MARST translator provides some extensions of the reference language in order to make the package be more convenient for the programmer.
The feature of modular programming can be illustrated by the following example:
First file Second file ---------------------------------------------------- procedure one(a, b); procedure one(a, b); value a, b; real a, b; value a, b; real a, b; begin code; ... end; procedure two(x, y); value x, y; real x, y; procedure two(x, y); code; value x, y; real x, y; begin begin ... <main program> end; end
The procedures one and two in the first file are called precompiled procedures. Declarations of precompiled procedures should be outside of main program block or compound statement. The procedures one and two in the second file are called code procedures; they have the keyword code instead a procedure body statement. Declarations of code procedures also should be outside of main program block or compound statement.
This mechanism allows translating precompiled procedures independently on the main program. Moreover, precompiled procedures may be programmed in any other C compatible programming language. The programmer can consider that directly before Algol program startup declarations of all precompiled procedures are substituted into the file, which contains main program (the second file in the example above) rather than declarations of corresponding code procedures.
Each code procedure should have the same procedure heading as the corresponding precompiled procedure (however, names of parameters may be altered). Note that mismatched procedure headings cannot be detected by the MARST translator, because they are placed in different files.
The pseudo procedure inline has the following (implicit) heading:
procedure inline(str); string str;
A procedure statement that refers to the inline pseudo procedure is translated into the code, which is the string str without enclosing quotes. For example:
Source program Output C code ------------------------------------------------ . . . . . . a := 1; dsa_0->a_5 = 1; b := 2; dsa_0->b_8 = 2; inline("printf(\"OK\");"); printf("OK"); c := 3; dsa_0->c_4 = 3; . . . . . .
Procedure statement inline may be used anywhere in the program as an oridinary Algol statement.
The pseudo procedure print is intended mainly for test printing (because the standard Algol input/output is out of criticism). This procedure has unspecified heading and variable parameter list. For example:
real a, b; integer c; Boolean d; array u, v[1:10], w[-5:5,-10:10]; . . . print(a, b, u); print(c); . . . print("test shot", (a+b)*c, !d & u[1] > v[1], u, v, w); . . .
Each actual parameter passed to the pseudo procedure print is sent to the channel number 1 (
stdout
) using printable format.
Algol converter utility is MACVT. It is an auxiliary program, which is intended for converting Algol 60 programs from some other representation to the MARST representation. Such conversion is usually needed when existing Algol programs should be adjusted in order to translate them with GNU MARST.
MACVT is not a translator itself. This program just reads an original code of Algol 60 program from the input text file, converts main symbols to the MARST representation (see Section 5. Input Language), and writes the resulting code to the output text file. It is assumed that the output code produced by MACVT will be later translated by MARST in an usual way. Note that MACVT performs no syntax checking.
The input language understood by MACVT differs from the GNU MARST input language only in representation of basic symbols. Should note that in this sense GNU MARST input language is a subset of the MACVT input language.
Representation of basic symbols implemented in MACVT is based mainly on well known (in 1960s) Algol 60 compiler developed by IBM first for IBM 7090 and later for System/360. This representation may be considered as a non-official standard, because it was widely used at the time, when Algol 60 was the actual programming language.
In order to invoke the MACVT converter the following syntax should be used:
macvt
[options ...] [filename]
Options:
-c
, --classic
This option is used by default until other representation is chosen. It
assumes that input Algol 60 program is coded using classic
representation: all white-space characters are non-significant (except
within quoted character strings) and keywords should be enclosed by
apostrophes. For details see below.
-f
, --free-coding
If this option is set, it is allowed not to enclose keywords by
apostrophes. But in this case white-space characters should not be used
within multi-character basic symbols. See below for details.
-h
, --help
exit(0)
-i
, --ignore-case
If this option is set, all letters (except within comments and character
strings) are converted to lower case, i.e. conversion is
case-insensitive.
-m
, --more-free
This option is the same as --free-coding
, but additionally
keywords for arithmetic, logical, and relational operators can be coded
without apostrophes. For details see below.
-o
filename, --output
filenameIf this option is not set, the converter uses the standard output by
default.
-s
, --old-sc
This option allows the converter recognizing the diphthong ., (point and
comma) as the semicolon (including its usage for terminating comment
sequences).
-t
, --old-ten
This option allows the converter recognizing the single apostrophe (when
it is followed by +
, -
, or digit) as the ten symbol.
-v
, --version
exit(0)
In order to convert an Algol 60 program, it should be prepared as a plain text file, and the name of this file should be specified in the command line. If the name of the input text file is not specified, the converter uses the standard input by default.
For one run the converter is able to process only one input text file.
In the table shown on the next page one or more valid representation are given for each basic symbol. Thereto the following additional rules are assumed:
--free-coding
or --more-free
)
is used.
greater
instead 'greater'
) is allowed only
if the option --more-free
is used.
--old-ten
is used. Note that in this case the sequence
'10'
is not recognized as ten symbol.
--old-sc
is used.
"
(double quote), the
corresponding closing quote should be coded as "
(double quote).
If an opening quote is coded as `
(diacritic mark), the
corresponding closing quote should be coded as '
(single
apostrophe).
Basic symbol Extended hardware representation ----------------------------------------------------------- a, b, ..., z a, b, ..., z A, B, ..., Z A, B, ..., Z 0, 1, ..., 9 0, 1, ..., 9 + + - - x * / / integer division % '/' 'div' exponentiation ^ ** 'power' 'pow' < < 'less' not greater <= 'notgreater' = = 'equal' not less >= 'notless' > > 'greater' not equal != 'notequal' equivalence == 'equiv' implication -> 'impl' or | 'or' and & 'and' not ! 'not' , , . . ten (10) # ' '10' : : .. ; ; ., := := .= ..= ( ( ) ) [ [ (/ ] ] /) opening quote " ` closing quote " ' array 'array' begin 'begin' Boolean 'boolean' code 'code' comment 'comment' do 'do' else 'else' end 'end' false 'false' for 'for' go to 'goto' if 'if' integer 'integer' label 'label' own 'own' procedure 'procedure' real 'real' step 'step' string 'string' switch 'switch' then 'then' true 'true' until 'until' value 'value' while 'while'
In order to illustrate what the MACVT converter does, consider the following Algol 60 procedure, which is coded using an old (classic) representation:
'PROCEDURE'EULER(FCT,SUM,EPS,TIM).,'VALUE'EPS,TIM., 'INTEGER' TIM., 'REAL' 'PROCEDURE' FCT., 'REAL' SUM, EPS., 'COMMENT' EULER COMPUTES THE SUM OF FCT (I) FOR I FROM ZERO UP TO INFINITY BY MEANS OF A SUITABLY REFINED EULER TRANSFORMATION. THE SUMMATION IS STOPPED AS SOON AS TIM TIMES IN SUCCESSION THE ABSOLUTE VALUE OF THE TERMS OF THE TRANSFORMED SERIES IS FOUND TO BE LESS THAN EPS, HENCE ONE SHOULD PROVIDE A FUNCTION FCT WITH ONE INTEGER ARGUMENT, AN UPPER BOUND EPS, AND AN INTEGER TIM. THE OUTPUT IS THE SUM SUM. EULER IS PARTICULARLY EFFICIENT IN THE CASE OF A SLOWLY CONVERGENT OR DIVERGENT ALTERNATING SERIES., 'BEGIN''INTEGER' I,K,N,T.,'ARRAY' M(/0..15/)., 'REAL' MN, MP, DS., I.=N.=T.=0.,M(/0/).=FCT(0).,SUM.=M(/0/)/2., NEXTTERM..I.=I+1.,MN.=FCT(1)., 'FOR' K.=0'STEP'1'UNTIL'N'DO' 'BEGIN' MP.=(MN+M(/K/))/2.,M(/K/).=MN., MN.=MP'END'MEANS., 'IF' (ABS(MN)'LESS' ABS (M(/N/))'AND'N'LESS'15)'THEN' 'BEGIN'DS.=MN/2.,N.=N+1., M(/N/).=MN'END' ACCEPT 'ELSE' DS.=MN., SUM.=SUM+DS., 'IF' ABS(DS)'LESS'EPS'THEN'T.=T+1'ELSE'T.=0., 'IF'T'LESS'TIM'THEN''GOTO'NEXTTERM 'END'EULER;
This code can be converted to the GNU MARST input language using the following command:
macvt -i -s euler.txt -o euler.alg
The verbatim result of conversion is the following:
procedure euler(fct,sum,eps,tim);value eps,tim; integer tim; real procedure fct; real sum, eps; comment EULER COMPUTES THE SUM OF FCT (I) FOR I FROM ZERO UP TO INFINITY BY MEANS OF A SUITABLY REFINED EULER TRANSFORMATION .THE SUMMATION IS STOPPED AS SOON AS TIM TIMES IN SUCCESSION THE ABSOLUTE VALUE OF THE TERMS OF THE TRANSFORMED SERIES IS FOUND TO BE LESS THAN EPS, HENCE ONE SHOULD PROVIDE A FUNCTION FCT WITH ONE INTEGER ARGUMENT, AN UPPER BOUND EPS, AND AN INTEGER TIM .THE OUTPUT IS THE SUM SUM .EULER IS PARTICULARLY EFFICIENT IN THE CASE OF A SLOWLY CONVERGENT OR DIVERGENT ALTERNATING SERIES; begin integer i,k,n,t;array m[0:15]; real mn, mp, ds; i:=n:=t:=0;m[0]:=fct(0);sum:=m[0]/2; nextterm:i:=i+1;mn:=fct(1); for k:=0 step 1 until n do begin mp:=(mn+m[k])/2;m[k]:=mn; mn:=mp end means; if (abs(mn)< abs (m[n])&n<15)then begin ds:=mn/2;n:=n+1; m[n]:=mn end accept else ds:=mn; sum:=sum+ds; if abs(ds)<eps then t:=t+1 else t:=0; if t<tim then go to nextterm end euler;
The author thanks Erik Schönfelder <schoenfr@gaertner.de> for a lot of useful advices and especially for testing MARST with real Algol 60 programs. The author also thanks Bernhard Treutwein <Bernhard.Treutwein@Verwaltung.Uni-Muenchen.DE> for a great help in preparing the MARST documentation.
[1] In order to obtain a reprint of this document in Postscript or in PDF format please contact the author.