Easy-C is language that has similar expressive power to the popular ANSI-C language while avoiding many of the issues that make ANSI-C difficult and/or tedious to code in. Easy-C is a strongly typed object oriented language.
The lexical components of are symbols, numbers, strings, punctuation, and comments. In addition, like other some other more recent programming languages, Easy-C uses the white space at the beginning of each line to provide indentation information to guide compilation. The following lexical components are accepted:
- Symbols
- Symbols are a sequence of letters ('A'-Z', 'a'-'z'), digits ('0'-'9', and underscores ('_'.) No symbol can start with a digit. Letters from the Latin-9 (ISO-8859-15) alphabet is permitted for letters. The letters can be upper case or lower case. Examples: a, b, a123, top_speed.
- Numbers
- Numbers start with a digit ('0'-'9'.) There are three kinds of numbers -- decimal, hexadecimal, and floating-point. A decimal number consists exclusively of decimal digits ('0'-'9'.) A hexadecimal number consists of '0x' followed by 1 or more hexadecimals digits ('0'-'9, 'A'-'F', 'a'-'f'). Floating point have a mandatory mantissa followed by an optional exponent. The mantissa consists of at least one decimal digit ('0'-'9') and at exactly one decimal point ('.'.) The decimal point may at the beginning, middle or end of the mantissa, but it must be present. The optional exponent is the letter 'E' (or 'e') followed by an optional sign ('-' or '+') followed by one or more decimal digits.
- Strings
- Strings specify a sequence of zero, one or more characters (also referred to as codes.) There are character strings that are enclosed in single quotes ("'") and must specify exactly one character. Regular strings enclosed in double quotes ('"') and specify zero, one or more characters. Any printing character from the Latin-9 character (plus space) is allowed between the quotes, except for the quote character itself. There is a mechanism for expressing other characters inside a string that involves the use of the backslash character ('\'). A backslash ('\') followed by a list of comma separated symbols and decimal numbers followed by another backslash ('\') defines a list of characters where each number/symbol corresponds to a character. The allowed symbols are:
Symbol Description Code n New-line 10 r Carriage Return 13 t Horizontal Tab 9 sq Single Quote 39 dq Double Quote 34 bsl Backslash 92 - Punctuation
- The following single character punctuation is allowed - '~', '!', '@', '%', '^', '&', '*', '(', ')', '-', '+', '=', '[', ']', '<', '>', ',', '|', and '/'. The following double character punctuation is allowed - '&&', '==', '||', '<=', '>=", '!=', ':=", and '@('. The following triple character punctuation is allowed - ':@='.
- Comments
- Comments start with a '#' and continue to the end of the line.
- Blank Lines
- Blank lines have no visible tokens on the line and are completely ignored.
- Indentation and Shite Space
- White space consists exclusively of the horizontal tab and space character. At the beginning of the line, tabs space over to the next multiple of 8 spaces. All other white space on the line after the first lexical element are ignored.
- Continuation Lines
- A line is continued onto the next line if it ends in a punctuation character other than ']' or ')'. Thus,
is the same as:
a := length@(a + b, # Comment ignored c - d)The example above shows that comments are allowed at the end of a continuation line.
a := length@(a + b, c - d)
Below are some examples of each of the visible lexical components:
- Symbols
- a, b, abc, a1, z123, Point, stop_point
- Decimal Numbers
- 0, 1, 9, 123, 1234567
- Hexadecimal Numbers
- 0x0, 0x1, 0x9, 0xa, 0xf, 0xA, 0xF, 0xabcdef, 0xABCDEF
- Floating Point Numbers
- 0., 1., 9., 0.0, 1.0, 9.0, .0, .1, .9, 12.34, .1234, 1234., 1.2e0, 1.2e10, 1.2e-10, 1.2e+10
- Character Strings
- 'a', 'b', ' ', ':', '\n\', '\r\', '\sq\', '\bsl\', '\1\', '\123'
- Full Strings
- "", "a", "z", "abc", "a test", "\sq\a test\sq,n\", "null-terminated\0\"
- Comments
- # A comment
Unlike C, Easy-C does not have reserved words such as if, return, etc.
Every variable, argument, and record field has a type. There are three kinds of types in Easy-C:
Two types are equal if and only if they exactly match; otherwise they are different.
- Simple Type
- A simple type is just a symbol, which by convention starts with a capital letter (e.g. Unsigned, Integer, Vector3, etc.)
- Parameterized Type
- A parameterized type is a symbol followed by one or more comma separated types enclosed in square brackets (e.g. Array[Integer], Array[Array[Integer]], Hash_Table[Unsigned, String], etc.)
- Routine Type
- A routine type is a type that points to a routine (i.e. function.) The syntax is an open square bracket followed by zero or more return types, followed by '<=', followed by zero or more argument types. For example, "[Integer <= Integer, Integer]", "[<= Unsigned, Integer]", "[Vector3 <=]", ...
In Easy-C, arguments, variables, and record fields are strongly typed. A variable of a given type always contains a pointer to an object of the correct type. For example, it is possible to have variable "a" of type "Apple" and a variable "o" of type "Orange". The variable "a" will only point to "Apple" objects and the variable "o" will only point to "Orange" objects. There is absolutely no mixing between "Apple" and "Orange" objects.
The Easy-C assignment operators are ":=" and ":@=".
":@=" is used to assign to variable for the first
time and ":=" is used to assign for the second
and subsequent times. The assignment operator is
a simple pointer replacement. If both "a1" and "a2"
are variables of type "Apple", they both point to
"Apple" objects, For the piece of code below:
After this statement, both "a1" and "a2" point to
the exact same "Apple" object. In other words,
the pointer in "a1" has been replaced by the pointer
in "a2".
a1 := a2
In Easy-C there is a concept of mutable and immutable objects. A mutable object is like box whose contents can be changed whereas an immutable object is a box whose contents is frozen in place and can not be changed. Some types in Easy-C are mutable and others are immutable. The basic types in Easy-C are Logical, Unsigned, Integer, Character, Float, and Double and all of these types are immutable. Other types like records and arrays are mutable in that their contents can be changed. It turns out that some Easy-C strings are mutable and some are immutable.
In Easy-C, if there are variables "a", "b", and "c"
of type "Unsigned", the code below:
is really a short hand for:
a := b + c
"add@Unsigned" refers to the "add" routine associated
with type "Unsigned". In Easy-C, all routines are
associated with a Type. "a := add@Unsigned(b, c)" means
invoke the "add@Unsigned" routine with two arguments, "b"
and "c" and store the back into the variable "a". In
even finer detail, the "add@Unsigned" routine does the
following:
a := add@Unsigned(b, c)
In short, there are no special language semantics associated with the basic types, they are treated like all others. This is stark contrast to most other languages (including ANSI-C) which make special allowances for the basic types by calling them scalar types.
Most languages that have pointers also have the concept of an empty pointer. In Easy-C, the empty pointer concept is replaced with a very similar concept of pointing to a Null object. Every type has one (and only one) null object named "Null". Thus, "Null@Unsigned" is the null object for the type "Unsigned" and "Null@Vector3" is the null object for the type "Vector3".
In Easy-C all variables, arguments and record fields are initialized to point to the null object of the appropriate type. Complex, types like records have each of their individual field values initialized to point to the appropriate null object. This is in stark contrast to other languages that leave variables and records uninitialized.
In short, 1) everything in Easy-C is a strongly typed object, 2) variables, arguments, and record fields are strongly typed, 3) variables, arguments and record fields always point to an object of the same strong type, and 4) assignment overwrites the previous object pointer value with a new object pointer.
The object model presented here is substantially simpler than most other programming languages.
Declarations always start at the beginning of line with no preceding white-space. In general, an Easy-C is broken into four sections:
library
Declaration
The syntax of the library declaration is:
where
library library_name
The routine causes all of the types and routines defined in library_name to become available for use in the current Easy-C file. In general, this means that the types and routines defined in the file library_name
library_name
- is the name of the library to access.
.ezc
are made available
for use.
define
Declaration
There are two basic flavors of the "define" declaration 1) record/variant definition, and 2) enumeration definition.
The syntax for the enumeration define is:
where
define enumeration_type_name
enumeration
item_1
item_2
...
item_n
generate routine_names
enumeration_type_name
- is the name of the new Enumeration type,
item_1, item_2, ..., item_n
- are the names of new constants,
routine_names
- is a comma separated list of routine names. {Currently, this list is empty.} The
generate
line is optional.
The enumeration define declaration defines a new simple type named enumeration_type_name. By convention, the first letter of the new type name is capitalized. Each of the listed items is defined as a global constant item_1@enumeration_type_name, ..., item_n@enumeration_type_name.
The syntax for the record/variant define is fairly complicated. The syntax is presented first and then is subsequently explained.
The syntax for the top level record/variant define
declaration is:
where
define record_type_name
record_variant_clause_1
record_variant_clause_2
...
record_variant_clause_n
generate routine_names
record_type_name
- is the name of the new record/variant type. This type can be either simple or parameterized.
record_variant_clause_1, ..., record_variant_clause_n
- are described below,
routine_names
- is a comma separated list of routine names. Currently, only
parse
andtraverse
supported as routine names. Thegenerate
line is optional.
The syntax for a record clause is:
where
record
field_name_1 type_1
field_name_2 type_2
...
field_name_n type_n
field_name_1, ..., field_name_n
- are unique symbols that name each field in the record, and
type_1, ..., type_n
- are the types associated with each field. Each type can be either, simple, parameterized, or a routine type.
The syntax for a variant clause is:
where
variant select_name select_type_name
field_name_1 type_1
field_name_2 type_2
...
field_name_n type_n
select_name
- is a symbol that is used to access the type selector,
select_type_name
- is a simple type that is defined for the type selector,
field_name_1, ..., field_name_n
- are unique symbols that name each field in the record, and
type_1, ..., type_n
- are the types associated with each field. Each type can be either, simple, parameterized, or a routine type.
The following example shows a simple record declaration:
This defines a new type named
define Point3 # Point in 3-space
record
x Double # X coordinate
y Double # Y coordinate
z Double # Z coordinate
Point3
.
This new type is a record with three fields named
x
, y
, and z
,
all of type Double
. This declaration
also defines the routine New@Point3
and
the global object Null@Point3
. The following
code shows this type being used:
`
p1 :@= New@Point3() # Create and return a new point
p1.x := 1.0 # Initialize x field
p1.y := 2.0 # Initialize y field
p1.z := p1.x + p1.y # Initialize z field
p2 :@= Null@Point3 # Initialize p2 to a Null@Point3
The following example shows a variant declaration:
This declaration defines a new type named
define Number # General purpose number
variant kind Number_Kind
double Double # Double precision float
float Float # Single precision float
integer Integer # Signed 32-bit number
unsigned Unsigned # Unsigned 32-bit number
Number
that has 5 fields named,
kind
, double
, float
,
integer
, and unsigned
.
It also defines a new enumeration type named
Number_Kind
.
global, constant, external
There are three kinds of object declarations -- global, constant, and external declarations.
The syntax of a global
declaration is:
where
global variable_name@type
xxx
variable_name
- xxx
type
- xxx
The syntax of the constant
declaration is:
where
constant name@type = expression
xxx
name
type
The syntax of the external
Declaration is:
where
external variable_name@type
xxx
variable_name
type
routine
Declaration
The routine
declaration is the work
horse of Easy-C. Most Easy-C programs are dominated
by routine
declarations.
The syntax of the routine
declaration is:
where
routine routine_name@routine_type
takes argument_name_1 argument_type_1
...
takes argument_name_N argument_type_N
takes_nothing
returns return_type
returns_nothing
nested_statements
A routine must have either one or more
routine_name
- is the routine name. By convention the routine name is in all lower case.
routine_type
- is the routine type. It can be either a simple type, or a parameterized type. If it is a parameterized type, each of the parameters, must be a simple type name. By convention, the first letter of a type is capitalized.
argument_name_1
, ...,argument_name_N
- is an argument name. The argument name becomes a local variable that is accessible throughout the entire routine body.
argument_type_1
, ...,argument_type_N
- is the argument type. The argument type can be either a simple type, parameterized type, or a routine type. Nested parameter types are permitted.
return_type
- is the routine return type. If the routine does not return a type,
returns_nothing
is specified instead.nested_statements
- are the nested statements that are executed when the routine is invoked.
takes
clauses or a single takes_nothing
clause.
A routine must have either a returns
clause
or a returns_nothing
clause. By convention,
a short comment occurs after the returns
or returns_nothing
clause that describes
what the routine does.
The routine main@Easy_C
is interesting
in that it is the first routine in a program that is
executed.
There are a few extra declarations that need additional documentation:
include
Declaration- includes a file, but not as a library
require
Declaration- requires a file, but not as a library
defines_prefix
Declaration- used for extracting macro values from C header files.
The syntax of the assert
statement is:
where
assert logical_expression
If
logical_expression
- is an expression that returns type
Logical
(i.e.true
or false).
logical_expression
evaluates to false the program
execution is immediately terminated with a
fatal run time error. The purpose of this
statement is for the programmer to inform
anybody who reads the code about code
invariants.
5.2 Assign Statement
The assign statement occurs whenever the statement
has a :=
or :@=
on the line.
When the compiler sees either of those two tokens,
it inserts an invisible {assign}
at
the beginning of the line.
The syntax of the assign statement is:
left_expression := right_expression
or
variable :@= right_expression
where
-
left_expression
-
Left expression is constrained to be one of
exactly three formats:
-
variable
-
yyy
-
expression.field_name
-
yyy
-
expression1[expression1, ...]
-
yyy
-
right_expression
-
xxx
-
variable
-
xxx
5.2 break
Statement
The syntax of the break
statement is:
break number
where
-
number
-
is an option break level number. This
number specifies how many levels of loops
to break out of. If
number
is not
specified, it defaults to 1, which breaks
out of one level of looping.
This statement is only legal when it occurs inside of
a while
statement. A break
statement outside of a while
statement
generates a compiler error.
5.3 call
Statement
The syntax of the call
statement is:
call expression
where
-
expression
-
is an expression that is evaluated for its
side-effects only. Most of the time this
means that
expression
is a single routine invocation, but other
more complicated expressions are permitted.
Any return value is ignored.
5.4 continue
Statement
The syntax of the continue
statement is:
continue number
where
-
number
-
is an option continue level number. This
number specifies how many levels of loops
to continue to. If
number
is not
specified, it defaults to 1, which continues
the inner most loop level.
This statement is only legal when it occurs inside of
a while
statement. A continue
statement outside of a while
statement
generates a compiler error.
5.5 do_nothing
Statement
The syntax of the do_nothing
statement is:
do_nothing
When executed, this statement does not do anything.
The purpose is for the programmer to inform the reader
that he or she that no code is needed at the specified
location. This happens pretty commonly after a
case
clause in a switch
statement.
5.6 if
Statement
The syntax of the if
statement is:
if logical_expression_1
nested_statements_1
else_if logical_Expression_2
nested_statements_2
...
else_if Logical_expression_N
nested_statements_N
else
nested_statements_last
where
-
logical_expression_1, ...,
logical_Expression_N
-
are expressions that evaluate to a value
of type
Logical
.
-
nested_statements_1, ...,
nested_statements_N
-
are the one or more nested statements
that are executed when the logical
expression immediately above returns
true
.
-
nested_statements_last
-
are the one or more nested statements
that are executed if none of the logical
expressions return
true
.
Basically, each of the logical expressions is evaluated
in sequence until the first expression returns
true
. For the first expression that
evaluates to true
the nested statements
immediately under the expression are executed.
After the nested statements are executed, the
statement is finished and code execution resumes
immediately after the if
statement.
In the case, where none of the expressions evaluates
to true
, the nested statements under
the else
clause are executed instead.
Finally, the else_if
and else
clauses are optional. The only required portion of
an if
statement is the first expression
(i.e. expression_1
) and the
nested statements immediately following it.
5.7 return
Statement
The syntax of the return
statement is:
return expression
where
-
expression
-
is an expression that is evaluated for the
routine return value.
A return
statement can occur anywhere
in a nested statements that make up the routine body.
Upon execution, the routine immediately terminates
execution and returns to the routine caller. If the
routine is supposed to return a value,
expression
must be present
and evaluate to a type that matches the
returns
clause in the routine header.
If the routine header specifies
returns_nothing
,
expression
must not be present.
5.8 switch
Statement
The syntax of the switch
statement is:
switch enumerate_expression
all_cases_required
case item_1a, ..., item_1n
nested_statements_1
...
case item_1a, ..., item_1m
nested_statements_M
default
nested_statements_default
where
-
enumerate_expression
-
is an expression that evaluates to a type
that is an enumeration type,
-
item_1a, ..., item_1n
-
are members of the enumeration type.
There must be at least enumeration item
after the
case
; all others
are optional.
-
nested_statements_1, ...,
nested_statements_M
-
are the nested statements that are executed when
the
enumerate_expression
evaluates to an item that is in the preceding
enumeration item.
-
nested_statements_last
-
are the nested statements that are executed
if
enumerate_expression
does not match any case
items.
The switch
statement must have at least
one case
or one
default
clause. If all_cases_required
is present,
absolutely every possible enumeration item must be
listed in one or more case
clauses;
in addition, all_cases_required
and
default
make no sense in the same
switch
statement. If there is no
default
clause present, and the
enumerate
does not evaluate
to a matching enumerate item in a case
clause, then no nested statements are executed at
all; otherwise, exactly one block of nested statements
will be executed depending upon the value of
enumerate_expression
.
while
Statement
The syntax of the while
statement is:
where
while logical_expression
nested_statements
This statement will repeatably evaluate
logical_expression
- is an expression that evaluates to type
Logical
(i.e. it returnstrue
orfalse
.nested_statements
- are the nested statement that are executed each time
logical_expression
evaluates totrue
.
logical_expression
until
the first time it returns false
.
Each time logical_expression
returns true
,
nested_statements
are evaluated.
The break
, continue
and
return
statements can all be used to
break out of while
statement loop.
The table below expresses the precedence of operators in Easy-C:
The operators that are preceded by a letter need a little more discussion. The 't[' and 't]' refer to when a square brackets are used for type parameters. Conversely, 'i[' and 'i]', refer to when square brackets are used as an indexing operator (e.g.
Prec. Operators Assoc. Routines 14 t[ t] left 13 ( @( ) i[ i] @ . left fetch_#(), store_#(), field_set(), field_get() 12 u- u! u+ u~ right negate(), not() 11 * / % left multiply(), divide(), remainder() 10 + - left add, minus 9 << >> left left_shift(), right_shift() 8 & left and() 7 ^ left xor() 6 | left or() 5 < > <= >= != = == !== left equal(), less_than(), greater_than(), identical() 4 && left 3 || left 2 , left 1 := :@= left
Array
fetch and
store.) Finally, the 'u-', 'u!', 'u+', and 'u~' refer
to unary operators. Unary operators are the only ones
that group right to left in their associativity (e.g. ---a
is the same a -(-(-a)).) All other operators are left
to right associativity.
For those of you that are familiar with the operator precedence of ANSI-C, you should be warned that there are a few differences between Easy-C and ANSI-C precedence. In particular, the relational operators in Easy-C have a lower precedence than in ANSI-C. In addition, the precedence of comma and assignment are swapped.
In ANSI-C, the evaluation order for routine arguments undefined. In Easy-C, left to right evaluation is strictly enforced.