Easy-C Language Reference

Table of Contents

  1. Introduction
  2. Lexical Issues
  3. Object Model
  4. Declarations
  5. Statements
  6. Expressions

1. Introduction

Easy-C is language that has similar expressive power to the popular ANSI-C language while avoiding many of the issues that make ANSI-C difficult and/or tedious to code in. Easy-C is a strongly typed object oriented language.

2. Lexical Issues

The lexical components of are symbols, numbers, strings, punctuation, and comments. In addition, like other some other more recent programming languages, Easy-C uses the white space at the beginning of each line to provide indentation information to guide compilation. The following lexical components are accepted:

Symbols
Symbols are a sequence of letters ('A'-Z', 'a'-'z'), digits ('0'-'9', and underscores ('_'.) No symbol can start with a digit. Letters from the Latin-9 (ISO-8859-15) alphabet is permitted for letters. The letters can be upper case or lower case. Examples: a, b, a123, top_speed.
Numbers
Numbers start with a digit ('0'-'9'.) There are three kinds of numbers -- decimal, hexadecimal, and floating-point. A decimal number consists exclusively of decimal digits ('0'-'9'.) A hexadecimal number consists of '0x' followed by 1 or more hexadecimals digits ('0'-'9, 'A'-'F', 'a'-'f'). Floating point have a mandatory mantissa followed by an optional exponent. The mantissa consists of at least one decimal digit ('0'-'9') and at exactly one decimal point ('.'.) The decimal point may at the beginning, middle or end of the mantissa, but it must be present. The optional exponent is the letter 'E' (or 'e') followed by an optional sign ('-' or '+') followed by one or more decimal digits.
Strings
Strings specify a sequence of zero, one or more characters (also referred to as codes.) There are character strings that are enclosed in single quotes ("'") and must specify exactly one character. Regular strings enclosed in double quotes ('"') and specify zero, one or more characters. Any printing character from the Latin-9 character (plus space) is allowed between the quotes, except for the quote character itself. There is a mechanism for expressing other characters inside a string that involves the use of the backslash character ('\'). A backslash ('\') followed by a list of comma separated symbols and decimal numbers followed by another backslash ('\') defines a list of characters where each number/symbol corresponds to a character. The allowed symbols are:
SymbolDescriptionCode
n New-line 10
r Carriage Return 13
t Horizontal Tab 9
sq Single Quote 39
dq Double Quote 34
bslBackslash 92
Punctuation
The following single character punctuation is allowed - '~', '!', '@', '%', '^', '&', '*', '(', ')', '-', '+', '=', '[', ']', '<', '>', ',', '|', and '/'. The following double character punctuation is allowed - '&&', '==', '||', '<=', '>=", '!=', ':=", and '@('. The following triple character punctuation is allowed - ':@='.
Comments
Comments start with a '#' and continue to the end of the line.
Blank Lines
Blank lines have no visible tokens on the line and are completely ignored.
Indentation and Shite Space
White space consists exclusively of the horizontal tab and space character. At the beginning of the line, tabs space over to the next multiple of 8 spaces. All other white space on the line after the first lexical element are ignored.
Continuation Lines
A line is continued onto the next line if it ends in a punctuation character other than ']' or ')'. Thus,
	    a := length@(a + b, # Comment ignored
	      c - d)					
is the same as:
	    a := length@(a + b, c - d)				
The example above shows that comments are allowed at the end of a continuation line.

Below are some examples of each of the visible lexical components:

Symbols
a, b, abc, a1, z123, Point, stop_point
Decimal Numbers
0, 1, 9, 123, 1234567
Hexadecimal Numbers
0x0, 0x1, 0x9, 0xa, 0xf, 0xA, 0xF, 0xabcdef, 0xABCDEF
Floating Point Numbers
0., 1., 9., 0.0, 1.0, 9.0, .0, .1, .9, 12.34, .1234, 1234., 1.2e0, 1.2e10, 1.2e-10, 1.2e+10
Character Strings
'a', 'b', ' ', ':', '\n\', '\r\', '\sq\', '\bsl\', '\1\', '\123'
Full Strings
"", "a", "z", "abc", "a test", "\sq\a test\sq,n\", "null-terminated\0\"
Comments
# A comment

Unlike C, Easy-C does not have reserved words such as if, return, etc.

3. Object Model

Every variable, argument, and record field has a type. There are three kinds of types in Easy-C:

Simple Type
A simple type is just a symbol, which by convention starts with a capital letter (e.g. Unsigned, Integer, Vector3, etc.)
Parameterized Type
A parameterized type is a symbol followed by one or more comma separated types enclosed in square brackets (e.g. Array[Integer], Array[Array[Integer]], Hash_Table[Unsigned, String], etc.)
Routine Type
A routine type is a type that points to a routine (i.e. function.) The syntax is an open square bracket followed by zero or more return types, followed by '<=', followed by zero or more argument types. For example, "[Integer <= Integer, Integer]", "[<= Unsigned, Integer]", "[Vector3 <=]", ...
Two types are equal if and only if they exactly match; otherwise they are different.

In Easy-C, arguments, variables, and record fields are strongly typed. A variable of a given type always contains a pointer to an object of the correct type. For example, it is possible to have variable "a" of type "Apple" and a variable "o" of type "Orange". The variable "a" will only point to "Apple" objects and the variable "o" will only point to "Orange" objects. There is absolutely no mixing between "Apple" and "Orange" objects.

The Easy-C assignment operators are ":=" and ":@=". ":@=" is used to assign to variable for the first time and ":=" is used to assign for the second and subsequent times. The assignment operator is a simple pointer replacement. If both "a1" and "a2" are variables of type "Apple", they both point to "Apple" objects, For the piece of code below:

    a1 := a2					
After this statement, both "a1" and "a2" point to the exact same "Apple" object. In other words, the pointer in "a1" has been replaced by the pointer in "a2".

In Easy-C there is a concept of mutable and immutable objects. A mutable object is like box whose contents can be changed whereas an immutable object is a box whose contents is frozen in place and can not be changed. Some types in Easy-C are mutable and others are immutable. The basic types in Easy-C are Logical, Unsigned, Integer, Character, Float, and Double and all of these types are immutable. Other types like records and arrays are mutable in that their contents can be changed. It turns out that some Easy-C strings are mutable and some are immutable.

In Easy-C, if there are variables "a", "b", and "c" of type "Unsigned", the code below:

    a := b + c							
is really a short hand for:
    a := add@Unsigned(b, c)			
"add@Unsigned" refers to the "add" routine associated with type "Unsigned". In Easy-C, all routines are associated with a Type. "a := add@Unsigned(b, c)" means invoke the "add@Unsigned" routine with two arguments, "b" and "c" and store the back into the variable "a". In even finer detail, the "add@Unsigned" routine does the following:
  1. the value of the object pointed to by "b" is obtained,
  2. the value of the object pointed to by "c" is obtained,
  3. the sum of those two values is computed,
  4. an object that contains the sum value is created,
  5. a pointer to the new sum object is returned.
While that seems a bit convoluted for doing simple math, it works just as well for more complicated types. For, example, if "a", "b", and "c" are variables of type "Vector3" (i.e. 3-dimensional vectors), the code above still works, except that "add@Vector3" is called instead.

In short, there are no special language semantics associated with the basic types, they are treated like all others. This is stark contrast to most other languages (including ANSI-C) which make special allowances for the basic types by calling them scalar types.

Most languages that have pointers also have the concept of an empty pointer. In Easy-C, the empty pointer concept is replaced with a very similar concept of pointing to a Null object. Every type has one (and only one) null object named "Null". Thus, "Null@Unsigned" is the null object for the type "Unsigned" and "Null@Vector3" is the null object for the type "Vector3".

In Easy-C all variables, arguments and record fields are initialized to point to the null object of the appropriate type. Complex, types like records have each of their individual field values initialized to point to the appropriate null object. This is in stark contrast to other languages that leave variables and records uninitialized.

In short, 1) everything in Easy-C is a strongly typed object, 2) variables, arguments, and record fields are strongly typed, 3) variables, arguments and record fields always point to an object of the same strong type, and 4) assignment overwrites the previous object pointer value with a new object pointer.

The object model presented here is substantially simpler than most other programming languages.

4. Declarations

Declarations always start at the beginning of line with no preceding white-space. In general, an Easy-C is broken into four sections:

  1. import declarations
  2. new type definitions
  3. object definitions
  4. routine definitions
The more important import declaration is the "library" declaration, although there are a few extra import declarations used to interface to ANSI-C code. The "define" declarations is used to define a new type. The "global" and "constant" declarations are used to define new globally accessible objects. And a "routine" declaration is used to define a new routine.

4.1 library Declaration

The syntax of the library declaration is:

    library library_name				
where
library_name
is the name of the library to access.
The routine causes all of the types and routines defined in library_name to become available for use in the current Easy-C file. In general, this means that the types and routines defined in the file library_name.ezc are made available for use.

4.2 define Declaration

There are two basic flavors of the "define" declaration 1) record/variant definition, and 2) enumeration definition.

The syntax for the enumeration define is:

    define enumeration_type_name
	enumeration
	    item_1
	    item_2
	    ...
	    item_n
	generate routine_names				
where
enumeration_type_name
is the name of the new Enumeration type,
item_1, item_2, ..., item_n
are the names of new constants,
routine_names
is a comma separated list of routine names. {Currently, this list is empty.} The generate line is optional.

The enumeration define declaration defines a new simple type named enumeration_type_name. By convention, the first letter of the new type name is capitalized. Each of the listed items is defined as a global constant item_1@enumeration_type_name, ..., item_n@enumeration_type_name.

The syntax for the record/variant define is fairly complicated. The syntax is presented first and then is subsequently explained.

The syntax for the top level record/variant define declaration is:

    define record_type_name
	record_variant_clause_1
	record_variant_clause_2
	...
	record_variant_clause_n
	generate routine_names				
where
record_type_name
is the name of the new record/variant type. This type can be either simple or parameterized.
record_variant_clause_1, ..., record_variant_clause_n
are described below,
routine_names
is a comma separated list of routine names. Currently, only parse and traverse supported as routine names. The generate line is optional.

The syntax for a record clause is:

	record
	    field_name_1 type_1
	    field_name_2 type_2
	    ...
	    field_name_n type_n		
where
field_name_1, ..., field_name_n
are unique symbols that name each field in the record, and
type_1, ..., type_n
are the types associated with each field. Each type can be either, simple, parameterized, or a routine type.

The syntax for a variant clause is:

	variant select_name select_type_name
	    field_name_1 type_1
	    field_name_2 type_2
	    ...
	    field_name_n type_n		
where
select_name
is a symbol that is used to access the type selector,
select_type_name
is a simple type that is defined for the type selector,
field_name_1, ..., field_name_n
are unique symbols that name each field in the record, and
type_1, ..., type_n
are the types associated with each field. Each type can be either, simple, parameterized, or a routine type.

The following example shows a simple record declaration:

    define Point3 # Point in 3-space
        record
        x Double # X coordinate
        y Double # Y coordinate
        z Double # Z coordinate					
This defines a new type named Point3. This new type is a record with three fields named x, y, and z, all of type Double. This declaration also defines the routine New@Point3 and the global object Null@Point3. The following code shows this type being used:
    p1 :@= New@Point3()	# Create and return a new point
    p1.x := 1.0		# Initialize x field
    p1.y := 2.0		# Initialize y field
    p1.z := p1.x + p1.y	# Initialize z field
    p2 :@= Null@Point3	# Initialize p2 to a Null@Point3	
`

The following example shows a variant declaration:

    define Number 		# General purpose number
        variant kind Number_Kind
        double Double 		# Double precision float
        float Float		# Single precision float
        integer Integer 	# Signed 32-bit number
        unsigned Unsigned 	# Unsigned 32-bit number	
This declaration defines a new type named Number that has 5 fields named, kind, double, float, integer, and unsigned. It also defines a new enumeration type named Number_Kind.

4.3 Object Declarations (global, constant, external

There are three kinds of object declarations -- global, constant, and external declarations.

The syntax of a global declaration is:

    global variable_name@type			
where
variable_name
xxx
type
xxx
xxx

The syntax of the constant declaration is:

    constant name@type = expression	
where
name
xxx
type
xxx
xxx

The syntax of the external Declaration is:

    external variable_name@type		
where
variable_name
xxx
type
xxx
xxx

4.4 routine Declaration

The routine declaration is the work horse of Easy-C. Most Easy-C programs are dominated by routine declarations.

The syntax of the routine declaration is:

    routine routine_name@routine_type
	takes argument_name_1 argument_type_1
	...
	takes argument_name_N argument_type_N
	takes_nothing
	returns return_type
	returns_nothing

	nested_statements				
where
routine_name
is the routine name. By convention the routine name is in all lower case.
routine_type
is the routine type. It can be either a simple type, or a parameterized type. If it is a parameterized type, each of the parameters, must be a simple type name. By convention, the first letter of a type is capitalized.
argument_name_1, ..., argument_name_N
is an argument name. The argument name becomes a local variable that is accessible throughout the entire routine body.
argument_type_1, ..., argument_type_N
is the argument type. The argument type can be either a simple type, parameterized type, or a routine type. Nested parameter types are permitted.
return_type
is the routine return type. If the routine does not return a type, returns_nothing is specified instead.
nested_statements
are the nested statements that are executed when the routine is invoked.
A routine must have either one or more takes clauses or a single takes_nothing clause. A routine must have either a returns clause or a returns_nothing clause. By convention, a short comment occurs after the returns or returns_nothing clause that describes what the routine does.

The routine main@Easy_C is interesting in that it is the first routine in a program that is executed.

4.5 Miscellaneous Declarations

There are a few extra declarations that need additional documentation:

include Declaration
includes a file, but not as a library
require Declaration
requires a file, but not as a library
defines_prefix Declaration
used for extracting macro values from C header files.

5. Statements

With the exception of the assign statement, each statement starts with a keyword. The assign statement is the only exception to this rule.

5.1 Assert Statement

The syntax of the assert statement is:

    assert logical_expression				
where
logical_expression
is an expression that returns type Logical (i.e. true or false).
If logical_expression evaluates to false the program execution is immediately terminated with a fatal run time error. The purpose of this statement is for the programmer to inform anybody who reads the code about code invariants.

5.2 Assign Statement

The assign statement occurs whenever the statement has a := or :@= on the line. When the compiler sees either of those two tokens, it inserts an invisible {assign} at the beginning of the line.

The syntax of the assign statement is:

    left_expression := right_expression	
or
    variable :@= right_expression		
where
left_expression
Left expression is constrained to be one of exactly three formats:
variable
yyy
expression.field_name
yyy
expression1[expression1, ...]
yyy
right_expression
xxx
variable
xxx

5.2 break Statement

The syntax of the break statement is:

    break number					
where
number
is an option break level number. This number specifies how many levels of loops to break out of. If number is not specified, it defaults to 1, which breaks out of one level of looping.
This statement is only legal when it occurs inside of a while statement. A break statement outside of a while statement generates a compiler error.

5.3 call Statement

The syntax of the call statement is:

    call expression					
where
expression
is an expression that is evaluated for its side-effects only. Most of the time this means that expression is a single routine invocation, but other more complicated expressions are permitted.
Any return value is ignored.

5.4 continue Statement

The syntax of the continue statement is:

    continue number					
where
number
is an option continue level number. This number specifies how many levels of loops to continue to. If number is not specified, it defaults to 1, which continues the inner most loop level.
This statement is only legal when it occurs inside of a while statement. A continue statement outside of a while statement generates a compiler error.

5.5 do_nothing Statement

The syntax of the do_nothing statement is:

    do_nothing							
When executed, this statement does not do anything. The purpose is for the programmer to inform the reader that he or she that no code is needed at the specified location. This happens pretty commonly after a case clause in a switch statement.

5.6 if Statement

The syntax of the if statement is:

    if logical_expression_1
	nested_statements_1
    else_if logical_Expression_2
	nested_statements_2
    ...
    else_if Logical_expression_N
	nested_statements_N
    else
	nested_statements_last				
where
logical_expression_1, ..., logical_Expression_N
are expressions that evaluate to a value of type Logical.
nested_statements_1, ..., nested_statements_N
are the one or more nested statements that are executed when the logical expression immediately above returns true.
nested_statements_last
are the one or more nested statements that are executed if none of the logical expressions return true.
Basically, each of the logical expressions is evaluated in sequence until the first expression returns true. For the first expression that evaluates to true the nested statements immediately under the expression are executed. After the nested statements are executed, the statement is finished and code execution resumes immediately after the if statement. In the case, where none of the expressions evaluates to true, the nested statements under the else clause are executed instead. Finally, the else_if and else clauses are optional. The only required portion of an if statement is the first expression (i.e. expression_1) and the nested statements immediately following it.

5.7 return Statement

The syntax of the return statement is:

    return expression					
where
expression
is an expression that is evaluated for the routine return value.
A return statement can occur anywhere in a nested statements that make up the routine body. Upon execution, the routine immediately terminates execution and returns to the routine caller. If the routine is supposed to return a value, expression must be present and evaluate to a type that matches the returns clause in the routine header. If the routine header specifies returns_nothing, expression must not be present.

5.8 switch Statement

The syntax of the switch statement is:

    switch enumerate_expression
      all_cases_required
      case item_1a, ..., item_1n
	nested_statements_1
      ...
      case item_1a, ..., item_1m
	nested_statements_M
      default
	nested_statements_default			
where
enumerate_expression
is an expression that evaluates to a type that is an enumeration type,
item_1a, ..., item_1n
are members of the enumeration type. There must be at least enumeration item after the case; all others are optional.
nested_statements_1, ..., nested_statements_M
are the nested statements that are executed when the enumerate_expression evaluates to an item that is in the preceding enumeration item.
nested_statements_last
are the nested statements that are executed if enumerate_expression does not match any case items.
The switch statement must have at least one case or one
default
clause. If all_cases_required is present, absolutely every possible enumeration item must be listed in one or more case clauses; in addition, all_cases_required and default make no sense in the same switch statement. If there is no default clause present, and the enumerate does not evaluate to a matching enumerate item in a case clause, then no nested statements are executed at all; otherwise, exactly one block of nested statements will be executed depending upon the value of enumerate_expression.

5.9 while Statement

The syntax of the while statement is:

    while logical_expression
	nested_statements				
where
logical_expression
is an expression that evaluates to type Logical (i.e. it returns true or false.
nested_statements
are the nested statement that are executed each time logical_expression evaluates to true.
This statement will repeatably evaluate logical_expression until the first time it returns false. Each time logical_expression returns true, nested_statements are evaluated.

The break, continue and return statements can all be used to break out of while statement loop.

6. Expressions

The table below expresses the precedence of operators in Easy-C:

Prec. Operators Assoc. Routines
14 t[ t] left
13 ( @( ) i[ i] @ . left fetch_#(), store_#(), field_set(), field_get()
12 u- u! u+ u~ right negate(), not()
11 * / % left multiply(), divide(), remainder()
10 + - left add, minus
9 << >> left left_shift(), right_shift()
8 & left and()
7 ^ left xor()
6 | left or()
5 < > <= >= != = == !== left equal(), less_than(), greater_than(), identical()
4 && left
3 || left
2 , left
1 := :@= left
The operators that are preceded by a letter need a little more discussion. The 't[' and 't]' refer to when a square brackets are used for type parameters. Conversely, 'i[' and 'i]', refer to when square brackets are used as an indexing operator (e.g. Array fetch and store.) Finally, the 'u-', 'u!', 'u+', and 'u~' refer to unary operators. Unary operators are the only ones that group right to left in their associativity (e.g. ---a is the same a -(-(-a)).) All other operators are left to right associativity.

For those of you that are familiar with the operator precedence of ANSI-C, you should be warned that there are a few differences between Easy-C and ANSI-C precedence. In particular, the relational operators in Easy-C have a lower precedence than in ANSI-C. In addition, the precedence of comma and assignment are swapped.

In ANSI-C, the evaluation order for routine arguments undefined. In Easy-C, left to right evaluation is strictly enforced.


Copyright © 2007 by Wayne C. Gramlich. All rights reserved.