This is language specification for µUL. It is currently work in progress.

The µCL Language Specification

Table of Contents

1. Introduction

This language specification tends to follow the same overall structure as other language specications; namely it discusses lexical issues first, followed by global declarations, procedure declarations, statements, and expressions.

2 Lexical Considerations

The µCL language consists of a number of declarations in a file. Each delcaration starts with a declaration name followed by zero, one or more arguments and ended by a new-line argument. Each argument is separeted from one another by one or more space and/or tabs. For example,

    port porta a bits_only read_write_static			
specifies the `port' command with 4 arguments. The first argument is `porta', the second is `a', the third is `bits_only' and the last is `read_write_static'. The declaration is terminated by a new-line. (There are no semi-colons in this language.)

Comments can be entered on the line by using a sharp character (`#') followed by and text until the end of the line.

2.1 Braces and Parenthesis

While most declarations occur on a single line, the use of curly-braces and parenthesies allows commands to span multiple lines. A matched pair of curly-braces or parenthesis are treated as a single argument in µCL. Both the curly-braces and parenthesis can span multiple lines. An example should help to clarify this concept.

An `if-then-else' statement has the following form:

if expression then_body else else_body
The first argument is an expression, the second argument is the `then' clause statements (usually enclosed in curly braces), the third argument is the word `else' and the last argunent is the `else' clause statements (again, usually enclosed in curly braces.) An example `if-then-else' statement is shown below:
    if (a < 10) {
	a := a + 10
    } else {
	a := a - 10
    }
								
In this example, the first argument is `(a < 10)', the second argument is `{ a := a + 10 }', the third argument is `else', and the forth argument is `{a := a - 10}'. Note that the curly-braces for the second and forth arguments actually span multiple lines.

A word of warning about curly-braces -- they must always match up. This includes comments, character literals, and everything. Unmatched curly braces cause all sorts of strange problems with the parser. For example, the extra `{' in the comment below causes the code to be unparsable.

	if (a < 10) {
	    # Unmatched brace {
	    a := a + 10
	} else {
	    a := a - 10
	}							
Just remember, the curly braces can be nested, but they MUST match-up.

2.2 Symbols and Literals

In µCL, a symbol starts with a letter and is followed by zero, one, or more, letters, digits or underscores. Underscores are treated as word separators, so they may not occur at the end of a symbol; nor can two underscores occur next to one another within the middle of a symbol.

Basically all constants in µCL start with a digit. The various forms are listed below:

Decimal Number (1234)
A decimal number starts with the digits 1 through 9. Decimal numbers do not start with the digit 0.
Hexadecimal Number (0xFa)
A hexadecimal number starts with "0x" or "0X" followed by one or more hexadecimal digits (0-9, A-F, and a-f).
Binary Literal (0b'1011001')
A binary literal starts with "0b" or "0B" followed by a binary number in single quotes.
Octal Literal (0o'67')
An octal literal starts with "0o" or "0O" followed by an octal number in single quotes.
Decimal Literal (Od'89')
A decimal literal starts with "0d" or "0D" followed by a deciaml number in single quotes.
Hexadecimal Literal (0h'9f')
A hexadecimal literal starts with "0h" or "0H" followed by a hexadeciaml number in single quotes.
Character Literal (0c'#')
A character literal starts with "0c" or "0C" followed by a printable character in single quotes. Control characters, spaces, tab, and single quote are not permitted within the single quotes.
String Literal (0s'string')
A string literal starts with "0s" or "0S" followed by a sequence of printing characters enclosed in single quotes. Control characters, tab, and single quote are not permitted within the single quotes. String literals currently only occur inside a string_constants declaration.
Random Number Literal (0r'16')
A random number literal is a sequence of one or more random numbers. The number enclosed in the single quotes is a decimal number that specifies how many random bytes are in the litera. Random number literals currently only occur inside of a string_constants declaration.
Please note, only the single quote (') character is permitted in literals. Do not use the accent grave (`) character. Thus, 0b'10101' is legal but 0b`10101' is not because the character after the "b" is an accent grave.

2.3 Punctuation

The following other tokens are recognized in µCL:

+
Addition
-
Subtraction
*
Multiplication
/
Division
%
Remainder
~
Bit complement
!
Logical not
@
Bit selection
^
Exclusive OR (XOR)
&
Bit-wise AND
&&
Conditional AND
=
Equal
!=
Non-equal
<
Less than
<<
Rotate Left
<=
Less than or Equal
>
Greater than
>>
Rotate Right
>=
Greater than or Equal
,
Seperator
|
Bit-wise OR
||
Conditional OR
:=
Assignment

At some point in the future, the following additional assignment operators will be added:

:+=
Addition Assign
:-=
Subtraction Assign
:*=
Multiplication Assign
:/=
Division Assign
:%=
Remainder Assign
:^=
Exclusive OR (XOR) Assign
:&=
Bit-wise AND Assign
:<<=
Rotate Left Assign
:>>=
Rotate Right Assign
:|=
Bit-wise OR Assign

3 Types

The long term plan for uCL is to support 6 types -- bit, byte, byte array, integer, float, and string. Initially, only the bit and byte types are supported.

3.1 Bit

A bit can contain the values `0' and `1'. Bits can be stored in variables and transfered via assignment statements. A bit value can not be directly assigned to a byte value.

Some examples of bit code are shown below:

    variable a bit
    variable b bit
    variable c bit

    a := 1
    b := c
    c := a && c		# a AND c
    a := !b		# NOT b
								

3.2 Byte

A byte can contain the values from `0' through `255' inclusive. Bytes are unsigned in µCL. Bytes can be stored in variables:

Some example code with bytes is shown below:

    variable a byte
    variable b byte
    variable c byte

    a := 23
    b := a
    c := a + b << 3
								

3.3 Byte Array

A byte array is a fixed sized array of one or more bytes. The array can be indexed into with a byte index value. There is no bounds checking when accessing a byte array.

Some sample code with byte arrays is shown below:

    variable buffer[10] byte	# Ten byte buffer
    variable index byte

    buffer[0] := 0
    index := 4
    buffer[index] := 8
    buffer[index + 1] := buffer[index]
    buffer[23] := 17		# Out of bounds!
								

3.4 String

A string is basically a sequence of bytes with a length. Read-only strings (i.e. constant strings) are stored in program memory. Read-write strings are stored in a byte array.

Some example code is shown below:

    variable buffer[10] byte
    variable text string
    variable chr byte
    variable length byte
    variable max byte

    text := "A string literal"
    chr := text[3]	# chr = 's'
    length := text.size	# length = 16
    max := text.limit	# max = 16

    text := buffer
    length := buffer.size	# length = 0
    max := buffer.limit		# limit = 10
    text.size := 3
    text[0] := 0c'H'
    text[1] := 0c'i'
    text[2] := 0c'!'
								

3.5 Integer

{To be written}

3.6 Float

{To be written}

4 Declarations

There are a small number of µCL declarations. The first is the `processor' declaration that specifies the processor type and configuration word options. The `global' declaration defines a global bit or byte variable. The `port' and `pin' declarations specifiy the configuration of the I/O pins on the PIC processor. The `register' and `bind' declarations provide access to microcontroller specific bits and registers. Finally, the `procedure' declaration specifies a µCL procedure.

4.1 The `processor' Declaration

The `processor' declaration has the following form:
processor name properties...
name
is the name of the microcontroller chip (e.g. `pic16f84') and
properties...
is a list of configuration word properties for the micocontroller. The form of a property is `property_name=property_value'.
An error results for each property needed by the microcontroller that is not set by the `processor' properites.

4.2 The `constant' Declaration

The `constant' declaration has the following form:

constant name constant_expression...
name
is the user supplied constant name, and
constant_expression...
is an expression consisting of numbers and other previously defined constants.

The `constant' declaration uses 32-bit arithmetic to do all compuations. Any constants that are used as microcontroller constants must be in the range 0 through 255 inclusive.

The expressions are evaluated in sequence. No forward references to constants that are defined later on are permitted.

4.3 The `global' Declaration

The `global' declaraton introduces a global variable that is accessible from all procedures. It has the following form:

global name [ '[' array_size ']' ] type
name
is the name of the global variable, and
array_size
is an optional array size constant enclosed in square brackets
type
is either `bit' or `byte'.

4.4 The `port' Declaration

The `port' declartion assigns a user name to an I/O port of the PIC. The `port' declaration has the following form:

port name port_letter type access
name
is the user name given to the port.
port_letter
is the port letter -- one of A, B, C, D, or E depending upon how many ports are implmented the specific version of the microcontroller.
type
is the port type -- one of `byte_only', `bits_only', `bits_and_byte', or `unused'. These are discused further below.
access
specifies the port access -- one of `read_only', `write_only', `read_write_static', `read_write_auto', `read_write_manual', and `none'. These are discussed further below.

An I/O port can be used as an 8-bit byte, or a bunch of single bit quantities, or both. When type is set to `byte_only', the I/O port is only accessible as an 8-bit quantity; in this situation, name is defined as a global variable of type `byte'. When type is set to `bits_only', the I/O port is accessed via individual bits; in this situation name is not defined as a global variable. When type is set to `bits_and_byte', both bit and byte access is permitted; in this situation name is defined as a global variable. Lastly, if type is set to `unused', the port is not used as either a byte or individual bits; in this situation name is not defined as a global variable.

An I/O port can be read from and written to. When an I/O port pin is only read from it is said to be `read_only'. Conversely, when an I/O port pin is only written to, it is said to be `write_only'. Some I/O port pins are bi-directional in that sometimes they write data and sometimes they read data. These pins are called `read_write' pins.

If all of the pins of the I/O port are only read from, access is set to `read_only'. If all of the pins of the I/O port are only written to, access is set to `write_only'. If some of the pins of an I/O port are only read from and others are only written to, but none of them are bi-directional, access is set to `read_write_static'. If any pin of the I/O port is bi-directional, the entire port is marked as either `read_write_auto' or `read_write_manual'. When the port is in `read_write_auto', additional code is generated to set the bit directions whenever the port is read from or written to. If the port is in `read_write_maunal, a `direction' statement is used to explicitly specify the direction of each pin of the port. There is no magic code generated to automatically set the bit direction.

If type is `unused', access must be set to `none'.

4.5 The `pin' Declaration

The `pin' declaration is tightly bound to the `port' declaration and has the following form:

pin name port_name bit_number access
name
is the name of the pin,
port_name
is the port name the pin is attached to,
bit_number
is the bit number for the pin (i.e. a number between 0 and 7 incluseive), and
access
is one of `read_only', `write_only', `read_write_auto', `read_write_manual', or `unused'.

The name argument is used as a global variable of type `bit' (except when access is `unused'.) The port_name must reference the name of a port declared in a previous `port' declaration. The bit_number specifies the bit in the I/O port that is being used (i.e. a number between 0 and 7 inclusive). Actually, bit_number can be an arbitrary constant expression.

If the pin is for input only, access is set to `read_only'. If the pin is for output only, access is set to `write_only'. If the pin is bidirectional, access is set to either `read_write_auto' or `read_write_manual'. When access is set to `read_write_auto', additional code is generated to automatically change the bit direction depending upon whether the name variable is read or written. Conversely, if access is `read_write_manual', the direction of the pin is controlled using the `direction' statement. Finally, if the pin is unused, access is set to `unused'.

When a pin variable is assigned to, the generated code is very careful not to introduce any spurious transitions on the output pin. Consider the code fragment below:

    port porta a bits_only read_write_static
    pin tx porta 0 write_only
    pin rx porta 1 read_only

    procedure test {
	arguments_none
	returns_nothing
	variable temp bit

	# Later on in some procedure ...
	
	temp := 1
	tx := 1

	# ...

	tx := temp						
This code will generate the following code:
	; tx := 1
	bsf porta,0
	; ...
	; tx := temp
	btfss temp,0
	bcf porta,0
	btfsc temp,0
	bsf porta,0						
This code will only cause a transition on the output pin if the value of `temp' is different than the current value of `tx'. In addition, their is no addition spurious transitions.

4.6 The `register' Declaration

The form of the `register' declaration is:

register name constant_expression...
name
is the name of the register, and
constant_expression...
is a constant expression that evaluates to the number of the register.
A `register' declaration is similar to a `port' declaration, but there is no need to specify the I/O direction of the port.

If the register size is greater than 31 (for 12-bit PIC's) or greater than 127 (for 14-bit PIC's), the code generator will emit the correct code to do the requisite bank switching.

4.7 The `register_array' Declaration

The form of the `register_array' declaration is:

register_array name first_constant_expression size_constant_expression
name
is the name of the register array, and
first_constant_expression...
is a constant expression that evaluates to the number of the first register in the array.
size_constant_expression...
is a constant expression that evaluates to the number of registers in the array.
A `register_array' declaration declares an array but does not allocate any storage for the array. It is assumed that the user knows what they are doing when they use this declaration.

4.8 The `bind' Declaration

The form of the `bind' declaration is:

bind new_name old_name [ `@' constant_expression... ]
new_name
is the new variable name,
old_name
is the old variable, port, or register name,
constant_expression...
is the the bit selection expression which must evaluate to a constant between 0 and 7 inclusive.
A `bind' declaration gives an alternate name for an existing global variable, register, or port. If the selector expression is specified, an individual bit is bound to new_name. When the selector version of a `bind' declaration is given.

Some example code that defines the interrupt control register for a PIC16F84 is shown below:

    register intcon 0xb
    bind gie intcon @ 7
    bind eeie intcon @ 6
    bind t0ie intcon @ 5
    bind inte intcon @ 4
    bind rbie intcon @ 3
    bind t0if intcon @ 2
    bind intf intcon @ 1
    bind rbif intcon @ 0					

When a selector expression is applied to a port name, the result is simliar to a `pin' declaration. However, the µCL compiler is blissfully unaware of any of the subtle issues of managing the I/O direction registers. In addition, the code can be glitchy. For example:

    port porta a bits_only read_write_static
    bind tx porta @ 0

    procedure test {
	arguments_none
	returns_nothing
	variable temp bit
	
	temp := 1
	#...
	tx := temp
    }								
will generate code that looks like:
    ; temp := 1
    bsf test__temp,0
    ; ...
    bcf test__temp,0
    btfsc porta,0
    bsf test__temp,0						
which will `glitch' to zero for three instruction cycles whenevr the `temp' variable is a 1. In general, it is a bad idea to use the `bind' declaration on a port variable.

4.9 The `string_constants' Declaration

The form of the string_constants declarations is:

string_constants strings_body
strings_body
is a list of 1 or more string constant declarations.
A string constant declaration has the following form:

string_name = string_component,...
strings_name
is the name of the string.
string_component
is either a quoted string constant (i.e. 0s"Hello", or a constant expression.

An example of the string_constants is shown below:

    ...

    constant cr 13
    constant lf 10
    string_constants {
	hello = 0s"Hello!", cr, lf
	goodbye = 0s"Goodbye!", cr, lf
    }

    procedure main {
	arguments_none
	returns_nothing

	variable char byte
	variable count byte
	variable index byte

	loop_forever {
	    # Send Hello:
	    index := 0
	    count_down count hello.size {
		call send_byte(hello[index])
		index := index + 1
	    }
	    # Send Goodbye:
	    index := 0
	    count_down count goodbye.size {
		call send_byte(goodbye[index])
		index := index + 1
	    }
	}
    }

    procedure send_byte {

    ...
								

4.10 The `origin' Declaration

The form of the origin declarations is:

origin constant_expression
where constant_expression
is the address at which to start generating code.
The origin declaration is used to place code in a different location in the microcontroller program memory.

For microcontrollers that have the concept of code banks (e.g. the PIC1687x), the origin declaration uses the high order bits of the origin to implicitly set the code bank. It is up to the programmer to manage the placement of procedures within code banks using the origin declaration. The compiler will generate an error if the procedures within a given code bank spill over into another code bank.

The compiler is responsible for generating the appropriate additional instructions for placing calls from procedures in one code bank to procedures in a different code bank.

4.11 The `bank' Declaration

The form of the bank declarations is:

bank constant_expression
where constant_expression
is the register bank to select.
Some microcontrollers have more than one bank of general purpose registers (e.g. the PIC1687x). The bank declaration is used to select which register bank to allocate registers from.

The programmer is responsible for ensuring that the number of registers within a given bank are not exhausted. The compiler will generate an error message if the number of registers within a given bank are exhausted.

The compiler is responsible for generating the additional bank select instructions for accessing global variables that are defined in each register bank.

4.12 The `procedure' Declaration

The form of the procedure declarations is:

procedure name body
name
is the procedure name, and
body
is the procedure body.

The body consists of a sequence of statements enclosed in curly braces. The first statements must be either one or more `argument' statments or a single `arguments_none' statement. This is followed by either a `returns' statement or a `returns_nothing' statement. See the section on statements to find out more about these statements.

It is illegal to have two `procedure' declarations with the same name.

The procedure named `main' must be the first procedure declared in every program. It is always invoked at the beginning of microcontroller execution.

5 Procedure Declarations

Procedure declarations must occur before any executable statements in the procedure body. By convention, the `argument' and `arguments_none' statements occur before any `returns' and `returns_nothing' declarations. If present, the `recursive' and `uniform_delay' declaration occur after the return declarations.

5.1 The `argument' Statement

The `argument' statement has the following form:

argument name type
name
is the argument name, and
type
is the argument type -- currently one of `bit' or `byte'.

All `argument' statements must come as the first statements in a `procedure' body. The first `argument' statement corresponds to the first procedure argument and so forth. It is illegal to specify both an `argument' statement and an `arguments_none' statement in the same procedure.

5.2 The `arguments_none' Statement

The `arguments_none' statement has the following form:

arguments_none

This statement specifies that the procedure takes no arguments. A procedure must either have one or more `argument' statements or a single `arguments_none' statement. It is an error to specify one of these two statements. When the `arguments_none' statement is specified, it must be the first statement in the procedure body.

5.3 The `returns' Statement

The `returns' statement has the following form:

returns type...
type...
is the list of one or more types that the procedure returns.
The `returns' statement specifies the types of any returned values. Unlike most other programming languages, this programming language allows for the return of more than one value from a procedure call. The `returns' statement must immediately follow the preceeding `argument' or `arguments_one' statement.

5.4 The `returns_nothing' Statement

The `returns_nothing' statement has the following form:

returns_nothing

The `returns_nothing' statement specifies that the procedure does not return any values. The `returns_nothing' statement must immediately follow any preceeding `argument' or `arguments_none' statement. Either `returns' or `returns_nothing' statement must be specified.

5.5 The `recursive' Statement

{Recursive is not implemented yet.}

The `recursive' statement has the following form:

recursive

The `recursive' statement instructs the compiler to produce code that allows the procedure to be called recursively. In general, recursive procedures consume more memory locations than non-recursive ones. If present, the `recursive' statement must immediately follow the preceeding `returns' or `returns_nothing' statement. The non-presense of a `recursive' statement indicates that a procedure is not recursive.

5.6 The `uniform_delay' Statement

The `uniform_delay' statement has the following form:

uniform_delay cycles
cycles
is the total number of instructions that the routine should take.

A `uniform_delay' statement instructs the compiler to produce a procedure that has a uniform execution delay. This means that conditional code generated for `if', `&&', and `||' will be padded with nop instructions to ensure that each path taken through the code takes exactly the same number of instruction cycles.

5.7 The `variable' Statement

The `variable' statement has the following form:

variable name type
name
is the argument name, and
type
is the argument type -- currently one of `bit' or `byte'.

All `variable' statements defines a variable that is local to the procedure body. It is not possible for one procedure to directly access the local variables of another procedure. There are no nested variables in this language. A variable that occures deep within a some nested statements is accessible through out the entire procedure.

6 Statements

Statements can only occur within the body of `procedure' declartion after all of the procedure declarations.

6.1 The Assemble Statement

The assemble statement has the following form:

assemble {assemble_statements}
assemble_statements
is a list of assembly statements to insert into the code.
The assembly statements have the following form:
opcode operand1 ...
For both the 12-bit and 14-bit PIC's the following opcodes are supported:
OpcodeOperandsDescription
addwff dAdd W and f
andwff dAND W and f
andlwkAND k to W
bcff bBit clear f
bsff bBit set f
btfscf bBit test f, skip if clear
btfssf bBit test f, skip if set
callaCall subroutine at a
clrffClear f
clrwdtClear watchdog timer
clrwfClear W
comff dComplement f
decff dDecrement f
decfszf dDecrement f, skip if 0
goto aaGo to address a
incff dIncrement f
incfszf dIncrement f, skip if 0
iorwff dInclusive OR W and f
iorlwkInclusive k to W
movff dMove f
movwff dMove W to f
movlwkMove k to W
movlw_lowk Move low order 8-bits of k to W
movlw_highk Move high order 8-bits ofk to W
nopNo Operation
retlwkReturn from subroutine with k in W
rlff dRotate left through carry
rrff dRotate right through carry
sleepEnter low power mode
subwff dSubtract W from f
swapwff dSwap nibbles in f
xorwff dExclusive OR W and f
xorlwkExclusive OR k to W
The following additional instructions are available for the 14-bit PIC's:
OpcodeOperandsDescription
addlwkAdd k to W
addlw_lowkAdd low order 8-bits of k to W
refieReturn from iterrupt
returnReturn from subroutine
sublwkSubtract W from k
The following additional instructions are available for the 12-bit PIC's:
OpcodeOperandsDescription
tris ffMove W into TRIS register f
optionMove W into OPTION register
Lastly, the following pseudo opcodes are available:
OpcodeOperandsDescription
labellabel_nameDefine address label
comment{text}Put a comment into code

For example, to set the pre-scaler in a PIC12C509, the following code could be used:

    ...
    constant option_bits 0xc0

    proc mumble {
	takes_nothing
	returns_nothing

	variable prescaler

	prescaler := ...

	assemble {
	    movlw option_bits
	    iorwf mumble__prescaler f
	    option
	}
    }
								

6.2 The Assignment Statement

The assignment statement has the following form:

variable, ... := expression, ...
variable, ...
is a list of variables being assigned to, and
expression, ...
is a list of expressions being evaulated prior to assignment. variable.

The assignment statement is unique in that it is the only statement that does not start off with keyword. Instead, the parser prescans the line and if it encounters a `:=', it assumes the entire statement is an assignment statement.

The type of the variables and expressions must match up.

If variable is a bit variable define by a `pin' declaration, the compiler is extra careful to make sure that the output does not "glitch" as a result of an assignment.

The assignment statement allows multiple variables to be assigned as a result of a single asignment statement. This is useful when there is a procedure that returns more than on value.

For multiple assignment, the compiler evalutes all of the expressions to the right before doing any variable assignments. Thus, in the following example,

a, b := b, a
results in the variables `a' and `b' being swapped.

6.3 The `call' Statement

The `call' statement has the following form:

call procedure(arguments...)
procedure
is the name of the procedure to be called.
arguments
is a list of zero, one, or more expressions to be evaulated and passed as arguments to the procedure.
The `call statement' is used to call procedures that have no return values (i.e. `returns_nothing').

6.4 The `count_down' Statement

The `count_down' statement has the following form:

count_down variable expression statements
variable
...
expression
...
statements
...
The `count_down' command is used to loop over statements by expression times. variable is used as the counter. expression must be non-zero.

Lots of microcontrollers have some sort of decrement and skip instruction. The `count_down' command generates code that uses such an instruction.

6.5 The `delay' Statement

The `delay' statement has the following form:

delay cycles {statements}
where
cycles
is a constant expression that specifies the number of instruction cycles to be absorbed.
statements is a list of statement that are executed. Each statement in statements is compiled to have uniform execution time. Conditional code, like `if', `&&', and `||', is padded with nop instructions to cause the excution time to be uniform.

Each procedure called in statements must have a `uniform_delay' statement in its procedure header.

6.6 The `delay_loop' Statement

The `delay_loop' statement has the following form:

delay_loop {loop_body}
where
loop_body
is the body of statments that make up the delay loop.

The `delay_loop' statement can only occur within a procedure that has been declared `uniform_delay'. If it occurs, it must be the last statement at the top-level of the procedure. The `delay_loop' body statements are executed as many times as possible within the amount of time remaining in the procedure. Additional `nop' statements are inserted by the µCL compiler at the end to completely absorb the remaining time after the `delay' loop.

For example,

    procedure delay {
	arguments_none
	returns_nothing
	uniform_delay 555

	delay_loop
	    servo_out := timer < width
	}
    }								
will execute the statement `servor_out := timer < width' approximately 55 times. Additional, `nop' opcodes are added onto the end to make the total amount of time equal exactly 555 cycles.

6.7 The `direction' Statement

The `direction' statement has the following form:

direction variable access
variable
is a variable specified by a `pin' or `port' declaration, and
access
is either `read', `write', or `off'.

The `direction' statement allows the programmer to manually specify the direction of a byte of bit on an I/O port. The associated `pin' and/or `port' declarations associated with variable must be declared with access `read_write_manual'.

6.8 The `if' Statement

The `if' statement has the following form:

if if_expression then_statements [ else_if else_if_expression else_if__statements ]* [ else else_statements ]
if_expression
is an expression that evaluates to a bit value,
then_statements
is a statemnt list that is executed if if_expression evaluates to a 1,
else_if_expression
is an optional additional test expression that evaluates to a bit value,
else_if_statements
is a optional statement list that is executed if all the previous expressions evaluated to 0 except for the last else_if_expression, and
else_statements
is an optional statement list that is executed all previous expressions evaluated to 0.

The minimal `if' statement consists of an expression (if_expression) and a statement list (then_statements). The then_statements list is only executed if if_expression is a bit expression that evaluates to a 1.

The minimal `if' statement can be followed by zero, one or more `else_if' clauses. An `else_if' clause consists of the keyword `else_if', followed by an expression (else_if_expression) which is followed by a statement list (else_if_statements.) The else_if_statements statement list is executed only if the corresponding else_if_expression is a bit expression that returned 1 and all previous bit expressions in the `if' statement returned a 0.

Finally, the last optional clause that can be at the end of an `if' statement is the `else' clause. An `else' clause consists of the keyword `else' followed by a statement list (else_statements.) The else_statements are only executed if all of the previous bit expressions evaluated to 0.

6.9 The `loop_forever' Statement

The `loop_forever' statement has the following form:

loop_forever {loop_body}
loop_body
is the body of statments that make up the loop.
The `loop_forever' statement repeatably executes the statements in loop_body without ever exiting the loop.

6.10 The `nop' Statement

The `nop' statement has the following form:

nop count_expression
count_expression
is the number of nop (no-operation) instructions to generate.

count_expression must be an evaluate to a constant.

6.11 The `return' Statement

The `return' statement has the following form:

return [expression_list]
expression_list
is a list of zero, one, or more expressions that are returned.

6.12 The `switch' Statement

The `switch' statement has the following form:

switch switch_expression {switch_body}
switch_expression
is an expression the returns a byte value,
switch_body
is the switch body (discussed below).
The switch_body consists of a sequence of one or more case_clause's, followed by an optional default_clause. The expression is evaluated and control is transfered to one of the clauses. Only case_clause's, a default_clause, comments and blank lines can occur inside of switch_body.

An example case statement is shown below:

    case (command >> 5) {
      case 0 {
	# First command:
	...
      }
      case 1, 2 {
	# Second and third commands:
        ...
      }
      case 3 {
	# Forth command:
        ...
      }
      default 7 {
	# All other commands are undefined:
        ...
      }
    }
								
6.12.1 The `case' Clause
The `case' clause has the following form:
case constant_expression, ... {case_body}
constant_expression
is a constant expression, and
case_body
is the body of the case clause.
The case_body is executed if constant_expression matches th value of the switch_expression up in the main switch statement. It is a compile time error for the constant_expression to show up in more than one case clause. case_body can contain any statements.

6.12.1 The `default' Clause
The `default' clause has the following form:
default default_constant {default_body}
default_constant
is a constant expression, and
default_body
is the body of the default clause.
The case_body is executed if for any value switch_expression that is less than or equal to default_constant and is not matched by any case_clauseswitch_body can contain no more than one default_clause and it must appear after all case_clause's. Any value for switch_expression that exceeds default_constant has undefined behavior and will most likely crash.

6.13 The `watch_dog_reset' Statement

The `watch_dog_reset' statement has the following form:

watch_dog_reset
This statement generates the code to reset a watch dog timer in a microcontroller.

6.14 The `while' Statement

The `while' statement has the following form:

while while_expression {while_body}
while_expression
is an expression the returns a bit value,
while_body
is the while body (discussed below).
While while_expression is 1, while_body is executed.

7 Expressions

Expressions in µCL are modeled after expressions in ANSI-C. There are a few differences between µCL expressions and C expressions.

7.1 Differences from C Expressions

If you do not know C-expressions, you should probably skip reading this section.

The differences from C-expressions itemized below:

7.2 What is Precedence?

This section is for reference purposes for those people who are not familiar with the concept of operator precedence.

In regular arithmetic expressions like `1 + 2 × 3 - 4 × 5 + 6', the `×' operator has a higher precedence than the `+' and `-' operator. Thus, multiplication and division are performed before addition and subtraction. This can be made more explicit by adding parenthesis as follows -- `1 + (2 × 3) - (4 × 5) + 6'. In many programming languages, there are usually a couple of dozen operators and each of them have different precedences.

As usual, parenthesis are used to change the order operation execution. Thus, `(1 + 2) × (3 - 4) × (5 + 6)' causes the addition and subtraction to occur before the multiplication.

The sections below list operators in µCL expressions from lowest procedence to highest prcedence. The lowest precedence operators are performed after all higher precedence operators have been done.

7.3 Assignment Operators (':=', `:+=', etc.)

The assigment operators are:

`:='
Straight assignment (L := R)
`:+='
Addition assignment (L := L + R)
`:-='
Subtraction assignemnt (L := L - R)
`:*='
Multiplication assignment (L := L * R)
`:%='
Division assignment (L := L / R)
`:&='
AND assignment (L := L & R)
`:|='
OR assignment (L := L | R)
`:^='
XOR assignment (L := L ^ R)
NOTE: Currently, only straight assignment is implemented

{Discuss operator assignment.}

µCL supports multiple assignment. In mulitple assignment, a list of expressions to the right are assigned to a list of variables on the left. All of the expressions to the right are evaluated and stored in temporary varaibles before any of the assignments take place. The example below will swap the contents of the `a' and `b' variables:

    a, b := b, a						
which is equivalent to:
    T1 := b
    T2 := a
    a := T1
    b := T2							
where T1 and T2 are temporaries.

Another form of multiple assignment occurs when a procedure returns more than on value. An example should help clarify this. Let us assume that the procedure `plus_minus(a, b)' returns `a+b' and `a-b'. This procedure would be written as follows:

    procedure plus_minus {
	argument a byte
	argument b byte
	returns byte byte
	return a + b, a - b
    }								
and we can invoke `plus_minus' as follows:
    a_plus_b, a_minus_b := plus_minus(a, b)			
The first value returned from `plus_minus' is `a+b' and it is assigned to the variable `a_plus_b'. The second value returned from `plus_minus' is `a-b' and it is assigned to the variable `a_minus_b'.

7.4 Comma Operator (`,')

The comma operator `,' does not perform any computation per se. All it does is cause other expressions of higher precedence to be executed in left-to-right order. It is used to separate variables and expressions in multiple assignment statements (e.g. `a, b := b, a') and it is used to separate the arguments passed to procedures.

A lot of C programmers have programmed in C for years without realizing that in C most implementations use right-to-left order of execution. The C language specification does not mandate either left-to-right or right-to-left execution order. However, the first implementations of C pretty uniformly implemented right-to-left execution in order to support variadic (i.e. many variables) functions like `printf'. Pretty much all subsequent C implementations after that have followed the same rule. The order of execution only crops up when the expressions involve side effects. For example, in C:

    (void)printf("first:%d second:%d\n", get_byte(), get_byte()) 
does not do what most people think it does. The first call to `get_byte' is the right-most one followed by the left-most one. Thus, in the example above, the first byte that is read is actually printed as the decimal number next to `second' and vice verce. This is probably not what the C programmer had in mind.

In µCL, execution order is always left-to-right and there are no suprises like there are in C.

7.5 Conditional OR (`||')

The expresion `L || R' returns `1' if either `L' or `R' evaluate to `1'. It is a conditional OR because it will not execute `R' if `L' returns a `1'.

Consider the code fragment below:

    if (denominator = 0 || numerator/denominator = 0) {
	# ...
    }								
In this example, it would erroneous to divide the numerator by zero, so we can check for zero beforehand and only if it is non-zero does the division take place.

If it is desirable to always execute the left and the right side before computing the OR, the bitwise OR (`|') operator can be used.

7.6 Conditional AND (`&&')

The `L && R' returns `1' if both `L' and `R' evaluate to `1'. It is a conditional AND because, it will not bother to execute `R' if `L' returns a `0'. Consider the code fragment below:

    if (denominator != 0 && numerator/denominator > 0) {
	# ...
    }								
In this example, it would erroneous to divide the numerator by zero, so we can check for zero beforehand and only if it is non-zero does the division take place.

If it is desirable to always execute the left and the right side before computing the AND, the bitwise AND (`&') operator can be used.

The precedence of `&&' is higher than `||'. Thus,

A && B || C && D
would be executed as
(A && B) || (C && D)
As usual, it never really hurts to add parentheses to improve readability.

7.7 Relational Operators and Bit Selection (`<', `=', `>', `<=', `!=', `>=', `@')

There is one bit selection operator and six relational operators:

`L @ N'
The N'th bit of L
`L < R'
L less than R
`L = R'
L equal to R
`L > R'
L greater than R
`L <= R'
L less than or equal to R
`L != R'
L not equal to R
`L >= R'
L greater than or equal to R

The equality operators work for operands of type `byte' and `bit' and all the other opertators (`<', `>', `<=', `>=', and `@') only work for operands of type `byte'.

NOTE: bit equals and not-equals is not imlemented yet. The work around is quite ugly -- `A && B || !A && !B' for equality and `A && !B || !A && B' for inequality.

It is not permissible to stick an extra space between any of the two character relational operators (`<=', `!=', and `>=').

The bit selection operator is not present in C and is unique to µCL. `A @ N' is the N'th bit of A. Both A and N must be of type `byte'. The result is of type `bit'. `A @ 0' selects the least significant of A. `B @ 7' selects the most significant bit of B.

NOTE: Currently, the right operator of bit selection must be a constant. A reasonable work around would be `A & 1 << N != 0' which is grouped as `(A & (1 << N)) != 0'; unfortunately, `1<<N' where N is not a constant is also unimplemented. Sigh.

C programmers should note that the equality operator consists of a single `=' not the double `==' used in C.

In C, the `char' type usually stands for a byte. The ANSI-C standard allows `char' to be either `signed' or `unsigned'. This ambiguity causes all sorts of grief when porting C code between C compilers. There is no such ambiguity in µCL, the `byte' type is always represents non-negative numbers between 0 and 255 inclusive. Thus, in µCL, 128 is always greater 127 whereas in C (char)128 is sometimes less than (char)127 and sometimes greater.

The precedence of the relation operators is greater than both conditional AND (`&&') and conditional OR (`||'). Thus, the following code fragment

0c'a' <= c && c <= 0x'z'
is grouped as
(0c'a' <= c) && (c <= 0x'z')

7.8 Addition and Subtraction (`+' and `-')

The addition operator is `+' and the subtraction operator is '-'. These operators are only defined for expressions of type `byte'. The order of execution is strictly left to right. Thus,

a + b - c + d
is evaluated as:
((a + b) - c) + d

Currently, µCL does not implement 16-bit or 32-bit arithmetic. You can use the following work-around:

    variable a_lo byte
    variable a_hi byte
    variable b_lo byte
    variable b_hi byte
    variable c_lo byte
    variable c_hi byte

    c_hi := a_hi + b_hi
    c_lo := a_lo + b_lo
    if (status @ c) {
	c_hi := c_hi + 1
    }								
NOTE: Eventually the generated code will be pretty efficient; right now the code for arthmetic is pretty sloppy.

The precedence of addition and subtraction is higher than the relational operators. Thus,

a + b > 20
is grouped as
(a + b) > 20
The more complicated expression
a + b > 20 && c - d <= e + f
is grouped as
((a + b) > 20) && ((c - d) <= (e + f))

7.9 Multiplication, Division, and Modulo (`*', `/', and `%')

NOTE: Currently, none of these operators are implemented.

The multiplication operator is `*', the division operator is '/', and the modulo operator is `%'. The modulo operator returns the remainder of a division. These operators are only defined for expressions of type `byte'. The order of execution is strictly left to right. Thus,

a * b / c * d
is evaluated as:
((a * b) / c) * d

The precedence of multipliation and division is higher than addition and subtraction. Thus,

a * b + c / d
is grouped as
(a * b) + (c / d)
The more complicated expression
a * b + c > 20 && d / e - f < 30
is grouped as
(((a * b) + c) > 20) && (((d / e) - f) < 30)

7.10 Bitwise OR (`|')

The expression `A | B' computes the bitwise OR of A and B. A and B must both be the same type. This operation is defined for both type `bit' and `byte'. Execution order is strictly left to right.

NOTE: Currently, bitwise OR of type `bit' is not implemented. Usually, you can make do with the conditional OR operator (`||').

The precedence of bitwise OR is higher than multiplication, division, addition, subtraction and all relational operators.

a | 1 + b | 2 * c | 3
is grouped as
(a | 1) + ((b | 2) * (c | 3))
The expression
a | 1 >= b | 2
is grouped as
(a | 1) >= (b | 2)
In C, this expression would group as
(a | (1 >= b)) | 2
which is pretty counter-intuitive.

7.11 Bitwise AND (`&')

The expression `A & B' computes the bitwise AND of A and B. A and B must both be the same type. This operation is defined for both type `bit' and `byte'. Execution order is strictly left to right.

NOTE: Currently, bitwise AND of type `bit' is not implemented. Usually, you can make do with the conditional AND operator (`&&').

The precedence of bitwise AND is higher than bitwise OR, multiplication, division, addition, subtraction and all relational operators.

a & 1 | b & 2 | c & 4
is grouped as
((a & 1) | (b & 2)) | (c & 4)
The expression
a & 0xf = 0xb
is grouped as
(a & 0xf) = 0xb
In C, this expression would group as
a & (0xf >= b)
which is pretty counter-intuitive.

7.12 Bitwise XOR (`^')

The expression `A ^ B' computes the bitwise XOR (eXclusive OR) of A and B. A and B must both be the same type. This operation is defined for both type `bit' and `byte'. Execution order is strictly left to right.

NOTE: Currently, bitwise XOR of type `bit' is not implemented.

Bitwise XOR is used to toggle bits. If you want to complement the third bit in a register, try the following:

a := a ^ 4

The precedence of bitwise XOR is higher than bitwise AND, bitwise OR, multiplication, division, addition, subtraction and all relational operators. This means you can twiddle bits, mask them off, the then assemble them together without having to fight the operator precedence. For example,

a ^ 9 & 0xf | 0xa0
is grouped as
((a ^ 9) & 0xf) | 0xa0
and toggles the first and forth bits of a, masks off the four high order bits, and sets the fifth and seventh bits.

7.13 Shift_Operators (`>>' and `<<')

The shift right operator is `>>' and the shift left operator is `<<'. Order of operation is strictly left to right. `A << N' causes A to be shifted left by N bits with 0 being shifted into the least significant bits. `B >> N' causes B to be shifted right by N bits with 0 being shifted into the most significant bits. Both the left and right operands the shift operators must be of type `byte'. Thus, `A << 1' is equivalent to efficiently multiplying by 2 and `A << 2' is equivalent to multiplying by 4. Similarly, `B >> 1' is equivalent to dividing by 2 and `B >> 2' is equivalent to dividing by 4.

NOTE: Currently, the code for shifting by a non-constant expression has not been implemented.

The precedence of the shift operators is greater than the bitwise operators. Thus,

a >> 4 | a << 4
is grouped as
(a >> 4) | (a << 4)
Incidently, this piece of code results in a value where the nibbles of a have been exchanged.

7.14 Unary Operators (`-', `!', `~')

There are three unary operators:

- R
Minus R
~ R
Bitwise NOT of R
! R
Logical NOT of R
The minus (`-') bitwise NOT (`~') operator only works on an operand of type `byte'. The logical NOT operator (`!') only works on a operand of type `bit'.

The precedence of the unary operators is higher than all the arithmetic and bit twiddling operators. Thus,

-a - b
is grouped as
(-a) - b
and
~a & b
is grouped as
(~a) & b

NOTE: Currently, there is a bug in the expression parser such that `a & ~b' is not properly parsed.

NOTE: I've often thought that logical NOT (`!') should have a precedence between `&&' and bit selection (`@'). Thus, `!A@N' would group as `!(A@N)' rather than `(!A)@N'. Similarly, I've often thought that that unary minus should have a precedence right above multiplication (`*').

7.15 Array Operator (`...[...]')

The array operator looks as follows:

L [ R ]
where L is an expression that evaluates to either a string or a byte array, and R is an expression that evaluates to a byte. When the array operator occurs to the left of an assignment, a byte value is stored into the string or byte array; otherwise, a byte value is fetched.

7.16 Dot Operator (`L.size or L.limit')

Currently, the dot operator has only two forms:

L.size
Fetch or set the size of L where L is an expression that evaluates to either a string or byte array.
L.limit
Fetch the limit of L where L evalutes to an expression that evaluates to either a string or byte array.

7.17 Procedure Invocation (`P(...)')

Procedure invocation is the act invoking a procedure using its return value (or values) in an expression. `P()' invokes a procedure P with no arguments, `P(A1)' inovkes a preocedure with a single argument expression A1, and `P(A1, A2)' invokes procedure P with argument expressions A1 and A2.

NOTE: Internally, µCL converts `P(A1, ..., An)' to `P##(A1, ..., AN)' where `##' stands for `invoked with'. You might occasionally see an error message with `##' in it.

The arguments to the procedure are evaluated in strict left to right order.

It is an error to invoke a procedure that does not return any return value (i.e. `returns_nothing') in an expression.

It is legal to invoke a procedure in an expression that returns multiple values, provided the returned values are directly passed on as arguments to another procedure invocation (or a multiple assignment.) For example, assume that `swap(a, b)' returns `b, a'. Further assume that `p3' is a procedure that takes three arguments. `p3(swap(a, b), c)' is legal and equivalent to `p3(b, a, c)'.

The precedence of procedure invocation is higher than all other operators.


Copyright (c) 1999 by Wayne C. Gramlich. All rights reserved.