Advanced Computing Environment
Hosted by SourceForge
brix-os project page

Previous: Lexical Syntax ----- Up: Contents ----- Next: Identifiers, Objects and References

Parsing and Rewrite Patterns

show notes
Features that need to be added to the page or ideas that haven't been thought out.
  • DEFOP`prefix should only be used for operators that could also be infix, DEFX`open-macro should be used for all other cases.
  • patterns can access private module slots, type properties and methods when declared in the same module
  • blocks should never have semicolons after them
  • extensions that are unable to determine their own termination must be explicitly terminated. this however should be avoided but is possible.
    	s = sum 1 2 3 4 5 6 7;
    	s = (sum 1 2 3 4 5 6 7)
  • extensions can override the parser to handle semicolons and commas in the same list, with each having different meaning
  • single expression lists evaluate to a single value and multiple expression lists evaluate to a tuple or List object
  • the common parent for common-infix is any concrete type shared between both parameters. the common parent can even be a parameterized type if the type has the ability to produce a common parent.
    	defop `common-infix .. (first:Number, last:Number)
    	  -> (first, last):Range(common)
  • add third parameter to declare the commin infix variable?

The parser reads from the tree of tokens produced by the lexer, evaluates each expression and pushes objects or code fragments to internal stacks representing expression components. Most tokens evaluate to simple objects but extensions and unprocessed types are executed to produce an object, consume additional tokens, generate code or change the compiler state. Once the token has been evaluated the parser tests for a prefix operator in the current token or a compound accessor or infix/postfix operator in the next token and continues the expression chain. The expression is terminated when the parser is unable to chain it to the next token or no more tokens exist. A semicolon or comma after the expression is consumed and the expression flagged as explicitly terminated. The last stage of the parser scans the stack for operators, orders them by precedence and evaluates their patterns.

The contents of the internal stack is then returned to the caller as the values produced by the expression and the caller decides how to handle the values and also the type of termination (some disallow explicit termination). List (parentheses) and arrays (square brackets) require explicit comma termination for all but the last expression and each expression must produce one or more values. Blocks (curly brackets) have optional semicolon termination and expression values are discarded. Extensions such as IF and WHILE require an implicitly terminated condition expression and optional semicolon termination for their body expressions.

The symbolic identifier set is reserved for operators but the alphanumeric set may also be used. Parameters are limited to one for prefix/postfix and two for infix patterns. Common infix patterns use the same type for both parameters and the actual type must be a subtype. This mode also declares a variable containing the common type of both parameters. Each operator name may have one prefix, one infix (or common infix) and one postfix pattern.

DEFOP precedence style name (parameters) -> pattern
  • precedence -- 0 to 255 (lowest)
  • style -- `prefix, `infix, `common-infix or `postfix
  • name -- operator name (both identifier sets)
  • parameters -- parameter list
  • -> -- required syntax
  • pattern -- single expression
	defop 10 `infix . (left:Object, right:tetra.compiler.Identifier)
	  -> tetra.compiler.access_slot(left, right)
	defop 20 `prefix - (right:Number)
	  -> Number.negate(right)
	defop 60 `infix - (left:Number, right:Number)
	  -> Number.subtract(left, right)

	equality_interface = deftype () {
	  isEqual = def (a:Self, b:Self) -> Boolean;
	defop 70 `common-infix == (left:equality_interface, right)
	  -> common.isEqual(left, right)
Precedence is always required but has been omitted in all code examples on this site.

Shorthand Operators
Patterns are evaluated and may container other operators.

	defop `common-infix += (left:addition_interface, right)
	  -> left = left + right

As the parser evaluates tokens it inserts hints into the tree so the editor knows about expression boundaries and types. The editor can use this information to display type information, highlight sub-expressions or display faded semicolons on optionally terminated expressions. Lexical and parsing errors are fed back to the editor via error tokens that contain descriptive messages to correct the problem. All source code is parsed in the background to update the editor and compiled when error-free. This makes types and extensions immediately available to other parts of the source and for use inside the REPL for quick code tests.

The expression highlighting is important because it not only displays precedence grouping but also makes operator chaining mistakes visible. Some open extensions also benefit from this by showing the tokens they consume.

	b = if a > 0 -1 else 1

	// expected result
	b = if (a > 0) -1 else 1

	// actual result
	b = if (a > 0 - 1) missing value else 1

Previous: Lexical Syntax ----- Up: Contents ----- Next: Identifiers, Objects and References