01 September 2009

8. Boolean and Relational operators

Boolean and relational operators can be treated in a variety of ways. We will treat them in a way that is simple, easy and correct, but inefficient. Each operator will return the integer 1 for true and 0 for false. The language will consider 0 to be false and any other value (not just 1) to be true.

This is the absolutely easiest way of doing it, and also allows us to use these operators in freely in any expression, even for assignments: bigger = a > 0.

The first thing we need to do when adding new operators is to decide their precedence. I have decided to give boolean operators the lowest precedence (evaluated last), followed by relational operators, followed by mulops, and then addops. Because then we don't have to use parentheses when writing an expression like this:
If   a+1  >  6*b   And  5-c
If ((a+1) > (6*b)) And (5-c)

As you can see, the operators with the lowest precedence are evaluated last, because they are in the outermost implicit parentheses.
Speaking of which, we have forgotten to implement parentheses! I'll save that for the next installment.

Which operators do we need? We need And, Or, Xor, <, >, <=, >= and <>. Also, you may be thinking of Not. Not is a unary operator (takes only one argument), so it must be treated separately.

We will start with And. What it needs to do, is to look at both operands, and if both are true (different from 0), it should place 1 in eax. Else it should place 0 in eax.

It turns out ebp wasn't well suited, because we don't have separate access to the low 8-bit part of the register. I'll continue with ebp, but the code to convert a value to to a boolean can be made slightly faster if you use ecx instead. Since we do this for learning, we should not care about efficiency.

Here is the new grammar:
digit      ::= 0..9
integer ::= <digit> { <digit> }
value ::= <integer>
mulop ::= * | /
addop ::= + | -
relop ::= < | > | <= | >= | <>
boolop ::= And | Or | Xor
mulexp ::= <value> {<mulop> <value>}
addexp ::= <mulexp> {<addop> <mulexp> }
relexp ::= <addexp> { <relop> <addexp> }
boolexp ::= <relexp> {<boolop> <relexp> }
expression ::= <BoolExp>

We will use the following code in several places, so I placed it in a separate procedure:
Procedure ToBool(reg.s)
Emit("cmp " + reg + ", 0")
Emit("mov ecx, 0")
Emit("setnz cl")
Emit("mov " + reg + ", ecx")
EndProcedure
It converts an integer into another integer, which may only be 0 and 1. Setnz sets cl (the lower part of ecx) to 1 if reg-0 was NOT zero. The result is that ecx becomes 0 when reg was 0, else ecx becomes 1.

BoolExp() looks just like the other procedures:
Procedure BoolExp()
; boolexp ::= <relexp> {<boolop> <relexp> }
Protected Operation.s
RelExp()
While Look = 'a' ; start of and
Operation = GetName()
Emit("push eax")
RelExp()
Emit("pop ebp")
Select Operation
Case "and": ToBool("eax") ; Convert to 0 or 1 depending on
ToBool("ebp") ; truth value of input
Emit("and eax, ebp")
Emit("mov eax, 0")
Emit("setnz al")
Default
Expected("Operator, got " + Operation)
EndSelect
Wend
EndProcedure
First, both operands are converted from their initial value into 0 and 1 (depending on their truth value). Then we perform a bitwise and operation, which works as a boolean and because of the limited input range (values are either 0 or 1).

You should now rewrite Expression() to call BoolExp() instead of AddExp(). Then write a dummy procedure, RelExp() that only calls AddExp(). You compiler should now have support for the And operator! Try it!

Note, that the compiler is currently case sensitive. Don't try to remedy this in BoolExp() - it should be done globally in the base file. If you want to fix it now, simply put an LCase() in GetLook(), and be sure to only expect lowercase letters in Look.

When you implement Or and Xor, you should use the or and xor cpu instructions instead of the and instruction. Since all these operators share a lot of code, we can make the code generation shared, too:
Procedure BoolExp()
; boolexp ::= <relexp> { <boolop> <relexp> }
Protected Operation.s
RelExp()
While Look = 'a' Or Look = 'o' Or Look = 'x'
Operation = GetName()
Emit("push eax")
RelExp()
Emit("pop ebp")
ToBool("eax")
ToBool("ebp")
Select Operation
Case "and": Emit("and eax, ebp")
Case "or": Emit("or eax, ebp")
Case "xor": Emit("xor eax, ebp")
Default
Expected("Operator, got " + Operation)
EndSelect
Emit("mov eax, 0")
Emit("setnz al")
Wend
EndProcedure

Cool. We added two new operators simply by moving four lines and adding two lines. (Ok, so I cheated by modifying the line with While...)

To avoid confusion I will must precise that these operators associate to the left, that is:
1 and 2 or 3
is the same as
(1 and 2) or 3
or verbally, the output of (1 and 2) becomes the input of or 3. Remember that the output of (1 and 2) is a number that is either 0 (false) or 1 (true).

This is the same for all operators we have implemented, however, this order is overridden by the precedence rules. So:
1+2+3 equals
(1+2)+3, however
1+2*3 equals
1+(2*3)

All normal operators are left associative, except one: ^ (raise to the power of). The power operator is right associative. Often this is implemented a normal function instead of an operator, this is because a lot of code is required to compute the result.

We will use the same idea for creating the relational operators as we used for above: the comparison will return either 0 or 1, depending on the input. To keep the code clean, I want to put the code generation for these operators in a separate procedure, RelOp(). This is because the code for recognizing the multi-character relops can get a bit long-winded. Here is RelExp():
Procedure RelExp()
; relexp ::= <addexp> { <boolop> <addexp> }
Protected Operation.s
AddExp()
While Look = '<' Or Look = '>' Or Look = '='
Operation = Chr(Look)
GetLook()
Select Look
Case '<', '>', '='
Operation + Chr(Look)
GetLook()
EndSelect
EatWhite()
Emit("push eax")
AddExp()
Emit("pop ebp")
RelOp(Operation) ; Generate code
Wend
EndProcedure


Here is the code for RelOp(), it's a standard Select statement to generate the correct conditional:
Procedure RelOp(Op.s)
Emit("cmp ebp, eax")
Emit("mov eax, 0")
Select Op
Case "=": Emit("sete al") ; set if equal
Case "<": Emit("setl al") ; set if less
Case "<=": Emit("setle al") ; less or equal
Case ">": Emit("setg al") ; set if greater
Case ">=": Emit("setge al")
Case "<>": Emit("setne al") ; set if not equal
Default
Expected("Relational operator, got " + Op)
EndSelect
EndProcedure

Notice that the algorithm is quite similar to ToBool(), except we now compare with another number instead of 0. Setne and setnz are aliases for the same instructions, by the way.

I now realize that there is a mistake in base2.pb. Simply replace the procedure Expected() with this one:
Procedure Expected(S.s)
Error("Expected " + S)
EndProcedure
Else the error message will be nonsensical.

A minor problem with our implementation of the boolean operators is that they are so called "strict". Normally boolean operators are "short-circuit". You should look up this on wikipedia or somewhere else if you don't know what it means.

The last thing you have to do before you're finished with this part of the tutorial is to put your new code through some testing. Remember to also test invalid inputs. They should give a proper error message.

Here is the complete listing:

; boolrel.pb
IncludeFile "..\base2.pb"

Procedure ToBool(reg.s)
Emit("cmp " + reg + ", 0")
Emit("mov ecx, 0")
Emit("setnz cl")
Emit("mov " + reg + ", ecx")
EndProcedure

Procedure RelOp(Op.s)
Emit("cmp ebp, eax")
Emit("mov eax, 0")
Select Op
Case "=": Emit("sete al") ; set if equal
Case "<": Emit("setl al") ; set if less
Case "<=": Emit("setle al") ; less or equal
Case ">": Emit("setg al") ; set if greater
Case ">=": Emit("setge al")
Case "<>": Emit("setne al") ; set if not equal
Default
Expected("Relational operator, got " + Op)
EndSelect
EndProcedure

Procedure Value()
; value ::= <integer>
Emit("mov eax, " + GetInteger())
EndProcedure

Procedure MulExp()
; mulexp ::= <value> {<mulop> <value>}
Protected Operation.c
Value()
While Look = '*' Or Look = '/' Or Look = '%'
Operation = Look
GetLookWhite()
Emit("push eax")
Value()
Emit("pop ebp")
Select Operation
Case '*': Emit("imul eax, ebp")
Case '/', '%'
Emit("xchg eax, ebp")
Emit("cdq") ; expands eax to edx:eax
Emit("idiv ebp")
EndSelect
If Operation = '%'
Emit("mov eax, edx")
EndIf
Wend
EndProcedure

Procedure AddExp()
; addexp ::= <mulexp> {<addop> <mulexp> }
Protected Operation.c
MulExp()
While Look = '+' Or Look = '-'
Operation = Look
GetLookWhite()
Emit("push eax")
MulExp()
Emit("pop ebp")
Select Operation
Case '+': Emit("add eax, ebp")
Case '-': Emit("sub eax, ebp")
Emit("neg eax")
EndSelect
Wend
EndProcedure

Procedure RelExp()
; relexp ::= <addexp> { <boolop> <addexp> }
Protected Operation.s
AddExp()
While Look = '<' Or Look = '>' Or Look = '='
Operation = Chr(Look)
GetLook()
Select Look
Case '<', '>', '='
Operation + Chr(Look)
GetLook()
EndSelect
EatWhite()
Emit("push eax")
AddExp()
Emit("pop ebp")
RelOp(Operation) ; Generate code
Wend
EndProcedure

Procedure BoolExp()
; boolexp ::= <relexp> { <boolop> <relexp> }
Protected Operation.s
RelExp()
While Look = 'a' Or Look = 'o' Or Look = 'x'
Operation = GetName()
Emit("push eax")
RelExp()
Emit("pop ebp")
ToBool("eax")
ToBool("ebp")
Select Operation
Case "and": Emit("and eax, ebp")
Case "or": Emit("or eax, ebp")
Case "xor": Emit("xor eax, ebp")
Default
Expected("Operator, got " + Operation)
EndSelect
Emit("mov eax, 0")
Emit("setnz al")
Wend
EndProcedure

Procedure Expression()
; expression ::= <BoolExp>
BoolExp()

If Look <> 0
Expected("End of line")
EndIf
EndProcedure

Init()
Expression()
Input()

1 comment:

Invincible Cow said...

I fixed a bug in RelExp() that prevented whitespace after single-character relational operators.

And also a serious bug in RelOp(): all comparisons were inversed. It is now fixed (I swapped the operands of cmp).