30 August 2009

7. Subtraction and Division

These are easy. The instruction to subtract on x86 is sub. It is used like the add instruction. The only difference is that with subtraction, the order of the operands matters.
10-3 <> 3-10

The x86 instruction idiv is rather clunky. The format of a division is dividend/divisor. idiv requires that the dividend is twice as large as the division. So if the divisor is a long, then the dividend must be a quad. And also, the dividend must reside in eax. Now eax doesn't actually hold a quad, so in this case, idiv requires that one part of the quad is in edx and the other part is in eax!
The result is always placed in eax. Additionally, the remainder is placed in edx.
Also we have to be sure to get the order of the operands correct here as well, since 10/2 is not the same as 2/10.

Let's make a new grammar with the new operators:

mulop ::= * | /
addop ::= + | -
mulexp ::= <value> {<mulop> <value>}
addexp ::= <mulexp> {<addop> <mulexp> }
expression ::= <addexp>

As you can see, + and - have the same precedence, so does * and /.

Make a copy of mul.pb (call it subdiv.pb). Adding subtraction to the code is done like this. I also replaced all occurences of ecx with ebp to make the generated assembly code easier to read. Ecx looks too much like eax. Contrary to what some people believe, ebp is not some kind of special register. It is a general-purpose register just like ecx.
Procedure AddExp()
; addexp ::= <mulexp> {<addop> <mulexp> }
Protected Operation.c
MulExp()
While Look = '+' Or Look = '-'
Operation = Look
GetLookWhite()
Emit("push eax")
MulExp()
Emit("pop ebp")
Select Operation
Case '+': Emit("add eax, ebp")
Case '-': Emit("sub eax, ebp")
Emit("neg eax")
EndSelect
Wend
EndProcedure

We store the operator in a variable to preserve it across the calls to GetLookWhite() and Mulexp(). Then we negate the result after subtracting to get the correct sign. We have to do this because we are subtracting ebp from eax instead of the other way around (which is what we really want).
We could replace
sub  eax, ebp
neg eax

with
sub  ebp, eax
mov eax, ebp

but simply negating the result is faster.

Try the code now. Mix addition, subtraction and multiplication, and test the resulting code to see if it's correct.

Now for division. Here is the code:

Procedure MulExp()
; mulexp ::= <value> {<mulop> <value>}
Protected Operation.c
Value()
While Look = '*' Or Look = '/'
Operation = Look
GetLookWhite()
Emit("push eax")
Value()
Emit("pop ebp")
Select Operation
Case '*': Emit("imul eax, ebp")
Case '/': Emit("xchg eax, ebp")
Emit("cdq") ; expands eax to edx:eax
Emit("idiv ebp")
EndSelect
Wend
EndProcedure


With division, we also get the operands in the reverse order. However, we can't just negate the result. While -(10-2) equals 2-10, -(10/2) does not equal 2/10. So we have to exchange the operands before doing the division. We do that with the xchg instruction.
I should mention here that you must never use the xchg instruction with a memory operand (unless you are coding a mutex). This is because xchg with memory "asserts a bus lock". This just means it's very slow. So multiple mov instructions are faster. But with registers only, it's fine.
The cdq (convert dword to quad) instruction makes a quad out of the value stored in eax. Or technically, it sign extends eax to edx:eax. Which becomes the input of the idiv instruction.
If you ever find yourself doing idiv edx, you're doing it wrong. Because edx:eax is already the input, so you'll get a nonsense result.
Finally, idiv ebp divides edx:eax by ebp and leaves the result in eax. Also, it leaves the remainder of the division in edx. (Example: the remainder of 3/2 is 1. And of 5/2, also 1.)

The operator for getting the remainder is called modulo, and is usually % or mod. PB uses %, so we'll do that, too.

Here is the full code, including addition, subtraction, multiplication, division and modulo:
; subdiv.pb
IncludeFile "..\base2.pb"

Procedure Value()
; value ::= <integer>
Emit("mov eax, " + GetInteger())
EndProcedure

Procedure MulExp()
; mulexp ::= <value> {<mulop> <value>}
Protected Operation.c
Value()
While Look = '*' Or Look = '/' Or Look = '%'
Operation = Look
GetLookWhite()
Emit("push eax")
Value()
Emit("pop ebp")
Select Operation
Case '*': Emit("imul eax, ebp")
Case '/', '%'
Emit("xchg eax, ebp")
Emit("cdq") ; expands eax to edx:eax
Emit("idiv ebp")
EndSelect
If Operation = '%'
Emit("mov eax, edx")
EndIf
Wend
EndProcedure

Procedure AddExp()
; addexp ::= <mulexp> {<addop> <mulexp> }
Protected Operation.c
MulExp()
While Look = '+' Or Look = '-'
Operation = Look
GetLookWhite()
Emit("push eax")
MulExp()
Emit("pop ebp")
Select Operation
Case '+': Emit("add eax, ebp")
Case '-': Emit("sub eax, ebp")
Emit("neg eax")
EndSelect
Wend
EndProcedure

Procedure Expression()
; expression ::= <AddExp>
AddExp()

If Look <> 0
Expected("End of line")
EndIf
EndProcedure

Init()
Expression()
Input()

6. Multiplication

With the ease with which we eased us through addition, one could be lead to believe that multiplication was equally easy. After all, multiplication is almost the same as addition, a number on each side of an operator (4*5) and a cpu instruction to calculate the result. But not so fast. The difficulty arises when multiplication and addition are combined.
Compute 5+5*3. No, it's not 30, it's 20. That's the difficult part: multiplication must be done before addition. A general term for this is operator precedence. Multiplication has a higher precedence than addition, so it must be computed first. Thus, 5+5*3 is equal to 5+(5*3) rather than (5+5)*3.

Let's try to write a BNF for this, so that we have some guidance when implementing it. Here is our old BNF (for addition only)
addexp     ::= <value> {+ <value>}
expression ::= <addexp>

Here it is with support for multiplication:
mulexp     ::= <value> {* <value>}
addexp ::= <mulexp> {+ <mulexp> }
expression ::= <addexp>

This may be seem strange. Why does it work? Let's look at 1+5*3. First expression calls addexp. Then addexp calls mulexp. And mulexp calls value, which loads 1 into eax. Control is returned to mulexp, which sees that Look is not *. Control is returned to addexp, which sees the + sign and pushes eax.
Everything is working normally so far (except that we did an unnecessary roundtrip through mulexp). Now comes the clever part. We are still in addexp, and 1 is pushed on the stack. Instead of calling value directly to get the next number, addexp calls mulexp, which in turn calls value to get the 5 into eax.
Now mulexp sees a *, so it pushes the 5 onto the stack, calls value again to get 3 in eax, and multiplies eax with the 5. So we now have the result of 3*5 in eax and 1 on the stack. (mulexp popped the 5 when multiplying.)
Finally we return addexp again, which adds the 1 from the stack to eax (15), to get the final result, 16.

The bottom line is: we multiplied before adding. You should read the description above again while looking at the actual code. If you need to, use the debugger to single-step through the code with the description above as an assistance to see what's going on.

The actual code is straightforward. AddExp() is modified to call MulExp() instead of Value().
MulExp() is written like the old AddExp(), but checks for the multiplication operator (*) instead of the addition operator (+) and, of course, issues an imul cpu instruction instead of an add instruction.

I recommend that you look here: http://flatassembler.net/docs.php?article=manual to get a description of all x86 assembly instructions. You can also download the same file as a pdf file.

In case you were wondering why we are using imul instead of just mul, it's because imul is for signed multiplication while mul is for unsigned for multiplication.
; mul.pb
IncludeFile "..\base2.pb"

Procedure Value()
; value ::= <integer>
Emit("mov eax, " + GetInteger())
EndProcedure

Procedure MulExp()
; mulexp ::= <value> {* <value>}
Value()
While Look = '*'
GetLookWhite()
Emit("push eax")
Value()
Emit("pop ecx")
Emit("imul eax, ecx")
Wend
EndProcedure

Procedure AddExp()
; addexp ::= <mulexp> {+ <mulexp> }
MulExp()
While Look = '+'
GetLookWhite()
Emit("push eax")
MulExp()
Emit("pop ecx")
Emit("add eax, ecx")
Wend
EndProcedure

Procedure Expression()
; expression ::= <AddExp>
AddExp()

If Look <> 0
Expected("End of line")
EndIf
EndProcedure

Init()
Expression()
Input()

Try to run it. Test it with some expressions and see if the generated code is correct (see the end of the tutorial "addition" for a how-to). Run it with single-stepping (see the PB help file about the debugger) to understand what's really going on in there.

If the program above works, and you understand why/how it works, then you have just completed the hardest part of writing a compiler.;
Once you grasp BNF and parsing you can do almost anything you want to (except generate efficient code, which is a problem all by itself). Mind you, there's a lot of work left, but when you understand this program you have passed a hurdle that enables you to learn the more advanced stuff. It's like creating a new space in your mind. Sort of being born. It doesn't automatically make you a great programmer, but it's the biggest hurdle.

Oups. I likened mastering multiplication to being born. What an unsuitable pun...

I repeat: You have just completed the hardest part of writing a compiler.

5. Base 2

If you think I'm talking about baseball here, you're way off base. (That was a baseball expression. Not the kind of expressions we are working with.)
What we need is a new base.pb with more functionality. Make a copy of base.pb and name it base2.pb.

We want to add some new stuff. Here is a list:
- Move IsDigit() and GetDigit() into base2.pb
- Add support for integers consisting of multiple digits (like 128)
- Add support for names (IsAlpha(), IsAlphaNumeric() and GetName())
- Add support for whitespace

You should try to write these procedures yourself. But not without some guidance. Here's what you do:

- Copy IsDigit() from add2.pb into base2.pb

- Write a new procedure IsWhite(). BNF for whitespace:
white ::= #SPACE | #TAB
Return 1 if yes, 0 else.

- Write a procedure EatWhite() which calls GetLook() while Look is whitespace.

- Write GetLookWhite() which calls GetLook() and then EatWhite().

- Write a new procedure: GetInteger() to replace GetDigit(). Here is the BNF grammar for what input it should accept:
integer ::= <digit> {<digit>}
The integer should be returned as a string. If you're into error handling, give an error if the number is outside the range of a 32-bit signed integer. Be sure to give an error if Look is not a digit when the procedure is called (because this is REQUIRED, see the BNF).
Since we want support for whitespace after numbers (so you can type 1+ 2 instead of 1+2) call EatWhite() at the end.

- Write a new procedure IsAlpha(C.c) that returns 1 if the parameter is an alphabetic character. Here is the BNF:
alpha ::= A..Z | _ | $
As you can see, I included underline and $. You may or may not want to do that.

- Write a new procedure IsAlphaNumeric(). Use IsAlpha() and IsDigit() to prevent doing the same work twice.

- Write a new procedure GetName() that returns a name as a string. If you ever heard the word "identifier", this is the same, only a much longer word. Grammar:
name ::= <alpha> {<digit> | <alpha>}
Notice that a name must start with an alpha (defined above), but subsequent characters may be alphanumeric.
Call EatWhite() at the end to allow for program whitespace.

Test all your procedures to make sure they do what you expect.

Here's my new base2.pb. You should compare your procedures with mine to be sure they work the same:
Global Look.c
Global Instream.s

Procedure GetLook()
Look = Asc(Instream)
Instream = Mid(Instream, 2)
EndProcedure

Procedure Error(E.s)
PrintN("Error: " + E)
Input()
End
EndProcedure

Procedure Expected(S.s)
Error(S + " expected.")
EndProcedure

Procedure Emit(S.s)
PrintN(#TAB$ + S)
EndProcedure

Procedure Init()
OpenConsole()
Instream = Input()
GetLook()
EndProcedure

;- Additions:

Procedure IsDigit(C.c)
If C >= '0' And C <= '9'
ProcedureReturn 1
EndIf
EndProcedure

Procedure IsAlpha(C.c)
Select C
Case 'a' To 'z', 'A' To 'Z', '_', '$'
ProcedureReturn 1
EndSelect
EndProcedure

Procedure IsAlphaNumeric(C.c)
If IsAlpha(C) Or IsDigit(C)
ProcedureReturn 1
EndIf
EndProcedure

Procedure IsWhite(C.c)
If C = ' ' Or C = #TAB
ProcedureReturn 1
EndIf
EndProcedure

Procedure EatWhite()
While IsWhite(Look)
GetLook()
Wend
EndProcedure

Procedure.s GetInteger()
Protected Name.s
While IsDigit(Look)
Name + Chr(Look)
GetLook()
Wend
If Not Name
Expected("Integer")
EndIf
EatWhite()
ProcedureReturn Name
EndProcedure

Procedure.s GetName()
Protected Name.s
While IsAlphaNumeric(Look)
Name + Chr(Look)
GetLook()
Wend
If Not Name
Expected("Name")
EndIf
EatWhite()
ProcedureReturn Name
EndProcedure

Procedure GetLookWhite()
GetLook()
EatWhite()
EndProcedure

With so much new features it would be sad not to use them right away. So make a copy of add2.pb (call it add2_base2.pb).
You only need to make three changes to it. The first is to include base2.pb instead of base.pb. Then change GetDigit() into GetInteger() and GetLook() in AddExp() into GetLookWhite().
The result: we now handle any integers, including arbitary whitespace. Try, for example,
"128 +    56+8"


Here it is:
; add2_base2.pb
IncludeFile "..\base2.pb" ; Changed from base.pb
Init()

Procedure Value()
; value ::= <integer>
Emit("mov eax, " + GetInteger()) ; Changed from GetDigit()
EndProcedure

Procedure AddExp()
; addexp ::= <value> {+ <value>}
Value()
While Look = '+'
GetLookWhite() ; Changed from GetLook()
Emit("push eax")
Value()
Emit("pop ecx")
Emit("add eax, ecx")
Wend
EndProcedure

Procedure Expression()
; expression ::= <addexp>
AddExp()

If Look <> 0
Expected("End of line")
EndIf
EndProcedure

; Parse an expression
Expression()
Input()

4. Addition

We can now have an expression that consists of a single number. If you don't remember, our definition of an expression was

expression ::= <digit>

The next step is to add a + operator. We may write a BNF like this:
digit  ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 
expression ::= <digit> + <digit>

Unfortunately, this means that an expression can't be just one digit any more. It must always be a digit plus another digit. So let's use this BNF instead, it allows a digit and optionally, a plus sign and another digit. The [] symbol means whatever is inside the brackets is optional.
digit  ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 
expression ::= <digit> [+ <digit>]


Let's make a copy of the file from the previous chapter (name it add.pb) and extend it to accept addition. To do that, all we need to change is the procedure Expression(), since we only changed that rule in our grammar.
Procedure Expression()
; Here's the BNF to look at while writing the procedure:
; expression ::= <digit> [+ <digit>]
Emit("mov eax, " + GetDigit())

If Look = '+' ; remember, the + is optional
GetLook() ; Throw away the + and get the next character
Emit("add eax, " + GetDigit())
EndIf

If Look <> 0 ; The grammar does not allow anything after the digit
Expected("End of line") ; so if there is anything, put an error
EndIf
EndProcedure

Try it. This program should accept inputs simiar to this:
1
7+3
And refuse all other inputs, like this:
7+3+5
Var+4

You probably don't want to be limited to expression with just two terms, so let's find a way to extend our parser to accept any number of additions in the same expression.
expression ::= <digit> {+ <digit>}
The { and } means whatever is inside is optional and could be repeated any number of times. Thus, legal expressions will now be:
1
1+2
1+2+3+4+5+6+7+8+1+2+6+7
This is exactly what we want.

Before we go overboard and implement this, I'd like to make a few changes to make the program more modular:
value  ::= <digit>
addexp ::= <value> {+ <value>}
expression ::= <addexp>
Later on, when we need other values than digits, like variables, we don't need to change addexp or expression, only value.

Here is the program:
; add2.pb
IncludeFile "..\base.pb"
Init()

Procedure IsDigit(C.c)
If C >= '0' And C <= '9'
ProcedureReturn 1
EndIf
EndProcedure

Procedure.s GetDigit()
If IsDigit(Look)
Temp.s = Chr(Look)
GetLook()
ProcedureReturn Temp
Else
Expected("Digit")
EndIf
EndProcedure

Procedure Value()
; value ::= <digit>
Emit("mov eax, " + GetDigit())
EndProcedure

Procedure AddExp()
; addexp ::= <value> {+ <value>}
Value()
While Look = '+'
GetLook()
Emit("push eax")
Value()
Emit("pop ecx")
Emit("add eax, ecx")
Wend
EndProcedure


Procedure Expression()
; expression ::= <addexp>
AddExp()

If Look <> 0
Expected("End of line")
EndIf
EndProcedure

; Parse an expression
Expression()
Input()

As you can see, Value() always moves the value into eax. Then AddExpr() needs to save that value on the stack so that it doesn't get overwritten by the next call to Value(). So our generated code for the expression "1+2" just got worse by a huge amount. Previously just two instructions, now it's five!
You just have to get used to this. It is possible to generate better code, but it is harder. For beginners, this will do. Remember, it's probably way faster than a normal interpreter would be.
Try the program with a few expressions and see if it generates valid code. That is the most important of all. If the generated code doesn’t work correctly, it doesn't matter it it's really, really fast.

To check that a generated code produces the correct result, you can copy and paste it into purebasic, enable inline assembly, and run it with some code to display the result. Here is an example:

; Show the correct result
; (hopefully PB gets it right!)
Debug 1+2+3+4

; Generated code:
MOV eax, 1
PUSH eax
MOV eax, 2
POP ecx
ADD eax, ecx
PUSH eax
MOV eax, 3
POP ecx
ADD eax, ecx
PUSH eax
MOV eax, 4
POP ecx
ADD eax, ecx

; Get result
Show.l
MOV [v_Show], eax
Debug Show


If you have in front of you a program that produces valid x86 code from the entered expressions: congratulations! If not, then be ashamed, you brickhead! (Or rather, I didn't explain everything properly. Please post a comment to get help.)

29 August 2009

3. Expressions: a digit

Most programming languages have a concept of expressions. For our purpose, an expression is something which can be assigned to a variable.
Some example expressions:

(R*Alpha)/255) + ((R2*(255-Alpha)) / 255
0
SingleVariable
5+IWantCake
17*85 - 5/2


To write a compiler for a procedural language you need to be able to compile an expression. Since they are so important, we will start by dealing with them.

The first thing we need to do is to extract "meaning" from the input characters. The input characters in basetest.pb was all treated in the same way (printed to output). When compiling an expression, this is not so. The different characters have different meaning. For example, + means to add something, while 1, 45 and 153.56 are numbers.

"Making sense of" or "extracting meaning out of" a long list of characters is called parsing. We want to parse an expression. So how do we do that? We start with the simplest case possible and build to it.

The simplest expression is a single digit. To recognize a single digit in the input stream of characters we should have a procedure to check if a character is a digit or if it is something else entirely. One could write it like this:
Procedure IsDigit(C.c)
If C >= '0' And C <= '9'
ProcedureReturn 1
Else
ProcedureReturn 0
EndIf
EndProcedure


If C is a digit, it returns 1, else 0.

Or less verbosely, like this:
Procedure IsDigit(C.c)
If C >= '0' And C <= '9'
ProcedureReturn 1
EndIf
EndProcedure

(PureBasic procedures returns 0 when they reach the end.)

Or even like this:
Procedure IsDigit(C.c)
Select C
Case '0' To '9'
ProcedureReturn 1
EndSelect
EndProcedure

The important part is: does it work? We must always check that!
Debug "Not digits:"
Debug IsDigit('a')
Debug IsDigit('b')
Debug IsDigit('C')
Debug IsDigit('?')
Debug IsDigit('!')
Debug "Digits:"
Debug IsDigit('0')
Debug IsDigit('4')
Debug IsDigit('9')


Recognizing things like numbers without making sense of them (we know it's a number, but we don't know if it's part of an assignment or part of an array declaration) is called lexing or scanning, depending on who you ask.

So, there was three different procedures for recognizing a digit, yet all of them did the same. With another programming language, like C, the procedures would look even more different. So is there a simple way of specifying what we want (a digit) in an implementation-independent way? Yes: BNF, or Backus-Naur Form.

Instead of defining what is a digit by writing a procedure to recognize it, we can use a statement in BNF to say what is a digit:

digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9


This reads "digit has the form 0 or 1 or 2…" and so on. As you can see, this is shorter and clearer than either of the procedures. That is why we use BNF: it allows us to precisely specify what we want in a concise and implementation-independent manner. No matter which of the above procedures you use, it will use the definition that a digit is a character between 1 and 9.

Now that we have a way to recognize digits, let's write a program that requires that the first character of the input is a digit.

; new folder\readnum1.pb
IncludeFile "..\base.pb"
Init()

Procedure IsDigit(C.c)
If C >= '0' And C <= '9'
ProcedureReturn 1
EndIf
EndProcedure

Procedure Digit()
If IsDigit(Look)
Emit("found digit!")
Else
Expected("digit")
EndIf
EndProcedure

Digit()
Input() ; Wait for enter so we can see the output


Try it with different input and see if it reacts correctly. If not, then go back and check what is wrong.

It's nice, but it only tells us we got a digit, not which one! Not very useful, actually. Let's write a procedure that actually gives us the digit.

Procedure.s GetDigit()
If IsDigit(Look)
Temp.s = Chr(Look)
GetLook()
ProcedureReturn Temp
Else
Expected("Digit")
EndIf
EndProcedure


As you can see, we have error handling here. Catching errors early on will simplify everything.

Remember why we wanted to parse a digit? It was because we really wanted to parse an expression. Let's write the grammar (in BNF) for an expression which can only consist of one thing: a single digit.

digit  ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 
expression ::= <digit>


is inside clips to signify that we reference the rule digit (0 | 1 | 2 … 8 | 9) rather than wanting the actual characters "digit". An important side effect of this way of defining an expression is that it can never empty, nor can it be multiple digits. It can only be exactly one digit.

Now we are ready to write a program which implements the grammar from above:

IncludeFile "..\base.pb"
Init()

Procedure IsDigit(C.c)
If C >= '0' And C <= '9'
ProcedureReturn 1
EndIf
EndProcedure

Procedure.s GetDigit()
If IsDigit(Look)
Temp.s = Chr(Look)
GetLook()
ProcedureReturn Temp
Else
Expected("Digit")
EndIf
EndProcedure

Procedure Expression()
Emit("mov eax, " + GetDigit())
If Look <> 0 ; The grammar does not allow anything after the digit
Expected("End of line") ; so if there is anything, put an error
EndIf
EndProcedure

; Parse an expression
Expression()
Input()


Try it with various inputs and see how it:
- Refuses the empty input
- Refuses any letter or special symbol as input
- Gives an error when the input contains more than one digit
- Accepts the input without error ONLY when there is EXACTLY one digit
This is exactly as specified in the BNF grammar. I'll repeat it here so you can see:

digit  ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 
expression ::= <digit>


Also, this programs generates actual, valid assembly code (for 32-bit x86). That is just so wow!

2. Testing base.pb

Let's see what base.pb can do for us. Create a new folder ("testbase") and make a new purebasic source file in it. Then write this into the .pb file:

IncludeFile "..\base.pb" ; Include the base file from the parent folder

Init() ; Init input by reading a line from the console
Global Buffer.s ; A temporary string used by this example

While Look ; Look is the current character we are working with
; Run this loop until Look becomes 0
Buffer + Chr(Look) ; Put Look (converted to a string) at the end of our string
GetLook() ; Discard the current value in Look and read in a new character
Wend

; Now our Buffer will be filled with all the characters read into Look

Emit(Buffer) ; Let's show it just to be sure

Input() ; Wait for enter so we can the output


When you run it, you should see a console window. When you type in some text and hit enter, what you should typed should come back at you, only indented. Read this program (and if necessary, base.pb) thoroughly to understand what's going on.

1. Base

Some people wanted to write a compiler in PureBasic, but didn't know how to. So hence I started this tutorial to help people get started. Once you grasp the basics you can expand on your own.

This tutorial is in many ways similar to "Let's write a compiler" by Jack Crenshaw, because that's the tutorial I read to learn this. Now come sue me.

A compiler reads input text (called the source language) and turns that into assembly language (called the target language). Since we need to experiment a lot to understand what's going it would be wise to have a set of common input and output routines. I already made some for you. Unlike a real compiler, which will read from a file and write to another file, these procedures read from the console and write to the console. This makes them suitable for experimenting. Later we will see that retargeting the procedures to read and write files will be simple.

Save this as base.pb. Then see if it compiles and runs (it should do nothing).
Global Look.c
Global Instream.s

Procedure GetLook()
Look = Asc(Instream)
Instream = Mid(Instream, 2)
EndProcedure

Procedure Error(E.s)
PrintN("Error: " + E)
Input()
End
EndProcedure

Procedure Expected(S.s)
Error(S + " expected.")
EndProcedure

Procedure Emit(S.s)
PrintN(#TAB$ + S)
EndProcedure

Procedure Init()
Instream = Input()
GetLook()
EndProcedure