31 January 2004 A tiny compiler, part 5 I'm still working on my prototype compiler. I've created a keyboard loop, a routine to add bytes to the code space, and a framework for passing words to the compiler routines. Now I'm writing and testing another routine the compiler will need. The "compiler routines" will be compile, define, execute, and hex. (I already added these as do-nothing routines; I'll flesh these out later.) The hex routine expects that the (magenta) word that the user has just typed is a number in hexadecimal format, so hex needs code to convert the two characters into a byte value. This is what hex_to_byte does. Code As usual, I've prepared boot-sector source code that you can try out. hex_to_byte The hex_to_byte routine has to take the two characters in newword and convert them into a single byte in AL. The routine caches the two characters in BX and starts to work on the first (left) character. (Remember that newword stores the new word backwards, so the first character is in the high byte — in BH.) hex_to_byte: mov bx, [newword] mov al, bh The routine passes AL to a private subroutine, to convert the hexadecimal numeral into a binary value between 0 and 15. The character being converted was the first of the two, representing the high four bits of the byte value, so the resulting binary value is shifted into the high four bits of AL, and the result is stashed in AH for use later. call .1 shl al, 4 mov ah, al The routine then retrieves the second of the two characters (representing the low four bits of the byte value), converts the character, and combines the result with the high four bits stashed in AH. The byte value is now in AL, so the job is finished. mov al, bl call .1 or al, ah ret Here is that private subroutine that hex_to_byte uses. It expects a hexadecimal character in AL. If the character is a numeral, then I simply subtract from it the value for the numeral zero and return. .1: cmp al, '9' ja .2 sub al, '0' ret If the character is a letter, then I mask out bit 5, which has no effect on an uppercase letter but will convert a lowercase letter into an uppercase one. .2 and al, 0x5F Then I can convert the letter. 'A' becomes 0x0A, 'B' becomes 0x0B, and so on.
sub al, ('A'-10)
ret
Test code The test code comes with a list of sixteen newwords, which are copied one at a time into newword. The hex_to_byte routine converts each newword into a byte value, which is stored into a sixteen-byte buffer at address 0x8000. After all sixteen bytes are stored, dump_16 displays the contents of the buffer. mov bx, 15 .1: push bx add bx, bx mov ax, [newwords+bx] mov [newword], ax call hex_to_byte pop bx mov [0x8000+bx], al dec bx jns .1 mov bx, 0x8000 call dump_16 jmp short $ Results The newwords look like this: newwords: db '01','23','45','67','89','AB','CD','EF' db '65','74','83','92','a1','b0','cf','de' Each of these newwords is stored with bytes reversed, as they would be if they had been typed in — that is, '01' is what newword would be if you typed in 10, and so on — so the buffer into which the generated byte values are stored looks like this: 0000:8000: 10 23 54 76 98 BA DC FE 56 47 38 29 1A 0B FC ED | >2Tv....VG8).... Still to come I still have to do the following:
One last thing: I implied in the beginning that I was going to provide a way for this compiler to store and use precode, but I don't think I'll do that. I'll start work on the real compiler soon, the 32-bit version, and it will have a different design.
It's the end of the month, and I wanted to add one more entry for January 2004. Check the index for other entries. |