This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var x = 10 | |
var y = 20 | |
func add(x, y) | |
{ | |
return x + y | |
} | |
// this is a comment | |
write(add(x, y)) |
Let's begin!
Defining The Tokens
Before we can create our lexer, we must first define each and every valid token in the language. This can be done very easily with a C enum:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#ifndef TUT_TOKEN_H | |
#define TUT_TOKEN_H | |
typedef enum | |
{ | |
TUT_TOK_NUMBER, | |
TUT_TOK_STRING, | |
TUT_TOK_IDENT, | |
TUT_TOK_VAR, | |
TUT_TOK_FUNC, | |
TUT_TOK_RETURN, | |
TUT_TOK_IF, | |
TUT_TOK_ELSE, | |
TUT_TOK_WHILE, | |
TUT_TOK_OPENPAREN, | |
TUT_TOK_CLOSEPAREN, | |
TUT_TOK_OPENSQUARE, | |
TUT_TOK_CLOSESQUARE, | |
TUT_TOK_OPENCURLY, | |
TUT_TOK_CLOSECURLY, | |
TUT_TOK_COMMA, | |
TUT_TOK_PLUS, | |
TUT_TOK_MINUS, | |
TUT_TOK_MUL, | |
TUT_TOK_DIV, | |
TUT_TOK_ASSIGN, | |
TUT_TOK_EOF, | |
TUT_TOK_ERROR, | |
TUT_TOK_COUNT | |
} TutToken; | |
const char* Tut_TokenRepr(TutToken token); | |
#endif |
As you can see, every valid entity in the language is represented by one of these values.
The Lexer
The job of the lexer is to take a string containing "Tut" code and transform it into "TutToken" on demand. This is what the header file of the lexer looks like.
// TODO: Finish blog post
Here is the repository thus far.
No comments:
Post a Comment