Static or Dynamic
It’s been quite some time since my last post because I’ve been considering making the language statically typed. Most of the languages I’ve made are dynamically typed (except for my most recent one) and they’re perfectly fine. The problem is, it’s particularly difficult optimizing such languages for real-world use without greatly increasing the size or complexity of the implementation. Ultimately, most dynamically typed languages are still strongly typed (save a few: https://www.destroyallsoftware.com/talks/wat), and errors which would be caught at compile time are caught at runtime. This runtime overhead is easily avoided by moving type-checking over to the compiler. The language runtime itself can still be “dynamic” but type errors can be caught at compile-time (this is what python does with type hints).
One of the most common arguments against static typing is that it adds a lot of superfluous noise to the source code:
BitBuffer* bitBuffer = new BitBuffer; // lots of typenames
And while that may be true, much of this noise can be removed by inferring types, making the code practically indistinguishable from its dynamically typed equivalent:
auto bitBuffer = new BitBuffer; // not very different from 'var bitBuffer = new BitBuffer;'
Do note that no type information is lost in the latter example.
Concerning performance, bytecode produced for a dynamically typed language will likely be much slower than for a statically typed one. For example:
def add(x, y):
return x + y
would produce something like (hypothetical machine code):
.add
type_of_x = typeof x ; types extracted at runtime
type_of_y = typeof y
beq type_of_x, int, add_x_int ; if the type of x is 'int' goto ...
beq type_of_x, float, add_x_float ; ...
print "Cannot add values"
error
.add_x_int
beq type_of_y, int, add_x_y_int
beq type_of_y, float, add_x_int_y_float
.add_x_float
beq type_of_y, int, add_x_float_y_int
beq type_of_y, float, add_x_float_y_float
.add_x_y_int
result = addi x, y
ret
.add_x_int_y_float
.add_x_float_y_int
print "attempted to add int and float"
error
.add_x_float_y_float
result = addf x, y
ret
Whereas a statically typed program like the following:
int add(int x, int y)
{
return x + y;
}
Would produce something like:
.add
result = addi x, y
ret
I’d say that’s a win.
Memory Management
Okay so languages with dynamic runtimes pretty much have to have a means of automatically managing memory because they simply do not know the size of any values at compile-time. If they didn’t, you’d have to write shit like:
def shit():
x = 10
free(x) # free the dynamically allocated integer value
Or at least:
def shit():
x = 10
release(x) # manual ref counting
Keeping track of dynamically allocated values is hard enough. Having to keep track of values allocated on the “stack” would only exacerbate the problem. So, most languages have garbage collectors or automatic reference counting + a garbage collector for reference cycles (like python).
New Syntax
Taking all that I’ve discussed into consideration, I’ve revised the syntax of “Tut” (the programming language we’re making) to the following:
var pos : vec2; // do note that the type is declared after its use
struct vec2
{
x : int;
y : int;
};
func make_vec2(x : int, y : int) : vec2
{
var result : vec2;
result.x = x;
result.y = y;
return result;
}
// functions which make use of strings should take cstr's as parameters
func puts(str : cstr) : void
{
printf("%s\n", str);
}
func main() : int
{
pos = make_vec2(10, 10);
printf("(%i, %i)\n", pos.x, pos.y);
var name : cstr = "Andy"; // "cstr" is an immutable reference to a string
var start_of_name : str = substr(name, 0, 2); // "str" is a mutable heap-allocated string
puts(name);
puts(startOfName); // "str"s are implicitly casted to cstr
free_str(start_of_name); // substr allocates memory so it has to be freed
return 0;
}
Yes, this is basically C with postfix types, but it has built-in strings and, more importantly, it’s ours to change as we see fit!
Well, I'll get to work on an implementation ASAP.
No comments:
Post a Comment