diff --git a/site/404.html b/site/404.html index 7e3dbe63e..d3851f45f 100644 --- a/site/404.html +++ b/site/404.html @@ -1,5 +1,5 @@ - +
A simple example is constant folding: if some expression always evaluates to the exact same value, we can do the evaluation at compile time and replace the code for the expression with its result. If the user typed in this:
-pennyArea = 3.14159 * (0.75 / 2) * (0.75 / 2); +pennyArea = 3.14159 * (0.75 / 2) * (0.75 / 2);we could do all of that arithmetic in the compiler and change the code to:
-pennyArea = 0.4417860938; +pennyArea = 0.4417860938;Optimization is a huge part of the programming language business. Many language hackers spend their entire careers here, squeezing every drop of performance diff --git a/site/a-tree-walk-interpreter.html b/site/a-tree-walk-interpreter.html index b8b8c19a8..03f51395b 100644 --- a/site/a-tree-walk-interpreter.html +++ b/site/a-tree-walk-interpreter.html @@ -1,5 +1,5 @@ - +
A Tree-Walk Interpreter · Crafting Interpreters diff --git a/site/a-virtual-machine.html b/site/a-virtual-machine.html index 8377b5aa6..4e6caffee 100644 --- a/site/a-virtual-machine.html +++ b/site/a-virtual-machine.html @@ -1,5 +1,5 @@ - +A Virtual Machine · Crafting Interpreters @@ -112,9 +112,9 @@—literally a Chunk—and it runs it. The code and data structures for the VM reside in a new module. -
vm.h
+vm.h-
create new file#ifndef clox_vm_h +#ifndef clox_vm_h #define clox_vm_h #include "chunk.h" @@ -135,9 +135,9 @@
vm.c
+vm.c-
create new file#include "common.h" +#include "common.h" #include "vm.h" VM vm; @@ -172,7 +172,7 @@
int main(int argc, const char* argv[]) { +int main(int argc, const char* argv[]) {main.c
in main()initVM(); @@ -183,7 +183,7 @@
disassembleChunk(&chunk, "test chunk"); +disassembleChunk(&chunk, "test chunk");main.c
in main()freeVM(); @@ -192,7 +192,7 @@main.c, in main()
One last ceremonial obligation:
-#include "debug.h" +#include "debug.h"main.c#include "vm.h"@@ -206,7 +206,7 @@15 . 1 . 1Executing instructions
The VM springs into action when we command it to interpret a chunk of bytecode.
-disassembleChunk(&chunk, "test chunk"); +disassembleChunk(&chunk, "test chunk");main.c
in main()interpret(&chunk); @@ -215,7 +215,7 @@15̴
main.c, in main()This function is the main entrypoint into the VM. It’s declared like so:
-void freeVM(); +void freeVM();vm.h
add after freeVM()InterpretResult interpret(Chunk* chunk); @@ -226,7 +226,7 @@15̴
vm.h, add after freeVM()The VM runs the chunk and then responds with a value from this enum:
-} VM; +} VM;vm.h@@ -245,9 +245,9 @@
add after struct VM15̴ errors and a VM that detects runtime errors, the interpreter will use this to know how to set the exit code of the process.
We’re inching towards some actual implementation.
-vm.c
+vm.c-
add after freeVM()InterpretResult interpret(Chunk* chunk) { +InterpretResult interpret(Chunk* chunk) { vm.chunk = chunk; vm.ip = vm.chunk->code; return run(); @@ -266,7 +266,7 @@15̴ interpreter, we would store
ipin a local variable. It gets modified so often during execution that we want the C compiler to keep it in a register. -typedef struct { +typedef struct { Chunk* chunk;vm.h@@ -291,9 +291,9 @@
in struct VM15̴ to be executed. This will be true during the entire time the VM is running: the IP always points to the next instruction, not the one currently being handled.
The real fun happens in
-run().vm.c
+vm.c-
add after freeVM()static InterpretResult run() { +static InterpretResult run() { #define READ_BYTE() (*vm.ip++) for (;;) { @@ -350,7 +350,7 @@15̴ return from the current Lox function, but we don’t have functions yet, so we’ll repurpose it temporarily to end the execution.
Let’s go ahead and support our one other instruction.
-switch (instruction = READ_BYTE()) { +switch (instruction = READ_BYTE()) {vm.c
in run()case OP_CONSTANT: { @@ -366,7 +366,7 @@15̴
We don’t have enough machinery in place yet to do anything useful with a constant. For now, we’ll just print it out so we interpreter hackers can see what’s going on inside our VM. That call to
-printf()necessitates an include.vm.c
+vm.c
add to top of file#include <stdio.h> @@ -375,7 +375,7 @@15̴
vm.c, add to top of fileWe also have a new macro to define.
-#define READ_BYTE() (*vm.ip++) +#define READ_BYTE() (*vm.ip++)vm.c
in run()#define READ_CONSTANT() (vm.chunk->constants.values[READ_BYTE()]) @@ -393,7 +393,7 @@15̴
run(). To make that scoping more explicit, the macro definitions themselves are confined to that function. We define them at the beginning and—because we care—undefine them at the end. -#undef READ_BYTE +#undef READ_BYTEvm.c
in run()#undef READ_CONSTANT @@ -416,7 +416,7 @@15 . VM like we did with chunks themselves. In fact, we’ll even reuse the same code. We don’t want this logging enabled all the time—it’s just for us VM hackers, not Lox users—so first we create a flag to hide it behind. -
#include <stdint.h> +#include <stdint.h>common.h@@ -430,7 +430,7 @@15 .
When this flag is defined, the VM disassembles and prints each instruction right before executing it. Where our previous disassembler walked an entire chunk once, statically, this disassembles instructions dynamically, on the fly.
-for (;;) { +for (;;) {vm.c
in run()#ifdef DEBUG_TRACE_EXECUTION @@ -448,7 +448,7 @@15 . bytecode. Then we disassemble the instruction that begins at that byte.
As ever, we need to bring in the declaration of the function before we can call it.
-#include "common.h" +#include "common.h"vm.c#include "debug.h"#include "vm.h" @@ -465,7 +465,7 @@1
In addition to imperative side effects, Lox has expressions that produce, modify, and consume values. Thus, our compiled bytecode needs a way to shuttle values around between the different instructions that need them. For example:
-print 3 - 2; +print 3 - 2;We obviously need instructions for the constants 3 and 2, the
1 and “subtrahend” might be some sort of underground Paleolithic monument.
To put a finer point on it, look at this thing right here:
-fun echo(n) { +fun echo(n) { print n; return n; } @@ -508,7 +508,7 @@1 statement, with numbers marking the order that the nodes are evaluated." />
Given left-to-right evaluation, and the way the expressions are nested, any correct Lox implementation must print these numbers in this order:
-1 // from echo(1) +1 // from echo(1) 2 // from echo(2) 3 // from echo(1 + 2) 4 // from echo(4) @@ -593,7 +593,7 @@15 . 2 generate much faster native code on the fly.
Alrighty, it’s codin’ time! Here’s the stack:
-typedef struct { +typedef struct { Chunk* chunk; uint8_t* ip;vm.h
@@ -637,7 +637,7 @@15 . 2
I remember it like this:
-stackToppoints to where the next value to be pushed will go. The maximum number of values we can store on the stack (for now, at least) is:#include "chunk.h" +#include "chunk.h"vm.h@@ -652,7 +652,7 @@15 . 2 instructions to push too many values and run out of stack space—the classic “stack overflow”. We could grow the stack dynamically as needed, but for now we’ll keep it simple. Since VM uses Value, we need to include its declaration. -
#include "chunk.h" +#include "chunk.h"vm.h#include "value.h"@@ -662,7 +662,7 @@15 . 2
vm.hNow that VM has some interesting state, we get to initialize it.
-void initVM() { +void initVM() {vm.c
in initVM()resetStack(); @@ -671,9 +671,9 @@15 . 2
vm.c, in initVM()That uses this helper function:
-vm.c
+@@ -685,7 +685,7 @@vm.c-
add after variable vmstatic void resetStack() { +static void resetStack() { vm.stackTop = vm.stack; }15 . 2 them. The only initialization we need is to set
stackTopto point to the beginning of the array to indicate that the stack is empty.The stack protocol supports two operations:
-InterpretResult interpret(Chunk* chunk); +InterpretResult interpret(Chunk* chunk);vm.h
add after interpret()void push(Value value); @@ -698,9 +698,9 @@15 . 2
You can push a new value onto the top of the stack, and you can pop the most recently pushed value back off. Here’s the first function:
-vm.c
+vm.c-
add after freeVM()void push(Value value) { +void push(Value value) { *vm.stackTop = value; vm.stackTop++; } @@ -714,9 +714,9 @@15 . 2 itself to point to the next unused slot in the array now that the previous slot is occupied.
Popping is the mirror image.
-vm.c
+vm.c-
add after push()Value pop() { +Value pop() { vm.stackTop--; return *vm.stackTop; } @@ -734,7 +734,7 @@15 . 2 make our lives as VM hackers easier if we had some visibility into the stack.
To that end, whenever we’re tracing execution, we’ll also show the current contents of the stack before we interpret each instruction.
-#ifdef DEBUG_TRACE_EXECUTION +#ifdef DEBUG_TRACE_EXECUTIONvm.c
in run()printf(" "); @@ -753,7 +753,7 @@15 . 2 instruction on the stack. The output is pretty verbose, but it’s useful when we’re surgically extracting a nasty bug from the bowels of the interpreter.
Stack in hand, let’s revisit our two instructions. First up:
-case OP_CONSTANT: { +case OP_CONSTANT: { Value constant = READ_CONSTANT();vm.c
in run()
@@ -766,7 +766,7 @@15 . 2
In the last chapter, I was hand-wavey about how the
-OP_CONSTANTinstruction “loads” a constant. Now that we have a stack you know what it means to actually produce a value: it gets pushed onto the stack.case OP_RETURN: { +case OP_RETURN: {vm.c
in run()printValue(pop()); @@ -786,13 +786,13 @@15& with only the two rudimentary instructions we have so far. So let’s teach our interpreter to do arithmetic.
We’ll start with the simplest arithmetic operation, unary negation.
-var a = 1.2; +var a = 1.2; print -a; // -1.2.The prefix
--operator takes one operand, the value to negate. It produces a single result. We aren’t fussing with a parser yet, but we can add the bytecode instruction that the above syntax will compile to.OP_CONSTANT, +OP_CONSTANT,chunk.h
in enum OpCodeOP_NEGATE, @@ -801,7 +801,7 @@15&
chunk.h, in enum OpCodeWe execute it like so:
-} +}vm.c
in run()case OP_NEGATE: push(-pop()); break; @@ -812,7 +812,7 @@15&
The instruction needs a value to operate on, which it gets by popping from the stack. It negates that, then pushes the result back on for later instructions to use. Doesn’t get much easier than that. We can disassemble it too.
-case OP_CONSTANT: +case OP_CONSTANT: return constantInstruction("OP_CONSTANT", chunk, offset);debug.c@@ -823,7 +823,7 @@
in disassembleInstruction()15&
debug.c, in disassembleInstruction()And we can try it out in our test chunk.
-writeChunk(&chunk, constant, 123); +writeChunk(&chunk, constant, 123);main.c
in main()writeChunk(&chunk, OP_NEGATE, 123); @@ -836,7 +836,7 @@15&
After loading the constant, but before returning, we execute the negate instruction. That replaces the constant on the stack with its negation. Then the return instruction prints that out:
--1.2 +-1.2Magical!
15 . 3 . 1Binary operators
@@ -849,7 +849,7 @@15 . 3&
Lox has some other binary operators—comparison and equality—but those don’t produce numbers as a result, so we aren’t ready for them yet.
-OP_CONSTANT, +OP_CONSTANT,chunk.h
in enum OpCodeOP_ADD, @@ -861,7 +861,7 @@15 . 3&
chunk.h, in enum OpCodeBack in the bytecode loop, they are executed like this:
-} +}vm.c
in run()case OP_ADD: BINARY_OP(+); break; @@ -877,7 +877,7 @@15 . 3& arithmetic expression is some boilerplate code to pull values off the stack and push the result. When we later add dynamic typing, that boilerplate will grow. To avoid repeating that code four times, I wrapped it up in a macro. -
#define READ_CONSTANT() (vm.chunk->constants.values[READ_BYTE()]) +#define READ_CONSTANT() (vm.chunk->constants.values[READ_BYTE()])vm.c
in run()#define BINARY_OP(op) \ @@ -906,21 +906,21 @@15 . 3& probably looks really weird. This macro needs to expand to a series of statements. To be careful macro authors, we want to ensure those statements all end up in the same scope when the macro is expanded. Imagine if you defined: -
#define WAKE_UP() makeCoffee(); drinkCoffee(); +#define WAKE_UP() makeCoffee(); drinkCoffee();And then used it like:
-if (morning) WAKE_UP(); +if (morning) WAKE_UP();The intent is to execute both statements of the macro body only if
-morningis true. But it expands to:if (morning) makeCoffee(); drinkCoffee();; +if (morning) makeCoffee(); drinkCoffee();;Oops. The
-ifattaches only to the first statement. You might think you could fix this using a block.#define WAKE_UP() { makeCoffee(); drinkCoffee(); } +#define WAKE_UP() { makeCoffee(); drinkCoffee(); }That’s better, but you still risk:
-if (morning) +