DOES> overwrites that address so that executing the word, instead of doing the default thing, now runs some different code, namely the code that you supply after the DOES>.
This is something of a kludge because the usual implementation stores something (the default semantics) in that cell when you run CREATE, then later overwrites it when you run DOES. Since lots of Forth targets today are microcontrollers whose code storage is in flash memory, overwriting individual already-written cells in code space is not nice.
Early Forths had <BUILDS ... DOES> instead of CREATE ... DOES> . You can see how the angle brackets originally looked symmetrical but after things changed, the bracket only appeared on DOES> and that may be part of why people find it confusing.
<BUILDS didn't install any default action into the newly created word. It left it uninitialized so it would get filled in when DOES> came along. CREATE DOES> was sort of an optimization since CREATE already existed too, making <BUILDS unnecessary. So they got rid of <BUILDS during standardization back in the minicomputer era where this stuff always generated code in ram (or maybe magnetic core) rather than flash. That optimization in turn has bitten some implementers in the butt. So <BUILDS DOES> has come back into usage among some MCU implementations like FlashForth. FlashForth is pretty nice by the way.
Well I didn't mean to type that much, but I hope it helps.
CREATE FOO 42 , DOES> @ ;
into an interpreter to create the constant. Then if placed inside a definition, there would be two semicolons: : CONSTANT CREATE , DOES> @ ; ;
It's an extra nesting which makes it clear you have a definition that makes a definition. You could even put words between the semicolons which just become part of the definition of CONSTANT.It feels as if this DOES> thing is a kludge that activates within definitions, and kind of "hijacks" the rest of their instructions. Without DOES>, the material after it would be part of the definition of CONSTANT and not part of the definition of the word produced by CREATE. The switcheroo feels hacky.
It’s nice to see Forth internal deep dives hitting the front page, great article.
I hate DOES> I was implementing it well after 1am last night and I hate it, I have this feeling as something gets harder to implement it means its not right, but I know DOES> is right, so its me, I just couldn't implement it well. It was super frustrating. But now I feel better :)
I am new to Forth but it feels like `create does>` has to be replaced with some new construct, I just want word code to operate on its data, but I need to gain more experience to find out, for now `create does>` will do.
https://github.com/dan4thewin/FreeForth2/blob/master/ff.asm
which uses a double loop to lookup first macro words, and then immediate words
https://github.com/ablevm/able-forth/blob/current/forth.scr
Ableforth implements a defer/expand operation \ to effectively quote words. The basic loop is then simply parse a text word, look it up and execute it.
Both make use of macros (code generators) to implement deferred behaviour, as well as code inlining. Ultimately all these operations implement defer by manipulating the execution flow, something that algebraic effects also do.
I have a feeling that algebraic effects can be used in a Forth to implement DOES.
: index-array swap cells + ;
create a1 10 cells allot
create a2 20 cells allot
: array1 a1 index-array ;
: array2 a2 index-array ;
I used to use the "implementation-dependent" trick of popping the return address (in e.g. index-array) to get the data. Less verbose, a bit more efficient. But my implementation doesn't permit it anymore.Recently I've found out that implementing "self/this" pseudo-value and pseudo-method calls much more useful. The relation with this and "create does>" is that latter can be seen as poor man's closure, or poor man's object [1].
[1] https://stackoverflow.com/questions/2497801/closures-are-poo...
: foo 1+ does> . ;
42 foo bar
bar \ prints 43
If you want, you can add the "traditional" indirection in the initialization part, for a similar effect.
So, not quite the same, but almost, and I think it echoes the intuition you have, which is also mine.
With Forth's stack based nature, how is it possible to ever have performant data structures? If I only ever do things by pushing and popping from the stack, then I would think, that data structures are inherently limited to linear access times. But there are libraries implementing arrays and so on. I don't understand how this is possible in a performant way. How those structures are made, so that they are performantly accessible. Or perhaps Forth is really seriously lacking performant data structures? But that seems crazy unlikely.
I still don't know what DOES> really does... ;-)
Typical usage is for the “code that will run immediately” is to store some data, and for the “code that gets compiled to be run later” to use that data.
perhaps the simplest example is CONSTANT, which can be defined like this:
: CONSTANT ( w "name" -- )
CREATE ,
DOES> ( -- w )
@ ;
Here, the “code that will run immediately” is CREATE ,
which a) reads a name from the command line and creates a word with that name, and then takes the top of the stack and stores that directly after the word’s definition.The “code that gets compiled to be run later” is
@
which fetches the formerly stored value (taking the address of the formerly created word from the stack)DOES> has to do some shenanigans to make that work, but that’s an implementation detail, and will be dependent on the particular FORTH being used.
: NAME alpha beta ... psi omega ;
what happens is that at at compile time, a dictionary entry NAME is created, and then the alpha ... omega words are compiled to be run later.When DOES> is introduced:
: NAME alpha beta ... DOES> ... psi omega ;
all of the above still holds. We still have a dictionary entry NAME, which denotes all of the words up to the semicolon, including DOES>.Then, when we execute NAME in a compilation context, because the word sequence contains DOES>, everything to the left of DOES> is specially treated: it is executed immediately in the compilation context and is removed. But that's not all; DOES> doesn't just execute everything to the left and disappear; it leaves something behind: some word which is then combined with the material to the right of DOES> to form the run-time sequence.
In your example, when we run CONSTANT, the part to the left of DOES> fetches a name from the input stream, and creates a word, and then makes the value on the stack the definition.
the accumulation of to-be-run later words is interrupted, and everything before DOES> is done now, at definition time, and removed from the definition.
The CREATE material, when executed, leaves behind a reference to the word denoting the constant. Then DOES> creates a definition for that word, using the remaining material.
Is that more or less it?
Typically (likely always, as sharing code is the main reason DOES was invented), it compiles that code once and magically makes “that word” ‘jump’ to that code. That way, when, for example, you use the definition with DOES> multiple times, you only compile “the remaining material” once.
> Then, when we execute NAME in a compilation context, because the word sequence contains DOES>, everything to the left of DOES> is specially treated
That would be too magical. If you want a word to be executed when compiling code, you make it IMMEDIATE. For the CONSTANT example I gave, that’s not done, as it is executed in interpretation context, and then creates a word, compiles the number from the top of the stack, and then hooks up the word just created to the code compiled earlier.
But the code after DOES> is repeatedly referenced in new definitions that are the result of executing the word which contains DOES>, like the CONSTANT example.
The original : CONSTANT CREATE , DOES> @ ; could be entirely compiled so that it contains a compiled sequence for the @ part. When DOES> is executed, it patches a pointer to that part into the word produced by CREATE, and then somehow skips the execution of that part.
How do you manage the storage? There has to be some refcounting or garbage collection. What if four words point to the same instruction sequence, and we FORGET three of them.
Ah, but FORGET works in a LIFO discipline; you can't just forget arbitrary entries. If B was defined using parts of A, then B is newer. You cannot forget A without forgetting B first. I think.
: CREATE WORD CREATEHEAD DODOES , 0 , ;
: DOES> IMMEDIATE ['] LIT , HERE @ 6 CELLS + , ['] LATEST , ['] @ , ['] >DFA , ['] ! , ['] EXIT
reading colorforth code and especially commentary (https://github.com/Howerd/colorForth) it seemed that it refines the concept of staging into colours (does> might correspond to cyan?).
hopefully someone more knowledgeable will chime in here!
Surprised so little public forth's implement it.
: COUNTER
CREATE ,
DOES> DUP 1 SWAP +! @ ;
0 COUNTER PK
PK . \ => 1
PK . \ => 2
A semi-equivalent in Javascript is: const counter = init => {
let x = init;
return () => { x += 1; return x; };
};
const pk = counter(0);
console.log(pk()); // => 1
console.log(pk()); // => 2
What’s funny, is that I used to know how it works, now any time I come across these kind of articles I get more and more confused and further away from understanding. It’s like reading those convoluted explanations of what a monad is.
It does this by doing something now, and later (you could read create does> as now later>)
So for
: CONSTANT CREATE , >DOES @ ;
This makes the defining word CONSTANT, which when run (now) compiles the next word
So CONSTANT myvar will compile myvar. Myvar, when run (later) will get it's value and push it to the datastack.