#LyX 1.3 created this file. For more info see http://www.lyx.org/ \lyxformat 221 \textclass article \language english \inputencoding auto \fontscheme default \graphics default \paperfontsize default \spacing single \papersize Default \paperpackage a4 \use_geometry 0 \use_amsmath 0 \use_natbib 0 \use_numerical_citations 0 \paperorientation portrait \secnumdepth 3 \tocdepth 2 \paragraph_separation skip \defskip medskip \quotes_language english \quotes_times 2 \papercolumns 1 \papersides 1 \paperpagestyle headings \layout Title Factor Developer's Guide \layout Author Slava Pestov \layout Standard \begin_inset LatexCommand \tableofcontents{} \end_inset \layout Section* \pagebreak_top Introduction \layout Standard Factor is an imperitive programming language with functional and object-oriented influences. Its primary goal is to be used for web-based server-side applications. Factor is interpreted by a virtual machine that provides garbage collection and prohibits pointer arithmetic. \begin_inset Foot collapsed false \layout Standard Two releases of Factor are available -- a virtual machine written in C, and an interpreter written in Java that runs on the Java virtual machine. This guide targets the C version of Factor. \end_inset \layout Standard Factor borrows heavily from Forth, Joy and Lisp. From Forth it inherits a flexible syntax defined in terms of \begin_inset Quotes eld \end_inset parsing words \begin_inset Quotes erd \end_inset and an execution model based on a data stack and call stack. From Joy and Lisp it inherits a virtual machine prohibiting direct pointer arithmetic, and the use of \begin_inset Quotes eld \end_inset cons cells \begin_inset Quotes erd \end_inset to represent code and data struture. \layout Section Fundamentals \layout Standard A "word" is the main unit of program organization in Factor -- it corresponds to a "function", "procedure" or "method" in other languages. \layout Standard When code examples are given, the input is in a roman font, and any output from the interpreter is in italics: \layout LyX-Code \begin_inset Quotes eld \end_inset Hello, world! \begin_inset Quotes erd \end_inset print \layout LyX-Code \emph on Hello, world! \layout Subsection The stack \layout Standard The stack is used to exchange data between words. When a number is executed, it is pushed on the stack. When a word is executed, it receives input parameters by removing successive elements from the top of the stack. Results are then pushed back to the top of the stack. \layout Standard The word \family typewriter .s \family default prints the contents of the stack, leaving the contents of the stack unaffected. The top of the stack is the rightmost element in the printout: \layout LyX-Code 2 3 .s \layout LyX-Code \emph on { 2 3 } \layout Standard The word \family typewriter . \family default removes the object at the top of the stack, and prints it: \layout LyX-Code 1 2 3 . . . \layout LyX-Code \emph on 3 \layout LyX-Code \emph on 2 \layout LyX-Code \emph on 1 \layout Standard The usual arithmetic operators \family typewriter + - * / \family default all take two parameters from the stack, and push one result back. Where the order of operands matters ( \family typewriter - \family default and \family typewriter / \family default ), the operands are taken from the stack in the natural order. For example: \layout LyX-Code 10 17 + . \layout LyX-Code \emph on 27 \layout LyX-Code 111 234 - . \layout LyX-Code \emph on -123 \layout LyX-Code 333 3 / . \layout LyX-Code \emph on 111 \layout Standard This type of arithmetic is called \emph on postfix \emph default , because the operator follows the operands. Contrast this with \emph on infix \emph default notation used in many other languages, so-called because the operator is in-between the two operands. \layout Standard More complicated infix expressions can be translated into postfix by translating the inner-most parts first. Grouping parantheses are never necessary: \layout LyX-Code ! Postfix equivalent of (2 + 3) * 6 \layout LyX-Code 2 3 + 6 * \layout LyX-Code \emph on 36 \layout LyX-Code ! Postfix equivalent of 2 + (3 * 6) \layout LyX-Code 2 3 6 * + \layout LyX-Code \emph on 20 \layout Subsection Factoring \layout Standard New words can be defined in terms of existing words using the \emph on colon definition \emph default syntax: \layout LyX-Code : \emph on name \emph default ( \emph on inputs \emph default -- \emph on outputs \emph default ) \layout LyX-Code #! \emph on Description \layout LyX-Code \emph on factors ... \emph default ; \layout Standard When the new word is executed, each one of its factors gets executed, in turn. The comment delimited by \family typewriter ( \family default and \family typewriter ) \family default is called a stack effect comment and is described later. The stack effect comment, as well as the documentation comment starting with \family typewriter #! \family default are both optional, and can be placed anywhere in the source code, not just in colon definitions. \layout Standard Note that in a source file, a word definition can span multiple lines. However, the interactive interpreter expects each line of input to be \begin_inset Quotes eld \end_inset complete \begin_inset Quotes erd \end_inset , so interactively, colon definitions must be entered all on one line. \layout Standard For example, lets assume we are designing some software for an aircraft navigation system. Lets assume that internally, all lengths are stored in meters, and all times are stored in seconds. We can define words for converting from kilometers to meters, and hours and minutes to seconds: \layout LyX-Code : kilometers 1000 * ; \layout LyX-Code : minutes 60 * ; \layout LyX-Code : hours 60 * 60 * ; \layout LyX-Code 2 kilometers . \layout LyX-Code \emph on 2000 \layout LyX-Code 10 minutes . \layout LyX-Code \emph on 600 \layout LyX-Code 2 hours . \layout LyX-Code \emph on 7200 \layout Standard Now, suppose we need a word that takes the flight time, the aircraft velocity, and the tailwind velocity, and returns the distance travelled. If the parameters are given on the stack in that order, all we do is add the top two elements (aircraft velocity, tailwind velocity) and multiply it by the element underneath (flight time). So the definition looks like this, this time with a stack effect comment since its slightly less obvious what the operands are: \layout LyX-Code : distance ( time aircraft tailwind -- distance ) + * ; \layout LyX-Code 2 900 36 distance . \layout LyX-Code \emph on 1872 \layout Standard Note that we are not using any units here. We could, if we defined some words for velocity units first. The only non-trivial thing here is the implementation of \family typewriter km/hour \family default -- we have to divide the \family typewriter km/sec \family default velocity by the number of seconds in one hour to get the desired result: \layout LyX-Code : km/hour kilometers 1 hours / ; \layout LyX-Code 2 hours 900 km/hour 36 km/hour distance . \layout LyX-Code \emph on 1872000 \layout Subsection Stack effects \layout Standard A stack effect comment contains a description of inputs to the left of \family typewriter -- \family default , and a description of outputs to the right. As always, the top of the stack is on the right side. Lets try writing a word to compute the cube of a number. \begin_inset Foot collapsed false \layout Standard I'd use the somewhat simpler example of a word that squares a number, but such a word already exists in the standard library. Its in the \family typewriter arithmetic \family default vocabulary, named \family typewriter sq \family default . \end_inset \layout Standard Three numbers on the stack can be multiplied together using \family typewriter * * \family default : \layout LyX-Code 2 4 8 * * . \layout LyX-Code \emph on 64 \layout Standard However, the stack effect of \family typewriter * * \family default is \family typewriter ( a b c -- a*b*c ) \family default . We would like to write word that takes \emph on one \emph default input only. To achive this, we need to be able to duplicate the top stack element twice. As it happends, there is a word \family typewriter dup ( x -- x x ) \family default for precisely this purpose. Now, we are able to define the \family typewriter cube \family default word: \layout LyX-Code : cube dup dup * * ; \layout LyX-Code 10 cube . \layout LyX-Code \emph on 1000 \layout LyX-Code -2 cube . \layout LyX-Code \emph on -8 \layout Standard It is quite often the case that we want to compose two factors in a colon definition, but their stack effects don't \begin_inset Quotes eld \end_inset match up \begin_inset Quotes erd \end_inset . \layout Standard There is a set of \emph on shuffle words \emph default for solving precisely this problem. These words are so-called because they simply rearrange stack elements in some fashion, without modifying them in any way. Lets take a look at the most frequently-used shuffle words: \layout Standard \family typewriter drop ( x -- ) \family default Discard the top stack element. Used when a return value is not needed. \layout Standard \family typewriter dup ( x -- x x ) \family default Duplicate the top stack element. Used when a value is needed more than once. \layout Standard \family typewriter swap ( x y -- y x ) \family default Swap top two stack elements. Used when a word expects parameters in a different order. \layout Standard \family typewriter rot ( x y z -- y z x ) \family default Rotate top three stack elements to the left. \layout Standard \family typewriter -rot ( x y z -- z x y ) \family default Rotate top three stack elements to the right. \layout Standard \family typewriter over ( x y -- x y x ) \family default Bring the second stack element \begin_inset Quotes eld \end_inset over \begin_inset Quotes erd \end_inset the top element. \layout Standard \family typewriter nip ( x y -- y ) \family default Remove the second stack element. \layout Standard \family typewriter tuck ( x y -- y x y ) \family default Tuck the top stack element under the second stack element. \layout Standard You can try all these words out -- push some numbers on the stack, execute a word, and look at how the stack contents was changed using \family typewriter .s \family default . Compare the stack contents with the stack effects above. \layout Standard Note the order of the shuffle word descriptions above. The ones at the top are used most often because they are easy to understand. The more complex ones such as rot should be avoided as possible, because they make the flow of data in a word definition harder to understand. \layout Standard If you find yourself using too many shuffle words, or you're writing a stack effect comment in the middle of a colon definition, it is a good sign that the word should probably be factored into two or more words. Effective factoring is like riding a bicycle -- it is hard at first, but then you \begin_inset Quotes eld \end_inset get it \begin_inset Quotes erd \end_inset , and writing small, clear and reusable word definitions becomes second-nature. \layout Subsection Combinators \layout Standard A quotation a list of objects that can be executed. Words that operate on quotations are called \emph on combinators \emph default . Quotations are input using the following syntax: \layout LyX-Code [ 2 3 + . ] \layout Standard When input, a quotation is not executed immediately -- rather, it becomes one object on the stack. Try evaluating the following: \layout LyX-Code [ 1 2 3 + * ] .s \layout LyX-Code \emph on { [ 1 2 3 + * ] } \layout LyX-Code call .s \layout LyX-Code \emph on { 5 } \layout Standard \family typewriter call \family default \family typewriter ( quot -- ) \family default executes the quotation at the top of the stack. Using \family typewriter call \family default with a literal quotation is useless; writing out the elements of the quotation has the same effect. However, the \family typewriter call \family default combinator is a building block of more powerful combinators, since quotations can be passed around arbitrarily and even modified before being called. \layout Standard \family typewriter ifte \family default \family typewriter ( cond true false -- ) \family default executes either the \family typewriter true \family default or \family typewriter false \family default quotations, depending on the boolean value of \family typewriter cond \family default . In Factor, there is no real boolean data type -- instead, a special object \family typewriter f \family default is the only object with a \begin_inset Quotes eld \end_inset false \begin_inset Quotes erd \end_inset boolean value. Every other object is a boolean \begin_inset Quotes eld \end_inset true \begin_inset Quotes erd \end_inset . The special object \family typewriter t \family default is the \begin_inset Quotes eld \end_inset canonical \begin_inset Quotes erd \end_inset truth value. \layout Standard Here is an example of \family typewriter ifte \family default usage: \layout LyX-Code 1 2 < [ \begin_inset Quotes eld \end_inset 1 is less than 2. \begin_inset Quotes erd \end_inset print ] [ \begin_inset Quotes eld \end_inset bug! \begin_inset Quotes erd \end_inset print ] ifte \layout Standard Compare the order of operands here, and the order of arguments in the stack effect of \family typewriter ifte \family default . \layout Standard That the stack effects of the two \family typewriter ifte \family default branches should be the same. If they differ, the word becomes harder to document and debug. \layout Standard \family typewriter times ( num quot -- ) \family default executes a quotation a number of times. It is good style to have the quotation always consume as many values from the stack as it produces. This ensures the stack effect of the entire \family typewriter times \family default expression stays constant regardless of the number of iterations. \layout Standard More combinators will be introduced later. \layout Subsection Vocabularies \layout Standard The dictionary of words is not a flat list -- rather, it is separated into a number of \emph on vocabularies \emph default . Each vocabulary is a named list of words that have something in common -- for example, the \begin_inset Quotes eld \end_inset lists \begin_inset Quotes erd \end_inset vocabulary contains words for working with linked lists. \layout Standard When a word is read by the parser, the \emph on vocabulary search path \emph default determines which vocabularies to search. In the interactive interpreter, the default search path contains a large number of vocabularies. Contrast this to the situation when a file is being parsed -- the search path has a minimal set of vocabularies containing basic parsing words. \begin_inset Foot collapsed false \layout Standard The rationale here is that the interactive interpreter should have a large number of words available by default, for convinience, whereas source files should specify their external dependencies explicitly. \end_inset \layout Standard New vocabularies are added to the search path using the \family typewriter USE: \family default parsing word. For example: \layout LyX-Code \begin_inset Quotes eld \end_inset /home/slava/.factor-rc \begin_inset Quotes erd \end_inset exists? . \layout LyX-Code \emph on ERROR: :1: Undefined: exists? \layout LyX-Code USE: streams \layout LyX-Code \begin_inset Quotes eld \end_inset /home/slava/.factor-rc \begin_inset Quotes erd \end_inset exists? . \layout LyX-Code \emph on t \layout Standard How do you know which vocabulary contains a word? Vocabularies can either be listed, or an \begin_inset Quotes eld \end_inset apropos \begin_inset Quotes erd \end_inset search can be performed: \layout LyX-Code "init" words. \layout LyX-Code \emph on [ ?run-file boot cli-arg cli-param init-environment \layout LyX-Code \emph on init-gc init-interpreter init-scratchpad init-search-path \layout LyX-Code \emph on init-stdio init-toplevel parse-command-line parse-switches \layout LyX-Code \emph on run-files run-user-init stdin stdout ] \layout LyX-Code \layout LyX-Code "map" apropos. \layout LyX-Code \emph on IN: lists \layout LyX-Code \emph on map \layout LyX-Code \emph on IN: strings \layout LyX-Code \emph on str-map \layout LyX-Code \emph on IN: vectors \layout LyX-Code \emph on (vector-map) \layout LyX-Code \emph on (vector-map-step) \layout LyX-Code \emph on vector-map \layout Standard New words are defined in the \emph on input vocabulary \emph default . The input vocabulary can be changed at the interactive prompt, or in a source file, using the \family typewriter IN: \family default parsing word. For example: \layout LyX-Code IN: music-database \layout LyX-Code : random-playlist ... ; \layout Standard It is a convention (although it is not enforced by the parser) that the \family typewriter IN: \family default directive is the first statement in a source file, and all \family typewriter USE: \family default follow, before any other definitions. \layout Section PRACTICAL: Numbers game \layout Standard In this section, basic input/output and flow control is introduced. We construct a program that repeatedly prompts the user to guess a number -- they are informed if their guess is correct, too low, or too high. The game ends on a correct guess. \layout LyX-Code numbers-game \layout LyX-Code \emph on I'm thinking of a number between 0 and 100. \layout LyX-Code \emph on Enter your guess: \emph default 25 \layout LyX-Code \emph on Too low \layout LyX-Code \emph on Enter your guess: \emph default 38 \layout LyX-Code \emph on Too high \layout LyX-Code \emph on Enter your guess: \emph default 31 \layout LyX-Code \emph on Correct - you win! \layout Subsection Development methodology \layout Standard A typical Factor development session involves a text editor \begin_inset Foot collapsed false \layout Standard Try jEdit, which has Factor syntax highlighting out of the box. \end_inset and Factor interpreter running side by side. Instead of the edit/compile/run cycle, the development process becomes an \begin_inset Quotes eld \end_inset edit cycle \begin_inset Quotes erd \end_inset -- you make some changes to the source file and reload it in the interpreter using a command like this: \layout LyX-Code \begin_inset Quotes eld \end_inset numbers-game.factor \begin_inset Quotes erd \end_inset run-file \layout Standard Then the changes can be tested, either by hand, or using a test harness. There is no need to compile anything, or to lose interpreter state by restartin g. Additionally, words with \begin_inset Quotes eld \end_inset throw-away \begin_inset Quotes erd \end_inset definitions that you do not intend to keep can also be entered directly at this interpreter prompt. \layout Standard Each word should do one useful task. New words can be defined in terms of existing, already-tested words. You design a set of reusable words that model the problem domain. Then, the problem is solved in terms of a \emph on domain-specific vocabulary \emph default . This is called \emph on bottom-up design. \layout Subsection Getting started \layout Standard Start a text editor and create a file named \family typewriter numbers-game.factor \family default . \layout Standard At the top of the file, write a comment. Comments are a feature that can be found in almost any programming language; in Factor, they are implemented as parsing words. An example of commenting follows: \layout LyX-Code ! The word ! discards input until the end of the line \layout LyX-Code ( The word ( discards input until the next ) \layout Standard It is always a good idea to comment your code. Try to write simple code that does not need detailed comments to describe; similarly, avoid redundant comments. These two principles are hard to quantify in a concrete way, and will become more clear as your skills with Factor increase. \layout Standard We will be defining new words in the numbers-game vocabulary; add an \family typewriter IN: \family default statement at the top of the source file: \layout LyX-Code IN: numbers-game \layout Standard Also in order to be able to test the words, issue a \family typewriter USE: \family default statement in the interactive interpreter: \layout LyX-Code USE: numbers-game \layout Standard This section will develop the numbers game in an incremental fashion. After each addition, issue a command like the following to load the source file into the Factor interpreter: \layout LyX-Code \begin_inset Quotes eld \end_inset numbers-game.factor \begin_inset Quotes erd \end_inset run-file \layout Subsection Reading a number from the keyboard \layout Standard A fundamental operation required for the numbers game is to be able to read a number from the keyboard. The \family typewriter read \family default word \family typewriter ( -- str ) \family default reads a line of input and pushes it on the stack as a string. The \family typewriter parse-number \family default word \family typewriter ( str -- n ) \family default takes a string from the stack, and parses it, pushing an integer. These two words can be combined into a single colon definition: \layout LyX-Code : read-number ( -- n ) read parse-number ; \layout Standard You should add this definition to the source file, and try loading the file into the interpreter. As you will soon see, this raises an error! The problem is that the two words \family typewriter read \family default and \family typewriter parse-number \family default are not part of the default, minimal, vocabulary search path used when reading files. The solution is to use \family typewriter apropos. \family default to find out which vocabularies contain those words, and add the appropriate USE: statements to the source file: \layout LyX-Code USE: parser \layout LyX-Code USE: stdio \layout Standard After adding the above two statements, the file should now parse, and testing should confirm that the read-number word works correctly. \begin_inset Foot collapsed false \layout Standard There is the possibility of an invalid number being entered at the keyboard. In this case, \family typewriter print-number \family default returns \family typewriter f \family default , the boolean false value. For the sake of simplicity, we ignore this case in the numbers game example. However, proper error handling is an essential part of any large program and is covered later. \end_inset \layout Subsection Printing some messages \layout Standard Now we need to make some words for printing various messages. They are given here without further ado: \layout LyX-Code : guess-banner \layout LyX-Code \begin_inset Quotes eld \end_inset I'm thinking of a number between 0 and 100. \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : guess-prompt \begin_inset Quotes eld \end_inset Enter your guess: \begin_inset Quotes erd \end_inset write ; \layout LyX-Code : too-high \begin_inset Quotes eld \end_inset Too high \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : too-low \begin_inset Quotes eld \end_inset Too low \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : correct \begin_inset Quotes eld \end_inset Correct - you win! \begin_inset Quotes erd \end_inset print ; \layout Standard Note that in the above, stack effect comments are omitted, since they are obvious from context. You should ensure the words work correctly after loading the source file into the interpreter. \layout Subsection Taking action based on a guess \layout Standard The next logical step is to write a word \family typewriter judge-guess \family default that takes the user's guess along with the actual number to be guessed, and prints one of the messages \family typewriter too-high \family default , \family typewriter too-low \family default , or \family typewriter correct \family default . This word will also push a boolean flag, indicating if the game should continue or not -- in the case of a correct guess, the game does not continue. \layout Standard This description of judge-guess is a mouthful -- and it suggests that it may be best to split it into two words. So the first word we write handles the more specific case of an \emph on inexact \emph default guess -- so it prints either \family typewriter too-low \family default or \family typewriter too-high \family default . \layout LyX-Code : inexact-guess ( guess actual -- ) \layout LyX-Code > [ too-high ] [ too-low ] ifte ; \layout Standard Note that the word gives incorrect output if the two parameters are equal. However, it will never be called this way. \layout Standard With this out of the way, the implementation of judge-guess is an easy task to tackle. Using the words \family typewriter inexact-guess \family default , \family typewriter = \family default , and \family typewriter 2dup \family default , we can write: \layout LyX-Code : judge-guess ( actual guess -- ? ) \layout LyX-Code 2dup = [ \layout LyX-Code correct f \layout LyX-Code ] [ \layout LyX-Code inexact-guess t \layout LyX-Code ] ifte ; \layout Standard Note the use of \family typewriter 2dup ( x y -- x y x y ) \family default . Since \family typewriter = \family default consumes both its parameters, we must make copies of them to pass to \family typewriter correct \family default and \family typewriter inexact-guess \family default . Try the following at the interpreter to see what's going on: \layout LyX-Code clear 1 2 2dup = .s \layout LyX-Code \emph on { 1 2 f } \layout LyX-Code clear 4 4 2dup = .s \layout LyX-Code \emph on { 4 4 t } \layout Standard Test \family typewriter judge-guess \family default with a few inputs: \layout LyX-Code 1 10 judge-guess . \layout LyX-Code \emph on Too low \layout LyX-Code \emph on t \layout LyX-Code 89 43 judge-guess . \layout LyX-Code \emph on Too high \layout LyX-Code \emph on t \layout LyX-Code 64 64 judge-guess . \layout LyX-Code \emph on Correct \layout LyX-Code \emph on f \layout Subsection Generating random numbers \layout Standard The \family typewriter random-int \family default word \family typewriter ( min max -- n ) \family default pushes a random number in a specified range. The range is inclusive, so both the minimum and maximum indices are candidate random numbers. Use \family typewriter apropos. \family default to determine that this word is in the \family typewriter random \family default vocabulary. For the purposes of this game, random numbers will be in the range of 0 to 100, so we can define a word that generates a random number in the range of 0 to 100: \layout LyX-Code : number-to-guess ( -- n ) 0 100 random-int ; \layout Standard Add the word definition to the source file, along with the appropriate \family typewriter USE: \family default statement. Load the source file in the interpreter, and confirm that the word functions correctly, and that its stack effect comment is accurate. \layout Subsection The game loop \layout Standard The game loop consists of repeated calls to \family typewriter guess-prompt \family default , \family typewriter read-number \family default and \family typewriter judge-guess \family default . If \family typewriter judge-guess \family default pushes \family typewriter f \family default , the loop stops, otherwise it continues. This is realized with a recursive implementation: \layout LyX-Code : numbers-game-loop ( actual -- ) \layout LyX-Code dup guess-prompt read-number judge-guess [ \layout LyX-Code numbers-game-loop \layout LyX-Code ] [ \layout LyX-Code drop \layout LyX-Code ] ifte ; \layout Standard In Factor, tail-recursive words consume a bounded amount of call stack space. This means you are free to pick recursion or iteration based on their own merits when solving a problem. In many other languages, the usefulness of recursion is severely limited by the lack of tail-recursive call optimization. \layout Subsection Finishing off \layout Standard The last task is to combine everything into the main \family typewriter numbers-game \family default word. This is easier than it seems: \layout LyX-Code : numbers-game number-to-guess numbers-game-loop ; \layout Standard Try it out! Simply invoke the numbers-game word in the interpreter. It should work flawlessly, assuming you tested each component of this design incrementally! \layout Subsection The complete program \layout LyX-Code ! Numbers game example \newline \layout LyX-Code IN: numbers-game \layout LyX-Code USE: parser \layout LyX-Code USE: stdio \newline \newline : read-number ( -- n ) read parse-number ; \newline \newline : guess-banner \layout LyX-Code \begin_inset Quotes eld \end_inset I'm thinking of a number between 0 and 100. \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : guess-prompt \begin_inset Quotes eld \end_inset Enter your guess: \begin_inset Quotes erd \end_inset write ; \layout LyX-Code : too-high \begin_inset Quotes eld \end_inset Too high \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : too-low \begin_inset Quotes eld \end_inset Too low \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : correct \begin_inset Quotes eld \end_inset Correct - you win! \begin_inset Quotes erd \end_inset print ; \newline \newline : inexact-guess ( guess actual -- ) \layout LyX-Code > [ too-high ] [ too-low ] ifte ; \newline \newline : judge-guess ( actual guess -- ? ) \layout LyX-Code 2dup = [ \layout LyX-Code correct f \layout LyX-Code ] [ \layout LyX-Code inexact-guess t \layout LyX-Code ] ifte ; \newline \newline : number-to-guess ( -- n ) 0 100 random-int ; \newline \newline : numbers-game-loop ( actual -- ) \layout LyX-Code dup guess-prompt read-number judge-guess [ \layout LyX-Code numbers-game-loop \layout LyX-Code ] [ \layout LyX-Code drop \layout LyX-Code ] ifte ; \newline \newline : numbers-game number-to-guess numbers-game-loop ; \layout LyX-Code \layout Section Lists \layout Standard A list is composed of a set of pairs; each pair holds a list element, and a reference to the next pair. Lists have the following literal syntax: \layout LyX-Code [ \begin_inset Quotes eld \end_inset CEO \begin_inset Quotes erd \end_inset 5 \begin_inset Quotes eld \end_inset CFO \begin_inset Quotes erd \end_inset -4 f ] \layout Standard Before we continue, it is important to understand the role of data types in Factor. Lets make a distinction between two categories of data types: \layout Itemize Representational type -- this refers to the form of the data in the interpreter. Representational types include integers, strings, and vectors. Representational types are checked at run time -- attempting to multiply two strings, for example, will yield an error. \layout Itemize Intentional type -- this refers to the meaning of the data within the problem domain. This could be a length measured in inches, or a string naming a file, or a list of objects in a room in a game. It is up to the programmer to check intentional types -- Factor won't prevent you from adding two integers representing a distance and a time, even though the result is meaningless. \layout Subsection Cons cells \layout Standard It may surprise you that in Factor, \emph on lists are intentional types \emph default . This means that they are not an inherent feature of the interpreter; rather, they are built from a simpler data type, the \emph on cons cell \emph default . \layout Standard A cons cell is an object that holds a reference to two other objects. The order of the two objects matters -- the first is called the \emph on car \emph default , the second is called the \emph on cdr \emph default . \layout Standard All words relating to cons cells and lists are found in the \family typewriter lists \family default vocabulary. The words \family typewriter cons \family default , \family typewriter car \family default and \family typewriter cdr \family default \begin_inset Foot collapsed false \layout Standard These infamous names originate from the Lisp language. Originally, \begin_inset Quotes eld \end_inset Lisp \begin_inset Quotes erd \end_inset stood for \begin_inset Quotes eld \end_inset List Processing \begin_inset Quotes erd \end_inset . \end_inset construct and deconstruct cons cells: \layout LyX-Code 1 2 cons . \layout LyX-Code \emph on [ 1 | 2 ] \layout LyX-Code 3 4 car . \layout LyX-Code \emph on 3 \layout LyX-Code 5 6 cdr . \layout LyX-Code \emph on 6 \layout Standard The output of the first expression suggests a literal syntax for cons cells: \layout LyX-Code [ 10 | 20 ] cdr . \layout LyX-Code \emph on 20 \layout LyX-Code [ \begin_inset Quotes eld \end_inset first \begin_inset Quotes erd \end_inset | [ \begin_inset Quotes eld \end_inset second \begin_inset Quotes erd \end_inset | f ] ] car . \layout LyX-Code \emph on \begin_inset Quotes eld \end_inset first \begin_inset Quotes erd \end_inset \layout LyX-Code [ \begin_inset Quotes eld \end_inset first \begin_inset Quotes erd \end_inset | [ \begin_inset Quotes eld \end_inset second \begin_inset Quotes erd \end_inset | f ] ] cdr car . \layout LyX-Code \emph on \begin_inset Quotes eld \end_inset second \begin_inset Quotes erd \end_inset \layout Standard The last two examples make it clear how nested cons cells represent a list. Since this \begin_inset Quotes eld \end_inset nested cons cell \begin_inset Quotes erd \end_inset syntax is extremely cumbersome, the parser provides an easier way: \layout LyX-Code [ 1 2 3 4 ] cdr cdr car . \layout LyX-Code \emph on 3 \layout Standard A \emph on generalized list \emph default is a set of cons cells linked by their cdr. A \emph on proper list \emph default , or just list, is a generalized list with a cdr equal to f, the list is a proper list. Also, the object \family typewriter f \family default is a proper list, and in fact it is equivalent to the empty list \family typewriter [ ] \family default . An \emph on improper list \emph default is a generalized list that is not a proper list. \layout Standard The \family typewriter list? \family default word tests if the object at the top of the stack is a proper list: \layout LyX-Code \begin_inset Quotes eld \end_inset hello \begin_inset Quotes erd \end_inset list? . \layout LyX-Code \emph on f \layout LyX-Code [ \begin_inset Quotes eld \end_inset first \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset second \begin_inset Quotes erd \end_inset | \begin_inset Quotes eld \end_inset third \begin_inset Quotes erd \end_inset ] list? . \layout LyX-Code \emph on f \layout LyX-Code [ \begin_inset Quotes eld \end_inset first \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset second \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset third \begin_inset Quotes erd \end_inset ] list? . \layout LyX-Code \emph on t \layout Subsection Working with lists \layout Standard Unless otherwise documented, list manipulation words expect proper lists as arguments. Given an improper list, they will either raise an error, or disregard the hanging cdr at the end of the list. \layout Standard Also unless otherwise documented, list manipulation words return newly-created lists only. The original parameters are not modified. This may seem inefficient, however the absence of side effects makes code much easier to test and debug. \begin_inset Foot collapsed false \layout Standard Side effect-free code is the fundamental idea underlying functional programming languages. While Factor allows side effects and is not a functional programming language, for a lot of problems, coding in a functional style gives the most maintanable and readable results. \end_inset Where performance is important, a set of \begin_inset Quotes eld \end_inset destructive \begin_inset Quotes erd \end_inset words is provided. They are documented in the next section. \layout Standard \family typewriter nth ( index list -- obj ) \family default Look up an element specified by a zero-based index, by successively iterating down the cdr of the list: \layout LyX-Code 1 [ \begin_inset Quotes eld \end_inset Hamster \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset Bagpipe \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset Beam \begin_inset Quotes erd \end_inset ] nth . \layout LyX-Code \emph on \begin_inset Quotes eld \end_inset Bagpipe \begin_inset Quotes erd \end_inset \layout Standard This word takes linear time proportional to the list index. If you need constant time lookups, use a vector instead. \layout Standard \family typewriter length ( list -- n ) \family default Iterate down the cdr of the list until it reaches \family typewriter f \family default , counting the number of elements in the list: \layout LyX-Code [ [ 1 2 ] [ 3 4 ] 5 ] length . \layout LyX-Code \emph on 3 \layout LyX-Code [ [ [ \begin_inset Quotes eld \end_inset Hey \begin_inset Quotes erd \end_inset ] 5 ] length . \layout LyX-Code \emph on 2 \layout Standard \family typewriter unit ( obj -- list ) \family default Make a list of one element: \layout LyX-Code \begin_inset Quotes eld \end_inset Unit 18 \begin_inset Quotes erd \end_inset unit . \layout LyX-Code \emph on [ \begin_inset Quotes eld \end_inset Unit 18 \begin_inset Quotes erd \end_inset ] \layout Standard \family typewriter append ( list list -- list ) \family default Append the two lists at the top of the stack: \layout LyX-Code [ 1 2 3 ] [ 4 5 6 ] append . \layout LyX-Code \emph on [ 1 2 3 4 5 6 ] \layout LyX-Code [ 1 2 3 ] dup [ 4 5 6 ] append .s \layout LyX-Code \emph on { [ 1 2 3 ] [ 1 2 3 4 5 6 ] } \layout Standard The first list is copied, and the cdr of its last cons cell is set to the second list. The second example above shows that the original parameter was not modified. Interestingly, if the second parameter is not a proper list, \family typewriter append \family default returns an improper list: \layout LyX-Code [ 1 2 3 ] 4 append . \layout LyX-Code \emph on [ 1 2 3 | 4 ] \layout Standard \family typewriter add ( list obj -- list ) \family default Create a new list consisting of the original list, and a new element added at the end: \layout LyX-Code [ 1 2 3 ] 4 add . \layout LyX-Code \emph on [ 1 2 3 4 ] \layout LyX-Code 1 [ 2 3 4 ] cons . \layout LyX-Code \emph on [ 1 2 3 4 ] \layout Standard While \family typewriter cons \family default and \family typewriter add \family default appear to have similar effects, they are quite different -- \family typewriter cons \family default is a very cheap operation, while \family typewriter add \family default has to copy the entire list first! If you need adds to the end to take a constant time, use a vector. \layout Standard \family typewriter reverse ( list -- list ) \family default Push a new list which has the same elements as the original one, but in reverse order: \layout LyX-Code [ 4 3 2 1 ] reverse . \layout LyX-Code \emph on [ 1 2 3 4 ] \layout Standard \family typewriter contains ( obj list -- list ) \family roman \family default Look \family roman for an occurrence of an object in a list. The remainder of the list starting from the first occurrence \family default is returned. If the object does not occur in the list, f is returned: \layout LyX-Code : lived-in? ( country -- ? ) \layout LyX-Code [ \begin_inset Quotes eld \end_inset Canada \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset New Zealand \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset Australia \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset Russia \begin_inset Quotes erd \end_inset ] contains ; \layout LyX-Code \begin_inset Quotes eld \end_inset Australia \begin_inset Quotes erd \end_inset lived-in? . \layout LyX-Code \emph on [ \begin_inset Quotes eld \end_inset Australia \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset Russia \begin_inset Quotes erd \end_inset ] \layout LyX-Code \begin_inset Quotes eld \end_inset Pakistan \begin_inset Quotes erd \end_inset lived-in? . \layout LyX-Code \emph on f \layout Standard For now, assume \begin_inset Quotes eld \end_inset occurs \begin_inset Quotes erd \end_inset means \begin_inset Quotes eld \end_inset contains an object that looks like \begin_inset Quotes erd \end_inset . The issue of object equality is covered in the next chapter. \layout Standard \family typewriter remove ( obj list -- list ) \family default Push a new list, with all occurrences of the object removed. All other elements are in the same order: \layout LyX-Code : australia- \begin_inset Quotes eld \end_inset Australia \begin_inset Quotes erd \end_inset swap remove ; \layout LyX-Code [ "Canada" "New Zealand" "Australia" "Russia" ] australia- . \layout LyX-Code \emph on [ "Canada" "New Zealand" "Russia" ] \layout Standard \family typewriter remove-nth ( index list -- list ) \family default Push a new list, with an index removed: \layout LyX-Code : australia- \begin_inset Quotes eld \end_inset Australia \begin_inset Quotes erd \end_inset swap remove ; \layout LyX-Code [ "Canada" "New Zealand" "Australia" "Russia" ] australia- . \layout LyX-Code \emph on [ "Canada" "New Zealand" "Russia" ] \layout Standard XXX: unique, set-nth -- talk about lists as stacks \layout Subsection Association lists \layout Standard An \emph on association list \emph default is one where every element is a cons. The car of each cons is a name, the cdr is a value. The literal notation is suggestive: \layout LyX-Code [ \layout LyX-Code [ \begin_inset Quotes eld \end_inset Jill \begin_inset Quotes erd \end_inset | \begin_inset Quotes eld \end_inset CEO \begin_inset Quotes erd \end_inset ] \layout LyX-Code [ \begin_inset Quotes eld \end_inset Jeff \begin_inset Quotes erd \end_inset | \begin_inset Quotes eld \end_inset manager \begin_inset Quotes erd \end_inset ] \layout LyX-Code [ \begin_inset Quotes eld \end_inset James | \begin_inset Quotes eld \end_inset lowly web designer \begin_inset Quotes erd \end_inset ] \layout LyX-Code ] \layout Standard \family typewriter assoc? ( obj -- ? ) \family default returns \family typewriter t \family default if the object is a list whose every element is a cons; otherwise it returns \family typewriter f \family default . \layout Standard \family typewriter assoc ( name alist -- value ) \family default looks for a pair with this name in the list, and pushes the cdr of the pair. Pushes f if no name with this pair is present. Note that assoc cannot differentiate between a name that is not present at all, or a name with a value of \family typewriter f \family default . \layout Standard \family typewriter assoc* ( name alist -- [ name | value ] ) \family default looks for a pair with this name, and pushes the pair itself. Unlike \family typewriter assoc \family default , \family typewriter assoc* \family default returns different values in the cases of a value set to \family typewriter f \family default , or an undefined value. \layout Standard \family typewriter set-assoc ( value name alist -- alist ) \family default removes any existing occurrence of a name from the list, and adds a new pair. This creates a new list, the original is unaffected. \layout Standard \family typewriter acons ( value name alist -- alist ) \family default is slightly faster than \family typewriter set-assoc \family default since it simply conses a new pair onto the list. However, if used repeatedly, the list will grow to contain a lot of \begin_inset Quotes eld \end_inset shadowed \begin_inset Quotes erd \end_inset pairs. \layout Standard Searching an association list incurs a linear time cost, so they should only be used for small mappings -- a typical use is a mapping of half a dozen entries or so, specified literally in source. Hashtables can achieve better performance with larger mappings. \layout Subsection List combinators \layout Standard In a traditional language such as C, every iteration or collection must be written out as a loop, with setting up and updating of idices, etc. Factor on the other hand relies on combinators and quotations to avoid duplicating these loop \begin_inset Quotes eld \end_inset design patterns \begin_inset Quotes erd \end_inset throughout the code. \layout Standard The simplest case is iterating through each element of a list, and printing it or otherwise consuming it from the stack. \layout Standard \family typewriter each ( list quot -- ) \family default pushes each element of the list in turn, and executes the quotation. The list and quotation are not on the stack when the quotation is executed. This allows a powerful idiom where the quotation makes a copy of a value on the stack, and consumes it along with the list element. In fact, this idiom works with all well-designed combinators. \begin_inset Foot collapsed false \layout Standard Later, you will learn how to apply it when designing your own combinators. \end_inset \layout Standard The previously-mentioned \family typewriter reverse \family default word is implemented using \family typewriter each \family default : \layout LyX-Code : reverse [ ] swap [ swons ] each ; \layout Standard To understand how it works, consider that each element of the original list is consed onto the beginning of a new list, in turn. So the last element of the original list ends up at the beginning of the new list. \layout Standard \family typewriter inject ( list quot -- list ) \family default is similar to \family typewriter each \family default , except the return values of the quotation are collected into the new list. The quotation must leave one more element on the stack than was present before the quotation was called, otherwise the combinator will not function properly; so the quotation must have stack effect \family typewriter ( obj -- obj ) \family default . \layout Standard For example, suppose we have a list where each element stores the quantity of a some nutrient in 100 grams of food; we would like to find out the total nutrients contained in 300 grams: \layout LyX-Code : multiply-each ( n list -- list ) \layout LyX-Code [ dupd * ] inject nip ; \layout LyX-Code 3 [ 50 450 101 ] multiply-each . \layout LyX-Code \emph on [ 180 1350 303 ] \layout Standard Note the use of \family typewriter nip \family default to discard the original parameter \family typewriter n \family default . \layout Standard In case there is no appropriate combinator, recursion can be used. Factor performs tail call optimization, so a word where the recursive call is the last thing done will not use an arbitrary amount of stack space. \layout Standard \family typewriter subset ( list quot -- list ) \family default produces a new list containing some of the elements of the original list. Which elements to collect is determined by the quotation -- the quotation is called with each list element on the stack in turn, and those elements for which the quotation does not return \family typewriter f \family default are added to the new list. The quotation must have stack effect \family typewriter ( obj -- ? ) \family default . \layout Standard For example, lets construct a list of all numbers between 0 and 99 such that the sum of their digits is less than 10: \layout LyX-Code : sum-of-digits ( n -- n ) 10 /mod + ; \layout LyX-Code 100 count [ sum-of-digits 10 < ] subset . \layout LyX-Code \emph on [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 20 21 \layout LyX-Code \emph on 22 23 24 25 26 27 30 31 32 33 34 35 36 40 41 42 43 44 \layout LyX-Code \emph on 45 50 51 52 53 54 60 61 62 63 70 71 72 80 81 90 ] \layout Standard \family typewriter all? ( list quot -- ? ) \family default returns \family typewriter t \family default if the quotation returns \family typewriter t \family default for all elements of the list, otherwise it returns \family typewriter f \family default . In other words, if \family typewriter all? \family default returns \family typewriter t \family default , then \family typewriter subset \family default applied to the same list and quotation would return the entire list. \begin_inset Foot collapsed false \layout Standard Barring any side effects which modify the execution of the quotation. It is best to avoid side effects when using list combinators. \end_inset \layout Standard For example, the implementation of \family typewriter assoc? \family default uses \family typewriter all? \family default : \layout LyX-Code : assoc? ( list -- ? ) \layout LyX-Code dup list? [ [ cons? ] all? ] [ drop f ] ifte ; \layout Subsection List constructors \layout Standard The list construction words minimize stack noise with a clever trick. They store a partial list in a variable, thus reducing the number of stack elements that have to be juggled. \layout Standard The word \family typewriter [, ( -- ) \family default begins list construction. \layout Standard The word \family typewriter , ( obj -- ) \family default appends an object to the partial list. \layout Standard The word \family typewriter ,] ( -- list ) \family default pushes the complete list. \layout Standard While variables haven't been described yet, keep in mind that a new scope is created between \family typewriter [, \family default and \family typewriter ,] \family default . This means that list constructions can be nested, as long as in the end, the number of \family typewriter [, \family default and \family typewriter ,] \family default balances out. There is no requirement that \family typewriter [, \family default and \family typewriter ,] \family default appear in the same word, however, debugging becomes prohibitively difficult when a list construction begins in one word and ends with another. \layout Standard Here is an example of list construction using this technique: \layout LyX-Code [, 1 10 [ 2 * dup , ] times drop ,] . \layout LyX-Code \emph on [ 2 4 8 16 32 64 128 256 512 1024 ] \layout LyX-Code \layout Subsection Destructively modifying lists \layout Standard All previously discussed list modification functions always returned newly-alloc ated lists. Destructive list manipulation functions on the other hand reuse the cons cells of their input lists, and hence avoid memory allocation. \layout Standard Only ever destructively change lists you do not intend to reuse again. You should not rely on the side effects -- they are unpredictable. It is wrong to think that destructive words \begin_inset Quotes eld \end_inset modify \begin_inset Quotes erd \end_inset the original list -- rather, think of them as returning a new list, just like the normal versions of the words, with the added caveat that the original list must not be used again. \layout Standard \family typewriter nreverse ( list -- list ) \family default reverses a list without consing. In the following example, the return value reuses the cons cells of the original list, and the original list has been ruined by unpredictable side effects: \layout LyX-Code [ 1 2 3 4 ] dup nreverse .s \layout LyX-Code \emph on { [ 4 ] [ 4 3 2 1 ] } \layout Standard Compare the second stack element (which is what remains of the original list) and the top stack element (the list returned by \family typewriter nreverse \family default ). \layout Standard The \family typewriter nreverse \family default word is the most frequently used destructive list manipulator. The usual idiom is a loop where values are consed onto the beginning of a list in each iteration of a loop, then the list is reversed at the end. Since the original list is never used again, \family typewriter nreverse \family default can safely be used here. XXX - example \layout Standard \family typewriter nappend ( list list -- list ) \family default sets the cdr of the last cons cell in the first list to the second list, unless the first list is \family typewriter f \family default , in which case it simply returns the second list. Again, the side effects on the first list are unpredictable -- if it is \family typewriter f \family default , it is unchanged, otherwise, it is equal to the return value: \layout LyX-Code [ 1 2 ] [ 3 4 ] nappend . \layout LyX-Code \emph on [ 1 2 3 4 ] \layout Standard Note in the above examples, we use literal list parameters to nreverse and nappend. This is actually a very bad idea, since the same literal list may be used more than once! For example, lets make a colon definition: \layout LyX-Code : very-bad-idea [ 1 2 3 4 ] nreverse ; \layout LyX-Code very-bad-idea . \layout LyX-Code \emph on [ 4 3 2 1 ] \layout LyX-Code very-bad-idea . \layout LyX-Code \emph on [ 4 ] \layout LyX-Code \begin_inset Quotes eld \end_inset very-bad-idea \begin_inset Quotes erd \end_inset see \layout LyX-Code \emph on : very-bad-idea \layout LyX-Code \emph on [ 4 ] nreverse ; \layout Standard As you can see, the word definition itself was ruined! \layout Standard Sometimes it is desirable make a copy of a list, so that the copy may be safely side-effected later. \layout Standard \family typewriter clone-list ( list -- list ) \family default pushes a new list containing the exact same elements as the original. The elements themselves are not copied. \layout Standard If you want to write your own destructive list manipulation words, you can use \family typewriter set-car ( value cons -- ) \family default and \family typewriter set-cdr ( value cons -- ) \family default to modify individual cons cells. Some words that are not destructive on their inputs nonetheless create intermediate lists which are operated on using these words. One example is \family typewriter clone-list \family default itself. \layout Section Vectors \layout Standard A vector is a contiguous chunk of cells which hold references to arbitrary objects. Vectors have the following literal syntax: \layout LyX-Code { f f f t t f t t -6 \begin_inset Quotes eld \end_inset Hey \begin_inset Quotes erd \end_inset } \layout Standard Use of vector literals in source code is discouraged, since vector manipulation relies on side effects rather than return values, and it is very easy to mess up a literal embedded in a word definition. \layout Subsection Vectors versus lists \layout Standard Vectors are applicable for a different class of problems than lists. Compare the relative performance of common operations on vectors and lists: \layout Standard \begin_inset Tabular \begin_inset Text \layout Standard \end_inset \begin_inset Text \layout Standard Lists \end_inset \begin_inset Text \layout Standard Vectors \end_inset \begin_inset Text \layout Standard Random access of an index \end_inset \begin_inset Text \layout Standard linear time \end_inset \begin_inset Text \layout Standard constant time \end_inset \begin_inset Text \layout Standard Add new element at start \end_inset \begin_inset Text \layout Standard constant time \end_inset \begin_inset Text \layout Standard linear time \end_inset \begin_inset Text \layout Standard Add new element at end \end_inset \begin_inset Text \layout Standard linear time \end_inset \begin_inset Text \layout Standard constant time \end_inset \end_inset \layout Standard When using vectors, you need to pass around a vector and an index -- when working with lists, often only a list head is passed around. For this reason, if you need a sequence for iteration only, a list is a better choice because the list vocabulary contains a rich collection of recursive words. \layout Standard On the other hand, when you need to maintain your own \begin_inset Quotes eld \end_inset stack \begin_inset Quotes erd \end_inset -like collection, a vector is the obvious choice, since most pushes and pops can then avoid allocating memory. \layout Standard Vectors and lists can be converted back and forth using the \family typewriter vector>list \family default word \family typewriter ( vector -- list ) \family default and the \family typewriter list>vector \family default word \family typewriter ( list -- vector ) \family default . \layout Subsection Vector manipulation \layout Standard \family typewriter ( capacity -- vector ) \family default pushes a zero-length vector. Storing more elements than the initial capacity grows the vector. \layout Standard \family typewriter vector-nth ( index vector -- obj ) \family default pushes the object stored at a zero-based index of a vector: \layout LyX-Code 0 { \begin_inset Quotes eld \end_inset zero \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset one \begin_inset Quotes erd \end_inset } vector-nth . \layout LyX-Code \emph on \begin_inset Quotes eld \end_inset zero \begin_inset Quotes erd \end_inset \layout LyX-Code 2 { 1 2 } vector-nth . \layout LyX-Code \emph on ERROR: Out of bounds \layout Standard \family typewriter set-vector-nth ( obj index vector -- ) \family default stores a value into a vector: \begin_inset Foot collapsed false \layout Standard The words \family typewriter get \family default and \family typewriter set \family default used in this example will be formally introduced later. \end_inset \layout LyX-Code { \begin_inset Quotes eld \end_inset math \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset CS \begin_inset Quotes erd \end_inset } \begin_inset Quotes eld \end_inset v \begin_inset Quotes erd \end_inset set \layout LyX-Code 1 \begin_inset Quotes eld \end_inset philosophy \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset v \begin_inset Quotes erd \end_inset get set-vector-nth \layout LyX-Code \begin_inset Quotes eld \end_inset v \begin_inset Quotes erd \end_inset get . \layout LyX-Code \emph on { \begin_inset Quotes eld \end_inset math \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset philosophy \begin_inset Quotes erd \end_inset } \layout LyX-Code 4 \begin_inset Quotes eld \end_inset CS \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset v \begin_inset Quotes erd \end_inset get set-vector-nth \layout LyX-Code \begin_inset Quotes eld \end_inset v \begin_inset Quotes erd \end_inset get . \layout LyX-Code \emph on { \begin_inset Quotes eld \end_inset math \begin_inset Quotes erd \end_inset \begin_inset Quotes eld \end_inset philosophy \begin_inset Quotes erd \end_inset f f \begin_inset Quotes eld \end_inset CS \begin_inset Quotes erd \end_inset } \layout Standard \family typewriter vector-length ( vector -- length ) \family default pushes the number of elements in a vector. As the previous two examples demonstrate, attempting to fetch beyond the end of the vector will raise an error, while storing beyond the end will grow the vector as necessary. \layout Standard \family typewriter set-vector-length ( length vector -- ) \family default resizes a vector. If the new length is larger than the current length, the vector grows if necessary, and the new cells are filled with \family typewriter f \family default . \layout Standard \family typewriter vector-push ( obj vector -- ) \family default adds an object at the end of the vector. This increments the vector's length by one. \layout Standard \family typewriter vector-pop ( vector -- obj ) \family default removes the object at the end of the vector and pushes it. This decrements the vector's length by one. \layout Subsection Vector combinators \layout Standard vector-each, vector-map \layout Section Strings \layout Subsection Strings are character vectors \layout Standard str-nth, str-length, substring, ... \layout Subsection String buffers are mutable \layout Standard , sbuf-append, sbuf>str \layout Standard Otherwise like a vector: \layout Standard sbuf-nth, set-sbuf-nth, sbuf-length, set-sbuf-length \layout Subsection String constructors \layout Standard <% % %> \layout Subsection Printing and reading strings \layout Standard print, write, read, turning a string into a number \layout Section PRACTICAL: Contractor timesheet \layout Standard TODO operations: \layout Standard - print a time difference as hours:minutes \layout Standard - begin work \layout Standard - end work & annotate \layout Standard - print an invoice, takes hourly rate as a parameter. do simple formatted output, using 'spaces' and 'pad-string'. \layout Standard use a vector to store [ annotation | time ] pairs, pass the vector in \layout Section Organization \layout Subsection Hashtables \layout Subsection Namespaces \layout Subsection The name stack \layout Subsection The inspector \layout Section PRACTICAL: Music player \layout Section Deeper in the beast \layout Standard Text -> objects - parser, objects -> text - unparser for atoms, prettyprinter for collections. \layout Standard What really is a word -- primitive, parameter, property list. \layout Standard Call stack how it works and >r/r> \layout Subsection Parsing words \layout Standard Lets take a closer look at Factor syntax. Consider a simple expression, and the result of evaluating it in the interactiv e interpreter: \layout LyX-Code 2 3 + . \layout LyX-Code \emph on 5 \layout Standard The interactive interpreter is basically an infinite loop. It reads a line of input from the terminal, parses this line to produce a \emph on quotation \emph default , and executes the quotation. \layout Standard In the parse step, the input text is tokenized into a sequence of white space-separated tokens. First, the interpreter checks if there is an existing word named by the token. If there is no such word, the interpreter instead treats the token as a number. \begin_inset Foot collapsed false \layout Standard Of course, Factor supports a full range of data types, including strings, lists and vectors. Their source representations are still built from numbers and words, however. \end_inset \layout Standard Once the expression has been entirely parsed, the interactive interpreter executes it. \layout Standard This parse time/run time distinction is important, because words fall into two categories; \begin_inset Quotes eld \end_inset parsing words \begin_inset Quotes erd \end_inset and \begin_inset Quotes eld \end_inset running words \begin_inset Quotes erd \end_inset . \layout Standard The parser constructs a parse tree from the input text. When the parser encounters a token representing a number or an ordinary word, the token is simply appended to the current parse tree node. A parsing word on the other hand is executed \emph on \emph default immediately after being tokenized. Since it executes in the context of the parser, it has access to the raw input text, the entire parse tree, and other parser structures. \layout Standard Parsing words are also defined using colon definitions, except we add \family typewriter parsing \family default after the terminating \family typewriter ; \family default . Here are two examples of definitions for words \family typewriter foo \family default and \family typewriter bar \family default , both are identical except in the second example, \family typewriter foo \family default is defined as a parsing word: \layout LyX-Code ! Lets define 'foo' as a running word. \layout LyX-Code : foo \begin_inset Quotes eld \end_inset 1) foo executed. \begin_inset Quotes erd \end_inset print ; \layout LyX-Code : bar foo \begin_inset Quotes eld \end_inset 2) bar executed. \begin_inset Quotes erd \end_inset ; \layout LyX-Code bar \layout LyX-Code \emph on 1) foo executed \layout LyX-Code \emph on 2) bar executed \layout LyX-Code bar \layout LyX-Code \emph on 1) foo executed \layout LyX-Code \emph on 2) bar executed \layout LyX-Code \layout LyX-Code ! Now lets define 'foo' as a parsing word. \layout LyX-Code : foo \begin_inset Quotes eld \end_inset 1) foo executed. \begin_inset Quotes erd \end_inset print ; parsing \layout LyX-Code : bar foo \begin_inset Quotes eld \end_inset 2) bar executed. \begin_inset Quotes erd \end_inset ; \layout LyX-Code \emph on 1) foo executed \layout LyX-Code bar \layout LyX-Code \emph on 2) bar executed \layout LyX-Code bar \layout LyX-Code \emph on 2) bar executed \layout Standard In fact, the word \family typewriter \begin_inset Quotes eld \end_inset \family default that denotes a string literal is a parsing word -- it reads characters from the input text until the next occurrence of \family typewriter \begin_inset Quotes eld \end_inset \family default , and appends this string to the current node of the parse tree. Note that strings and words are different types of objects. Strings are covered in great detail later. \layout Section PRACTICAL: Infix syntax \layout Section Continuations \layout Standard Generators, co-routines, multitasking, exception handling \layout Section HTTP Server \layout Section PRACTICAL: Some web app \the_end