diff --git a/doc/devel-guide.tex b/doc/devel-guide.tex deleted file mode 100644 index 47aacec962..0000000000 --- a/doc/devel-guide.tex +++ /dev/null @@ -1,2859 +0,0 @@ -% :indentSize=4:tabSize=4:noTabs=true:mode=tex:wrap=soft: - -\documentclass[english]{book} -\makeindex -\usepackage[T1]{fontenc} -\usepackage[latin1]{inputenc} -\usepackage{alltt} -\usepackage{tabularx} -\usepackage{times} -\pagestyle{headings} -\setcounter{tocdepth}{1} -\setlength\parskip{\medskipamount} -\setlength\parindent{0pt} - -\raggedbottom -\raggedright - -\newcommand{\ttbackslash}{\char'134} - -\newcommand{\sidebar}[1]{{\fbox{\fbox{\parbox{10cm}{\begin{minipage}[b]{10cm} -{\LARGE For wizards} - -#1 -\end{minipage}}}}}} - -\newcommand{\chapkeywords}[1]{%{\parbox{10cm}{\begin{minipage}[b]{10cm} -\begin{quote} -\emph{Key words:} \texttt{#1} -\end{quote} -%\end{minipage}}}} -} -\makeatletter - -\newcommand{\wordtable}[1]{{ -\begin{tabularx}{12cm}{|l l X|} -#1 -\hline -\end{tabularx}}} - -\newcommand{\tabvocab}[1]{ -\hline -\multicolumn{3}{|l|}{ -\rule[-2mm]{0mm}{6mm} -\texttt{#1} vocabulary:} -\\ -\hline -} - -\usepackage{babel} -\makeatother -\begin{document} - -\title{Factor Developer's Guide} - - -\author{Slava Pestov} - -\maketitle -\tableofcontents{} - -\chapter*{Introduction} - -Factor is a programming language with functional and object-oriented -influences. Factor borrows heavily from Forth, Joy and Lisp. Programmers familiar with these languages will recognize many similarities with Factor. - -Factor is \emph{interactive}. This means it is possible to run a Factor interpreter that reads from the keyboard, and immediately executes expressions as they are entered. This allows words to be defined and tested one at a time. - -Factor is \emph{dynamic}. This means that all objects in the language are fully reflective at run time, and that new definitions can be entered without restarting the interpreter. Factor code can be used interchangably as data, meaning that sophisticated language extensions can be realized as libraries of words. - -Factor is \emph{safe}. This means all code executes in a virtual machine that provides -garbage collection and prohibits direct pointer arithmetic. There is no way to get a dangling reference by deallocating a live object, and it is not possible to corrupt memory by overwriting the bounds of an array. - -When examples of interpreter interactions are given in this guide, the input is in a roman font, and any -output from the interpreter is in boldface: - -\begin{alltt} -\textbf{ok} "Hello, world!" print -\textbf{Hello, world!} -\end{alltt} - -\chapter{First steps} - -This chapter will cover the basics of interactive development using the listener. Short code examples will be presented, along with an introduction to programming with the stack, a guide to using source files, and some hints for exploring the library. - -\section{The listener} - -\chapkeywords{print write read} -\index{\texttt{print}} -\index{\texttt{write}} -\index{\texttt{read-line}} - -Factor is an \emph{image-based environment}. When you compiled Factor, you also generated a file named \texttt{factor.image}. You will learn more about images later, but for now it suffices to understand that to start Factor, you must pass the image file name on the command line: - -\begin{alltt} -./f factor.image -\textbf{Loading factor.image... relocating... done -This is native Factor 0.69 -Copyright (C) 2003, 2004 Slava Pestov -Copyright (C) 2004 Chris Double -Type ``exit'' to exit, ``help'' for help. -65536 KB total, 62806 KB free -ok} -\end{alltt} - -An \texttt{\textbf{ok}} prompt is printed after the initial banner, indicating the listener is ready to execute Factor expressions. You can try the classical first program: - -\begin{alltt} -\textbf{ok} "Hello, world." print -\textbf{Hello, world.} -\end{alltt} - -One thing all programmers have in common is they make large numbers of mistakes. There are two types of errors; syntax errors, and logic errors. Syntax errors indicate a simple typo or misplaced character. -A logic error is not associated with a piece of source code, but rather an execution state. We will learn how to debug them later. The listener will indicate the point in the source code where the confusion occurs: - -\begin{alltt} -\textbf{ok} "Hello, world" pr,int -\textbf{:1: Not a number -"Hello, world" pr,int - ^ -:s :r :n :c show stacks at time of error.} -\end{alltt} - -Factor source is composed from whitespace-separated words. Factor is case-sensitive. Each one of the following is exactly one word, and all four are distinct: - -\begin{verbatim} -2dup -2DUP -[ -@$*(%# -\end{verbatim} - -A frequent beginner's error is to leave out whitespace between words. When you are entering the code examples in the following sections, keep in mind that you cannot omit arbitrary whitespace: - -\begin{alltt} -:greet "Greetings, " write print; \emph{! incorrect} -: greet "Greetings, " write print ; \emph{! correct} -\end{alltt} - -\section{Colon definitions} - -\chapkeywords{:~; see} -\index{\texttt{:}} -\index{\texttt{;}} -\index{\texttt{see}} - -Factor words are similar to functions and procedures in other languages. Words are defined using \emph{colon definition} syntax. Some words, like \texttt{print}, \texttt{write} and \texttt{read-line}, along with dozens of others we will see, are part of Factor. Other words will be created by you. - -When you create a new word, you are associating a name with a particular sequence of \emph{already-existing} words. Enter the following colon definition in the listener: - -\begin{alltt} -\textbf{ok} : ask-name "What is your name? " write read-line ; -\end{alltt} - -What did we do above? We created a new word named \texttt{ask-name}, and associated with it the definition \texttt{"What is your name? " write read-line}. Now, lets type in two more colon definitions. The first one prints a personalized greeting. The second colon definition puts the first two together into a complete program. - -\begin{alltt} -\textbf{ok} : greet "Greetings, " write print ; -\textbf{ok} : friend ask-name greet ; -\end{alltt} - -Now that the three words \texttt{ask-name}, \texttt{greet}, and \texttt{friend} have been defined, simply typing \texttt{friend} in the listener will run our example program: - -\begin{alltt} -\textbf{ok} friend -\textbf{What is your name? }Lambda the Ultimate -\textbf{Greetings, Lambda the Ultimate} -\end{alltt} - -Notice that the \texttt{ask-name} word passes a piece of data to the \texttt{greet} word. We will worry about the exact details later -- for now, our focus is the colon definition syntax itself. - -You can look at the definition of any word, including library words, using \texttt{see}: - -\begin{alltt} -\textbf{ok} \ttbackslash greet see -\textbf{IN: scratchpad -: greet - "Greetings, " write print ;} -\end{alltt} - -The \texttt{see} word shows a reconstruction of the source code, not the original source code. So in particular, formatting and some comments are lost. - -\sidebar{ -Factor is written in itself. Indeed, most words in the Factor library are colon definitions, defined in terms of other words. Out of more than two thousand words in the Factor library, less than two hundred are \emph{primitive}, or built-in to the runtime. - -If you exit Factor by typing \texttt{exit}, any colon definitions entered at the listener will be lost. However, you can save a new image first, and use this image next time you start Factor in order to have access to your definitions. - -\begin{alltt} -\textbf{ok} "work.image" save-image\\ -\textbf{ok} exit -\end{alltt} - -The \texttt{factor.image} file you start with initially is just an image that was saved after the standard library, and nothing else, was loaded. -} - -\section{The stack} - -\chapkeywords{.s .~clear} -\index{\texttt{.s}} -\index{\texttt{.}} -\index{\texttt{clear}} - -In order to be able to write more sophisticated programs, you will need to master usage of the \emph{stack}. Input parameters -- for example, numbers, or strings such as \texttt{"Hello "}, are pushed onto the top of the stack when typed. The most recently pushed value is said to be \emph{at the top} of the stack. When a word is executed, it takes -input parameters from the top of the -stack. Results are then pushed back on the stack. By convention, words remove input parameters from the stack. - -Recall our \texttt{friend} definition from the previous section. In this definition, the \texttt{ask-name} word passes a piece of data to the \texttt{greet} word: - -\begin{alltt} -: friend ask-name greet ; -\end{alltt} - -The first thing done by \texttt{friend} is calling \texttt{ask-name}, which was defined as follows: - -\begin{alltt} -: ask-name "What is your name? " write read-line ; -\end{alltt} - -Read this definition from left to right, and visualize the data flow. First, the string \texttt{"What is your name?~"} is pushed on the stack. The \texttt{write} word is called; it removes the string from the stack and writes it, without returning any values. Next, the \texttt{read-line} word is called. It waits for a line of input from the user, then pushes the entered string on the stack. - -After \texttt{ask-name}, the \texttt{friend} word calls \texttt{greet}, which was defined as follows: - -\begin{alltt} -: greet "Greetings, " write print ; -\end{alltt} - -This word pushes the string \texttt{"Greetings, "} and calls \texttt{write}, which writes this string. Next, \texttt{print} is called. Recall that the \texttt{read-line} call inside \texttt{ask-name} left the user's input on the stack; well, it is still there, and \texttt{print} prints it. In case you haven't already guessed, the difference between \texttt{write} and \texttt{print} is that the latter outputs a terminating new line. - -How did we know that \texttt{write} and \texttt{print} take one value from the stack each, or that \texttt{read-line} leaves one value on the stack? The answer is, you don't always know, however, you can use \texttt{see} to look up the \emph{stack effect comment} of any library word: - -\begin{alltt} -\textbf{ok} \ttbackslash print see -\textbf{IN: stdio -: print ( string -{}- ) - "stdio" get fprint ;} -\end{alltt} - -You can see that the stack effect of \texttt{print} is \texttt{( string -{}- )}. This is a mnemonic indicating that this word pops a string from the stack, and pushes no values back on the stack. As you can verify using \texttt{see}, the stack effect of \texttt{read-line} is \texttt{( -{}- string )}. - -All words you write should have a stack effect. So our \texttt{friend} example should have been written as follows: - -\begin{verbatim} -: ask-name ( -- name ) "What is your name? " write read-line ; -: greet ( name -- ) "Greetings, " write print ; -: friend ( -- ) ask-name greet ; -\end{verbatim} - -The contents of the stack can be printed using the \texttt{.s} word: - -\begin{alltt} -\textbf{ok} 1 2 3 .s -\textbf{1 -2 -3} -\end{alltt} - -The \texttt{.} (full-stop) word removes the top stack element, and prints it. Unlike \texttt{print}, which will only print a string, \texttt{.} can print any object. - -\begin{alltt} -\textbf{ok} "Hi" . -\textbf{"Hi"} -\end{alltt} - -You might notice that \texttt{.} surrounds strings with quotes, while \texttt{print} prints the string without any kind of decoration. This is because \texttt{.} is a special word that outputs objects in a form \emph{that can be parsed back in}. This is a fundamental feature of Factor -- data is code, and code is data. We will learn some very deep consequences of this throughout this guide. - -If the stack is empty, calling \texttt{.} will raise an error. This is in general true for any word called with insufficient parameters, or parameters of the wrong type. - -The \texttt{clear} word removes all elements from the stack. - -\sidebar{ -In Factor, the stack takes the place of local variables found in other languages. Factor still supports variables, but they are usually used for different purposes. Like most other languages, Factor heap-allocates objects, and passes object references by value. However, we don't worry about this until mutable state is introduced. -} - -\section*{Review} - -Lets review the words we've seen until now, and their stack effects. The ``vocab'' column will be described later, and only comes into play when we start working with source files. Basically, Factor's words are partitioned into vocabularies rather than being in one flat list. - -\wordtable{ -\tabvocab{syntax} -\texttt{:~\emph{word}}& -\texttt{( -{}- )}& -Begin a word definion named \texttt{\emph{word}}.\\ -\texttt{;}& -\texttt{( -{}- )}& -End a word definion.\\ -\texttt{\ttbackslash\ \emph{word}}& -\texttt{( -{}- )}& -Push \texttt{\emph{word}} on the stack, instead of executing it.\\ -\tabvocab{stdio} -\texttt{print}& -\texttt{( string -{}- )}& -Write a string to the console, with a new line.\\ -\texttt{write}& -\texttt{( string -{}- )}& -Write a string to the console, without a new line.\\ -\texttt{read-line}& -\texttt{( -{}- string )}& -Read a line of input from the console.\\ -\tabvocab{prettyprint} -\texttt{.s}& -\texttt{( -{}- )}& -Print stack contents.\\ -\texttt{.}& -\texttt{( value -{}- )}& -Print value at top of stack in parsable form.\\ -\tabvocab{stack} -\texttt{clear}& -\texttt{( ...~-{}- )}& -Clear stack contents.\\ -} - -From now on, each section will end with a summary of words and stack effects described in that section. - -\section{Arithmetic} - -\chapkeywords{+ - {*} / neg pred succ} -\index{\texttt{+}} -\index{\texttt{-}} -\index{\texttt{*}} -\index{\texttt{/}} -\index{\texttt{neg}} -\index{\texttt{pred}} -\index{\texttt{succ}} - -The usual arithmetic operators \texttt{+ - {*} /} all take two parameters -from the stack, and push one result back. Where the order of operands -matters (\texttt{-} and \texttt{/}), the operands are taken in the natural order. For example: - -\begin{alltt} -\textbf{ok} 10 17 + . -\textbf{27} -\textbf{ok} 111 234 - . -\textbf{-123} -\textbf{ok} 333 3 / . -\textbf{111} -\end{alltt} - -The \texttt{neg} word negates the number at the top of the stack (that is, multiplies it by -1). - -This type of arithmetic is called \emph{postfix}, because the operator -follows the operands. Contrast this with \emph{infix} notation used -in many other languages, so-called because the operator is in-between -the two operands. - -More complicated infix expressions can be translated into postfix -by translating the inner-most parts first. Grouping parentheses are -never necessary. - -Here is the postfix translation of $(2 + 3) \times 6$: - -\begin{alltt} -\textbf{ok} 2 3 + 6 {*} -\textbf{30} -\end{alltt} - -Here is the postfix translation of $2 + (3 \times 6)$: - -\begin{alltt} -\textbf{ok} 2 3 6 {*} + -\textbf{20} -\end{alltt} - -As a simple example demonstrating postfix arithmetic, consider a word, presumably for an aircraft navigation system, that takes the flight time, the aircraft -velocity, and the tailwind velocity, and returns the distance travelled. -If the parameters are given on the stack in that order, all we do -is add the top two elements (aircraft velocity, tailwind velocity) -and multiply it by the element underneath (flight time). So the definition -looks like this: - -\begin{alltt} -\textbf{ok} : distance ( time aircraft tailwind -{}- distance ) + {*} ; -\textbf{ok} 2 900 36 distance . -\textbf{1872} -\end{alltt} - -Note that we are not using any distance or time units here. To extend this example to work with units, we make the assumption that for the purposes of computation, all distances are -in meters, and all time intervals are in seconds. Then, we can define words -for converting from kilometers to meters, and hours and minutes to -seconds: - -\begin{alltt} -\textbf{ok} : kilometers 1000 {*} ; -\textbf{ok} : minutes 60 {*} ; -\textbf{ok} : hours 60 {*} 60 {*} ; -2 kilometers . -\emph{2000} -10 minutes . -\emph{600} -2 hours . -\emph{7200} -\end{alltt} - -The implementation of \texttt{km/hour} is a bit more complex. We would like it to convert kilometers per hour to meters per second. To get the desired result, we first have to convert to kilometers per second, then divide this by the number of seconds in one hour. - -\begin{alltt} -\textbf{ok} : km/hour kilometers 1 hours / ; -\textbf{ok} 2 hours 900 km/hour 36 km/hour distance . -\textbf{1872000} -\end{alltt} - -\section*{Review} - -\wordtable{ -\tabvocab{math} -\texttt{+}& -\texttt{( x y -{}- z )}& -Add \texttt{x} and \texttt{y}.\\ -\texttt{-}& -\texttt{( x y -{}- z )}& -Subtract \texttt{y} from \texttt{x}.\\ -\texttt{*}& -\texttt{( x y -{}- z )}& -Multiply \texttt{x} and \texttt{y}.\\ -\texttt{/}& -\texttt{( x y -{}- z )}& -Divide \texttt{x} by \texttt{y}.\\ -\texttt{pred}& -\texttt{( x -{}- x-1 )}& -Push the predecessor of \texttt{x}.\\ -\texttt{succ}& -\texttt{( x -{}- x+1 )}& -Push the successor of \texttt{x}.\\ -\texttt{neg}& -\texttt{( x -{}- -1 )}& -Negate \texttt{x}.\\ -} - -\section{Shuffle words} - -\chapkeywords{drop dup swap over nip dupd rot -rot tuck pick 2drop 2dup} -\index{\texttt{drop}} -\index{\texttt{dup}} -\index{\texttt{swap}} -\index{\texttt{over}} -\index{\texttt{nip}} -\index{\texttt{dupd}} -\index{\texttt{rot}} -\index{\texttt{-rot}} -\index{\texttt{tuck}} -\index{\texttt{pick}} -\index{\texttt{2drop}} -\index{\texttt{2dup}} - -Lets try writing a word to compute the cube of a number. -Three numbers on the stack can be multiplied together using \texttt{{*} -{*}}: - -\begin{alltt} -2 4 8 {*} {*} . -\emph{64} -\end{alltt} - -However, the stack effect of \texttt{{*} {*}} is \texttt{( a b c -{}- -a{*}b{*}c )}. We would like to write a word that takes \emph{one} input -only, and multiplies it by itself three times. To achieve this, we need to make two copies of the top stack element, and then execute \texttt{{*} {*}}. As it happens, there is a word \texttt{dup ( x -{}- -x x )} for precisely this purpose. Now, we are able to define the -\texttt{cube} word: - -\begin{alltt} -: cube dup dup {*} {*} ; -10 cube . -\emph{1000} --2 cube . -\emph{-8} -\end{alltt} - -The \texttt{dup} word is just one of the several so-called \emph{shuffle words}. Shuffle words are used to solve the problem of composing two words whose stack effects don't quite {}``match up''. - -Lets take a look at the four most-frequently used shuffle words: - -\wordtable{ -\tabvocab{stack} -\texttt{drop}& -\texttt{( x -{}- )}& -Discard the top stack element. Used when -a word returns a value that is not needed.\\ -\texttt{dup}& -\texttt{( x -{}- x x )}& -Duplicate the top stack element. Used -when a value is required as input for more than one word.\\ -\texttt{swap}& -\texttt{( x y -{}- y x )}& -Swap top two stack elements. Used when -a word expects parameters in a different order.\\ -\texttt{over}& -\texttt{( x y -{}- x y x )}& -Bring the second stack element {}``over'' -the top element.\\} - -The remaining shuffle words are not used nearly as often, but are nonetheless handy in certian situations: - -\wordtable{ -\tabvocab{stack} -\texttt{nip}& -\texttt{( x y -{}- y )}& -Remove the second stack element.\\ -\texttt{dupd}& -\texttt{( x y -{}- x x y )}& -Duplicate the second stack element.\\ -\texttt{rot}& -\texttt{( x y z -{}- y z x )}& -Rotate top three stack elements -to the left.\\ -\texttt{-rot}& -\texttt{( x y z -{}- z x y )}& -Rotate top three stack elements -to the right.\\ -\texttt{tuck}& -\texttt{( x y -{}- y x y )}& -``Tuck'' the top stack element under -the second stack element.\\ -\texttt{pick}& -\texttt{( x y z -{}- x y z x )}& -``Pick'' the third stack element.\\ -\texttt{2drop}& -\texttt{( x y -{}- )}& -Discard the top two stack elements.\\ -\texttt{2dup}& -\texttt{( x y -{}- x y x y )}& -Duplicate the top two stack elements. A frequent use for this word is when two values have to be compared using something like \texttt{=} or \texttt{<} before being passed to another word.\\ -} - -You should try all these words out and become familiar with them. Push some numbers on the stack, -execute a shuffle word, and look at how the stack contents was changed using -\texttt{.s}. Compare the stack contents with the stack effects above. - -Try to avoid the complex shuffle words such as \texttt{rot} and \texttt{2dup} as much as possible. They make data flow harder to understand. If you find yourself using too many shuffle words, or you're writing -a stack effect comment in the middle of a colon definition to keep track of stack contents, it is -a good sign that the word should probably be factored into two or -more smaller words. - -A good rule of thumb is that each word should take at most a couple of sentences to describe; if it is so complex that you have to write more than that in a documentation comment, the word should be split up. - -Effective factoring is like riding a bicycle -- once you ``get it'', it becomes second nature. - -\section{Working with the stack} - -\chapkeywords{sq sqrt} -\index{\texttt{sq}} -\index{\texttt{sqrt}} - -In this section, we will work through the construction of a word for solving quadratic equations, that is, finding values of $x$ that satisfy $ax^2+bx+c=0$, where $a$, $b$ and $c$ are given to us. If you don't like math, take comfort in the fact this is the last mathematical example for a while! - -First, note that \texttt{sq} multiplies the top of the stack by itself, and \texttt{sqrt} takes the square root of a number: - -\begin{alltt} -\textbf{ok} 5 sq . -\textbf{25} -\textbf{ok} 2 sqrt . -\textbf{1.414213562373095} -\end{alltt} - -The mathematical formula that gives a value of $x$ for known $a$, $b$ and $c$ might be familiar to you: - -$$x=\frac{-b}{2a}\pm\sqrt{\frac{b^2-4ac}{4a^2}}$$ - -We will compute the left-hand side first. The word to compute it will be named \texttt{quadratic-e}, and it will take the values $a$ and $b$ on the stack: - -\begin{verbatim} -: quadratic-e ( b a -- -b/2a ) - 2 * / neg ; -\end{verbatim} - -Now, lets code the right hand side of the equation: - -\begin{verbatim} -: quadratic-d ( a b c -- d ) - pick 4 * * swap sq swap - swap sq 4 * / sqrt ; -\end{verbatim} - -To understand how \texttt{quadratic-d} works, consider the stack after each step: - -\begin{tabular}{|l|l|} -\hline -Word:&Stack after:\\ -\hline -Initial:&$a$ $b$ $c$\\ -\hline -\texttt{pick}&$a$ $b$ $c$ $a$\\ -\hline -\texttt{4}&$a$ $b$ $c$ $a$ $4$\\ -\hline -\texttt{*}&$a$ $b$ $c$ $4a$\\ -\hline -\texttt{*}&$a$ $b$ $4ac$\\ -\hline -\texttt{swap}&$a$ $4ac$ $b$\\ -\hline -\texttt{sq}&$a$ $4ac$ $b^2$\\ -\hline -\texttt{swap}&$a$ $b^2$ $4ac$\\ -\hline -\texttt{-}&$a$ $b^2-4ac$\\ -\hline -\texttt{swap}&$b^2-4ac$ $a$\\ -\hline -\texttt{sq}&$b^2-4ac$ $a^2$\\ -\hline -\texttt{4}&$b^2-4ac$ $a^2$ $4$\\ -\hline -\texttt{*}&$b^2-4ac$ $4a^2$\\ -\hline -\texttt{/}&$\frac{b^2-4ac}{4a^2}$\\ -\hline -\texttt{sqrt}&$\sqrt{\frac{b^2-4ac}{4a^2}}$\\ -\hline -\end{tabular} - -Now, we need a word that takes the values computed by \texttt{quadratic-e} and \texttt{quadratic-d}, and returns two values, one being the sum, the other being the difference. This is the $\pm$ part of the formula: - -\begin{verbatim} -: quadratic-roots ( e d -- alpha beta ) - 2dup + -rot - ; -\end{verbatim} - -You should be able to work through the stack flow of the above word in your head. Test it with a few inputs. - -Finally, we can put these three words together into a complete program: - -\begin{verbatim} -: quadratic ( a b c -- alpha beta ) - 3dup quadratic-d - nip swap rot quadratic-e - swap quadratic-roots ; -\end{verbatim} - -Again, lets look at the stack after each step of the execution of \texttt{quadratic}: - -\begin{tabular}{|l|l|} -\hline -Word:&Stack after:\\ -\hline -Initial:&$a$ $b$ $c$\\ -\hline -\texttt{3dup}&$a$ $b$ $c$ $a$ $b$ $c$\\ -\hline -\texttt{quadratic-d}&$a$ $b$ $c$ $\sqrt{\frac{b^2-4ac}{4a^2}}$\\ -\hline -\texttt{nip}&$a$ $b$ $\sqrt{\frac{b^2-4ac}{4a^2}}$\\ -\hline -\texttt{swap}&$a$ $\sqrt{\frac{b^2-4ac}{4a^2}}$ $b$ \\ -\hline -\texttt{rot}&$\sqrt{\frac{b^2-4ac}{4a^2}}$ $b$ $a$ \\ -\hline -\texttt{quadratic-e}&$\sqrt{\frac{b^2-4ac}{4a^2}}$ $\frac{-b}{2a}$\\ -\hline -\texttt{quadratic-roots}&$\frac{-b}{2a}+\sqrt{\frac{b^2-4ac}{4a^2}}$ $\frac{-b}{2a}-\sqrt{\frac{b^2-4ac}{4a^2}}$\\ -\hline -\end{tabular} - -You can test \texttt{quadratic} with a handful of inputs: - -\begin{alltt} -\textbf{ok} 1 2 1 quadratic . . -\textbf{-1.0} -\textbf{-1.0} -\textbf{ok} 1 -5 4 quadratic . . -\textbf{1.0} -\textbf{4.0} -\textbf{ok} 1 0 1 quadratic . . -\textbf{#\{ 0 -1.0 \} -#\{ 0 1.0 \}} -\end{alltt} - -The last example shows that Factor can handle complex numbers perfectly well. We will have more to say about complex numbers later. - -\section*{Review} - -\wordtable{ -\tabvocab{math} -\texttt{sq}& -\texttt{( x -{}- x*x )}& -Square of a number.\\ -\texttt{sqrt}& -\texttt{( x -{}- sqrt[x] )}& -Square root of a number.\\ -} - -\section{Source files} - -\chapkeywords{run-file apropos.~USE: IN:} -\index{\texttt{run-file}} -\index{\texttt{apropos.}} -\index{\texttt{IN:}} -\index{\texttt{USE:}} - -Entering colon definitions at the listener is very convenient for quick testing, but for serious work you should save your work in source files with the \texttt{.factor} filename extension. Any text editor will do, but if you use jEdit\footnote{\texttt{http://www.jedit.org}}, you can take advantage of the powerful integration features found in the Factor plugin. Consult the plugin documentation for details. - -Lets put our program for solving quadratic equations in a source file. Create a file named \texttt{quadratic.factor} in your favorite editor, and add the following content: - -\begin{verbatim} -IN: quadratic -USE: math -USE: kernel - -: quadratic-e ( b a -- -b/2a ) - 2 * / neg ; - -: quadratic-d ( a b c -- d ) - pick 4 * * swap sq swap - swap sq 4 * / sqrt ; - -: quadratic-roots ( d e -- alpha beta ) - 2dup + -rot - ; - -: quadratic ( a b c -- alpha beta ) - 3dup quadratic-d - nip swap rot quadratic-e - swap quadratic-roots ; -\end{verbatim} - -Now, load the source file in the Factor interpreter using the \texttt{run-file} word: - -\begin{alltt} -\textbf{ok} "quadratic.factor" run-file -\textbf{/home/slava/quadratic.factor:2: Not a number - 2 * / neg ; - ^ -:s :r :n :c show stacks at time of error. -:get ( var -- value ) inspects the error namestack.} -\end{alltt} - -Oops! What happened? It looks like it is not recognizing the \texttt{*} word, which works fine in the listener! The problem is that while most words in the library are available for use at the listener, source files must explicitly declare which \texttt{vocabularies} they make use of. To find out which vocabulary holds the \texttt{*} word, use \texttt{apropos.}: - -\begin{alltt} -\textbf{ok} "*" apropos. -\emph{...} -\textbf{IN: math -*} -\emph{...} -\end{alltt} - -The \texttt{apropos.}~word searches for words whose name contains a given string. As you can see, there are a number of words whose name contains \texttt{*}, but the one we are looking for is \texttt{*} itself, in the \texttt{math} vocabulary. To make use of the \texttt{math} vocabulary, simply add the following \texttt{vocabulary use declaration} at the beginning of the \texttt{quadratic.factor} source file: - -\begin{verbatim} -USE: math -\end{verbatim} - -Now, try loading the file again. This time, an error will be displayed because the \texttt{pick} word cannot be found. Use \texttt{apropos.} to confirm that \texttt{pick} is in the \texttt{stack} vocabulary, and add the appropriate declaration at the start of the source file. Then, the source file should load without any errors. - -By default, words you define go in the \texttt{scratchpad} vocabulary. To change this, add a declaration to the start of the source file: - -\texttt{IN: quadratic} - -Now, to use the words defined within, you must issue the following command in the listener first: - -\begin{alltt} -\textbf{ok} USE: quadratic -\end{alltt} - -\sidebar{If you are using jEdit, you can use the \textbf{Plugins}>\textbf{Factor}>\textbf{Use word at caret} command to insert a \texttt{USE:} declaration for the word at the caret.} - -\section*{Review} - -\wordtable{ -\tabvocab{parser} -\texttt{run-file}& -\texttt{( string -{}- )}& -Load a source file with the given name.\\ -\tabvocab{syntax} -\texttt{USE: \emph{vocab}}& -\texttt{( -{}- )}& -Add a vocabulary to the search path.\\ -\texttt{IN: \emph{vocab}}& -\texttt{( -{}- )}& -Set vocabulary for new word definitions.\\ -\tabvocab{words} -\texttt{apropos.}& -\texttt{( string -{}- )}& -List all words whose name contains a given string, and the vocabularies they are found in.\\ -} - -\section{Exploring the library} - -\chapkeywords{apropos.~see ~vocabs.~words.~httpd} - -We already saw two ways to explore the Factor library in previous sections: the \texttt{see} word, which shows a word definition, and \texttt{apropos.}~ which helps locate a word and its vocabulary if we know part of its name. - -Entering \texttt{vocabs.}~in the listener produces a list of all existing vocabularies: - -\begin{alltt} -\textbf{ok} vocabs. -\textbf{[ "alien" "ansi" "combinators" "compiler" "continuations" -"errors" "file-responder" "files" "format" "hashtables" -"html" "httpd" "httpd-responder" "image" "inference" "init" -"inspect-responder" "inspector" "interpreter" "io-internals" -"jedit" "kernel" "listener" "lists" "logging" "logic" "math" -"namespaces" "parser" "presentation" "prettyprint" -"processes" "profiler" "quit-responder" "random" "real-math" -"resource-responder" "scratchpad" "sdl" "sdl-event" -"sdl-gfx" "sdl-keysym" "sdl-video" "stack" "stdio" "streams" -"strings" "syntax" "telnetd" "test" "test-responder" -"threads" "unparser" "url-encoding" "vectors" "words" ] -} -\end{alltt} - -As you can see, there are a lot of vocabularies! Now, you can use \texttt{words.}~to list the words inside a given vocabulary: - -\begin{alltt} -\textbf{ok} "lists" words. -\textbf{[ (cons-hashcode) (count) (each) (partition) (top) , 2car -2cdr 2cons 2list 2swons 3list =-or-contains? acons acons@ -all=? all? append assoc assoc* assoc-apply assoc? car cdr -cons cons-hashcode cons= cons? cons@ contains? count each -last last* length list>vector list? make-list make-rlist map -maximize nth num-sort partition partition-add partition-step -prune remove remove-assoc remove@ reverse set-assoc sort -stack>list subset swons tail top tree-contains? uncons -uncons@ unique unique, unique@ unit unswons unzip -vector>list ]} -\end{alltt} - -Any word definition can then be shown using \texttt{see}, but you might have to \texttt{USE:} the vocabulary first. - -\begin{alltt} -\textbf{ok} USE: presentation -\textbf{ok} \ttbackslash set-style see -\textbf{IN: presentation -: set-style ( style name -- ) - "styles" get set-hash ;} -\end{alltt} - -A more sophisticated way to browse the library is using the integrated HTTP server. You can start the HTTP server using the following pair of commands: - -\begin{alltt} -\textbf{ok} USE: httpd -\textbf{ok} 8888 httpd -\end{alltt} - -Then, point your browser to the following URL, and start browsing: - -\begin{quote} -\texttt{http://localhost:8888/responder/inspect/vocabularies} -\end{quote} - -To stop the HTTP server, point your browser to - -\begin{quote} -\texttt{http://localhost:8888/responder/quit}. -\end{quote} - -You can even start the HTTP in a separate thread, using the following commands: - -\begin{alltt} -\textbf{ok} USE: httpd -\textbf{ok} USE: threads -\textbf{ok} [ 8888 httpd ] in-thread -\end{alltt} - -This way, you can browse code and play in the listener at the same time. - -\section*{Review} - -\wordtable{ -\tabvocab{words} -\texttt{apropos.}& -\texttt{( string -{}- )}& -Search for words whose name contains a string.\\ -\texttt{vocabs.}& -\texttt{( -{}- )}& -List all vocabularies.\\ -\texttt{words.}& -\texttt{( string -{}- )}& -List all words in a given vocabulary.\\ -\tabvocab{httpd} -\texttt{httpd}& -\texttt{( port -{}- )}& -Start an HTTP server on the given port.\\ -} - -\chapter{Working with data} - -This chapter will introduce the fundamental data type used in Factor -- the list. -This will lead is to a discussion of debugging, conditional execution, recursion, and higher-order words, as well as a peek inside the interpreter, and an introduction to unit testing. If you have used another programming language, you will no doubt find using lists for all data limiting; other data types, such as vectors and hashtables, are introduced in later sections. However, a good understanding of lists is fundamental since they are used more than any other data type. - -\section{Lists and cons cells} - -\chapkeywords{cons car cdr uncons unit} -\index{\texttt{cons}} -\index{\texttt{car}} -\index{\texttt{cdr}} -\index{\texttt{uncons}} -\index{\texttt{unit}} - -A \emph{cons cell} is an ordered pair of values. The first value is called the \emph{car}, -the second is called the \emph{cdr}. - -The \texttt{cons} word takes two values from the stack, and pushes a cons cell containing both values: - -\begin{alltt} -\textbf{ok} "fish" "chips" cons . -\textbf{{[} "fish" | "chips" {]}} -\end{alltt} - -Recall that \texttt{.}~prints objects in a form that can be parsed back in. This suggests that there is a literal syntax for cons cells. The difference between the literal syntax and calling \texttt{cons} is two-fold; \texttt{cons} can be used to make a cons cell whose components are computed values, and not literals, and \texttt{cons} allocates a new cons cell each time it is called, whereas a literal is allocated once. - -The \texttt{car} and \texttt{cdr} words push the constituents of a cons cell at the top of the stack. - -\begin{alltt} -\textbf{ok} 5 "blind mice" cons car . -\textbf{5} -\textbf{ok} "peanut butter" "jelly" cons cdr . -\textbf{"jelly"} -\end{alltt} - -The \texttt{uncons} word pushes both the car and cdr of the cons cell at once: - -\begin{alltt} -\textbf{ok} {[} "potatoes" | "gravy" {]} uncons .s -\textbf{\{ "potatoes" "gravy" \}} -\end{alltt} - -If you think back for a second to the previous section on shuffle words, you might be able to guess how \texttt{uncons} is implemented in terms of \texttt{car} and \texttt{cdr}: - -\begin{alltt} -: uncons ( {[} car | cdr {]} -- car cdr ) dup car swap cdr ; -\end{alltt} - -The cons cell is duplicated, its car is taken, the car is swapped with the cons cell, putting the cons cell at the top of the stack, and finally, the cdr of the cons cell is taken. - -Lists of values are represented with nested cons cells. The car is the first element of the list; the cdr is the rest of the list. The following example demonstrates this, along with the literal syntax used to print lists: - -\begin{alltt} -\textbf{ok} {[} 1 2 3 4 {]} car . -\textbf{1} -\textbf{ok} {[} 1 2 3 4 {]} cdr . -\textbf{{[} 2 3 4 {]}} -\textbf{ok} {[} 1 2 3 4 {]} cdr cdr . -\textbf{{[} 3 4 {]}} -\end{alltt} - -Note that taking the cdr of a list of three elements gives a list of two elements; taking the cdr of a list of two elements returns a list of one element. So what is the cdr of a list of one element? It is the special value \texttt{f}: - -\begin{alltt} -\textbf{ok} {[} "snowpeas" {]} cdr . -\textbf{f} -\end{alltt} - -The special value \texttt{f} represents the empty list. Hence you can see how cons cells and \texttt{f} allow lists of values to be constructed. If you are used to other languages, this might seem counter-intuitive at first, however the utility of this construction will come into play when we consider recursive words later in this chapter. - -One thing worth mentioning is the concept of an \emph{improper list}. An improper list is one where the cdr of the last cons cell is not \texttt{f}. Improper lists are input with the following syntax: - -\begin{verbatim} -[ 1 2 3 | 4 ] -\end{verbatim} - -Improper lists are rarely used, but it helps to be aware of their existence. Lists that are not improper lists are sometimes called \emph{proper lists}. - -Lets review the words we've seen in this section: - -\wordtable{ -\tabvocab{syntax} -\texttt{{[}}& -\texttt{( -{}- )}& -Begin a literal list.\\ -\texttt{{]}}& -\texttt{( -{}- )}& -End a literal list.\\ -\tabvocab{lists} -\texttt{cons}& -\texttt{( car cdr -{}- {[} car | cdr {]} )}& -Construct a cons cell.\\ -\texttt{car}& -\texttt{( {[} car | cdr {]} -{}- car )}& -Push the first element of a list.\\ -\texttt{cdr}& -\texttt{( {[} car | cdr {]} -{}- cdr )}& -Push the rest of the list without the first element.\\ -\texttt{uncons}& -\texttt{( {[} car | cdr {]} -{}- car cdr )}& -Push both components of a cons cell.\\} - -\section{Operations on lists} - -\chapkeywords{unit append reverse contains?} -\index{\texttt{unit}} -\index{\texttt{append}} -\index{\texttt{reverse}} -\index{\texttt{contains?}} - -The \texttt{lists} vocabulary defines a large number of words for working with lists. The most important ones will be covered in this section. Later sections will cover other words, usually with a discussion of their implementation. - -The \texttt{unit} word makes a list of one element. It does this by consing the top of the stack with the empty list \texttt{f}: - -\begin{alltt} -: unit ( a -- {[} a {]} ) f cons ; -\end{alltt} - -The \texttt{append} word appends two lists at the top of the stack: - -\begin{alltt} -\textbf{ok} {[} "bacon" "eggs" {]} {[} "pancakes" "syrup" {]} append . -\textbf{{[} "bacon" "eggs" "pancakes" "syrup" {]}} -\end{alltt} - -Interestingly, only the first parameter has to be a list. if the second parameter -is an improper list, or just a list at all, \texttt{append} returns an improper list: - -\begin{alltt} -\textbf{ok} {[} "molasses" "treacle" {]} "caramel" append . -\textbf{{[} "molasses" "treacle" | "caramel" {]}} -\end{alltt} - -The \texttt{reverse} word creates a new list whose elements are in reverse order of the given list: - -\begin{alltt} -\textbf{ok} {[} "steak" "chips" "gravy" {]} reverse . -\textbf{{[} "gravy" "chips" "steak" {]}} -\end{alltt} - -The \texttt{contains?}~word checks if a list contains a given element. The element comes first on the stack, underneath the list: - -\begin{alltt} -\textbf{ok} : desert? {[} "ice cream" "cake" "cocoa" {]} contains? ; -\textbf{ok} "stew" desert? . -\textbf{f} -\textbf{ok} "cake" desert? . -\textbf{{[} "cake" "cocoa" {]}} -\end{alltt} - -What exactly is going on in the mouth-watering example above? The \texttt{contains?}~word returns \texttt{f} if the element does not occur in the list, and returns \emph{the remainder of the list} if the element occurs in the list. The significance of this will become apparent in the next section -- \texttt{f} represents boolean falsity in a conditional statement. - -Note that we glossed the concept of object equality here -- you might wonder how \texttt{contains?}~decides if the element ``occurs'' in the list or not. It turns out it uses the \texttt{=} word, which checks if two objects have the same structure. Another way to check for equality is using \texttt{eq?}, which checks if two references refer to the same object. Both of these words will be covered in detail later. - -Lets review the words we saw in this section: - -\wordtable{ -\tabvocab{lists} -\texttt{unit}& -\texttt{( a -{}- {[} a {]} )}& -Make a list of one element.\\ -\texttt{append}& -\texttt{( list list -{}- list )}& -Append two lists.\\ -\texttt{reverse}& -\texttt{( list -{}- list )}& -Reverse a list.\\ -\texttt{contains?}& -\texttt{( value list -{}- ?~)}& -Determine if a list contains a value.\\} - -\section{Conditional execution} - -\chapkeywords{f t ifte when unless unique abs infer} -\index{\texttt{f}} -\index{\texttt{t}} -\index{\texttt{ifte}} -\index{\texttt{when}} -\index{\texttt{unless}} -\index{\texttt{unique?}} -\index{\texttt{abs}} -\index{\texttt{ifer}} - -Until now, all code examples we've considered have been linear in nature; each word is executed in turn, left to right. To perform useful computations, we need the ability the execute different code depending on circumstances. - -The simplest style of a conditional form in Factor is the following: - -\begin{alltt} -\emph{condition} {[} - \emph{to execute if true} -{] [} - \emph{to execute if false} -{]} ifte -\end{alltt} - -The \texttt{ifte} word is a \emph{combinator}, because it executes lists representing code, or \emph{quotations}, given on the stack. - -The condition should be some piece of code that leaves a truth value on the stack. What is a truth value? In Factor, there is no special boolean data type --- instead, the special value \texttt{f} we've already seen to represent empty lists also represents falsity. Every other object represents boolean truth. In cases where a truth value must be explicitly produced, the value \texttt{t} can be used. The \texttt{ifte} word removes the condition from the stack, and executes one of the two branches depending on the truth value of the condition. - -A good first example to look at is the \texttt{unique} word. This word conses a value onto a list as long as the value does not already occur in the list. Otherwise, the original list is returned. You can guess that this word somehow involves \texttt{contains?}~and \texttt{cons}. Check out its implementation using \texttt{see} and your intuition will be confirmed: - -\begin{alltt} -: unique ( elem list -- list ) - 2dup contains? [ nip ] [ cons ] ifte ; -\end{alltt} - -The first the word does is duplicate the two values given as input, since \texttt{contains?}~consumes its inputs. If the value does occur in the list, \texttt{contains?}~returns the remainder of the list starting from the first occurrence; in other words, a truth value. This calls \texttt{nip}, which removes the value from the stack, leaving the original list at the top of the stack. On the other hand, if the value does not occur in the list, \texttt{contains?}~returns \texttt{f}, which causes the other branch of the conditional to execute. This branch calls \texttt{cons}, which adds the value at the beginning of the list. - -Another frequently-used combinator \texttt{when}. This combinator is a variation of \texttt{ifte}, except only one quotation is given. If the condition is true, the quotation is executed. Nothing is done if the condition is false. In fact \texttt{when} is implemented in terms of \texttt{ifte}. Since \texttt{when} is called with a quotation on the stack, it suffices to push an empty list, and call \texttt{ifte} -- the given quotation is the true branch, and the empty quotation is the false branch: - -\begin{verbatim} -: when ( ? quot -- ) - [ ] ifte ; -\end{verbatim} - -An example of a word that uses both \texttt{ifte} and \texttt{when} is the \texttt{abs} word, which computes the absolute value of a number: - -\begin{verbatim} -: abs ( z -- abs ) - dup complex? [ - >rect mag2 - ] [ - dup 0 < [ neg ] when - ] ifte ; -\end{verbatim} - -If the given number is a complex number, its distance from the origin is computed\footnote{Don't worry about the \texttt{>rect} and \texttt{mag2} words at this stage; they will be described later. If you are curious, use \texttt{see} to look at their definitions and read the documentation comments.}. Otherwise, if the parameter is a real number below zero, it is negated. If it is a real number greater than zero, it is not modified. - -The dual of the \texttt{when} combinator is the \texttt{unless} combinator. It takes a quotation to execute if the condition is false; otherwise nothing is done. In both cases, the condition is popped off the stack. - -The implementation is similar to that of \texttt{when}, but this time, we must swap the two quotations given to \texttt{ifte}, so that the true branch is an empty list, and the false branch is the user's quotation: - -\begin{verbatim} -: unless ( ? quot -- ) - [ ] swap ifte ; -\end{verbatim} - -A very simple example of \texttt{unless} usage is the \texttt{assert} word: - -\begin{verbatim} -: assert ( t -- ) - [ "Assertion failed!" throw ] unless ; -\end{verbatim} - -This word is used for unit testing -- it raises an error and stops execution if the top of the stack is false. - -\subsection{Stack effects of conditionals} - -It is good style to ensure that both branches of conditionals you write have the same stack effect. This makes words easier to debug. Since it is easy to make a stack flow mistake when working with conditionals, Factor includes a \emph{stack effect inference} tool. It can be used as follows: - -\begin{alltt} -\textbf{ok} [ 2dup contains? [ nip ] [ cons ] ifte ] infer . -\textbf{[ 2 | 1 ]} -\end{alltt} - -The output indicates that the code snippet\footnote{The proper term for a code snippet is a ``quotation''. You will learn about quotations later.} takes two values from the stack, and leaves one. Now lets see what happends if we forget about the \texttt{nip} and try to infer the stack effect: - -\begin{alltt} -\textbf{ok} [ 2dup contains? [ ] [ cons ] ifte ] infer . -\textbf{ERROR: Unbalanced ifte branches -:s :r :n :c show stacks at time of error.} -\end{alltt} - -Now lets look at the stack effect of the \texttt{abs} word. First, verify that each branch has the same stack effect: - -\begin{alltt} -\textbf{ok} [ >rect mag2 ] infer . -\textbf{[ 1 | 1 ]} -\textbf{ok} [ dup 0 < [ neg ] when ] infer . -\textbf{[ 1 | 1 ]} -\end{alltt} - -Since the branches are balanced, the stack effect of the entire conditional expression can be computed: - -\begin{alltt} -\textbf{ok} [ abs ] infer . -\textbf{[ 1 | 1 ]} -\end{alltt} - -\subsection*{Review} - -\wordtable{ -\tabvocab{syntax} -\texttt{f}& -\texttt{( -{}- f )}& -Empty list, and boolean falsity.\\ -\texttt{t}& -\texttt{( -{}- f )}& -Canonical truth value.\\ -\tabvocab{combinators} -\texttt{ifte}& -\texttt{( ?~true false -{}- )}& -Execute either \texttt{true} or \texttt{false} depending on the boolean value of the conditional.\\ -\texttt{when}& -\texttt{( ?~quot -{}- )}& -Execute quotation if the condition is true, otherwise do nothing but pop the condition and quotation off the stack.\\ -\texttt{unless}& -\texttt{( ?~quot -{}- )}& -Execute quotation if the condition is false, otherwise do nothing but pop the condition and quotation off the stack.\\ -\tabvocab{lists} -\texttt{unique}& -\texttt{( elem list -{}- )}& -Prepend an element to a list if it does not occur in the -list.\\ -\tabvocab{math} -\texttt{abs}& -\texttt{( z -- abs )}& -Compute the complex absolute value.\\ -\tabvocab{test} -\texttt{assert}& -\texttt{( t -- )}& -Raise an error if the top of the stack is \texttt{f}.\\ -\tabvocab{inference} -\texttt{infer}& -\texttt{( quot -{}- {[} in | out {]} )}& -Infer stack effect of code, if possible.\\} - -\section{Recursion} - -\chapkeywords{ifte when cons?~last last* list?} -\index{\texttt{ifte}} -\index{\texttt{when}} -\index{\texttt{cons?}} -\index{\texttt{last}} -\index{\texttt{last*}} -\index{\texttt{list?}} - -The idea of \emph{recursion} is key to understanding Factor. A \emph{recursive} word definition is one that refers to itself, usually in one branch of a conditional. The general form of a recursive word looks as follows: - -\begin{alltt} -: recursive - \emph{condition} {[} - \emph{recursive case} - {] [} - \emph{base case} - {]} ifte ; -\end{alltt} - -Use \texttt{see} to take a look at the \texttt{last} word; it pushes the last element of a list: - -\begin{alltt} -: last ( list -- last ) - last* car ; -\end{alltt} - -As you can see, it makes a call to an auxilliary word \texttt{last*}, and takes the car of the return value. As you can guess by looking at the output of \texttt{see}, \texttt{last*} pushes the last cons cell in a list: - -\begin{verbatim} -: last* ( list -- last ) - dup cdr cons? [ cdr last* ] when ; -\end{verbatim} - -So if the top of stack is a cons cell whose cdr is not a cons cell, the cons cell remains on the stack -- it gets duplicated, its cdr is taken, the \texttt{cons?} predicate tests if it is a cons cell, then \texttt{when} consumes the condition, and takes the empty ``false'' branch. This is the \emph{base case} -- the last cons cell of a one-element list is the list itself. - -If the cdr of the list at the top of the stack is another cons cell, then something magical happends -- \texttt{last*} calls itself again, but this time, with the cdr of the list. The recursive call, in turn, checks of the end of the list has been reached; if not, it makes another recursive call, and so on. - -Lets test the word: - -\begin{alltt} -\textbf{ok} {[} "salad" "pizza" "pasta" "pancakes" {]} last* . -\textbf{{[} "pancakes" {]}} -\textbf{ok} {[} "salad" "pizza" "pasta" | "pancakes" {]} last* . -\textbf{{[} "pasta" | "pancakes" {]}} -\textbf{ok} {[} "nachos" "tacos" "burritos" {]} last . -\textbf{"burritos"} -\end{alltt} - -A naive programmer might think that recursion is an infinite loop. Two things ensure that a recursion terminates: the existence of a base case, or in other words, a branch of a conditional that does not contain a recursive call; and an \emph{inductive step} in the recursive case, that reduces one of the arguments in some manner, thus ensuring that the computation proceeds one step closer to the base case. - -An inductive step usually consists of taking the cdr of a list (the conditional could check for \texttt{f} or a cons cell whose cdr is not a list, as above), or decrementing a counter (the conditional could compare the counter with zero). Of course, many variations are possible; one could instead increment a counter, pass around its maximum value, and compare the current value with the maximum on each iteration. - -Lets consider a more complicated example. Use \texttt{see} to bring up the definition of the \texttt{list?} word. This word checks that the value on the stack is a proper list, that is, it is either \texttt{f}, or a cons cell whose cdr is a proper list: - -\begin{verbatim} -: list? ( list -- ? ) - dup [ - dup cons? [ cdr list? ] [ drop f ] ifte - ] [ - drop t - ] ifte ; -\end{verbatim} - -This example has two nested conditionals, and in effect, two base cases. The first base case arises when the value is \texttt{f}; in this case, it is dropped from the stack, and \texttt{t} is returned, since \texttt{f} is a proper list of no elements. The second base case arises when a value that is not a cons is given. This is clearly not a proper list. - -The inductive step takes the cdr of the list. So, the birds-eye view of the word is that it takes the cdr of the list on the stack, until it either reaches \texttt{f}, in which case the list is a proper list, and \texttt{t} is returned; or until it reaches something else, in which case the list is an improper list, and \texttt{f} is returned. - -Lets test the word: - -\begin{alltt} -\textbf{ok} {[} "tofu" "beans" "rice" {]} list? . -\textbf{t} -\textbf{ok} {[} "sprouts" "carrots" | "lentils" {]} list? . -\textbf{f} -\textbf{ok} f list? . -\textbf{t} -\end{alltt} - -\subsection{Stack effects of recursive words} - -There are a few things worth noting about the stack flow inside a recursive word. The condition must take care to preserve any input parameters needed for the base case and recursive case. The base case must consume all inputs, and leave the final return values on the stack. The recursive case should somehow reduce one of the parameters. This could mean incrementing or decrementing an integer, taking the \texttt{cdr} of a list, and so on. Parameters must eventually reduce to a state where the condition returns \texttt{f}, to avoid an infinite recursion. - -The recursive case should also be coded such that the stack effect of the total definition is the same regardless of how many iterations are preformed; words that consume or produce different numbers of paramters depending on circumstances are very hard to debug. - -Lets review the words we saw in this chapter: - -\wordtable{ -\tabvocab{combinators} -\texttt{when}& -\texttt{( ?~quot -{}- )}& -Execute quotation if the condition is true, otherwise do nothing but pop the condition and quotation off the stack.\\ -\tabvocab{lists} -\texttt{cons?}& -\texttt{( value -{}- ?~)}& -Test if a value is a cons cell.\\ -\texttt{last}& -\texttt{( list -{}- value )}& -Last element of a list.\\ -\texttt{last*}& -\texttt{( list -{}- cons )}& -Last cons cell of a list.\\ -\texttt{list?}& -\texttt{( value -{}- ?~)}& -Test if a value is a proper list.\\ -} - -\section{Debugging} - -\section{The interpreter} - -\chapkeywords{acons >r r>} -\index{\texttt{acons}} -\index{\texttt{>r}} -\index{\texttt{r>}} - -So far, we have seen what we called ``the stack'' store intermediate values between computations. In fact Factor maintains a number of other stacks, and the formal name for the stack we've been dealing with so far is the \emph{data stack}. - -Another fundamental stack is the \emph{return stack}. It is used to save interpreter state for nested calls. - -You already know that code quotations are just lists. At a low level, each colon definition is also just a quotation. The interpreter consists of a loop that iterates a quotation, pushing each literal, and executing each word. If the word is a colon definition, the interpreter saves its state on the return stack, executes the definition of that word, then restores the execution state from the return stack and continues. - -The return stack also serves a dual purpose as a temporary storage area. Sometimes, juggling values on the data stack becomes ackward, and in that case \texttt{>r} and \texttt{r>} can be used to move a value from the data stack to the return stack, and vice versa, respectively. - -The words \texttt{>r} and \texttt{r>} ``hide'' the top of the stack between their occurrences. Try the following in the listener: - -\begin{alltt} -\textbf{ok} 1 2 3 .s -\textbf{3 -2 -1} -\textbf{ok} >r .s r> -\textbf{2 -1} -\textbf{ok} 1 2 3 .s -\textbf{3 -2 -1} -\end{alltt} - -A simple example can be found in the definition of the \texttt{acons} word: - -\begin{alltt} -\textbf{ok} \ttbackslash acons see -\textbf{: acons ( value key alist -- alist ) - >r swons r> cons ;} -\end{alltt} - -When the word is called, \texttt{swons} is applied to \texttt{value} and \texttt{key} creating a cons cell whose car is \texttt{key} and whose cdr is \texttt{value}, then \texttt{cons} is applied to this new cons cell, and \texttt{alist}. So this word adds the \texttt{key}/\texttt{value} pair to the beginning of the \texttt{alist}. - -Note that usages of \texttt{>r} and \texttt{r>} must be balanced within a single quotation or word definition. The following examples illustrate the point: - -\begin{verbatim} -: the-good >r 2 + r> * ; -: the-bad >r 2 + ; -: the-ugly r> ; -\end{verbatim} - -Basically, the rule is you must leave the return stack in the same state as you found it so that when the current quotation finishes executing, the interpreter can return to the calling word. - -One exception is that when \texttt{ifte} occurs as the last word in a definition, values may be pushed on the return stack before the condition value is computed, as long as both branches of the \texttt{ifte} pop the values off the return stack before returning. - -Lets review the words we saw in this chapter: - -\wordtable{ -\tabvocab{lists} -\texttt{acons}& -\texttt{( value key alist -{}- alist )}& -Add a key/value pair to the association list.\\ -\tabvocab{stack} -\texttt{>r}& -\texttt{( obj -{}- r:obj )}& -Move value to return stack..\\ -\texttt{r>}& -\texttt{( r:obj -{}- obj )}& -Move value from return stack..\\ -} - -\section{Association lists} - -An \emph{association list} is a list where every element is a cons. The -car of each cons is a name, the cdr is a value. The literal notation -is suggestive: - -\begin{alltt} -{[} - {[} "Jill" | "CEO" {]} - {[} "Jeff" | "manager" {]} - {[} "James" | "lowly web designer" {]} -{]} -\end{alltt} - -\texttt{assoc? ( obj -{}- ?~)} returns \texttt{t} if the object is -a list whose every element is a cons; otherwise it returns \texttt{f}. - -\texttt{assoc ( key alist -{}- value )} looks for a pair with this -key in the list, and pushes the cdr of the pair. Pushes f if no pair -with this key is present. Note that \texttt{assoc} cannot differentiate between -a key that is not present at all, or a key with a value of \texttt{f}. - -\texttt{assoc{*} ( key alist -{}- {[} key | value {]} )} looks for -a pair with this key, and pushes the pair itself. Unlike \texttt{assoc}, -\texttt{assoc{*}} returns different values in the cases of a value -set to \texttt{f}, or an undefined value. - -\texttt{set-assoc ( value key alist -{}- alist )} removes any existing -occurrence of a key from the list, and adds a new pair. This creates -a new list, the original is unaffected. - -\texttt{acons ( value key alist -{}- alist )} is slightly faster -than \texttt{set-assoc} since it simply conses a new pair onto the -list. However, if used repeatedly, the list will grow to contain a -lot of {}``shadowed'' pairs. - -The following pair of word definitions from the \texttt{html} vocabulary demonstrates the usage of association lists. It implements a mapping of special characters to their HTML entity names. Note the usage of \texttt{?}~to return the original character if the association lookup yields \texttt{f}: - -\begin{alltt} -: html-entities ( -- alist ) - {[} - {[} CHAR: < | "\<" {]} - {[} CHAR: > | "\>" {]} - {[} CHAR: \& | "\&" {]} - {[} CHAR: {'} | "\'" {]} - {[} CHAR: {"} | "\"" {]} - {]} ; - -: char>entity ( ch -- str ) - dup >r html-entities assoc dup r> ? ; -\end{alltt} - -Searching association lists incurs a linear time cost, so they should -only be used for small mappings -- a typical use is a mapping of half -a dozen entries or so, specified literally in source. Hashtables offer -better performance with larger mappings. - -\section{Recursive combinators} - -\chapter{Practical: a numbers game} - -In this section, basic input/output and flow control is introduced. -We construct a program that repeatedly prompts the user to guess a -number -- they are informed if their guess is correct, too low, or -too high. The game ends on a correct guess. - -\begin{alltt} -numbers-game -\emph{I'm thinking of a number between 0 and 100.} -\emph{Enter your guess:} 25 -\emph{Too low} -\emph{Enter your guess:} 38 -\emph{Too high} -\emph{Enter your guess:} 31 -\emph{Correct - you win!} -\end{alltt} - -\section{Getting started} - -Start a text editor and create a file named \texttt{numbers-game.factor}. - -Write a short comment at the top of the file. Two examples of commenting style supported by Factor: - -\begin{alltt} -! Numbers game. -( The great numbers game ) -\end{alltt} - -It is always a good idea to comment your code. Try to write simple -code that does not need detailed comments to describe; similarly, -avoid redundant comments. These two principles are hard to quantify -in a concrete way, and will become more clear as your skills with -Factor increase. - -We will be defining new words in the \texttt{numbers-game} vocabulary; add -an \texttt{IN:} statement at the top of the source file: - -\begin{alltt} -IN: numbers-game -\end{alltt} -Also in order to be able to test the words, issue a \texttt{USE:} -statement in the interactive interpreter: - -\begin{alltt} -USE: numbers-game -\end{alltt} -This section will develop the numbers game in an incremental fashion. -After each addition, issue a command like the following to load the -source file into the Factor interpreter: - -\begin{alltt} -"numbers-game.factor" run-file -\end{alltt} - -\section{Reading a number from the keyboard} - -A fundamental operation required for the numbers game is to be able -to read a number from the keyboard. The \texttt{read} word \texttt{( --{}- str )} reads a line of input and pushes it on the stack. -The \texttt{parse-number} word \texttt{( str -{}- n )} turns a decimal -string representation of an integer into the integer itself. These -two words can be combined into a single colon definition: - -\begin{alltt} -: read-number ( -{}- n ) read parse-number ; -\end{alltt} -You should add this definition to the source file, and try loading -the file into the interpreter. As you will soon see, this raises an -error! The problem is that the two words \texttt{read} and \texttt{parse-number} -are not part of the default, minimal, vocabulary search path used -when reading files. The solution is to use \texttt{apropos.} to find -out which vocabularies contain those words, and add the appropriate -\texttt{USE:} statements to the source file: - -\begin{alltt} -USE: parser -USE: stdio -\end{alltt} -After adding the above two statements, the file should now parse, -and testing should confirm that the \texttt{read-number} word works correctly.% -\footnote{There is the possibility of an invalid number being entered at the -keyboard. In this case, \texttt{parse-number} returns \texttt{f}, -the boolean false value. For the sake of simplicity, we ignore this -case in the numbers game example. However, proper error handling is -an essential part of any large program and is covered later.% -} - - -\section{Printing some messages} - -Now we need to make some words for printing various messages. They -are given here without further ado: - -\begin{alltt} -: guess-banner - "I'm thinking of a number between 0 and 100." print ; -: guess-prompt "Enter your guess: " write ; -: too-high "Too high" print ; -: too-low "Too low" print ; -: correct "Correct - you win!" print ; -\end{alltt} -Note that in the above, stack effect comments are omitted, since they -are obvious from context. You should ensure the words work correctly -after loading the source file into the interpreter. - - -\section{Taking action based on a guess} - -The next logical step is to write a word \texttt{judge-guess} that -takes the user's guess along with the actual number to be guessed, -and prints one of the messages \texttt{too-high}, \texttt{too-low}, -or \texttt{correct}. This word will also push a boolean flag, indicating -if the game should continue or not -- in the case of a correct guess, -the game does not continue. - -This description of judge-guess is a mouthful -- and it suggests that -it may be best to split it into two words. The first word we write -handles the more specific case of an \emph{inexact} guess -- so it -prints either \texttt{too-low} or \texttt{too-high}. - -\begin{alltt} -: inexact-guess ( actual guess -{}- ) - < {[} too-high {]} {[} too-low {]} ifte ; -\end{alltt} -Note that the word gives incorrect output if the two parameters are -equal. However, it will never be called this way. - -With this out of the way, the implementation of judge-guess is an -easy task to tackle. Using the words \texttt{inexact-guess}, \texttt{2dup}, \texttt{2drop} and \texttt{=}, we can write: - -\begin{alltt} -: judge-guess ( actual guess -{}- ? ) - 2dup = {[} - 2drop correct f - {]} {[} - inexact-guess t - {]} ifte ; -\end{alltt} - -The word \texttt{=} is found in the \texttt{kernel} vocabulary, and the words \texttt{2dup} and \texttt{2drop} are found in the \texttt{stack} vocabulary. Since \texttt{=} -consumes both its inputs, we must first duplicate the \texttt{actual} and \texttt{guess} parameters using \texttt{2dup}. The word \texttt{correct} does not need to do anything with these two numbers, so they are popped off the stack using \texttt{2drop}. Try evaluating the following -in the interpreter to see what's going on: - -\begin{alltt} -clear 1 2 2dup = .s -\emph{\{ 1 2 f \}} -clear 4 4 2dup = .s -\emph{\{ 4 4 t \}} -\end{alltt} - -Test \texttt{judge-guess} with a few inputs: - -\begin{alltt} -1 10 judge-guess . -\emph{Too low} -\emph{t} -89 43 judge-guess . -\emph{Too high} -\emph{t} -64 64 judge-guess . -\emph{Correct} -\emph{f} -\end{alltt} - -\section{Generating random numbers} - -The \texttt{random-int} word \texttt{( min max -{}- n )} pushes a -random number in a specified range. The range is inclusive, so both -the minimum and maximum indexes are candidate random numbers. Use -\texttt{apropos.} to determine that this word is in the \texttt{random} -vocabulary. For the purposes of this game, random numbers will be -in the range of 0 to 100, so we can define a word that generates a -random number in the range of 0 to 100: - -\begin{alltt} -: number-to-guess ( -{}- n ) 0 100 random-int ; -\end{alltt} -Add the word definition to the source file, along with the appropriate -\texttt{USE:} statement. Load the source file in the interpreter, -and confirm that the word functions correctly, and that its stack -effect comment is accurate. - - -\section{The game loop} - -The game loop consists of repeated calls to \texttt{guess-prompt}, -\texttt{read-number} and \texttt{judge-guess}. If \texttt{judge-guess} -returns \texttt{f}, the loop stops, otherwise it continues. This is -realized with a recursive implementation: - -\begin{alltt} -: numbers-game-loop ( actual -{}- ) - dup guess-prompt read-number judge-guess {[} - numbers-game-loop - {]} {[} - drop - {]} ifte ; -\end{alltt} -In Factor, tail-recursive words consume a bounded amount of call stack -space. This means you are free to pick recursion or iteration based -on their own merits when solving a problem. In many other languages, -the usefulness of recursion is severely limited by the lack of tail-recursive -call optimization. - - -\section{Finishing off} - -The last task is to combine everything into the main \texttt{numbers-game} -word. This is easier than it seems: - -\begin{alltt} -: numbers-game number-to-guess numbers-game-loop ; -\end{alltt} -Try it out! Simply invoke the \texttt{numbers-game} word in the interpreter. -It should work flawlessly, assuming you tested each component of this -design incrementally! - - -\section{The complete program} - -\begin{verbatim} -! Numbers game example - -IN: numbers-game -USE: kernel -USE: math -USE: parser -USE: random -USE: stdio -USE: stack - -: read-number ( -- n ) read parse-number ; - -: guess-banner - "I'm thinking of a number between 0 and 100." print ; -: guess-prompt "Enter your guess: " write ; -: too-high "Too high" print ; -: too-low "Too low" print ; -: correct "Correct - you win!" print ; - -: inexact-guess ( actual guess -- ) - < [ too-high ] [ too-low ] ifte ; - -: judge-guess ( actual guess -- ? ) - 2dup = [ - 2drop correct f - ] [ - inexact-guess t - ] ifte ; - -: number-to-guess ( -- n ) 0 100 random-int ; - -: numbers-game-loop ( actual -- ) - dup guess-prompt read-number judge-guess [ - numbers-game-loop - ] [ - drop - ] ifte ; - -: numbers-game number-to-guess numbers-game-loop ; -\end{verbatim} - -\chapter{All about numbers} - -A brief introduction to arithmetic in Factor was given in the first chapter. Most of the time, the simple features outlined there suffice, and if math is not your thing, you can skim (or skip!) this chapter. For the true mathematician, Factor's numerical capability goes far beyond simple arithmetic. - -Factor's numbers more closely model the mathematical concept of a number than other languages. Where possible, exact answers are given -- for example, adding or multiplying two integers never results in overflow, and dividing two integers yields a fraction rather than a truncated result. Complex numbers are supported, allowing many functions to be computed with parameters that would raise errors or return ``not a number'' in other languages. - -\section{Integers} - -\chapkeywords{integer?~BIN: OCT: HEX: .b .o .h} - -The simplest type of number is the integer. Integers come in two varieties -- \emph{fixnums} and \emph{bignums}. As their names suggest, a fixnum is a fixed-width quantity\footnote{Fixnums range in size from $-2^{w-3}-1$ to $2^{w-3}$, where $w$ is the word size of your processor (for example, 32 bits). Because fixnums automatically grow to bignums, usually you do not have to worry about details like this.}, and is a bit quicker to manipulate than an arbitrary-precision bignum. - -The predicate word \texttt{integer?}~tests if the top of the stack is an integer. If this returns true, then exactly one of \texttt{fixnum?}~or \texttt{bignum?}~would return true for that object. Usually, your code does not have to worry if it is dealing with fixnums or bignums. - -Unlike some languages where the programmer has to declare storage size explicitly and worry about overflow, integer operations automatically return bignums if the result would be too big to fit in a fixnum. Here is an example where multiplying two fixnums returns a bignum: - -\begin{alltt} -\textbf{ok} 134217728 fixnum? . -\textbf{t} -\textbf{ok} 128 fixnum? . -\textbf{t} -\textbf{ok} 134217728 128 * . -\textbf{17179869184} -\textbf{ok} 134217728 128 * bignum? . -\textbf{t} -\end{alltt} - -Integers can be entered using a different base. By default, all number entry is in base 10, however this can be changed by prefixing integer literals with one of the parsing words \texttt{BIN:}, \texttt{OCT:}, or \texttt{HEX:}. For example: - -\begin{alltt} -\textbf{ok} BIN: 1110 BIN: 1 + . -\textbf{15} -\textbf{ok} HEX: deadbeef 2 * . -\textbf{7471857118} -\end{alltt} - -The word \texttt{.} prints numbers in decimal, regardless of how they were input. A set of words in the \texttt{prettyprint} vocabulary is provided for print integers using another base. - -\begin{alltt} -\textbf{ok} 1234 .h -\textbf{4d2} -\textbf{ok} 1234 .o -\textbf{2232} -\textbf{ok} 1234 .b -\textbf{10011010010} -\end{alltt} - -\section{Rational numbers} - -\chapkeywords{rational?~numerator denominator} - -If we add, subtract or multiply any two integers, the result is always an integer. However, this is not the case with division. When dividing a numerator by a denominator where the numerator is not a integer multiple of the denominator, a ratio is returned instead. - -\begin{alltt} -1210 11 / . -\emph{110} -100 330 / . -\emph{10/33} -\end{alltt} - -Ratios are printed and can be input literally in the form of the second example. Ratios are always reduced to lowest terms by factoring out the greatest common divisor of the numerator and denominator. A ratio with a denominator of 1 becomes an integer. Trying to create a ratio with a denominator of 0 raises an error. - -The predicate word \texttt{ratio?}~tests if the top of the stack is a ratio. The predicate word \texttt{rational?}~returns true if and only if one of \texttt{integer?}~or \texttt{ratio?}~would return true for that object. So in Factor terms, a ``ratio'' is a rational number whose denominator is not equal to 1. - -Ratios behave just like any other number -- all numerical operations work as expected, and in fact they use the formulas for adding, subtracting and multiplying fractions that you learned in high school. - -\begin{alltt} -\textbf{ok} 1/2 1/3 + . -\textbf{5/6} -\textbf{ok} 100 6 / 3 * . -\textbf{50} -\end{alltt} - -Ratios can be deconstructed into their numerator and denominator components using the \texttt{numerator} and \texttt{denominator} words. The numerator and denominator are both integers, and furthermore the denominator is always positive. When applied to integers, the numerator is the integer itself, and the denominator is 1. - -\begin{alltt} -\textbf{ok} 75/33 numerator . -\textbf{25} -\textbf{ok} 75/33 denominator . -\textbf{11} -\textbf{ok} 12 numerator . -\textbf{12} -\end{alltt} - -\section{Floating point numbers} - -\chapkeywords{float?~>float /f} - -Rational numbers represent \emph{exact} quantities. On the other hand, a floating point number is an \emph{approximation}. While rationals can grow to any required precision, floating point numbers are fixed-width, and manipulating them is usually faster than manipulating ratios or bignums (but slower than manipulating fixnums). Floating point literals are often used to represent irrational numbers, which have no exact representation as a ratio of two integers. Floating point literals are input with a decimal point. - -\begin{alltt} -\textbf{ok} 1.23 1.5 + . -\textbf{1.73} -\end{alltt} - -The predicate word \texttt{float?}~tests if the top of the stack is a floating point number. The predicate word \texttt{real?}~returns true if and only if one of \texttt{rational?}~or \texttt{float?}~would return true for that object. - -Floating point numbers are \emph{contagious} -- introducing a floating point number in a computation ensures the result is also floating point. - -\begin{alltt} -\textbf{ok} 5/4 1/2 + . -\textbf{7/4} -\textbf{ok} 5/4 0.5 + . -\textbf{1.75} -\end{alltt} - -Apart from contaigion, there are two ways of obtaining a floating point result from a computation; the word \texttt{>float ( n -{}- f )} converts a rational number into its floating point approximation, and the word \texttt{/f ( x y -{}- x/y )} returns the floating point approximation of a quotient of two numbers. - -\begin{alltt} -\textbf{ok} 7 4 / >float . -\textbf{1.75} -\textbf{ok} 7 4 /f . -\textbf{1.75} -\end{alltt} - -Indeed, the word \texttt{/f} could be defined as follows: - -\begin{alltt} -: /f / >float ; -\end{alltt} - -However, the actual definition is slightly more efficient, since it computes the floating point result directly. - -\section{Complex numbers} - -\chapkeywords{i -i \#\{ complex?~real imaginary >rect rect> arg abs >polar polar>} - -Complex numbers arise as solutions to quadratic equations whose graph does not intersect the x axis. For example, the equation $x^2 + 1 = 0$ has no solution for real $x$, because there is no real number that is a square root of -1. However, in the field of complex numbers, this equation has a well-known solution: - -\begin{alltt} -\textbf{ok} -1 sqrt . -\textbf{\#\{ 0 1 \}} -\end{alltt} - -The literal syntax for a complex number is \texttt{\#\{ re im \}}, where \texttt{re} is the real part and \texttt{im} is the imaginary part. For example, the literal \texttt{\#\{ 1/2 1/3 \}} corresponds to the complex number $1/2 + 1/3i$. - -The words \texttt{i} an \texttt{-i} push the literals \texttt{\#\{ 0 1 \}} and \texttt{\#\{ 0 -1 \}}, respectively. - -The predicate word \texttt{complex?} tests if the top of the stack is a complex number. Note that unlike math, where all real numbers are also complex numbers, Factor only considers a number to be a complex number if its imaginary part is non-zero. - -Complex numbers can be deconstructed into their real and imaginary components using the \texttt{real} and \texttt{imaginary} words. Both components can be pushed at once using the word \texttt{>rect ( z -{}- re im )}. - -\begin{alltt} -\textbf{ok} -1 sqrt real . -\textbf{0} -\textbf{ok} -1 sqrt imaginary . -\textbf{1} -\textbf{ok} -1 sqrt sqrt >rect .s -\textbf{\{ 0.7071067811865476 0.7071067811865475 \}} -\end{alltt} - -A complex number can be constructed from a real and imaginary component on the stack using the word \texttt{rect> ( re im -{}- z )}. - -\begin{alltt} -\textbf{ok} 1/3 5 rect> . -\textbf{\#\{ 1/3 5 \}} -\end{alltt} - -Complex numbers are stored in \emph{rectangular form} as a real/imaginary component pair (this is where the names \texttt{>rect} and \texttt{rect>} come from). An alternative complex number representation is \emph{polar form}, consisting of an absolute value and argument. The absolute value and argument can be computed using the words \texttt{abs} and \texttt{arg}, and both can be pushed at once using \texttt{>polar ( z -{}- abs arg )}. - -\begin{alltt} -\textbf{ok} 5.3 abs . -\textbf{5.3} -\textbf{ok} i arg . -\textbf{1.570796326794897} -\textbf{ok} \#\{ 4 5 \} >polar .s -\textbf{\{ 6.403124237432849 0.8960553845713439 \}} -\end{alltt} - -A new complex number can be created from an absolute value and argument using \texttt{polar> ( abs arg -{}- z )}. - -\begin{alltt} -\textbf{ok} 1 pi polar> . -\textbf{\#\{ -1.0 1.224606353822377e-16 \}} -\end{alltt} - -\section{Transcedential functions} - -\chapkeywords{\^{} exp log sin cos tan asin acos atan sec cosec cot asec acosec acot sinh cosh tanh asinh acosh atanh sech cosech coth asech acosech acoth} - -The \texttt{math} vocabulary provides a rich library of mathematical functions that covers exponentiation, logarithms, trigonometry, and hyperbolic functions. All functions accept and return complex number arguments where appropriate. These functions all return floating point values, or complex numbers whose real and imaginary components are floating point values. - -\texttt{\^{} ( x y -- x\^{}y )} raises \texttt{x} to the power of \texttt{y}. In the cases of \texttt{y} being equal to $1/2$, -1, or 2, respectively, the words \texttt{sqrt}, \texttt{recip} and \texttt{sq} can be used instead. - -\begin{alltt} -\textbf{ok} 2 4 \^ . -\textbf{16.0} -\textbf{ok} i i \^ . -\textbf{0.2078795763507619} -\end{alltt} - -All remaining functions have a stack effect \texttt{( x -{}- y )}, it won't be repeated for brevity. - -\texttt{exp} raises the number $e$ to a specified power. The number $e$ can be pushed on the stack with the \texttt{e} word, so \texttt{exp} could have been defined as follows: - -\begin{alltt} -: exp ( x -- e^x ) e swap \^ ; -\end{alltt} - -However, it is actually defined otherwise, for efficiency.\footnote{In fact, the word \texttt{\^{}} is actually defined in terms of \texttt{exp}, to correctly handle complex number arguments.} - -\texttt{log} computes the natural (base $e$) logarithm. This is the inverse of the \texttt{exp} function. - -\begin{alltt} -\textbf{ok} -1 log . -\textbf{\#\{ 0.0 3.141592653589793 \}} -\textbf{ok} e log . -\textbf{1.0} -\end{alltt} - -\texttt{sin}, \texttt{cos} and \texttt{tan} are the familiar trigonometric functions, and \texttt{asin}, \texttt{acos} and \texttt{atan} are their inverses. - -The reciprocals of the sine, cosine and tangent are defined as \texttt{sec}, \texttt{cosec} and \texttt{cot}, respectively. Their inverses are \texttt{asec}, \texttt{acosec} and \texttt{acot}. - -\texttt{sinh}, \texttt{cosh} and \texttt{tanh} are the hyperbolic functions, and \texttt{asinh}, \texttt{acosh} and \texttt{atanh} are their inverses. - -Similarly, the reciprocals of the hyperbolic functions are defined as \texttt{sech}, \texttt{cosech} and \texttt{coth}, respectively. Their inverses are \texttt{asech}, \texttt{acosech} and \texttt{acoth}. - -\section{Modular arithmetic} - -\chapkeywords{/i mod /mod gcd} - -In addition to the standard division operator \texttt{/}, there are a few related functions that are useful when working with integers. - -\texttt{/i ( x y -{}- x/y )} performs a truncating integer division. It could have been defined as follows: - -\begin{alltt} -: /i / >integer ; -\end{alltt} - -However, the actual definition is a bit more efficient than that. - -\texttt{mod ( x y -{}- x\%y )} computes the remainder of dividing \texttt{x} by \texttt{y}. If the result is 0, then \texttt{x} is a multiple of \texttt{y}. - -\texttt{/mod ( x y -{}- x/y x\%y )} pushes both the quotient and remainder. - -\begin{alltt} -\textbf{ok} 100 3 mod . -\textbf{1} -\textbf{ok} -546 34 mod . -\textbf{-2} -\end{alltt} - -\texttt{gcd ( x y -{}- z )} pushes the greatest common divisor of two integers; that is, the largest number that both integers could be divided by and still yield integers as results. This word is used behind the scenes to reduce rational numbers to lowest terms when doing ratio arithmetic. - -\section{Bitwise operations} - -\chapkeywords{bitand bitor bitxor bitnot shift} - -There are two ways of looking at an integer -- as a mathematical entity, or as a string of bits. The latter representation faciliates the so-called \emph{bitwise operations}. - -\texttt{bitand ( x y -{}- x\&y )} returns a new integer where each bit is set if and only if the corresponding bit is set in both $x$ and $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-and with a mask switches off all flags that are not explicitly set in the mask. - -\begin{alltt} -BIN: 101 BIN: 10 bitand .b -\emph{0} -BIN: 110 BIN: 10 bitand .b -\emph{10} -\end{alltt} - -\texttt{bitor ( x y -{}- x|y )} returns a new integer where each bit is set if and only if the corresponding bit is set in at least one of $x$ or $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-or with a mask switches on all flags that are set in the mask. - -\begin{alltt} -BIN: 101 BIN: 10 bitor .b -\emph{111} -BIN: 110 BIN: 10 bitor .b -\emph{110} -\end{alltt} - -\texttt{bitxor ( x y -{}- x\^{}y )} returns a new integer where each bit is set if and only if the corresponding bit is set in exactly one of $x$ or $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-xor with a mask toggles on all flags that are set in the mask. - -\begin{alltt} -BIN: 101 BIN: 10 bitxor .b -\emph{111} -BIN: 110 BIN: 10 bitxor .b -\emph{100} -\end{alltt} - -\texttt{bitnot ( x -{}- y )} returns the bitwise complement of the input; that is, each bit in the input number is flipped. This is actually equivalent to negating a number, and subtracting one. So indeed, \texttt{bitnot} could have been defined as thus: - -\begin{alltt} -: bitnot neg pred ; -\end{alltt} - -\texttt{shift ( x n -{}- y )} returns a new integer consisting of the bits of the first integer, shifted to the left by $n$ positions. If $n$ is negative, the bits are shifted to the right instead, and bits that ``fall off'' are discarded. - -\begin{alltt} -BIN: 101 5 shift .b -\emph{10100000} -BIN: 11111 -2 shift .b -\emph{111} -\end{alltt} - -The attentive reader will notice that shifting to the left is equivalent to multiplying by a power of two, and shifting to the right is equivalent to performing a truncating division by a power of two. - -\chapter{Working with state} - -\section{Building lists and strings} - -\chapkeywords{make-string make-list ,} -\index{\texttt{make-string}} -\index{\texttt{make-list}} -\index{\texttt{make-,}} - -\section{Hashtables} - -A hashtable, much like an association list, stores key/value pairs, and offers lookup by key. However, whereas an association list must be searched linearly to locate keys, a hashtable uses a more sophisticated method. Key/value pairs are sorted into \emph{buckets} using a \emph{hash function}. If two objects are equal, then they must have the same hash code; but not necessarily vice versa. To look up the value associated with a key, only the bucket corresponding to the key has to be searched. A hashtable is simply a vector of buckets, where each bucket is an association list. - -\texttt{ ( capacity -{}- hash )} creates a new hashtable with the specified number of buckets. A hashtable with one bucket is basically an association list. Right now, a ``large enough'' capacity must be specified, and performance degrades if there are too many key/value pairs per bucket. In a future implementation, hashtables will grow as needed as the number of key/value pairs increases. - -\texttt{hash ( key hash -{}- value )} looks up the value associated with a key in the hashtable. Pushes \texttt{f} if no pair with this key is present. Note that \texttt{hash} cannot differentiate between a key that is not present at all, or a key with a value of \texttt{f}. - -\texttt{hash* ( key hash -{}- {[} key | value {]} )} looks for -a pair with this key, and pushes the pair itself. Unlike \texttt{hash}, -\texttt{hash{*}} returns different values in the cases of a value -set to \texttt{f}, or an undefined value. - -\texttt{set-hash ( value key hash -{}- )} stores a key/value pair in a hashtable. - -Hashtables can be converted to association lists and vice versa using -the \texttt{hash>alist} and \texttt{alist>hash} words. The list of keys and -list of values can be extracted using the \texttt{hash-keys} and \texttt{hash-values} words. - -examples - -\section{Variables} - -Notice that until now, all the code except a handful of examples has only used the stack for storage. You can also use variables to store temporary data, much like in other languages, however their use is not so prevalent. This is not a coincidence -- Fator was designed this way, and mastery of the stack is essential. Using variables where the stack is more appropriate leads to ugly, unreusable code. - -Variables are typically used for longer-term storage of data, and compound data structures, realized as nested namespaces of variables. This concept should be instantly familiar to anybody who's used an object-oriented programming language. Variables should only be used for intermediate results if keeping everything on the stack would result in ackward stack flow. - -The words \texttt{get ( name -{}- value )} and \texttt{set ( value name -{}- )} retreive and store variable values, respectively. Variable names are strings, and they do not have to be declared before use. For example: - -\begin{alltt} -5 "x" set -"x" get . -\emph{5} -\end{alltt} - -\section{Namespaces} - -Only having one list of variable name/value bindings would make the language terribly inflexible. Instead, a variable has any number of potential values, one per namespace. There is a notion of a ``current namespace''; the \texttt{set} word always stores variables in the current namespace. On the other hand, \texttt{get} traverses up the stack of namespace bindings until it finds a variable with the specified name. - -\texttt{bind ( namespace quot -{}- )} executes a quotation in the dynamic scope of a namespace. For example, the following sets the value of \texttt{x} to 5 in the global namespace, regardless of the current namespace at the time the word was called. - -\begin{alltt} -: global-example ( -- ) - global {[} 5 "x" set {]} bind ; -\end{alltt} - -\texttt{ ( -{}- namespace )} creates a new namespace object. Actually, a namespace is just a hashtable, with a default capacity. - -\texttt{with-scope ( quot -{}- )} combines \texttt{} with \texttt{bind} by executing a quotation in a new namespace. - -get example - -describe - -\section{The name stack} - -The \texttt{bind} combinator creates dynamic scope by pushing and popping namespaces on the so-called \emph{name stack}. Its definition is simpler than one would expect: - -\begin{alltt} -: bind ( namespace quot -- ) - swap >n call n> drop ; -\end{alltt} - -The words \texttt{>n} and \texttt{n>} push and pop the name stack, respectively. Observe the stack flow in the definition of \texttt{bind}; the namespace goes on the name stack, the quotation is called, and the name space is popped and discarded. - -The name stack is really just a vector. The words \texttt{>n} and \texttt{n>} are implemented as follows: - -\begin{alltt} -: >n ( namespace -- n:namespace ) namestack* vector-push ; -: n> ( n:namespace -- namespace ) namestack* vector-pop ; -\end{alltt} - -\section{\label{sub:List-constructors}List construction} - -The \texttt{make-list} word provides an alternative way to build a list. Instead of passing a partial list around on the stack, it is kept in a variable. This reduces the number -of stack elements that have to be juggled. - -The word \texttt{make-list ( quot -{}- )} executes a quotation in a new dynamic scope. Calls to \texttt{, ( obj -{}- )} in the quotation appends objects to the partial -list. When the quotation returns, \texttt{make-list} pushes the complete list. - -The fact that a new -scope is created inside \texttt{make-list} is very important. -This means -that list constructions can be nested. - -Here is an example of list construction using this technique: - -\begin{alltt} -[ 1 10 {[} 2 {*} dup , {]} times drop ] make-list . -\emph{{[} 2 4 8 16 32 64 128 256 512 1024 {]}} -\end{alltt} - -\section{String construction} - -The \texttt{make-string} word is similar to \texttt{make-list}, except inside the quotation, only strings and integers may be passed to the \texttt{,} word, and when the quotation finishes executing, everything is concatenated into a single string. - -Compare the following two examples -- both define a word that concatenates together all elements of a list of strings. The first one uses a string buffer stored on the stack, the second uses string construction words: - -\begin{alltt} -: list>string ( list -- str ) - 100 swap {[} over sbuf-append {]} each sbuf>str ; - -: list>string ( list -- str ) - [ [ , ] each ] make-list ; -\end{alltt} - -\chapter{Practical: a contractor timesheet} - -For the second practical example, we will code a small program that tracks how long you spend working on tasks. It will provide two primary functions, one for adding a new task and measuring how long you spend working on it, and another to print out the timesheet. A typical interaction looks like this: - -\begin{alltt} -timesheet-app -\emph{ -(E)xit -(A)dd entry -(P)rint timesheet - -Enter a letter between ( ) to execute that action.} -a -\emph{Start work on the task now. Press ENTER when done. - -Please enter a description:} -Working on the Factor HTTP server - -\emph{(E)xit -(A)dd entry -(P)rint timesheet - -Enter a letter between ( ) to execute that action.} -a -\emph{Start work on the task now. Press ENTER when done. - -Please enter a description:} -Writing a kick-ass web app -\emph{ -(E)xit -(A)dd entry -(P)rint timesheet - -Enter a letter between ( ) to execute that action.} -p -\emph{TIMESHEET: -Working on the Factor HTTP server 0:25 -Writing a kick-ass web app 1:03 - -(E)xit -(A)dd entry -(P)rint timesheet - -Enter a letter between ( ) to execute that action.} -x -\end{alltt} - -Once you have finished working your way through this tutorial, you might want to try extending the program -- for example, it could print the total hours, prompt for an hourly rate, then print the amount of money that should be billed. - -\section{Measuring a duration of time} - -When you begin working on a new task, you tell the timesheet you want -to add a new entry. It then measures the elapsed time until you specify -the task is done, and prompts for a task description. - -The first word we will write is \texttt{measure-duration}. We measure -the time duration by using the \texttt{millis} word \texttt{( -{}- -m )} to take the time before and after a call to \texttt{read}. The -\texttt{millis} word pushes the number of milliseconds since a certain -epoch -- the epoch does not matter here since we are only interested -in the difference between two times. - -A first attempt at \texttt{measure-duration} might look like this: - -\begin{alltt} -: measure-duration millis read drop millis - ; -measure-duration . -\end{alltt} - -This word definition has the right general idea, however, the result -is negative. Also, we would like to measure durations in minutes, -not milliseconds: - -\begin{alltt} -: measure-duration ( -{}- duration ) - millis - read drop - millis swap - 1000 /i 60 /i ; -\end{alltt} - -Note that the \texttt{/i} word \texttt{( x y -{}- x/y )}, from the -\texttt{math} vocabulary, performs truncating division. This -makes sense, since we are not interested in fractional parts of a -minute here. - -\section{Adding a timesheet entry} - -Now that we can measure a time duration at the keyboard, lets write -the \texttt{add-entry-prompt} word. This word does exactly what one -would expect -- it prompts for the time duration and description, -and leaves those two values on the stack: - -\begin{alltt} -: add-entry-prompt ( -{}- duration description ) - "Start work on the task now. Press ENTER when done." print - measure-duration - "Please enter a description:" print - read ; -\end{alltt} - -You should interactively test this word. Measure off a minute or two, -press ENTER, enter a description, and press ENTER again. The stack -should now contain two values, in the same order as the stack effect -comment. - -Now, almost all the ingredients are in place. The final add-entry -word calls add-entry-prompt, then pushes the new entry on the end -of the timesheet vector: - -\begin{alltt} -: add-entry ( timesheet -{}- ) - add-entry-prompt cons swap vector-push ; -\end{alltt} - -Recall that timesheet entries are cons cells where the car is the -duration and the cdr is the description, hence the call to \texttt{cons}. -Note that this word side-effects the timesheet vector. You can test -it interactively like so: - -\begin{alltt} -10 dup add-entry -\emph{Start work on the task now. Press ENTER when done.} -\emph{Please enter a description:} -\emph{Studying Factor} -. -\emph{\{ {[} 2 | "Studying Factor" {]} \}} -\end{alltt} - -\section{Printing the timesheet} - -The hard part of printing the timesheet is turning the duration in -minutes into a nice hours/minutes string, like {}``01:15''. We would -like to make a word like the following: - -\begin{alltt} -135 hh:mm . -\emph{01:15} -\end{alltt} - -First, we can make a pair of words \texttt{hh} and \texttt{mm} to extract the hours -and minutes, respectively. This can be achieved using truncating division, -and the modulo operator -- also, since we would like strings to be -returned, the \texttt{unparse} word \texttt{( obj -{}- str )} from -the \texttt{unparser} vocabulary is called to turn the integers into -strings: - -\begin{alltt} -: hh ( duration -{}- str ) 60 /i unparse ; -: mm ( duration -{}- str ) 60 mod unparse ; -\end{alltt} - -The \texttt{hh:mm} word can then be written, concatenating the return -values of \texttt{hh} and \texttt{mm} into a single string using string -construction: - -\begin{alltt} -: hh:mm ( millis -{}- str ) [ dup hh , ":" , mm , ] make-string ; -\end{alltt} -However, so far, these three definitions do not produce ideal output. -Try a few examples: - -\begin{alltt} -120 hh:mm . -2:0 -130 hh:mm . -2:10 -\end{alltt} -Obviously, we would like the minutes to always be two digits. Luckily, -there is a \texttt{digits} word \texttt{( str n -{}- str )} in the -\texttt{format} vocabulary that adds enough zeros on the left of the -string to give it the specified length. Try it out: - -\begin{alltt} -"23" 2 digits . -\emph{"23"} -"7"2 digits . -\emph{"07"} -\end{alltt} -We can now change the definition of \texttt{mm} accordingly: - -\begin{alltt} -: mm ( duration -{}- str ) 60 mod unparse 2 digits ; -\end{alltt} -Now that time duration output is done, a first attempt at a definition -of \texttt{print-timesheet} looks like this: - -\begin{alltt} -: print-timesheet ( timesheet -{}- ) - {[} uncons write ": " write hh:mm print {]} vector-each ; -\end{alltt} -This works, but produces ugly output: - -\begin{alltt} -\{ {[} 30 | "Studying Factor" {]} {[} 65 | "Paperwork" {]} \} -print-timesheet -\emph{Studying Factor: 0:30} -\emph{Paperwork: 1:05} -\end{alltt} - -It would be much nicer if the time durations lined up in the same -column. First, lets factor out the body of the \texttt{vector-each} -loop into a new \texttt{print-entry} word before it gets too long: - -\begin{alltt} -: print-entry ( duration description -{}- ) - write ": " write hh:mm print ; - -: print-timesheet ( timesheet -{}- ) - {[} uncons print-entry {]} vector-each ; -\end{alltt} - -We can now make \texttt{print-entry} line up columns using the \texttt{pad-string} -word \texttt{( str n -{}- str )}. - -\begin{alltt} -: print-entry ( duration description -{}- ) - dup - write - 50 swap pad-string write - hh:mm print ; -\end{alltt} - -In the above definition, we first print the description, then enough -blanks to move the cursor to column 60. So the description text is -left-justified. If we had interchanged the order of the second and -third line in the definition, the description text would be right-justified. - -Try out \texttt{print-timesheet} again, and marvel at the aligned -columns: - -\begin{alltt} -\{ {[} 30 | "Studying Factor" {]} {[} 65 | "Paperwork" {]} \} -print-timesheet -\emph{Studying Factor 0:30} -\emph{Paperwork 1:05} -\end{alltt} - -\section{The main menu} - -Finally, we will code a main menu that looks like this: - -\begin{alltt} - -(E)xit -(A)dd entry -(P)rint timesheet - -Enter a letter between ( ) to execute that action. -\end{alltt} - -We will represent the menu as an association list. Recall that an association list is a list of pairs, where the car of each pair is a key, and the cdr is a value. Our keys will literally be keyboard keys (``e'', ``a'' and ``p''), and the values will themselves be pairs consisting of a menu item label and a quotation. - -The first word we will code is \texttt{print-menu}. It takes an association list, and prints the second element of each pair's value. Note that \texttt{terpri} simply prints a blank line: - -\begin{alltt} -: print-menu ( menu -{}- ) - terpri {[} cdr car print {]} each terpri - "Enter a letter between ( ) to execute that action." print ; -\end{alltt} - -You can test \texttt{print-menu} with a short association list: - -\begin{alltt} -{[} {[} "x" "(X)yzzy" 2 2 + . {]} {[} "f" "(F)oo" -1 sqrt . {]} {]} print-menu -\emph{ -Xyzzy -Foo - -Enter a letter between ( ) to execute that action.} -\end{alltt} - -The next step is to write a \texttt{menu-prompt} word that takes the same association list, reads a line of input from the keyboard, and executes the quotation associated with that line. Recall that the \texttt{assoc} word returns \texttt{f} if the specified key could not be found in the association list. The below definition makes use of a conditional to signal an error in that case: - -\begin{alltt} -: menu-prompt ( menu -{}- ) - read swap assoc dup {[} - cdr call - {]} {[} - "Invalid input: " swap unparse cat2 throw - {]} ifte ; -\end{alltt} - -Try applying the new \texttt{menu-prompt} word to the association list we used to test \texttt{print-menu}. You should verify that entering \texttt{x} causes the quotation \texttt{{[} 2 2 + . {]}} to be executed: - -\begin{alltt} -{[} {[} "x" "(X)yzzy" 2 2 + . {]} {[} "f" "(F)oo" -1 sqrt . {]} {]} menu-prompt -x -\emph{4} -\end{alltt} - -Finally, we want a \texttt{menu} word that first prints a menu, then prompts for and acts on input: - -\begin{alltt} -: menu ( menu -{}- ) - dup print-menu menu-prompt ; -\end{alltt} - -Considering the stack effects of \texttt{print-menu} and \texttt{menu-prompt}, it should be obvious why the \texttt{dup} is needed. - -\section{Finishing off} - -We now need a \texttt{main-menu} word. It takes the timesheet vector from the stack, and recursively calls itself until the user requests that the timesheet application exits: - -\begin{alltt} -: main-menu ( timesheet -{}- ) - {[} - {[} "e" "(E)xit" drop {]} - {[} "a" "(A)dd entry" dup add-entry main-menu {]} - {[} "p" "(P)rint timesheet" dup print-timesheet main-menu {]} - {]} menu ; -\end{alltt} - -Note that unless the first option is selected, the timesheet vector is eventually passed into the recursive \texttt{main-menu} call. - -All that remains now is the ``main word'' that runs the program with an empty timesheet vector. Note that the initial capacity of the vector is 10 elements, however this is not a limit -- adding more than 10 elements will grow the vector: - -\begin{alltt} -: timesheet-app ( -{}- ) - 10 main-menu ; -\end{alltt} - -\section{The complete program} - -\begin{verbatim} -! Contractor timesheet example - -IN: timesheet -USE: combinators -USE: errors -USE: format -USE: kernel -USE: lists -USE: math -USE: parser -USE: stack -USE: stdio -USE: strings -USE: unparser -USE: vectors - -! Adding a new entry to the time sheet. - -: measure-duration ( -- duration ) - millis - read drop - millis swap - 1000 /i 60 /i ; - -: add-entry-prompt ( -- duration description ) - "Start work on the task now. Press ENTER when done." print - measure-duration - "Please enter a description:" print - read ; - -: add-entry ( timesheet -- ) - add-entry-prompt cons swap vector-push ; - -! Printing the timesheet. - -: hh ( duration -- str ) 60 /i ; -: mm ( duration -- str ) 60 mod unparse 2 digits ; -: hh:mm ( millis -- str ) [ dup hh , ":" , mm , ] make-string ; - -: print-entry ( duration description -- ) - dup write - 60 swap pad-string write - hh:mm print ; - -: print-timesheet ( timesheet -- ) - "TIMESHEET:" print - [ uncons print-entry ] vector-each ; - -! Displaying a menu - -: print-menu ( menu -- ) - terpri [ cdr car print ] each terpri - "Enter a letter between ( ) to execute that action." print ; - -: menu-prompt ( menu -- ) - read swap assoc dup [ - cdr call - ] [ - "Invalid input: " swap unparse cat2 throw - ] ifte ; - -: menu ( menu -- ) - dup print-menu menu-prompt ; - -! Main menu - -: main-menu ( timesheet -- ) - [ - [ "e" "(E)xit" drop ] - [ "a" "(A)dd entry" dup add-entry main-menu ] - [ "p" "(P)rint timesheet" dup print-timesheet main-menu ] - ] menu ; - -: timesheet-app ( -- ) - 10 main-menu ; -\end{verbatim} - -\chapter{Working with classes} - -\section{What is object oriented programming?} - -Object oriented programming is a commonly-used term, however many people -define it differently. Most will agree it consists of three key ideas: - -\begin{itemize} -\item Objects are small pieces of state with the required identity and -equality semantics, along with runtime information -allowing the object to reflect on itself. - -\item Objects are organized in some manner, allowing one to express -that a given set of objects features common behavior or shape. Factor organizes -objects into classes and types, however its definition of these terms is -slightly different from convention. - -\item Behavior can be defined on objects, and dispatched in a polymorphic way, -where invoking a generic operation on an object takes action most -appropriate to that object. -\end{itemize} - -The separation into three parts is reflected in the design of the Factor -object system. - -The following terminology is used in this guide: - -\begin{itemize} -\item \emph{Class} -- a class is a set of objects given by a predicate -that distingluishes elements of the class from other objects, along with -some associated meta-information. - -\item \emph{Type} -- a type is a concrete representation of an object -in runtime memory. There is only a fixed number of built-in types, such as -integers, strings, and arrays. Each object has a unique type it belongs to, -whereas it may be a member of an arbitrary number of classes. - -\end{itemize} - -In many languages, a class refers to a specific object organization, -typically a specification form for named slots that objects in the class -shall have. In Factor, the \texttt{tuple} metaclass allows one to create -such conventional objects. However, we will look at generic words -and built-in classes first. - -\section{Generic words and methods} - -To use the generic word system, you must put the following near the -beginning of your source file: - -\begin{verbatim} -USE: generic -\end{verbatim} - -The motivation for generic words is that sometimes, you want to write a word that has -differing behavior depending on the class of its argument. For example, -in a game, a \texttt{draw} word could take different action if given a ship, a -weapon, a planet, etc. Writing one large \texttt{draw} word that contains type case logic results in -unnecessary coupling -- adding support for a new type of graphical -object would require modifying the original definition of \texttt{draw}, for -example. - -A generic word is a word whose behavior depends on the class of the -object at the top of the stack, however this behavior is defined in a -decentralized manner. - -A new generic word is defined using the following syntax: - -\begin{verbatim} -GENERIC: draw ( actor -- ) -#! Draw the actor. -\end{verbatim} - -A stack effect comment, as shown above, is not required but recommended. - -A generic word just defined like that will simply raise an error if -invoked. Specific behavior is defined using methods. - -A method associates behavior with a generic word. Methods are defined by -writing \texttt{M:}, followed by a class name, followed by the name of a -previously-defined generic word. - -One of the main benefits of generic words is that each method definition -can potentially occur in a different source file. Generic word -definitions also hide conditionals. - -Here are two methods for the generic \texttt{draw} word: - -\begin{verbatim} -M: ship draw ( actor -- ) - [ - surface get screen-xy radius get color get - filledCircleColor - ] bind ; - -M: plasma draw ( actor -- ) - [ - surface get screen-xy dup len get + color get - vlineColor - ] bind ; -\end{verbatim} - -Here, \texttt{ship} and \texttt{class} are user-defined classes. - -Every object is a member of the \texttt{object} class. If you provide a method specializing -on the \texttt{object} class for some generic word, the method will be -invoked when no other more specific method exists. For example: - -\begin{verbatim} -GENERIC: describe -M: number describe "The number " write . ; -M: object describe "I don't know anything about " write . ; -\end{verbatim} - -\section{Classes} - -Recall that in Factor, a class is just a predicate that categorizes objects as -being a member of the class or not. To be useful, it must be consistent --- for a given object, it must always return the same truth value. - -Classes are not always subsets or supersets of types and new classes can be defined by the user. Classes can be quite arbitrary: - -\begin{itemize} -\item Cons cells where both elements are integers - -\item Floating point numbers between -1 and 1 - -\item Hash tables holding a certain key - -\item Any object that occurs as a member of a certain global variable -holding a list. - -\item \... and so on. -\end{itemize} - -The building blocks of classes are the various built-in types, and -user-defined tuples. Tuples are covered later in this chapter. -The built-in types each get their own class whose members are precisely -the objects having that type. The following built-in classes are -defined: - -\begin{itemize} -\item \texttt{alien} -\item \texttt{array} -\item \texttt{bignum} -\item \texttt{complex} -\item \texttt{cons} -\item \texttt{dll} -\item \texttt{f} -\item \texttt{fixnum} -\item \texttt{float} -\item \texttt{port} -\item \texttt{ratio} -\item \texttt{sbuf} -\item \texttt{string} -\item \texttt{t} -\item \texttt{tuple} -\item \texttt{vector} -\item \texttt{word} -\end{itemize} - -Each builtin class has a corresponding membership test predicate, named -after the builtin class suffixed by \texttt{?}. For example, \texttt{cons?}, \texttt{word?}, etc. Automatically-defined predicates is a common theme, and -in fact \emph{every} class has a corresponding predicate word, -with the following -exceptions: - -\begin{itemize} -\item \texttt{object} -- there is no need for a predicate word, since -every object is an instance of this class. -\item \texttt{f} -- the only instance of this class is the singleton -\texttt{f} signifying falsity, missing value, and empty list, and the predicate testing for this is the built-in library word \texttt{not}. -\item \texttt{t} -- the only instance of this class is the canonical truth value -\texttt{t}. You can write \texttt{t =} to test for this object, however usually -any object distinct from \texttt{f} is taken as a truth value, and \texttt{t} is not tested for directly. -\end{itemize} - -\section{Metaclasses} - -So far, we have only seen predefined classes corresponding to built-in -types. More complicated classes are defined in terms of metaclasses. -This section will describe how to define new classes belonging to -predefined metaclasses. - -Just like shared object object traits motivates the existence of classes, -common behavior shared between classes themselves motivates metaclasses. -For example, classes corresponding to built-in types, such as \texttt{fixnum} -and \texttt{string}, are instances of -the \texttt{builtin} metaclass, whereas a user-defined class is not an -instance of \texttt{builtin}. - -\subsection{The \texttt{union} metaclass} - -The \texttt{union} metaclass allows new classes to be -defined as aggregates of existing classes. - -For example, the Factor library defines some unions over numeric types: - -\begin{verbatim} -UNION: integer fixnum bignum ; -UNION: rational integer ratio ; -UNION: real rational float ; -UNION: number real complex ; -\end{verbatim} - -Now, the absolute value function can be defined in an efficient manner -for real numbers, and in a more general fashion for complex numbers: - -\begin{verbatim} -GENERIC: abs ( z -- |z| ) -M: real abs dup 0 < [ neg ] when ; -M: complex abs >rect mag2 ; -\end{verbatim} - -New unions can be defined as in the numerical classes example: -you write \texttt{UNION:} followed by the name of the union, -followed by its members. The list of members is terminated with a -semi-colon. - -A predicate named after the union followed by '?' is -automatically-defined. For example, the following definition of 'real?' -was automatically created: - -\begin{verbatim} -: real? - dup rational? [ - drop t - ] [ - dup float? [ - drop t - ] [ - drop f - ] ifte - ] ifte ; -\end{verbatim} - -\subsection{The \texttt{complement} metaclass} - -The \texttt{complement} metaclass allows you to define a class whose members -are exactly those not in another class. For example, the class of all -truth values is defined in \texttt{library/kernel.factor} by: - -\begin{verbatim} -COMPLEMENT: general-t f -\end{verbatim} - -\subsection{The \texttt{predicate} metaclass} - -The predicate metaclass contains classes whose membership test is an -arbitrary expression. To speed up dispatch, each predicate must be -defined as a subclass of some other class. That way predicates -subclassing from disjoint builtin classes do not need to be -exhaustively tested. - -The source file \texttt{library/strings.factor} defines some subclasses of \texttt{integer} -classifying ASCII characters: - -\begin{verbatim} -PREDICATE: integer blank " \t\n\r" str-contains? ; -PREDICATE: integer letter CHAR: a CHAR: z between? ; -PREDICATE: integer LETTER CHAR: A CHAR: Z between? ; -PREDICATE: integer digit CHAR: 0 CHAR: 9 between? ; -PREDICATE: integer printable CHAR: \s CHAR: ~ between? ; -\end{verbatim} - -Each predicate defines a corresponding predicate word whose name is -suffixed with '?'; for example, a 'digit?' word is automatically -defined: - -\begin{verbatim} -: digit? - dup integer? [ - CHAR: 0 CHAR: 9 between? - ] [ - drop f - ] ifte ; -\end{verbatim} - -For obvious reasons, the predicate definition must consume and produce -exactly one value on the stack. - -\section{Tuples} - -Tuples are user-defined classes whose objects consist of named slots. - -New tuple classes are defined with the following syntax: - -\begin{verbatim} -TUPLE: point x y z ; -\end{verbatim} - -This defines a new class named \texttt{point}, along with the -following set of words: - -\begin{verbatim} - point? -point-x set-point-x -point-y set-point-y -point-z set-point-z -\end{verbatim} - -The word \texttt{} takes the slot values from the stack and -produces a new \texttt{point}: - -\begin{alltt} -\textbf{ok} 1 2 3 . -\textbf{<< point 1 2 3 >>} -\end{alltt} - -As you can guess from the above, there is a literal syntax for tuples, -and the \texttt{point?}~word tests if the top of the stack is an object -belonging to that class: - -\begin{alltt} -\textbf{ok} << point 1 2 3 >> point? . -\textbf{t} -\end{alltt} - -The general form of the literal syntax is as follows: - -\begin{alltt} -<< \emph{class} \emph{slots} \... >> -\end{alltt} - -The syntax consists of the tuple class name followed by the -values of all slots. An error is raised if insufficient or extraneous slot values are specified. - -As usual, the distinction between literal syntax and explicit calls is the -time the tuple is created; literals are created at parse time, whereas -explicit constructor calls creates a new object each time the code -runs. - -Slots are read and written using the various automatically-defined words with names of the -form \texttt{\emph{class}-\emph{slot}} and \texttt{set-\emph{class}-\emph{slot}}. - -\subsection{Constructors} - -A tuple constructor is named after the tuple class surrounded in angle -brackets (\texttt{<} and \texttt{>}). A default constructor is provided -that reads slot values from the stack, however a custom constructor can -be defined using the \texttt{C:} parsing word. - -\subsection{Delegation} - -If a tuple defines a slot named \texttt{delegate}, any generic words called on -the tuple that are not defined for the tuple's class will be passed on -to the delegate. - -This idiom is used in the I/O code for wrapper streams. For example, the -\texttt{ansi-stream} class delegates all generic words to its underlying stream, -except for \texttt{fwrite-attr}, which outputs the necessary terminal escape -codes. Another example is \texttt{stdio-stream}, which performs all I/O on its -underlying stream, except it flushes after every new line (which would -be undesirable for say, a file). - -Delegation is used instead of inheritance in Factor, but it is not a -substitute; in particular, the semantics differ in that a delegated -method call receives the delegate on the stack, not the original object. - -\input{new-guide.ind} - -\end{document} diff --git a/doc/handbook.tex b/doc/handbook.tex new file mode 100644 index 0000000000..e428ed01b3 --- /dev/null +++ b/doc/handbook.tex @@ -0,0 +1,2465 @@ +% :indentSize=4:tabSize=4:noTabs=true:mode=tex:wrap=soft: + +\documentclass{report} + +\usepackage[plainpages=false,colorlinks]{hyperref} +\usepackage[style=list,toc]{glossary} +\usepackage{alltt} +\usepackage{times} +\usepackage{longtable} + +\setcounter{tocdepth}{3} +\setcounter{secnumdepth}{3} + +\setlength\parskip{\medskipamount} +\setlength\parindent{0pt} + +\newcommand{\bs}{\char'134} +\newcommand{\dq}{"} +\newcommand{\tto}{\symbol{123}} +\newcommand{\ttc}{\symbol{125}} + +\newcommand{\parsingword}[3]{\index{#1} +\emph{Parsing word:} \texttt{#2} \`\texttt{#3} vocabulary} + +\newcommand{\ordinaryword}[3]{\index{#1} +\emph{Word:} \texttt{#2} \`\texttt{#3} vocabulary} + +\newcommand{\symbolword}[2]{\index{#1} +\emph{Symbol:} \texttt{#1} \`\texttt{#2} vocabulary} + +\newcommand{\classword}[2]{\index{#1} +\emph{Class:} \texttt{#1} \`\texttt{#2} vocabulary} + +\newcommand{\genericword}[3]{\index{#1} +\emph{Generic word:} \texttt{#2} \`\texttt{#3} vocabulary} + +\newcommand{\wordtable}[1]{ + +\hrulefill +\begin{tabbing}#1\\ \end{tabbing} +} + +\makeatletter + +\makeatother + +\makeglossary +\makeindex + +\begin{document} + +\title{Factor Developer's Handbook} + +\author{Slava Pestov} + +\maketitle +\tableofcontents{} + +\chapter*{Introduction} + +What follows is a detailed guide to the Factor language and development environment. It is not a tutorial or introductory guide, nor does it cover some background material that you are expected to understand, such as object-oriented programming, higher-order functions, continuations, or general issues of algorithm and program design. + +\chapter{The language} + +Factor is a programming language combinding a postfix syntax with a functional and object-oriented +flavor, building on ideas from Forth, Joy and Lisp. + +Factor is \emph{dynamic}. This means that all objects in the language are fully reflective at run time, and that new definitions can be entered without restarting the runtime. Factor code can be used interchangably as data, meaning that sophisticated language extensions can be realized as libraries of words. + +Factor is \emph{safe}. This means all code executes in an object-oriented runtime that provides +garbage collection and prohibits direct pointer arithmetic. There is no way to get a dangling reference by deallocating a live object, and it is not possible to corrupt memory by overwriting the bounds of an array. + +\section{Conventions} + +When examples of interpreter interactions are given in this guide, the input is in a roman font, and any +output from the interpreter is in boldface: +\begin{alltt} +\textbf{ok} "Hello, world!" print +\textbf{Hello, world!} +\end{alltt} +Parsing words, defined in \ref{parser}, are presented with the following notation. +\wordtable{ +\parsingword{word}{word syntax...}{foo} +} +The parsing word's name is followed by the syntax, with meta-syntactic variables set in an italic font. For example: +\wordtable{ +\parsingword{colon}{:~\emph{name} \emph{definition} ;}{syntax} +} +Ordinary words are presented in the following notation. +\wordtable{ +\ordinaryword{word}{word ( \emph{inputs} -- \emph{outputs} )}{foo} +} +A compound definition in the library, or primitive in the runtime. +\wordtable{ +\symbolword{word}{word}{foo} +} +A symbol definition. +\wordtable{ +\genericword{word}{word ( \emph{inputs} -- \emph{outputs} )}{foo} +} +A generic word definition. +\wordtable{ +\classword{word}{foo} +} +A class that generic word methods can specialize on. + +\subsection{Stack effects} + +Within a stack effect comment, the top of the stack is the rightmost entry in both the +list of inputs and outputs, so \texttt{( x y -- x-y )} indicates that the top stack element will be subtracted from the element underneath. + +The following abbreviations have conventional meaning in stack effect comments: + +\begin{description} +\item[\texttt{[ x y z ]}] a list with elements whose types are hinted at by \texttt{x}, \texttt{y}, \texttt{z} +\item[\texttt{[[ x y ]]}] a cons cell where the type of the cdr is hinted at by \texttt{x}, and the type of the cdr is hinted at by \texttt{y} +\item[\texttt{elt}] an arbitrary object that happends to be an element of a collection +\item[\texttt{i}] a loop counter or index +\item[\texttt{j}] a loop counter or index +\item[\texttt{n}] a number +\item[\texttt{obj}] an arbitrary object +\item[\texttt{quot}] a quotation +\item[\texttt{seq}] a sequence +\item[\texttt{str}] a string +\item[\texttt{?}] a boolean +\item[\texttt{foo/bar}] either \texttt{foo} or \texttt{bar}. For example, \texttt{str/f} means either a string, or \texttt{f} +\end{description} + +If the stack effect identifies quotations, the stack effect of each quotation may be given after suffixing \texttt{|} to the whole string. For example, the following denotes a word that takes a list and a quotation and produces a new list, calling the quotation with elements of the list. +\begin{verbatim} +( list quot -- list | quot: elt -- elt ) +\end{verbatim} + +\subsection{Naming conventions} + +The following naming conventions are used in the Factor library. + +\begin{description} +\item[\texttt{FOO:}] a parsing word that reads ahead from the input string +\item[\texttt{FOO}] a parsing word that does not read ahead, but rather takes a fixed action at parse time +\item[\texttt{FOO"}] a parsing word that reads characters from the input string until the next occurrence of \texttt{"} +\item[\texttt{foo?}] a predicate returning a boolean or generalized boolean value +\item[\texttt{foo.}] a word whose primary action is to print something, rather than to return a value. The basic case is the \texttt{.}~word, which prints the object at the top of the stack +\item[\texttt{foo*}] a variation of the \texttt{foo} word that takes more parameters +\item[\texttt{(foo)}] a word that is only useful for the implementation of \texttt{foo} +\item[\texttt{>to}] converts the object at the top of the stack to the \texttt{to} class +\item[\texttt{from>}] converts an instance of the \texttt{from} class into some canonical form +\item[\texttt{from>to}] convert an instance of the \texttt{from} class to the \texttt{to} class +\item[\texttt{>s}] move top of data stack to the \texttt{s} stack, where \texttt{s} is either \texttt{r} (call stack), \texttt{n} (name stack), or \texttt{c} (catch stack) +\item[\texttt{s>}] move top of \texttt{s} stack to the data stack, where \texttt{s} is as above +\item[\texttt{}] create a new instance of \texttt{class} +\item[\texttt{nfoo}] destructive version of \texttt{foo}, that modifies one of its inputs rather than returning a new value. This convention is used by sequence words +\item[\texttt{2foo}] like \texttt{foo} but takes two operands +\item[\texttt{3foo}] like \texttt{foo} but takes three operands +\item[\texttt{foo-with}] a form of the \texttt{foo} combinator that takes an extra object, and passes this object on each iteration of the quotation; for example, \texttt{each-with} and \texttt{map-with} +\item[\texttt{with-foo}] executes a quotation in a namespace where \texttt{foo} is configured in a special manner; for example, \texttt{with-stream} +\item[\texttt{make-foo}] executes a quotation in a namespace where a sequence of type \texttt{foo} is being constructed; for example, \texttt{make-string} +\end{description} + +\section{Syntax} +\newcommand{\parseglos}{\glossary{name=parser, +description={a set of words in the \texttt{parser} vocabulary, primarily \texttt{parse}, \texttt{eval}, \texttt{parse-file} and \texttt{run-file}, that creates objects from their printed representations, and adds word definitions to the dictionary}}} +\parseglos +In Factor, an \emph{object} is a piece of data that can be identified. Code is data, so Factor syntax is actually a syntax for describing objects, of which code is a special case. +The Factor parser performs two kinds of tasks -- it creates objects from their \emph{printed representations}, and it adds \emph{word definitions} to the dictionary. The latter is discussed in \ref{words}. + +\subsection{\label{parser}Parser algorithm} + +\glossary{name=token, +description={a whitespace-delimited piece of text, the primary unit of Factor syntax}} +\glossary{name=whitespace, +description={a space (ASCII 32), newline (ASCII 10) or carriage-return (ASCII 13)}} + +At the most abstract level, +Factor syntax consists of whitespace-separated tokens. The parser tokenizes the input on whitespace boundaries, where whitespace is defined as a sequence or one or more space, tab, newline or carriage-return characters. The parser is case-sensitive, so +the following three expressions tokenize differently: +\begin{verbatim} +2X+ +2 X + +2 x + +\end{verbatim} +As the parser reads tokens it makes a distinction between numbers, ordinary words, and +parsing words. Tokens are appended to the parse tree, the top level of which is a list +returned by the original parser invocation. Nested levels of the parse tree are created +by parsing words. + +Here is the parser algorithm in more detail -- some of the concepts therein will be defined shortly: + +\begin{itemize} +\item If the current character is a double-quote (\texttt{"}), the \texttt{"} parsing word is executed, causing a string to be read. +\item Otherwise, the next token is taken from the input. The parser searches for a word named by the token in the currently used set of vocabularies. If the word is found, one of the following two actions is taken: +\begin{itemize} +\item If the word is an ordinary word, it is appended to the parse tree. +\item If the word is a parsing word, it is executed. +\end{itemize} +Otherwise if the token does not represent a known word, the parser attempts to parse it as a number. If the token is a number, the number object is added to the parse tree. Otherwise, an error is raised and parsing halts. +\end{itemize} + +\glossary{name=string mode, +description={a parser mode where token strings are added to the parse tree; the parser will not look up tokens in the dictionary. Activated by switching on the \texttt{string-mode} variable}} + +There is one exception to the above process; the parser might be placed in \emph{string mode}, in which case it simply reads tokens and appends them to the parse tree as strings. String mode is activated and deactivated by certain parsing words wishing to read input in an unstructured but tokenized manner -- see \ref{string-mode}. + +\glossary{name=parsing word, +description={a word that is run at parse time. Parsing words can be defined by suffixing the compound definition with \texttt{parsing}. Parsing words have the \texttt{\dq{}parsing\dq{}} word property set to true, and respond with true to the \texttt{parsing?}~word}} + +Parsing words play a key role in parsing; while ordinary words and numbers are simply +added to the parse tree, parsing words execute in the context of the parser, and can +do their own parsing and create nested data structures in the parse tree. Parsing words +are also able to define new words. + +While parsing words supporting arbitrary syntax can be defined, the default set is found +in the \texttt{syntax} vocabulary and provides the basis for all further syntactic +interaction with Factor. + +\subsection{\label{vocabsearch}Vocabulary search} + +\newcommand{\wordglos}{\glossary{ +name=word, +description={an object holding a code definition and set of properties. Words are organized into vocabularies, and are uniquely identified by name within a vocabulary.}}} +\wordglos +\newcommand{\vocabglos}{\glossary{ +name=vocabulary, +description={a collection of words, uniquely identified by name. The hashtable of vocabularies is stored in the \texttt{vocabularies} global variable, and the \texttt{USE:}~and \texttt{USING:}~parsing words add vocabularies to the parser's search path}}} +\vocabglos + +A \emph{word} associates a code definition with its name. Words are organized into \emph{vocabularies}. Vocabularies are organized into vocabularies. Words are discussed in depth in \ref{words}. + +When the parser reads a token, it attempts to look up a word named by that token. The +lookup is performed in the parser's current vocabulary set. By default, this set includes +two vocabularies: +\begin{verbatim} +syntax +scratchpad +\end{verbatim} +The \texttt{syntax} vocabulary consists of a set of parsing words for reading Factor data +and defining new words. The \texttt{scratchpad} vocabulary is the default vocabulary for new +word definitions. +\wordtable{ +\parsingword{USE:}{USE: \emph{vocabulary}}{syntax} +} +\newcommand{\useglos}{\glossary{ +name=search path, +description={the list of vocabularies that the parser looks up tokens in. You can add to this list with the \texttt{USE:} and \texttt{USING:} parsing words}}} +\useglos + +The \texttt{USE:} parsing word adds a new vocabulary at the front of the search path. Subsequent word lookups by the parser will search this vocabulary first. +\begin{alltt} +USE: lists +\end{alltt} +\wordtable{ +\parsingword{USING:}{USING: \emph{vocabularies} ;}{syntax} +} +Consecutive \texttt{USE:} declarations can be merged into a single \texttt{USING:} declaration. +\begin{alltt} +USING: lists strings vectors ; +\end{alltt} + +Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. For a way around this, see \ref{deferred}. + +\subsection{Numbers} + +\newcommand{\numberglos}{\glossary{ +name=number, +description={an instance of the \texttt{number} class}}} +\numberglos + +If a vocabulary lookup of a token fails, the parser attempts to parse it as a number. + +\subsubsection{Integers} + +\newcommand{\integerglos}{\glossary{ +name=integer, +description={an instance of the \texttt{integer} class, which is a disjoint union of the \texttt{fixnum} and \texttt{bignum} classes}}} +\numberglos + +\newcommand{\fixnumglos}{\glossary{ +name=fixnum, +description={an instance of the \texttt{fixnum} class, representing a fixed precision integer. On 32-bit systems, an element of the interval $(-2^{-29},2^{29}]$, and on 64-bit systems, the interval $(-2^{-61},2^{61}]$}}} +\fixnumglos + +\newcommand{\bignumglos}{\glossary{ +name=bignum, +description={an instance of the \texttt{bignum} class, representing an arbitrary-precision integer whose value is bounded by available object memory}}} +\bignumglos + +The printed representation of an integer consists of a sequence of digits, optionally prefixed by a sign. +\begin{alltt} +123456 +-10 +2432902008176640000 +\end{alltt} +Integers are entered in base 10 unless prefixed with a base change parsing word. +\wordtable{ +\parsingword{BIN:}{BIN: \emph{integer}}{syntax}\\ +\parsingword{OCT:}{OCT: \emph{integer}}{syntax}\\ +\parsingword{HEX:}{HEX: \emph{integer}}{syntax} +} +\begin{alltt} +\textbf{ok} BIN: 1110 BIN: 1 + . +\textbf{15} +\textbf{ok} HEX: deadbeef 2 * . +\textbf{7471857118} +\end{alltt} + +\subsubsection{Ratios} + +\newcommand{\ratioglos}{\glossary{ +name=ratio, +description={an instance of the \texttt{ratio} class, representing an exact ratio of two integers}}} +\ratioglos + +The printed representation of a ratio is a pair of integers separated by a slash (\texttt{/}). +No intermediate whitespace is permitted. Either integer may be signed, however the ratio will be normalized into a form where the denominator is positive and the greatest common divisor +of the two terms is 1. +\begin{alltt} +75/33 +1/10 +-5/-6 +\end{alltt} + +\subsubsection{Floats} + +\newcommand{\floatglos}{\glossary{ +name=float, +description={an instance of the \texttt{float} class, representing an IEEE 754 double-precision floating point number}}} +\floatglos + +Floating point numbers contain an optional decimal part, an optional exponent, with +an optional sign prefix on either the significand or exponent. +\begin{alltt} +10.5 +-3.1456 +7e13 +1e-5 +\end{alltt} + +\subsubsection{Complex numbers} + +\newcommand{\complexglos}{\glossary{ +name=complex, +description={an instance of the \texttt{complex} class, representing a complex number with real and imaginary components, where both components are real numbers}}} +\complexglos +\wordtable{ +\parsingword{hash-curly}{\#\{ \emph{real} \emph{imaginary} \}\#}{syntax} +} +A complex number +is given by two components, a ``real'' part and ''imaginary'' part. The components +must either be integers, ratios or floats. +\begin{verbatim} +#{ 1/2 1/3 }# ! the complex number 1/2+1/3i +#{ 0 1 }# ! the imaginary unit +\end{verbatim} + +\subsection{Literals} + +Many different types of objects can be constructed at parse time via literal syntax. Numbers are a special case since support for reading them is built-in to the parser. All other literals are constructed via parsing words. + +If a quotation contains a literal object, the same literal object instance is used each time the quotation executes; that is, literals are ``live''. + +\subsubsection{\label{boolean}Booleans} + +\newcommand{\boolglos}{ +\glossary{ +name=boolean, +description={an instance of the \texttt{boolean} class, either \texttt{f} or \texttt{t}. See generalized boolean}} +\glossary{ +name=generalized boolean, +description={an object used as a truth value. The \texttt{f} object is false and anything else is true. See boolean}} +\glossary{ +name=t, +description={the canonical truth value. The \texttt{t} class, whose sole instance is the \texttt{t} object. Note that the \texttt{t} class is not equal to the \texttt{t} object}} +\glossary{ +name=f, +description={the canonical false value; anything else is true. The \texttt{f} class, whose sole instance is the \texttt{f} object. Note that the \texttt{f} class is not equal to the \texttt{f} object}} +} +\boolglos +Any Factor object may be used as a truth value in a conditional expression. The \texttt{f} object is false and anything else is true. The \texttt{f} object is also used to represent the empty list, as well as the concept of a missing value. The canonical truth value is the \texttt{t} object. +\wordtable{ +\parsingword{f}{f}{syntax}\\ +\parsingword{t}{t}{syntax} +} +Adds the \texttt{f} and \texttt{t} objects to the parse tree. + +Note that the \texttt{f} parsing word and class is not the same as the \texttt{f} object. The former can be obtained by writing \texttt{\bs~f} inside a quotation, or \texttt{POSTPONE: f} inside a list that will not be evaluated. +\begin{alltt} +\textbf{ok} f \bs f = . +\textbf{f} +\end{alltt} +An analogous distinction holds for the \texttt{t} class and object. + +\subsubsection{\label{syntax:char}Characters} + +\newcommand{\charglos}{\glossary{ +name=character, +description={an integer whose value denotes a Unicode code point}}} +\charglos +Factor has no distinct character type, however Unicode character value integers can be +read by specifying a literal character, or an escaped representation thereof. +\wordtable{ +\parsingword{CHAR:}{CHAR: \emph{token}}{syntax} +} +Adds the Unicode code point of the character represented by \emph{token} to the parse tree. + +\newcommand{\escapeglos}{\glossary{ +name=escape, +description={a sequence allowing a non-literal character to be inserted in a string. For a list of escapes, see \ref{escape}}}} +\escapeglos +If the token is a single-character string other than whitespace or backslash, the character is taken to be this token. If the token begins with a backslash, it denotes one of the following escape codes. +\begin{table}[Special character escape codes] +\label{escape} +\begin{tabular}{l|l} +Escape code&Character\\ +\hline +\texttt{\bs{}\bs}&Backslash (\texttt{\bs})\\ +\texttt{\bs{}s}&Space\\ +\texttt{\bs{}t}&Tab\\ +\texttt{\bs{}n}&Newline\\ +\texttt{\bs{}t}&Carriage return\\ +\texttt{\bs{}0}&Null byte (ASCII 0)\\ +\texttt{\bs{}e}&Escape (ASCII 27)\\ +\texttt{\bs{}"}&Double quote (\texttt{"})\\ +\end{tabular} +\end{table} +Examples: +\begin{alltt} +\textbf{ok} CHAR: a . +\textbf{97} +\textbf{ok} CHAR: \bs{}0 . +\textbf{0} +\textbf{ok} CHAR: \bs{}n . +\textbf{10} +\end{alltt} +A Unicode character can be specified by its code number by writing \texttt{\bs{}u} followed by a four-digit hexadecimal. That is, the following two expressions are equivalent: +\begin{alltt} +CHAR: \bs{}u0078 +78 +\end{alltt} +While not useful for single characters, this syntax is also permitted inside strings. + +\subsubsection{Strings} + +\newcommand{\stringglos}{\glossary{ +name=string, +description={an instance of the \texttt{string} class, representing an immutable sequence of characters}}} +\stringglos +\wordtable{ +\parsingword{"}{"\emph{string}"}{syntax} +} +Reads from the input string until the next occurrence of +\texttt{"}, and appends the resulting string to the parse tree. String literals cannot span multiple lines. +Strings containing +the \texttt{"} character and various other special characters can be read by +inserting escape sequences as described in \ref{syntax:char}. +\begin{alltt} +\textbf{ok} "Hello world" print +\textbf{Hello world} +\end{alltt} + +\subsubsection{Lists} +\newcommand{\listglos}{\glossary{ +name=list, +description={an instance of the \texttt{list} class, storing a sequence of elements as a chain of zero or more conses, where the car of each cons is an element, and the cdr is either \texttt{f} or another list}} +\glossary{name=proper list, description=see list}} +\listglos +\wordtable{ +\parsingword{openbracket}{[}{syntax}\\ +\parsingword{closebracket}{]}{syntax} +} +Parses a list, whose elements are read between \texttt{[} and \texttt{]} and can include other lists. +\begin{verbatim} +[ + "404" "responder" set + [ drop no-such-responder ] "get" set +] +\end{verbatim} +\newcommand{\consglos}{\glossary{ +name=cons, +description={an instance of the \texttt{cons} class, storing an ordered pair of objects referred to as the car and the cdr}}} +\consglos +\wordtable{ +\parsingword{conssyntax}{[[ \emph{car} \emph{cdr} ]]}{syntax} +} +Parses two components making up a cons cell. Note that the lists parsed with \texttt{[} and \texttt{]} are just a special case of \texttt{[[} and \texttt{]]}. The following two lines are equivalent. +\begin{alltt} +[ 1 2 3 ] +[[ 1 [[ 2 [[ 3 f ]] ]] ]] +\end{alltt} +The empty list is denoted by \texttt{f}, along with boolean falsity, and the concept of a missing value. The expression \texttt{[ ]} parses to the same object as \texttt{f}. + +\subsubsection{Words} + +While words parse as themselves, a word occurring inside a quotation is executed when the quotation is called. Sometimes it is desirable to have a word be pushed on the data stack during the execution of a quotation, usually for reflective access to the word's slots. +\wordtable{ +\parsingword{bs}{\bs~\emph{word}}{syntax} +} +Reads the next word from the input string and appends some \emph{code} to the parse tree that pushes the word on the stack when the code is called. The following two lines are equivalent: +\begin{verbatim} +\ length +[ length ] car +\end{verbatim} +\wordtable{ +\parsingword{POSTPONE:}{POSTPONE: \emph{word}}{syntax} +} +Reads the next word from the input string and appends the word to the parse tree, even if it is a parsing word. For an word \texttt{foo}, \texttt{POSTPONE: foo} and \texttt{foo} are equivalent; however, if \texttt{foo} is a parsing word, the latter will execute it at parse time, while the former will execute it at runtime. Usually used inside parsing words that wish to delegate some action to a further parsing word. +\begin{alltt} +\textbf{ok} : parsing1 + "Parsing 1" print 2 swons ; parsing +\textbf{ok} : parsing2 + "Parsing 2" print POSTPONE: parsing1 ; parsing +\textbf{ok} [ 1 parsing1 3 ] . +\textbf{Parsing 1} +\textbf{[ 1 2 3 ]} +\textbf{ok} [ 0 parsing2 2 4 ] . +\textbf{Parsing 2} +\textbf{Parsing 1} +\textbf{[ 0 2 4 ]} +\end{alltt} + +\subsubsection{Mutable literals} + +\newcommand{\mutableglos}{\glossary{ +name=mutable object, +description={an object whose slot values can be modified, or by transitivity, an object that refers to a mutable object via one of its slots}}} +\mutableglos + +Using mutable object literals in word definitions requires care, since if those objects +are mutated, the actual word definition will be changed, which is in most cases not what you would expect. Strings and lists are immutable; string buffers, vectors, hashtables and tuples are mutable. + +\subsubsection{String buffers} + +\newcommand{\sbufglos}{\glossary{ +name=string buffer, +description={an instance of the \texttt{sbuf} class, representing a mutable and growable sequence of characters}} +\glossary{name=sbuf, description=see string buffer}} +\sbufglos +\wordtable{ +\parsingword{SBUF}{SBUF" \emph{text}"}{syntax} +} +Reads from the input string until the next occurrence of +\texttt{"}, converts the string to a string buffer, and appends it to the parse tree. +As with strings, the escape codes described in \ref{syntax:char} are permitted. +\begin{alltt} +\textbf{ok} SBUF" Hello world" sbuf>string print +\textbf{Hello world} +\end{alltt} + +\subsubsection{Vectors} +\newcommand{\vectorglos}{\glossary{ +name=vector, +description={an instance of the \texttt{vector} class, storing a mutable and growable sequence of elements in contiguous memory cells}}} +\vectorglos +\wordtable{ +\parsingword{opencurly}{\{}{syntax}\\ +\parsingword{closecurly}{\}}{syntax} +} +Parses a vector, whose elements are read between \texttt{\{} and \texttt{\}}. +\begin{verbatim} +{ 3 "blind" "mice" } +\end{verbatim} + +\subsubsection{Hashtables} +\newcommand{\hashglos}{\glossary{ +name=hashtable, +description={an instance of the \texttt{hashtable} class, providing a mutable mapping of keys to values}}} +\hashglos +\wordtable{ +\parsingword{openccurly}{\{\{}{syntax}\\ +\parsingword{closeccurly}{\}\}}{syntax} +} +Parses a hashtable. Elements between \texttt{\{\{} and \texttt{\}\}} must be cons cells, where the car is the key and the cdr is a value. +\begin{verbatim} +{{ + [[ "red" [ 255 0 0 ] ]] + [[ "green" [ 0 255 0 ] ]] + [[ "blue" [ 0 0 255 ] ]] +}} +\end{verbatim} + +\subsubsection{Tuples} +\newcommand{\tupleglos}{\glossary{ +name=tuple, +description={an instance of a user-defined class whose metaclass is the \texttt{tuple} metaclass, storing a fixed set of elements in named slots, with optional delegation method dispatch semantics}}} +\tupleglos +\wordtable{ +\parsingword{<<}{<<}{syntax}\\ +\parsingword{>>}{>>}{syntax} +} +Parses a tuple. The tuple's class must follow \texttt{<<}. The element after that is always the tuple's delegate. Further elements until \texttt{>>} are specified according to the tuple's slot definition, and an error is raised if an incorrect number of elements is given. +\begin{verbatim} +<< color f 255 0 0 >> +\end{verbatim} + +\subsection{\label{comments}Comments} + +\wordtable{ +\parsingword{!}{!~\emph{remainder of line}}{syntax} +} +The remainder of the input line is ignored if an exclamation mark (\texttt{!}) is read. +\begin{alltt} +! Note that the sequence union does not include lists, +! or user defined tuples that respond to the sequence +! protocol. +\end{alltt} +\wordtable{ +\parsingword{hash!}{\#!~\emph{remainder of line}}{syntax} +} +\newcommand{\doccommentglos}{\glossary{ +name=documentation comment, +description={a comment describing the usage of a word. Delimited by the \texttt{\#"!} parsing word, they appear at the start of a word definition and are stored in the \texttt{""documentation""} word property}}} +\doccommentglos +Comments that begin with \texttt{\#!} are called \emph{documentation comments}. +A documentation comment has no effect on the generated parse tree, but if it is the first thing inside a word definition, the comment text is appended to the string stored in the word's \texttt{"documentation"} property. Word properties are described in \ref{word-props}. +\wordtable{ +\parsingword{(}{( \emph{stack effect} )}{syntax} +} +\glossary{ +name=stack effect, +description={A string of the form \texttt{( \emph{inputs} -- \emph{outputs} )}, where the inputs and outputs are a whitespace-separated list of names or types. The top of the stack is the right-most token on both sides.}} +\newcommand{\stackcommentglos}{\glossary{ +name=stack effect comment, +description={a comment describing the inputs and outputs of a word. Delimited by \texttt{(} and \texttt{}), they appear at the start of a word definition and are stored in the \texttt{""stack-effect""} word property}}} +\stackcommentglos +Comments delimited by \texttt{(} and \texttt{)} are called \emph{stack effect comments}. By convention they are placed at the beginning of a word definition to document the word's inputs and outputs: +\begin{verbatim} +: push ( element sequence -- ) + #! Push a value on the end of a sequence. + dup length swap set-nth ; +\end{verbatim} +A stack effect comment has no effect on the generated parse tree, but if it is the first thing inside a word definition, the word's \texttt{"stack-effect"} property is set to the comment text. Word properties are described in \ref{word-props}. + +\section{Data and control flow} + +\subsection{Shuffle words} + +\newcommand{\dsglos}{\glossary{ +name=stack, +description=see data stack} +\glossary{ +name=data stack, +description={the primary means of passing values between words}}} +\dsglos +Shuffle words are placed between words taking action to rearrange items on the stack +as the next word in the quotation would expect them. Their behavior can be understood entirely in terms of their stack effects. +\wordtable{ +\ordinaryword{drop}{drop ( x -- )}{kernel}\\ +\ordinaryword{2drop}{drop ( x y -- )}{kernel}\\ +\ordinaryword{3drop}{drop ( x y z -- )}{kernel}\\ +\ordinaryword{nip}{nip ( x y -- y )}{kernel}\\ +\ordinaryword{2nip}{2nip ( x y -- y )}{kernel}\\ +\ordinaryword{dup}{dup ( x -- x x )}{kernel}\\ +\ordinaryword{2dup}{2dup ( x y -- x y x y )}{kernel}\\ +\ordinaryword{3dup}{3dup ( x y z -- x y z x y z )}{kernel}\\ +\ordinaryword{dupd}{dupd ( x y -- x x y )}{kernel}\\ +\ordinaryword{over}{over ( x y -- x y x )}{kernel}\\ +\ordinaryword{pick}{pick ( x y z -- x y z x )}{kernel}\\ +\ordinaryword{tuck}{tuck ( x y -- y x y )}{kernel}\\ +\ordinaryword{swap}{swap ( x y -- y x )}{kernel}\\ +\ordinaryword{2swap}{2swap ( x y z t -- z t x y )}{kernel}\\ +\ordinaryword{swapd}{swapd ( x y z -- y x z )}{kernel}\\ +\ordinaryword{rot}{rot ( x y z -- y z x )}{kernel}\\ +\ordinaryword{-rot}{-rot ( x y z -- z x y )}{kernel} +} +Try to avoid the complex shuffle words such as \texttt{rot} and \texttt{2dup} as much as possible, for they make data flow harder to understand. If you find yourself using too many shuffle words, or you're writing +a stack effect comment in the middle of a compound definition to keep track of stack contents, it is +a good sign that the word should probably be factored into two or +more smaller words. + +\subsection{\label{quotations}Quotations} + +\newcommand{\csglos}{\glossary{ +name=return stack, +description=see call stack} +\glossary{ +name=call stack, +description={holds quotations waiting to be called. When a quotation is called with \texttt{call}, or when a compound word is executed, the previous call frame is pushed on the call stack, and the new quotation becomes the current call frame}}} +\csglos +\newcommand{\cfglos}{\glossary{ +name=call frame, +description=the currently executing quotation}} +\cfglos +\glossary{ +name=interpreter, +description=executes quotations by iterating them and recursing into nested definitions. see compiler} + +The Factor interpreter executes quotations. Quotations are lists, and since lists can contain any Factor object, they can contain words. It is words that give quotations their operational behavior, as you can see in the following description of the interpreter algorithm. + +\begin{itemize} +\item If the call frame is \texttt{f}, the call stack is popped and becomes the new call frame. +\item If the car of the call frame is a word, the word is executed: +\begin{itemize} +\item If the word is a symbol, it is pushed on the data stack. See \ref{symbols}. +\item If the word is a compound definition, the current call frame is pushed on the call stack, and the new call frame becomes the word definition. See \ref{colondefs}. +\item If the word is compiled or primitive, the interpreter jumps to a machine code definition. See \ref{primitives}. +\item If the word is undefined, an error is raised. See \ref{deferred}. +\end{itemize} +\item Otherwise, the car of the call frame is pushed on the data stack. +\item The call frame is set to the cdr, and the loop continues. +\end{itemize} + +The interpreter can be invoked reflectively with the following pair of words. +\wordtable{ +\ordinaryword{call}{call ( quot -- )}{kernel} +} +Push the current call frame on the call stack, and set the call stack to the given quotation. Conceptually: calls the quotation, as if its definition was substituted at the location of the \texttt{call}. +\begin{alltt} +\textbf{ok} [ 2 2 + 3 * ] call . +\textbf{12} +\end{alltt} +\wordtable{ +\ordinaryword{execute}{execute ( word -- )}{kernel} +} +Execute a word definition, taking action based on the word definition, as above. +\begin{alltt} +\textbf{ok} : hello "Hello world" print ; +\textbf{ok} : twice dup execute execute ; +\textbf{ok} \bs hello twice +\textbf{Hello world} +\textbf{Hello world} +\end{alltt} + +\subsubsection{Tail call optimization} + +\newcommand{\tailglos}{\glossary{ +name=tail call, +description=the last call in a quotation} +\glossary{ +name=tail call optimization, +description=the elimination of call stack pushes when making a tail call}} + +When a call is made to a quotation from the last word in the call frame, there is no +purpose in pushing the empty call frame on the call stack. Therefore the last call in a quotation does not grow the call stack, and tail recursion executes in bounded space. + +\subsubsection{Call stack manipulation} + +Because of the way the interpreter is described in \ref{quotations}, the top of the call stack is not accessed during the execution of a quotation; it is only popped when the interpreter reaches the end of the quotation. In effect, the call stack can be used as a temporary storage area, as long as pushes and pops are balanced out within a single quotation. +\wordtable{ +\ordinaryword{>r}{>r ( x -- r:x )}{kernel} +} +Moves the top of the data stack to the call stack. +\wordtable{ +\ordinaryword{r>}{r> ( x -- r:x )}{kernel} +} +Moves the top of the call stack to the data stack. + +The top of the data stack is ``hidden'' between \texttt{>r} and \texttt{r>}. +\begin{alltt} +\textbf{ok} 1 2 3 >r .s r> +\textbf{2 +1} +\end{alltt} +It is very important to balance usages of \texttt{>r} and \texttt{r>} within a single quotation or word definition. +\begin{verbatim} +: the-good >r 2 + r> * ; ! Okay +: the-bad >r 2 + ; ! Runtime error +: the-ugly r> ; ! Runtime error +\end{verbatim} +Basically, the rule is you must leave the call stack in the same state as you found it, so that when the current quotation finishes executing, the interpreter can return to the caller. + +One exception is that when \texttt{ifte} occurs as the last word in a definition, values may be pushed on the call stack before the condition value is computed, as long as both branches of the \texttt{ifte} pop the values off the call stack before returning. +\begin{verbatim} +: foo ( m ? n -- m+n/n ) + >r [ r> + ] [ drop r> ] ifte ; ! Okay +\end{verbatim} + +\subsubsection{Quotation variants} + +There are three words that combine shuffle words with \texttt{call}. They are useful in the implementation of higher-order words taking quotations as inputs. +\wordtable{ +\ordinaryword{slip}{slip ( quot x -- x | quot: -- )}{kernel} +} +Call a quotation, while hiding the top of the stack. The implementation is as you would expect. +\begin{verbatim} +: slip ( quot x -- x | quot: -- ) + >r call r> ; inline +\end{verbatim} +\wordtable{ +\ordinaryword{keep}{keep ( x quot -- x | quot:~x -- )}{kernel} +} +Call a quotation with a value on the stack, restoring the value when the quotation returns. +\begin{verbatim} +: keep ( x quot -- x | quot: x -- ) + over >r call r> ; inline +\end{verbatim} +\wordtable{ +\ordinaryword{2keep}{2keep ( x y q -- x y | q:~x y -- )}{kernel} +} +Call a quotation with a pair of values on the stack, restoring the values when the quotation returns. +\begin{verbatim} +: 2keep ( x y quot -- x y | quot: x y -- ) + over >r pick >r call r> r> ; inline +\end{verbatim} + +\subsection{Conditionals} + +The simplest style of a conditional form is the \texttt{ifte} word. +\wordtable{ +\ordinaryword{ifte}{ifte ( cond true false -- )}{kernel} +} +The \texttt{cond} is a generalized boolean. If it is \texttt{f}, the \texttt{false} quotation is called, and if \texttt{cond} is any other value, the \texttt{true} quotation is called. The condition flag is removed from the stack before either quotation executes. + +Note that in general, both branches should have the same stack effect. Not only is this good style that makes the word easier to understand, but also unbalanced conditionals cannot be compiled. +\wordtable{ +\ordinaryword{when}{when ( cond true -- | true:~-- )}{kernel}\\ +\ordinaryword{unless}{unless ( cond false -- | false:~-- )}{kernel} +} +This pair are minor variations on \texttt{ifte} where only one branch is specified. The other is implicitly \texttt{[ ]}. They are implemented in the trivial way: +\begin{verbatim} +: when [ ] ifte ; inline +: unless [ ] swap ifte ; inline +\end{verbatim} +The \texttt{ifte} word removes the condition flag from the stack before calling either quotation. Sometimes this is not desirable, if the condition flag is serving a dual purpose as a value to be consumed by the \texttt{true} quotation. The \texttt{ifte*} word exists for this purpose. +\wordtable{ +\ordinaryword{ifte*}{ifte*~( cond true false -- )}{kernel}\\ +\texttt{true:~cond --}\\ +\texttt{false:~--} +} +If the condition is true, it is retained on the stack before the \texttt{true} quotation is called. Otherwise, the condition is removed from the stack and the \texttt{false} quotation is called. The following two lines are equivalent: +\begin{verbatim} +X [ Y ] [ Z ] ifte* +X dup [ Y ] [ drop Z ] ifte +\end{verbatim} +\wordtable{ +\ordinaryword{when*}{when*~( cond true -- | true:~cond -- )}{kernel}\\ +\ordinaryword{unless*}{unless*~( cond false -- | false:~-- )}{kernel} +} +These are variations of \texttt{ifte*} where one of the quotations is \texttt{[ ]}. + +There is one final conditional form that is used to implement the ``default value'' idiom. +\wordtable{ +\ordinaryword{?ifte}{?ifte ( default cond true false -- )}{kernel}\\ +\texttt{true:~cond --}\\ +\texttt{false:~default --} +} +If the condition is \texttt{f}, the \texttt{false} quotation is called with the \texttt{default} value on the stack. Otherwise, the \texttt{true} quotation is called with the condition on the stack. The following two lines are equivalent: +\begin{verbatim} +X [ Y ] [ Z ] ?ifte +X dup [ nip Y ] [ drop Z ] ifte +\end{verbatim} + +\subsubsection{Boolean logic} + +The \texttt{?}~word chooses between two values, rather than two quotations. +\wordtable{ +\ordinaryword{?}{?~( cond true false -- true/false )}{kernel} +} +It is implemented in the obvious way. +\begin{verbatim} +: ? ( cond t f -- t/f ) + rot [ drop ] [ nip ] ifte ; inline +\end{verbatim} +Several words use \texttt{?}~to implement typical boolean algebraic operations. +\wordtable{ +\ordinaryword{>boolean}{>boolean ( obj -- t/f )}{kernel} +} +Convert a generalized boolean into a boolean. That is, \texttt{f} retains its value, whereas anything else becomes \texttt{t}. +\wordtable{ +\ordinaryword{not}{not ( ?~-- ?~)}{kernel} +} +Given \texttt{f}, outputs \texttt{t}, and on any other input, outputs \texttt{f}. +\wordtable{ +\ordinaryword{and}{and ( ?~?~-- ?~)}{kernel} +} +Outputs \texttt{t} if both of the inputs are true. +\wordtable{ +\ordinaryword{or}{or ( ?~?~-- ?~)}{kernel} +} +Outputs \texttt{t} if at least one of the inputs is true. +\wordtable{ +\ordinaryword{xor}{xor ( ?~?~-- ?~)}{kernel} +} +Outputs \texttt{t} if exactly one of the inputs is true. + +An alternative set of logical operations operate on individual bits of integers bitwise, rather than generalized boolean truth values. They are documented in \ref{bitwise}. + +\subsection{Continuations} + +\newcommand{\contglos}{ +\glossary{name=continuation, +description=an object representing the future of the computation}} +\contglos +At any point in the execution of a Factor program, the \emph{current continuation} represents the future of the computation. This object can be captured with the \texttt{callcc0} and \texttt{callcc1} words. +\wordtable{ +\ordinaryword{callcc0}{callcc0 ( quot -- )}{kernel}\\ +\texttt{quot:~cont --}\\ +\texttt{cont:~--}\\ +\ordinaryword{callcc1}{callcc1 ( quot -- )}{kernel}\\ +\texttt{quot:~cont --}\\ +\texttt{cont:~obj --} +} +Calling one of these words calls the given quotation with the continuation on the stack. The continuation is itself a quotation, and calling it \emph{continues execution} at the point after the call to \texttt{callcc0} and \texttt{callcc1}. Essentially, a continuation is a snapshot of the four stacks that can be restored at a later time. + +The difference between \texttt{callcc0} and \texttt{callcc1} lies in the continuation object. When \texttt{callcc1} is used, calling the continuation takes one value from the top of the data stack, and places it back on the \emph{restored} data stack. This allows idioms such as exception handling, co-routines and generators to be implemented via continuations. + +\subsubsection{\label{exceptions}Handling exceptional situations} +\glossary{name=exception, +description=an object representing an exceptional situation that has been detected} + +Support for handling exceptional situations such as bad user input, implementation bugs, and input/output errors is provided by a pair of words, \texttt{throw} and \texttt{catch}. +\wordtable{ +\ordinaryword{throw}{throw ( exception -- )}{errors} +} +Raises an exception. Execution does not continue at the point after the \texttt{throw} call. Rather, the innermost catch block is invoked, and execution continues at that point. Passing \texttt{f} as an exception will cause \texttt{throw} to do nothing. +\wordtable{ +\ordinaryword{catch}{catch ( try handler -- )}{errors}\\ +\texttt{handler:~exception/f -- } +} +An exception handler is established, and the \texttt{try} quotation is called. + +If the \texttt{try} quotation throws an error and no nested \texttt{catch} is established, the following sequence of events takes place: +\begin{itemize} +\item the stacks are restored to their state prior to the \texttt{catch} call, +\item the exception is pushed on the data stack, +\item the \texttt{handler} quotation is called. +\end{itemize} +If the \texttt{try} quotation completes successfully, the stacks are \emph{not} restored. The \texttt{f} object is pushed, and the \texttt{handler} quotation is called. + +A common idiom is that the \texttt{catch} block cleans up from the error in some fashion, then passes it on to the next-innermost catch block. The following word is used for this purpose. +\wordtable{ +\ordinaryword{rethrow}{throw ( exception -- )}{errors} +} +Raises an exception, without saving the current stacks for post-mortem inspection. This is done so that inspecting the error stacks sheds light on the original cause of the exception, rather than the point where it was rethrown. + +Here is a simple example of a word definition that attempts to convert a string representing a hexadecimal number into an integer, and instead of halting execution when the string is not valid, it simply outputs \texttt{f}. +\begin{verbatim} +: catch-hex> ( str -- n/f ) + [ hex> ] [ [ drop f ] when ] catch ; +\end{verbatim} +Exception handling is implemented using a \emph{catch stack}. The \texttt{catch} word pushes the current continuation on the catch stack, and \texttt{throw} calls the continuation at the top of the catch stack with the raised exception. +\glossary{name=catch stack, +description={a stack of exception handler continuations, pushed and popped by \texttt{catch}}} + +\subsubsection{Multitasking} + +Factor implements co-operative multitasking, where the thread of control switches between tasks at explicit calls to \texttt{yield}, as well as when blocking I/O is performed. Multitasking is implemented via continuations. +\wordtable{ +\ordinaryword{in-thread}{in-thread ( quot -- )}{threads} +} +Calls \texttt{quot} in a co-operative thread. The new thread begins executing immediately, and the current thread resumes when the quotation yields, either from blocking +I/O or an explicit call to \texttt{yield}. This is implemented by adding the current continuation to the run queue, then calling \texttt{quot}, and finally executing \texttt{stop} after \texttt{quot} returns. +\wordtable{ +\ordinaryword{yield}{yield ( -- )}{threads} +} +Add the current continuation to the end of the run queue, and call the continuation at the front of the run queue. +\wordtable{ +\ordinaryword{stop}{stop ( -- )}{threads} +} +Call the continuation at the front of run queue, without saving the current continuation. In effect, this stops the current thread. + +\subsubsection{Interpreter state} + +The current state of the interpreter is determined by the contents of the four stacks. A set of words for getting and setting stack contents are the primitive building blocks for continuations, and in turn abstractions such as exception handling and multitasking. +\wordtable{ +\ordinaryword{datastack}{datastack ( -- vector )}{kernel}\\ +\ordinaryword{set-datastack}{set-datastack ( vector -- )}{kernel} +} +Save and restore the data stack contents. As an example, here is a word that executes a quotation and restores the data stack to its previous state; +\begin{verbatim} +: keep-datastack + ( quot -- ) datastack slip set-datastack drop ; +\end{verbatim} +Note that the \texttt{drop} call is made to remove the original quotation from the stack. +\wordtable{ +\ordinaryword{callstack}{callstack ( -- vector )}{kernel}\\ +\ordinaryword{set-callstack}{set-callstack ( vector -- )}{kernel} +} +Save and restore the call stack contents. The call stack does not include the currently executing quotation that made the call to \texttt{callstack}, since the current quotation is held in the call frame -- \ref{quotations} has details. Similarly, calling \texttt{set-callstack} will continue executing the current quotation until it returns, at which point control transfers to the quotation at the top of the new call stack. +\wordtable{ +\ordinaryword{namestack}{namestack ( -- list )}{namespaces}\\ +\ordinaryword{set-namestack}{set-namestack ( list -- )}{namespaces} +} +Save and restore the name stack, used for dynamic variable bindings. See \ref{namespaces}. +\wordtable{ +\ordinaryword{catchstack}{catchstack ( -- list )}{errors}\\ +\ordinaryword{set-catchstack}{set-catchstack ( list -- )}{errors} +} +Save and restore the catch stack, used for exception handling. See \ref{exceptions}. + +\section{\label{words}Words} + +\wordglos +\vocabglos +\glossary{name=defining word, +description=a word that adds definitions to the dictionary} +\glossary{name=dictionary, +description=the collection of vocabularies making up the code in the Factor image} +Words are the fundamental unit of code in Factor, analogous to functions or procedures in other languages. Words are also objects, and this concept forms the basis for Factor's meta-programming facilities. Words hold two distinct pieces of information: +\begin{itemize} +\item The word definition that determines what action is taken when the word is executed, +\item A set of word properties, including the name of the word, the vocabulary it belongs do, documentation strings, and other meta-data. +\end{itemize} +\wordtable{ +\ordinaryword{word?}{word?~( object -- ?~)}{words} +} +Tests if the \texttt{object} is a word. +\wordtable{ +\classword{word}{words} +} +The class of words. + +\subsection{Vocabularies} +\wordtable{ +\symbolword{vocabularies}{words} +} +Words are organized into named vocabularies, stored in the global \texttt{vocabularies} variable. +\wordtable{ +\parsingword{IN:}{IN:~\emph{vocabulary}}{syntax} +} +Sets the current vocabulary for new word definitions, and adds the vocabulary to the search path (\ref{vocabsearch}). + +Parsing words add definitions to the current vocabulary. When a source file is being parsed, the current vocabulary is initially set to \texttt{scratchpad}. + +\subsubsection{Searching for words} + +Words whose names are known at parse time -- that is, most words making up your program -- can be referenced by stating their name. However, the parser itself, and sometimes code you write, will need to look up words dynamically. +\wordtable{ +\ordinaryword{search}{search ( name vocabs -- word )}{words} +} +The \texttt{vocabs} parameter is a list of vocabulary names. If a word with the given name is found, it is pushed on the stack, otherwise, \texttt{f} is pushed. + +\subsubsection{Creating words} + +\wordtable{ +\ordinaryword{create}{create ( name vocabulary -- word )}{words} +} +Creates a new word \texttt{name} in \texttt{vocabulary}. If the vocabulary already contains a word with this name, the existing word is returned. +\wordtable{ +\ordinaryword{create-in}{create ( name -- word )}{words} +} +Creates a new word \texttt{name} in the current vocabulary. Should only be called from parsing words (\ref{parsing-words}), and in fact is defined as: +\begin{verbatim} +: create-in ( name -- word ) "in" get create ; +\end{verbatim} + +\subsection{Word definition} + +There are two ways to create a word definition: +\begin{itemize} +\item Using parsing words at parse time, +\item Using defining words at run-time. This is a more dynamic feature that can be used to implement code generation and such, and in fact parse-time defining words are implemented in terms of run-time defining words. +\end{itemize} + +\subsubsection{\label{colondefs}Compound definitions} + +\newcommand{\colonglos}{\glossary{ +name=compound definition, +description=a word defined to execute a quotation consisting of existing words} +\glossary{ +name=colon definition, +description=see compound definition}} +\colonglos +A compound definition associates a word name with a quotation that is called when the word is executed. +\wordtable{ +\parsingword{:}{:~\emph{name} \emph{definition} ;}{syntax} +} +A word \texttt{name} is created in the current vocabulary, and is associated with \texttt{definition}. +\begin{verbatim} +: ask-name ( -- name ) + "What is your name? " write read-line ; +: greet ( name -- ) + "Greetings, " write print ; +: friend ( -- ) + ask-name greet ; +\end{verbatim} +By convention, the word name should be followed by a stack effect comment, and for more complex definitions, a documentation comment; see \ref{comments}. +\wordtable{ +\ordinaryword{define-compound}{define-compound ( word quotation -- )}{words} +} +Defines \texttt{word} to call the \texttt{quotation} when executed. +\wordtable{ +\ordinaryword{compound?}{compound?~( object -- ?~)}{words} +} +Tests if the \texttt{object} is a compound word definition. +\wordtable{ +\classword{compound}{words} +} +The class that all compound words are an instance of. + +\subsubsection{\label{symbols}Symbols} + +\newcommand{\symbolglos}{\glossary{ +name=symbol, +description={a word defined to push itself on the stack when executed, created by the \texttt{SYMBOL:}~parsing word}}} +\symbolglos +\wordtable{ +\parsingword{SYMBOL:}{SYMBOL:~\emph{name}}{syntax} +} +A word \texttt{name} is created in the current vocabulary that pushes itself on the stack when executed. Symbols are used to identify variables (\ref{namespaces}) as well as for storing crufties in their properties (\ref{word-props}). +\wordtable{ +\ordinaryword{define-symbol}{define-symbol ( word -- )}{words} +} +Defines \texttt{word} to push itself on the data stack when executed. +\wordtable{ +\ordinaryword{symbol?}{symbol?~( object -- ?~)}{words} +} +Tests if the \texttt{object} is a symbol. +\wordtable{ +\classword{symbol}{words} +} +The class that all symbols are an instance of. + +\subsubsection{\label{primitives}Primitives} +\newcommand{\primglos}{\glossary{ +name=primitive, +description=a word implemented as native code in the Factor runtime}} +\symbolglos + +Executing a primitive invokes native code in the Factor runtime. Primitives cannot be defined through Factor code. Compiled definitions behave similarly to primitives in that the interpreter jumps to native code upon encountering them. +\wordtable{ +\ordinaryword{primitive?}{primitive?~( object -- ?~)}{words} +} +Tests if the \texttt{object} is a primitive. +\wordtable{ +\classword{primitive}{words} +} +The class that all primitives are an instance of. + +\subsubsection{\label{deferred}Deferred words and mutual recursion} + +\glossary{ +name=deferred word, +description={a word without a definition, created by the \texttt{DEFER:}~parsing word}} +Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. Mutually-recursive pairs of words can be implemented by \emph{deferring} one of the words in the pair so that the second word in the pair can parse, then by replacing the deferred definition with a real one. +A demonstration of the idiom: +\begin{verbatim} +DEFER: foe +: fie ... foe ... ; +: foe ... fie ... ; +\end{verbatim} +\wordtable{ +\parsingword{DEFER:}{DEFER:~\emph{name}}{syntax} +} +Create a word \texttt{name} in the current vocabulary that simply raises an error when executed. Usually, the word will be replaced with a real definition later. +\wordtable{ +\ordinaryword{undefined?}{undefined?~( object -- ?~)}{words} +} +Tests if the \texttt{object} is an undefined (deferred) word. +\wordtable{ +\classword{undefined}{words} +} +The class that all undefined words are an instance of. + +\subsubsection{Undefining words} + +\wordtable{ +\parsingword{FORGET:}{FORGET:~\emph{name}}{syntax} +} +Removes the word \texttt{name} from its vocabulary. Existing definitions that reference the word will continue to work, but newly-parsed occurrences of the word will not locate the forgotten definition. No exception is thrown if no such word exists. +\wordtable{ +\ordinaryword{forget}{forget ( word -- )}{words} +} +Removes the word from its vocabulary. The parsing word \texttt{FORGET:} is implemented using this word. + +\subsection{\label{word-props}Word properties} + +\glossary{name=word property, +description={a name/value pair stored in a word's properties}} +\glossary{name=word properties, +description={a hashtable associated with each word storing various sundry properties}} + +Each word has an associated hashtable of properties. A common idiom in the Factor library is to use symbols for their properties. Conventionally, the property names are strings, but nothing requires that this be so. + +\wordtable{ +\ordinaryword{word-prop}{word-prop ( word name -- value )}{words}\\ +\ordinaryword{set-word-prop}{set-word-prop ( value word name -- )}{words} +} +Retrieve and store word properties. Note that the stack effect is designed so that it is most convenient when \texttt{name} is a literal that is pushed on the stack right before executing these words. This is usually the case. + +\wordtable{ +\ordinaryword{word-name}{word-prop ( word -- name )}{words}\\ +\ordinaryword{word-vocabulary}{word-vocabulary ( word -- vocabulary )}{words} +} +Retreive the name of a word, and the name of the vocabulary it is stored in. The definitions are trivial: +\begin{verbatim} +: word-name "name" word-prop ; +: word-vocabulary "vocabulary" word-prop ; +\end{verbatim} + +\wordtable{ +\ordinaryword{word-sort}{word-sort ( list -- list )}{words} +} +Sort a list of words by name. + +\wordtable{ +\ordinaryword{word-props}{word-props ( word -- hashtable )}{words}\\ +\ordinaryword{set-word-props}{set-word-props ( hashtable word -- )}{words} +} +Retreive and store the entire set of word properties. + +\subsection{Low-level details} + +The actual behavior of a word when executed is determined by the values of two slots: +\begin{itemize} +\item The primitive number +\item The primitive parameter +\end{itemize} +The primitive number is an index into an array of native functions in the Factor runtime. +Some frequently-occurring primitive numbers: +\begin{description} +\item[0] deferred word, +\item[1] compound definition -- executes the quotation stored in the parameter slot, +\item[2] symbol -- pushes the value of the parameter slot, +\item[3 onwards] the actual set of primitives, of which there are around 170. +\end{description} +The words outlined in this section should not be used in ordinary code. +\wordtable{ +\ordinaryword{word-primitive}{word-primitive ( word -- n )}{words}\\ +\ordinaryword{set-word-primitive}{set-word-primitive ( word -- n )}{words} +} +Retrieves and stores a word's primitive number. + +\wordtable{ +\ordinaryword{word-def}{word-def ( word -- object )}{words}\\ +\ordinaryword{set-word-def}{set-word-def ( object word -- )}{words} +} +Retrieves and stores a word's primitive parameter. This parameter is only used if the primitive number is 1 (compound definitions) or 2 (symbols). Note that to define a compound definition or symbol, you must use \texttt{define-compound} or \texttt{define-symbol}, as these words do not update the cross-referencing of word dependencies. + +\wordtable{ +\ordinaryword{word-xt}{word-xt ( word -- n )}{words}\\ +\ordinaryword{set-word-xt}{set-word-xt ( n word -- )}{words} +} +Retrieves and stores a word's \emph{execution token}. + +This is an even lower-level facility for working with the address containing native code to be invoked when the word is executed. The compiler sets the execution token to a location in memory containing generated code. + +\wordtable{ +\ordinaryword{update-xt}{update-xt ( word -- )}{words} +} +Updates a word's execution token according to its primitive number. When called with a compiled word, has the effect of decompiling the word. Calling \texttt{set-word-primitive} automatically updates this address. + +\wordtable{ +\ordinaryword{recrossref}{recrossref ( word -- )}{words} +} +Updates the cross-referencing database, which you will probably need to do if you mess around with any of the words in this section -- assuming Factor does not crash first, that is. + +\section{Objects} + +\glossary{name=object, +description=a datum that can be identified} + +\subsection{Identity and equality} + +\subsection{Generic words and methods} + +\glossary{name=generic word, +description={a word defined using the \texttt{GENERIC:}~parsing word. The behavior of generic words depends on the class of the object at the top of the stack. A generic word is composed of methods, where each method is specialized on a class}} +\glossary{name=method, +description={gives a generic word behavior when the top of the stack is an instance of a specific class}} +Sometimes you want a word's behavior to depend on the class of the object at the top of the stack, however implementing the word as a set of nested conditional tests is undesirable since it leads to unnecessary coupling -- adding support for a new class requires modifying the original definition of the word. + +A generic word is a word whose behavior depends on the class of the +object at the top of the stack, however this behavior is defined in a +decentralized manner. + +\wordtable{ +\parsingword{GENERIC:}{GENERIC: \emph{word}}{syntax} +} +Defines a new generic word. Initially, it contains no methods, and thus will raise an error when called. + +\wordtable{ +\parsingword{M:}{M: \emph{class} \emph{word} \emph{definition} ;}{syntax} +} +Defines a method, that is, a behavior for the generic \texttt{word} specialized on instances of \texttt{class}. Each method definition +can potentially occur in a different source file. + +\subsubsection{\label{method-order}Method ordering} + +While not all classes are \emph{comparable}, meaning there is no canonical linear ordering for classes, the methods of a generic word are ordered, and you can inspect this order using the \texttt{order} word: +\begin{alltt} +\textbf{ok} \bs = order . +\textbf{[ object number sequence string cons sbuf alien tuple +hashtable POSTPONE: f ]} +\end{alltt} +Least-specific methods come first in the list. + +\subsection{Classes} +\glossary{name=class, +description=a set of objects defined in a formal manner. Methods specialize generic words on classes} +\glossary{name=metaclass, +description={a set of classes sharing common traits. Examples include \texttt{builtin}, \texttt{union}, and \texttt{tuple}}} + +\wordtable{ +\classword{object}{generic} +} +Every object is a member of the \texttt{object} class. If you provide a method specializing +on the \texttt{object} class for some generic word, the method will be +invoked when no more specific method exists. For example: +\begin{verbatim} +GENERIC: describe +M: number describe + "The number " write . ; +M: object describe + "I don't know anything about " write . ; +\end{verbatim} +Each class has a membership predicate named +after the class with a \texttt{?}~suffix, with the following exceptions: +\begin{description} +\item[object] there is no need for a predicate word, since +every object is an instance of this class. +\item[f] the only instance of this class is the singleton +\texttt{f} signifying falsity, missing value, and empty list, and the predicate testing for this is the built-in library word \texttt{not}. +\item[t] the only instance of this class is the canonical truth value +\texttt{t}. You can write \texttt{t =} to test for this object, however usually +any object distinct from \texttt{f} is taken as a truth value, and \texttt{t} is not tested for directly. +\end{description} + +\subsubsection{Built-in classes} +\glossary{name=type, +description={an object invariant that describes its shape. An object's type is constant for the lifetime of the object, and there is only a fixed number of types built-in to the run-time. See class}} +\glossary{name=built-in class, +description=see type} +Every object is an instance of to exactly one type, and the type is constant for the lifetime of the object. There is only a fixed number of types built-in to the run-time, and corresponding to each type is a \emph{built-in class}: +\begin{verbatim} +alien +array +bignum +byte-array +complex +cons +dll +f +fixnum +float +ratio +sbuf +string +t +tuple +vector +word +\end{verbatim} +\wordtable{ +\ordinaryword{type}{type ( object -- n )}{kernel} +} +Outputs the type number of a given object. Most often, the \texttt{class} word is more useful. +\wordtable{ +\ordinaryword{class}{class ( object -- class )}{kernel} +} +Outputs the canonical class of a given object. While an object may be an instance of more than one class, the canonical class is either the built-in class, or if the object is a tuple, the tuple class. Examples: +\begin{alltt} +\textbf{ok} 1.0 class . +\textbf{float} +\textbf{ok} TUPLE: point x y z ; +\textbf{ok} << point f 1 2 3 >> class . +\textbf{point} +\end{alltt} + +\subsubsection{Unions} +\glossary{name=union, +description={a class whose set of instances is the union of the set of instances of a list of member classes}} +An object is an instance of a union class if it is an instance of one of its members. Union classes are used to associate the same method with several different classes, as well as to conveniently define predicates. +\wordtable{ +\parsingword{UNION:}{UNION: \emph{name} \emph{members} ;}{syntax} +} +Defines a union class. For example, the Factor library defines some unions over numeric types: +\begin{verbatim} +UNION: integer fixnum bignum ; +UNION: rational integer ratio ; +UNION: real rational float ; +UNION: number real complex ; +\end{verbatim} +Now, the absolute value function can be defined in an efficient manner +for real numbers, and in a more general fashion for complex numbers: +\begin{verbatim} +GENERIC: abs ( z -- |z| ) +M: real abs dup 0 < [ neg ] when ; +M: complex abs >rect mag2 ; +\end{verbatim} + +\subsubsection{Complements} +\glossary{name=complement, +description={a class whose set of instances is the set of objects that are not instances of a specific class}} + +An object is an instance of a complement class if it is not an instance of the complement's parameter. +\wordtable{ +\parsingword{COMPLEMENT:}{COMPLEMENT: \emph{name} \emph{parameter}}{syntax} +} +Defines a complement class. For example, the class of all values denoting ``true'' is defined as follows: +\begin{verbatim} +COMPLEMENT: general-t f +\end{verbatim} + +\subsubsection{Predicates} +\glossary{name=predicate, +description={a word with stack effect \texttt{( object -- ?~)}, or more alternatively, a class whose instances are the instances of a superclass that satisfy an arbitrary predicate}} +An object is an instance of a predicate classes if it is an instance of the predicate's parent class, and if it satisfies the predicate definition. + +To speed up dispatch, each predicate must be +defined as a subclass of some other class. That way predicates +subclassing from disjoint builtin classes do not need to be +exhaustively tested. +\wordtable{ +\parsingword{PREDICATE:}{PREDICATE: \emph{parent} \emph{name} \emph{predicate} ;}{syntax} +} +Defines a predicate class deriving from \texttt{parent} whose instances are the instances of \texttt{superclass} that satisfy the \texttt{predicate} quotation. The predicate quotation must have stack effect \texttt{( object -- ?~)}. + +For example, the library defines some subclasses of \texttt{integer} +classifying ASCII characters: +\begin{verbatim} +PREDICATE: integer blank " \t\n\r" str-contains? ; +PREDICATE: integer letter CHAR: a CHAR: z between? ; +PREDICATE: integer LETTER CHAR: A CHAR: Z between? ; +PREDICATE: integer digit CHAR: 0 CHAR: 9 between? ; +PREDICATE: integer printable CHAR: \s CHAR: ~ between? ; +\end{verbatim} + +\subsubsection{Operations on classes} +\wordtable{ +\ordinaryword{class-and}{class-and ( class class -- class )}{kernel}\\ +\ordinaryword{class-or}{class-and ( class class -- class )}{kernel} +} +Intersection and union of classes. Note that the returned class might not be the exact desired class; for example, \texttt{object} is output if no suitable class definition could be found at all. +\wordtable{ +\ordinaryword{class<}{class< ( class class -- class )}{kernel} +} +Classes are partially ordered. This ordering determines the method ordering of a generic word (\ref{method-order}). + +\subsection{Tuples} +\tupleglos + +Tuples are user-defined classes composed of named slots. All tuples have the same type, however distinct classes of tuples are defined. +\wordtable{ +\parsingword{TUPLE:}{TUPLE: \emph{name} \emph{slots} ;}{syntax} +} +Defines a new tuple class, along with a predicate \texttt{name?}~and default constructor \texttt{}. + +Slots are read and written using various automatically-defined words with names of the +form \texttt{\emph{class}-\emph{slot}} and \texttt{set-\emph{class}-\emph{slot}}. + +Here is an example: +\begin{verbatim} +TUPLE: point x y z ; +\end{verbatim} +This defines a new class named \texttt{point}, along with the +following set of words: +\begin{verbatim} + point? +point-x set-point-x +point-y set-point-y +point-z set-point-z +\end{verbatim} +The word \texttt{} takes the slot values from the stack and +produces a new \texttt{point}: +\begin{alltt} +\textbf{ok} 1 2 3 . +\textbf{<< point 1 2 3 >>} +\end{alltt} + +\subsubsection{Constructors} + +A tuple constructor is named after the tuple class surrounded in angle +brackets (\texttt{<} and \texttt{>}). A default constructor is provided +that reads slot values from the stack, however a custom constructor can +be defined using the \texttt{C:} parsing word. +\wordtable{ +\parsingword{C:}{C: \emph{class} \emph{definition} ;}{syntax} +} +Define a \texttt{} word that creates a tuple instance of the \texttt{class}, then applies the \texttt{definition} to this new tuple. The \texttt{definition} quotation must have stack effect \texttt{( tuple -- tuple )}. + +\subsubsection{Delegation} + +\glossary{name=delegate, +description={a fa\,cade object's delegate receives unhandled methods that are called on the fa\,cade}} +\glossary{name={fa\,cade}, +description=an object with a delegate} + +Each tuple can have an optional delegate tuple. Generic words called on +the tuple that do not have a method for the tuple's class will be passed on +to the delegate. Note that delegation to objects that are not tuples is not fully supported at this stage and might not work as you might expect. +\wordtable{ +\ordinaryword{delegate}{delegate ( object -- object )}{syntax} +} +Returns an object's delegate, or \texttt{f} if no delegate is set. Note that in this case, undefined methods will be passed to \texttt{f}; rather an error is raised immediately. +\wordtable{ +\ordinaryword{set-delegate}{set-delegate ( object tuple -- )}{syntax} +} +Sets a tuple's delegate. + +Factor uses delegation is used instead of inheritance, but it is not a direct +substitute; in particular, the semantics differ in that a delegated +method call receives the delegate on the stack, not the original object. + +\section{Numbers} + +\numberglos + +Factor's numbers more closely model the mathematical concept of a number than other languages. Where possible, exact answers are given -- for example, adding or multiplying two integers never results in overflow, and dividing two integers yields a fraction rather than a truncated result. Complex numbers are supported, allowing many functions to be computed with parameters that would raise errors or return ``not a number'' in other languages. + +\subsection{Integers} + +\integerglos + +The simplest type of number is the integer. Integers come in two varieties -- \emph{fixnums} and \emph{bignums}. As their names suggest, a fixnum is a fixed-width quantity\footnote{Fixnums range in size from $-2^{w-3}-1$ to $2^{w-3}$, where $w$ is the word size of your processor (for example, 32 bits). Because fixnums automatically grow to bignums, usually you do not have to worry about details like this.}, and is a bit quicker to manipulate than an arbitrary-precision bignum. + +The predicate word \texttt{integer?}~tests if the top of the stack is an integer. If this returns true, then exactly one of \texttt{fixnum?}~or \texttt{bignum?}~would return true for that object. Usually, your code does not have to worry if it is dealing with fixnums or bignums. + +Unlike some languages where the programmer has to declare storage size explicitly and worry about overflow, integer operations automatically return bignums if the result would be too big to fit in a fixnum. Here is an example where multiplying two fixnums returns a bignum: + +\begin{alltt} +\textbf{ok} 134217728 fixnum? . +\textbf{t} +\textbf{ok} 128 fixnum? . +\textbf{t} +\textbf{ok} 134217728 128 * . +\textbf{17179869184} +\textbf{ok} 134217728 128 * bignum? . +\textbf{t} +\end{alltt} + +Integers can be entered using a different base. By default, all number entry is in base 10, however this can be changed by prefixing integer literals with one of the parsing words \texttt{BIN:}, \texttt{OCT:}, or \texttt{HEX:}. For example: + +\begin{alltt} +\textbf{ok} BIN: 1110 BIN: 1 + . +\textbf{15} +\textbf{ok} HEX: deadbeef 2 * . +\textbf{7471857118} +\end{alltt} + +The word \texttt{.} prints numbers in decimal, regardless of how they were input. A set of words in the \texttt{prettyprint} vocabulary is provided for print integers using another base. + +\begin{alltt} +\textbf{ok} 1234 .h +\textbf{4d2} +\textbf{ok} 1234 .o +\textbf{2232} +\textbf{ok} 1234 .b +\textbf{10011010010} +\end{alltt} + +\subsection{Rational numbers} + +\newcommand{\rationalglos}{\glossary{ +name=rational, +description={an instance of the \texttt{rational} class, which is a disjoint union of the +\texttt{integer} and \texttt{ratio} classes}}} +\rationalglos +\ratioglos + +If we add, subtract or multiply any two integers, the result is always an integer. However, this is not the case with division. When dividing a numerator by a denominator where the numerator is not a integer multiple of the denominator, a ratio is returned instead. + +\begin{alltt} +1210 11 / . +\emph{110} +100 330 / . +\emph{10/33} +\end{alltt} + +Ratios are printed and can be input literally in the form of the second example. Ratios are always reduced to lowest terms by factoring out the greatest common divisor of the numerator and denominator. A ratio with a denominator of 1 becomes an integer. Trying to create a ratio with a denominator of 0 raises an error. + +The predicate word \texttt{ratio?}~tests if the top of the stack is a ratio. The predicate word \texttt{rational?}~returns true if and only if one of \texttt{integer?}~or \texttt{ratio?}~would return true for that object. So in Factor terms, a ``ratio'' is a rational number whose denominator is not equal to 1. + +Ratios behave just like any other number -- all numerical operations work as expected, and in fact they use the formulas for adding, subtracting and multiplying fractions that you learned in high school. + +\begin{alltt} +\textbf{ok} 1/2 1/3 + . +\textbf{5/6} +\textbf{ok} 100 6 / 3 * . +\textbf{50} +\end{alltt} + +Ratios can be deconstructed into their numerator and denominator components using the \texttt{numerator} and \texttt{denominator} words. The numerator and denominator are both integers, and furthermore the denominator is always positive. When applied to integers, the numerator is the integer itself, and the denominator is 1. + +\begin{alltt} +\textbf{ok} 75/33 numerator . +\textbf{25} +\textbf{ok} 75/33 denominator . +\textbf{11} +\textbf{ok} 12 numerator . +\textbf{12} +\end{alltt} + +\subsection{Floating point numbers} + +\newcommand{\realglos}{\glossary{ +name=real, +description={an instance of the \texttt{real} class, which is a disjoint union of the +\texttt{rational} and \texttt{float} classes}}} +\realglos +\floatglos + +Rational numbers represent \emph{exact} quantities. On the other hand, a floating point number is an \emph{approximation}. While rationals can grow to any required precision, floating point numbers are fixed-width, and manipulating them is usually faster than manipulating ratios or bignums (but slower than manipulating fixnums). Floating point literals are often used to represent irrational numbers, which have no exact representation as a ratio of two integers. Floating point literals are input with a decimal point. + +\begin{alltt} +\textbf{ok} 1.23 1.5 + . +\textbf{1.73} +\end{alltt} + +The predicate word \texttt{float?}~tests if the top of the stack is a floating point number. The predicate word \texttt{real?}~returns true if and only if one of \texttt{rational?}~or \texttt{float?}~would return true for that object. + +Floating point numbers are \emph{contagious} -- introducing a floating point number in a computation ensures the result is also floating point. + +\begin{alltt} +\textbf{ok} 5/4 1/2 + . +\textbf{7/4} +\textbf{ok} 5/4 0.5 + . +\textbf{1.75} +\end{alltt} + +Apart from contaigion, there are two ways of obtaining a floating point result from a computation; the word \texttt{>float ( n -{}- f )} converts a rational number into its floating point approximation, and the word \texttt{/f ( x y -{}- x/y )} returns the floating point approximation of a quotient of two numbers. + +\begin{alltt} +\textbf{ok} 7 4 / >float . +\textbf{1.75} +\textbf{ok} 7 4 /f . +\textbf{1.75} +\end{alltt} + +Indeed, the word \texttt{/f} could be defined as follows: + +\begin{alltt} +: /f / >float ; +\end{alltt} + +However, the actual definition is slightly more efficient, since it computes the floating point result directly. + +\subsection{Complex numbers} + +Complex numbers arise as solutions to quadratic equations whose graph does not intersect the x axis. For example, the equation $x^2 + 1 = 0$ has no solution for real $x$, because there is no real number that is a square root of -1. However, in the field of complex numbers, this equation has a well-known solution: + +\begin{alltt} +\textbf{ok} -1 sqrt . +\textbf{\#\{ 0 1 \}} +\end{alltt} + +The literal syntax for a complex number is \texttt{\#\{ re im \}}, where \texttt{re} is the real part and \texttt{im} is the imaginary part. For example, the literal \texttt{\#\{ 1/2 1/3 \}} corresponds to the complex number $1/2 + 1/3i$. + +The words \texttt{i} an \texttt{-i} push the literals \texttt{\#\{ 0 1 \}} and \texttt{\#\{ 0 -1 \}}, respectively. + +The predicate word \texttt{complex?} tests if the top of the stack is a complex number. Note that unlike math, where all real numbers are also complex numbers, Factor only considers a number to be a complex number if its imaginary part is non-zero. + +Complex numbers can be deconstructed into their real and imaginary components using the \texttt{real} and \texttt{imaginary} words. Both components can be pushed at once using the word \texttt{>rect ( z -{}- re im )}. + +\begin{alltt} +\textbf{ok} -1 sqrt real . +\textbf{0} +\textbf{ok} -1 sqrt imaginary . +\textbf{1} +\textbf{ok} -1 sqrt sqrt >rect .s +\textbf{\{ 0.7071067811865476 0.7071067811865475 \}} +\end{alltt} + +A complex number can be constructed from a real and imaginary component on the stack using the word \texttt{rect> ( re im -{}- z )}. + +\begin{alltt} +\textbf{ok} 1/3 5 rect> . +\textbf{\#\{ 1/3 5 \}} +\end{alltt} + +Complex numbers are stored in \emph{rectangular form} as a real/imaginary component pair (this is where the names \texttt{>rect} and \texttt{rect>} come from). An alternative complex number representation is \emph{polar form}, consisting of an absolute value and argument. The absolute value and argument can be computed using the words \texttt{abs} and \texttt{arg}, and both can be pushed at once using \texttt{>polar ( z -{}- abs arg )}. + +\begin{alltt} +\textbf{ok} 5.3 abs . +\textbf{5.3} +\textbf{ok} i arg . +\textbf{1.570796326794897} +\textbf{ok} \#\{ 4 5 \} >polar .s +\textbf{\{ 6.403124237432849 0.8960553845713439 \}} +\end{alltt} + +A new complex number can be created from an absolute value and argument using \texttt{polar> ( abs arg -{}- z )}. + +\begin{alltt} +\textbf{ok} 1 pi polar> . +\textbf{\#\{ -1.0 1.224606353822377e-16 \}} +\end{alltt} + +\subsection{Transcedential functions} + +The \texttt{math} vocabulary provides a rich library of mathematical functions that covers exponentiation, logarithms, trigonometry, and hyperbolic functions. All functions accept and return complex number arguments where appropriate. These functions all return floating point values, or complex numbers whose real and imaginary components are floating point values. + +\texttt{\^{} ( x y -- x\^{}y )} raises \texttt{x} to the power of \texttt{y}. In the cases of \texttt{y} being equal to $1/2$, -1, or 2, respectively, the words \texttt{sqrt}, \texttt{recip} and \texttt{sq} can be used instead. + +\begin{alltt} +\textbf{ok} 2 4 \^ . +\textbf{16.0} +\textbf{ok} i i \^ . +\textbf{0.2078795763507619} +\end{alltt} + +All remaining functions have a stack effect \texttt{( x -{}- y )}, it won't be repeated for brevity. + +\texttt{exp} raises the number $e$ to a specified power. The number $e$ can be pushed on the stack with the \texttt{e} word, so \texttt{exp} could have been defined as follows: + +\begin{alltt} +: exp ( x -- e^x ) e swap \^ ; +\end{alltt} + +However, it is actually defined otherwise, for efficiency.\footnote{In fact, the word \texttt{\^{}} is actually defined in terms of \texttt{exp}, to correctly handle complex number arguments.} + +\texttt{log} computes the natural (base $e$) logarithm. This is the inverse of the \texttt{exp} function. + +\begin{alltt} +\textbf{ok} -1 log . +\textbf{\#\{ 0.0 3.141592653589793 \}} +\textbf{ok} e log . +\textbf{1.0} +\end{alltt} + +\texttt{sin}, \texttt{cos} and \texttt{tan} are the familiar trigonometric functions, and \texttt{asin}, \texttt{acos} and \texttt{atan} are their inverses. + +The reciprocals of the sine, cosine and tangent are defined as \texttt{sec}, \texttt{cosec} and \texttt{cot}, respectively. Their inverses are \texttt{asec}, \texttt{acosec} and \texttt{acot}. + +\texttt{sinh}, \texttt{cosh} and \texttt{tanh} are the hyperbolic functions, and \texttt{asinh}, \texttt{acosh} and \texttt{atanh} are their inverses. + +Similarly, the reciprocals of the hyperbolic functions are defined as \texttt{sech}, \texttt{cosech} and \texttt{coth}, respectively. Their inverses are \texttt{asech}, \texttt{acosech} and \texttt{acoth}. + +\subsection{Modular arithmetic} + +In addition to the standard division operator \texttt{/}, there are a few related functions that are useful when working with integers. + +\texttt{/i ( x y -{}- x/y )} performs a truncating integer division. It could have been defined as follows: + +\begin{alltt} +: /i / >integer ; +\end{alltt} + +However, the actual definition is a bit more efficient than that. + +\texttt{mod ( x y -{}- x\%y )} computes the remainder of dividing \texttt{x} by \texttt{y}. If the result is 0, then \texttt{x} is a multiple of \texttt{y}. + +\texttt{/mod ( x y -{}- x/y x\%y )} pushes both the quotient and remainder. + +\begin{alltt} +\textbf{ok} 100 3 mod . +\textbf{1} +\textbf{ok} -546 34 mod . +\textbf{-2} +\end{alltt} + +\texttt{gcd ( x y -{}- z )} pushes the greatest common divisor of two integers; that is, the largest number that both integers could be divided by and still yield integers as results. This word is used behind the scenes to reduce rational numbers to lowest terms when doing ratio arithmetic. + +\subsection{Bitwise operations} + +There are two ways of looking at an integer -- as a mathematical entity, or as a string of bits. The latter representation faciliates the so-called \emph{bitwise operations}. + +\texttt{bitand ( x y -{}- x\&y )} returns a new integer where each bit is set if and only if the corresponding bit is set in both $x$ and $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-and with a mask switches off all flags that are not explicitly set in the mask. + +\begin{alltt} +BIN: 101 BIN: 10 bitand .b +\emph{0} +BIN: 110 BIN: 10 bitand .b +\emph{10} +\end{alltt} + +\texttt{bitor ( x y -{}- x|y )} returns a new integer where each bit is set if and only if the corresponding bit is set in at least one of $x$ or $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-or with a mask switches on all flags that are set in the mask. + +\begin{alltt} +BIN: 101 BIN: 10 bitor .b +\emph{111} +BIN: 110 BIN: 10 bitor .b +\emph{110} +\end{alltt} + +\texttt{bitxor ( x y -{}- x\^{}y )} returns a new integer where each bit is set if and only if the corresponding bit is set in exactly one of $x$ or $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-xor with a mask toggles on all flags that are set in the mask. + +\begin{alltt} +BIN: 101 BIN: 10 bitxor .b +\emph{111} +BIN: 110 BIN: 10 bitxor .b +\emph{100} +\end{alltt} + +\texttt{bitnot ( x -{}- y )} returns the bitwise complement of the input; that is, each bit in the input number is flipped. This is actually equivalent to negating a number, and subtracting one. So indeed, \texttt{bitnot} could have been defined as thus: + +\begin{alltt} +: bitnot neg pred ; +\end{alltt} + +\texttt{shift ( x n -{}- y )} returns a new integer consisting of the bits of the first integer, shifted to the left by $n$ positions. If $n$ is negative, the bits are shifted to the right instead, and bits that ``fall off'' are discarded. + +\begin{alltt} +BIN: 101 5 shift .b +\emph{10100000} +BIN: 11111 -2 shift .b +\emph{111} +\end{alltt} + +The attentive reader will notice that shifting to the left is equivalent to multiplying by a power of two, and shifting to the right is equivalent to performing a truncating division by a power of two. + +\chapter{The development environment} + +Factor supports interactive development in a live environment. Instead of working with +static executable files and restarting your application after each change, you can +incrementally make changes to your application and test them immediately. If you +notice an undesirable behavior, Factor's powerful reflection features will aid in +pinpointing the error. + +If you are used to a statically typed language, you might find Factor's tendency to only fail at runtime hard to work with at first. However, the interactive development tools outlined in this document allow a much quicker turn-around time for testing changes. Also, write unit tests -- unit testing is a great way to ensure that old bugs do not re-appear once they've been fixed. + +\section{System organization} + +\subsection{The listener} + +Factor is an \emph{image-based environment}. When you compiled Factor, you also generated a file named \texttt{factor.image}. I will have more to say about images later, but for now it suffices to understand that to start Factor, you must pass the image file name on the command line: +\begin{alltt} +./f factor.image +\textbf{Loading factor.image... relocating... done +Factor 0.73 :: http://factor.sourceforge.net :: unix/x86 +(C) 2003, 2005 Slava Pestov, Chris Double, +Mackenzie Straight +ok} +\end{alltt} +An \texttt{\textbf{ok}} prompt is printed after the initial banner, indicating the listener is ready to execute Factor phrases. The listener is a piece of Factor code, like any other; however, it helps to think of it as the primary interface to the Factor system. The listener reads Factor code and executes it. You can try the classical first program: + +\begin{alltt} +\textbf{ok} "Hello, world." print +\textbf{Hello, world.} +\end{alltt} + + +Multi-line phrases are supported; if there are unclosed brackets, the listener outputs \texttt{...} instead of the \texttt{ok} prompt, and the entire phrase is executed once all brackets are closed: + +\begin{alltt} +\textbf{ok} [ 1 2 3 ] [ +\textbf{...} . +\textbf{...} ] each +\textbf{1 +2 +3} +\end{alltt} + +The listener knows when to print a continuation prompt by looking at the height of the +stack. Parsing words such as \texttt{[} and \texttt{:} leave elements on the parser +stack; these elements are popped by \texttt{]} and \texttt{;}. + +\subsection{Source files} + +While it is possible to do all development in the listener and save your work in images, it is far more convenient to work with source files, at least until an in-image structure editor is developed. + +By convention, Factor source files are saved with the \texttt{.factor} filename extension. They can be loaded into the image as follows: + +\begin{alltt} +\textbf{ok} "examples/numbers-game.factor" run-file +\end{alltt} + +In Factor, loading a source file replaces any existing definitions\footnote{But see \ref{compiler} for this is not true of compiled code.}. Each word definition remembers what source file it was loaded from (if any). To reload the source file associated with a definition, use the \texttt{reload} word: + +\begin{alltt} +\textbf{ok} \bs draw reload +\end{alltt} + +Word definitions also retain the line number where they are located in their original source file. This allows you to open a word definition in jEdit\footnote{\texttt{http://www.jedit.org}} for editing using the +\texttt{jedit} word: + +\begin{alltt} +\textbf{ok} \bs compile jedit +\end{alltt} + +This word requires that a jEdit instance is already running. + +For the \texttt{jedit} word to work with words in the Factor library, you must set the \texttt{"resource-path"} variable to the location of the Factor source tree. One way to do this is to add a phrase like the following to your \texttt{.factor-rc}: + +\begin{verbatim} +"/home/slava/Factor/" "resource-path" set +\end{verbatim} + +On startup, Factor reads the \texttt{.factor-rc} file from your home directory. You can put +any quick definitions you want available at the listener there. To avoid loading this +file, pass the \texttt{-no-user-init} command line switch. Another way to have a set of definitions available at all times is to save a custom image, as described in the next section. + +\subsection{Images} + +The \texttt{factor.image} file is basically a dump of all objects in the heap. A new image can be saved as follows: + +\begin{alltt} +\textbf{ok} "work.image" save-image +\textbf{Saving work.image...} +\end{alltt} + +When you save an image before exiting Factor, then start Factor again, everything will be almost as you left it. Try the following: + +\begin{alltt} +./f factor.image +\textbf{ok} "Learn Factor" "reminder" set +\textbf{ok} "factor.image" save-image bye +\textbf{Saving factor.image...} +\end{alltt} + +Factor will save the image and exit. Now start it again and see that the reminder is still there: + +\begin{alltt} +./f factor.image +\textbf{ok} "reminder" get . +\textbf{"Learn Factor"} +\end{alltt} + +This is what is meant by the image being an \emph{infinite session}. When you shut down and restart Factor, what happends is much closer to a Laptop's ``suspend'' mode, than a desktop computer being fully shut down. + +\subsection{Looking at objects} + +Probably the most important debugging tool of them all is the \texttt{.} word. It prints the object at the top of the stack in a form that can be parsed by the Factor parser. A related word is \texttt{prettyprint}. It is identical to \texttt{.} except the output is more verbose; lists, vectors and hashtables are broken up into multiple lines and indented. + +\begin{alltt} +\textbf{ok} [ [ \tto 1 \ttc \tto 2 \ttc ] dup car swap cdr ] . +[ [ \tto 1 \ttc \tto 2 \ttc ] dup car swap cdr ] +\end{alltt} + +Most objects print in a parsable form, but not all. One exceptions to this rule is objects with external state, such as I/O ports or aliens (pointers to native structures). Also, objects with circular or very deeply nested structure will not print in a fully parsable form, since the prettyprinter has a limit on maximum nesting. Here is an example -- a vector is created, that holds a list whose first element is the vector itself: + +\begin{alltt} +\textbf{ok} \tto \ttc [ unit 0 ] keep [ set-vector-nth ] keep . +\tto [ ... ] \ttc +\end{alltt} + +The prettyprinted form of a vector or list with many elements is not always readable. The \texttt{[.]} and \texttt{\tto.\ttc} words output a list or a vector, respectively, with each element on its own line. In fact, the stack printing words are defined in terms of \texttt{[.]} and \texttt{\tto.\ttc}: + +\begin{verbatim} +: .s datastack {.} ; +: .r callstack {.} ; +: .n namestack [.] ; +: .c catchstack [.] ; +\end{verbatim} + +Before we move on, one final set of output words comes is used to output integers in +different numeric bases. The \texttt{.b} word prints an integer in binary, \texttt{.o} in octal, and \texttt{.h} in hexadecimal. + +\begin{alltt} +\textbf{ok} 31337 .b +\textbf{111101001101001} +\textbf{ok} 31337 .o +\textbf{75151} +\textbf{ok} 31337 .h +\textbf{7a69} +\end{alltt} + +\section{Word tools} + +\subsection{Exploring vocabularies} + +Factor organizes code in a two-tier structure of vocabularies and words. A word is the smallest unit of code; it corresponds to a function or method in other languages. Vocabularies group related words together for easy browsing and tracking of source dependencies. + +Entering \texttt{vocabs .}~in the listener produces a list of all existing vocabularies: + +\begin{alltt} +\textbf{ok} vocabs . +\textbf{[ "alien" "ansi" "assembler" "browser-responder" +"command-line" "compiler" "cont-responder" "errors" +"file-responder" "files" "gadgets" "generic" +"hashtables" "html" "httpd" "httpd-responder" "image" +"inference" "interpreter" "io-internals" "jedit" +"kernel" "kernel-internals" "line-editor" "listener" +"lists" "logging" "math" "math-internals" "memory" +"namespaces" "parser" "prettyprint" "profiler" +"quit-responder" "random" "resource-responder" +"scratchpad" "sdl" "shells" "stdio" "streams" +"strings" "syntax" "telnetd" "test" "test-responder" +"threads" "unparser" "url-encoding" "vectors" "words" ]} +\end{alltt} + +As you can see, there are a lot of vocabularies! Now, you can use \texttt{words .}~to list the words inside a given vocabulary: + +\begin{alltt} +\textbf{ok} "namespaces" words . +\textbf{[ (get) , >n append, bind change cons@ +dec extend get global inc init-namespaces list-buffer +literal, make-list make-rlist make-rstring make-string +make-vector n> namespace namestack nest off on put set +set-global set-namestack unique, unique@ with-scope ]} +\end{alltt} + +You can look at the definition of any word, including library words, using \texttt{see}. Keep in mind you might have to \texttt{USE:} the vocabulary first. + +\begin{alltt} +\textbf{ok} USE: httpd +\textbf{ok} \bs httpd-connection see +\textbf{IN: httpd : httpd-connection ( socket -- ) + "http-server" get accept [ + httpd-client + ] in-thread drop ;} +\end{alltt} + +The \texttt{see} word shows a reconstruction of the source code, not the original source code. So in particular, formatting and some comments are lost. + +\subsection{Cross-referencing words} + +The \texttt{apropos.} word is handy when searching for related words. It lists all words +whose names contain a given string. The \texttt{apropos.} word is also useful when you know the exact name of a word, but are unsure what vocabulary it is in. For example, if you're looking for ways to iterate over various collections, you can do an apropos search for \texttt{map}: + +\begin{alltt} +\textbf{ok} "map" apropos. +\textbf{IN: inference +type-value-map +IN: lists +map +map-with +IN: sdl +set-surface-map +surface-map +IN: strings +string-map +IN: vectors +vector-map} +\end{alltt} + +From the above output, you can see that \texttt{map} is for lists, \texttt{string-map} is for strings, and \texttt{vector-map} is for vectors. + +The \texttt{usage} word finds all words that refer to a given word and pushes a list on the stack. This word is helpful in two situations; the first is for learning -- a good way to learn a word is to see it used in context. The second is during refactoring -- if you change a word's stack effect, you must also update all words that call it. Usually you print the +return value of \texttt{usage} using \texttt{.}: + +\begin{alltt} +\textbf{ok} \bs string-map usage . +\textbf{schars>entities +filter-null +url-encode} +\end{alltt} + +Another useful word is \texttt{usages}. Unlike \texttt{usage}, it finds all usages, even +indirect ones -- so if a word refers to another word that refers to the given word, +both words will be in the output list. + +\subsection{Exploring classes} + +Factor supports object-oriented programming via generic words. Generic words are called +like ordinary words, however they can have multiple definitions, one per class, and +these definitions do not have to appear in the same source file. Such a definition is +termed a \emph{method}, and the method is said to \emph{specialize} on a certain +class. A class in the most +general sense is just a set of objects. You can output a list of classes in the system +with \texttt{classes .}: + +\begin{alltt} +\textbf{ok} classes. +\textbf{[ alien alien-error byte-array displaced-alien +dll ansi-stream disp-only displaced indirect operand +register absolute absolute-16/16 relative relative-bitfld +item kernel-error no-method border checkbox dialog editor +ellipse etched-rect frame gadget hand hollow-ellipse +hollow-rect label line menu pane pile plain-ellipse +plain-rect rectangle roll-rect scroller shelf slider +stack tile viewport world 2generic arrayed builtin +complement generic null object predicate tuple +tuple-class union hashtable html-stream class-tie +computed inference-error inference-warning literal +literal-tie value buffer port jedit-stream boolean +general-t array cons general-list list bignum complex +fixnum float integer number ratio rational real +parse-error potential-float potential-ratio +button-down-event button-up-event joy-axis-event +joy-ball-event joy-button-down-event joy-button-up-event +joy-hat-event key-down-event key-up-event motion-event +quit-event resize-event user-event sequence stdio-stream +client-stream fd-stream null-stream server string-output +wrapper-stream LETTER blank digit letter printable sbuf +string text POSTPONE: f POSTPONE: t vector compound +primitive symbol undefined word ]} +\end{alltt} + +If you \texttt{see} a generic word, all methods defined on the generic word are shown. +Alternatively, you can use \texttt{methods.} to print all methods specializing on a +given class: + +\begin{alltt} +\textbf{ok} \bs list methods. +\textbf{PREDICATE: general-list list + dup [ + last* cdr + ] when not ; +IN: gadgets +M: list custom-sheet + [ + length count + ] keep zip alist>sheet "Elements:" ; +IN: prettyprint +M: list prettyprint* + [ + [ + POSTPONE: [ + ] car swap [ + POSTPONE: ] + ] car prettyprint-sequence + ] check-recursion ;} +\end{alltt} + +\subsection{Browsing via the HTTP server} + + +A more sophisticated way to browse the library is using the integrated HTTP server. You can start the HTTP server using the following pair of commands: + +\begin{alltt} +\textbf{ok} USE: httpd +\textbf{ok} 8888 httpd +\end{alltt} + +Then, point your browser to the following URL, and start browsing: + +\begin{quote} +\texttt{http://localhost:8888/responder/inspect/vocabularies} +\end{quote} + +To stop the HTTP server, point your browser to + +\begin{quote} +\texttt{http://localhost:8888/responder/quit}. +\end{quote} + +You can even start the HTTP in a separate thread, and look at code in your web browser while continuing to play in the listener: + +\begin{alltt} +\textbf{ok} USE: httpd +\textbf{ok} USE: threads +\textbf{ok} [ 8888 httpd ] in-thread +\end{alltt} + +\section{Dealing with runtime errors} + +\subsection{Looking at stacks} + +To see the contents of the data stack, use the \texttt{.s} word. Similarly, the other stacks can be shown with \texttt{.r} (return stack), \texttt{.n} (name stack), and \texttt{.c} (catch stack). Each stack is printed with each element on its own line; the top of the stack is the first element printed. + +\subsection{The debugger} + +If the execution of a phrase in the listener causes an error to be thrown, the error +is printed and the stacks at the time of the error are saved. If you're spent any +time with Factor at all, you are probably familiar with this type of message: + +\begin{alltt} +\textbf{ok} [ 1 2 3 ] 4 append reverse +\textbf{The generic word car does not have a suitable method for 4 +:s :r :n :c show stacks at time of error. +:get ( var -- value ) inspects the error namestack.} +\end{alltt} + +The words \texttt{:s}, \texttt{:r}, \texttt{:n} and \texttt{:s} behave like their counterparts that are prefixed with \texttt{.}, except they show the stacks as they were when the error was thrown. + +The return stack warrants some special attention. To successfully develop Factor, you will need to learn to understand how it works. Lets look at the first few lines of the return stack at the time of the above error: + +\begin{verbatim} +[ swap cdr ] +uncons +[ r> tuck 2slip ] +(each) +[ swons ] +[ each ] +each +\end{verbatim} + +You can see the sequence of calls leading up to the error was \texttt{each} calling \texttt{(each)} calling \texttt{uncons}. The error tells us that the \texttt{car} word is the one that failed. Now, you can stare at the stack dump, at notice that if the call to \texttt{car} was successful and execution returned to \texttt{(each)}, the quotation \texttt{[ r> tuck 2slip ]} would resume executing. The first word there, \texttt{r>}, would take the quotation \texttt{[ swons ]} and put it back on the data stack. After \texttt{(each)} returned, it would then continue executing the quotation \texttt{[ each ]}. So what is going on here is a recursive loop, \texttt{[ swons ] each}. If you look at the definition of \texttt{reverse}, you will see that this is exactly what is being done: + +\begin{verbatim} +: reverse ( list -- list ) [ ] swap [ swons ] each ; +\end{verbatim} + +So a list is being reversed, but at some stage, the \texttt{car} is taken of something that is not a number. Now, you can look at the data stack with \texttt{:s}: + +\begin{verbatim} +<< no-method [ ] 4 car >> +car +4 +4 +[ 3 2 1 ] +\end{verbatim} + +So now, the mystery has been solved: as \texttt{reverse} iterates down the input value, it hits a cons cells whose \texttt{cdr} is not a list. Indeed, if you look at the value we are passing to \texttt{reverse}, you will see why: + +\begin{alltt} +\textbf{ok} [ 1 2 3 ] 4 append . +[[ 1 [[ 2 [[ 3 4 ]] ]] ]] +\end{alltt} + +In the future, the debugger will be linked with the walker, documented below. Right now, the walker is a separate tool. Another caveat is that in compiled code, the return stack is not reconstructed if there is an error. Until this is fixed, you should only compile code once it is debugged. For more potential compiler pitfalls, see \ref{compiler}. + +\subsection{The walker} + +The walker lets you step through the execution of a qotation. When a compound definition is reached, you can either keep walking inside the definition, or execute it in one step. The stacks can be inspected at each stage. + +There are two ways to use the walker. First of all, you can call the \texttt{walk} word explicitly, giving it a quotation: + +\begin{alltt} +\textbf{ok} [ [ 10 [ dup , ] repeat ] make-list ] walk +\textbf{\&s \&r \&n \&c show stepper stacks. +\&get ( var -- value ) inspects the stepper namestack. +step -- single step over +into -- single step into +continue -- continue execution +bye -- exit single-stepper +[ [ 10 [ dup , ] repeat ] make-list ] +walk} +\end{alltt} + +As you can see, the walker prints a brief help message, then the currently executing quotation. It changes the listener prompt from \texttt{ok} to \texttt{walk}, to remind you that there is a suspended continuation. + +The first element of the quotation shown is the next object to be evaluated. If it is a literal, both \texttt{step} and \texttt{into} have the effect of pushing it on the walker data stack. If it is a compound definition, then \texttt{into} will recurse the walker into the compound definition; otherwise, the word executes in one step. + +The \texttt{\&r} word shows the walker return stack, which is laid out just like the primary interpreter's return stack. In fact, a good way to understand how Factor's return stack works is to play with the walker. + +Note that the walker does not automatically stop when the quotation originally given finishes executing; it just keeps on walking up the return stack, and even lets you step through the listener's code. You can invoke \texttt{continue} or \texttt{exit} to terminate the walker. + +While the walker can be invoked explicitly using the \texttt{walk} word, sometimes it is more convenient to \emph{annotate} a word such that the walker is invoked automatically when the word is called. This can be done using the \texttt{break} word: + +\begin{alltt} +\textbf{ok} \bs layout* break +\end{alltt} + +Now, when some piece of code calls \texttt{layout*}, the walker will open, and you will be able to step through execution and see exactly what's going on. An important point to keep in mind is that when the walker is invoked in this manner, \texttt{exit} will not have the desired effect; execution will continue, but the data stack will be inconsistent, and an error will most likely be raised a short time later. Always use \texttt{continue} to resume execution after a break. + +The walker is very handy, but sometimes you just want to see if a word is being called at all and when, and you don't care to single-step it. In that case, you can use the \texttt{watch} word: + +\begin{alltt} +\textbf{ok} \bs draw-shape break +\end{alltt} + +Now when \texttt{draw-shape} is called, a message will be printed to that effect. + +You can undo the effect of \texttt{break} or \texttt{watch} by reloading the original source file containing the word definition in question: + +\begin{alltt} +\textbf{ok} \bs layout* reload +\textbf{ok} \bs draw-shape reload +\end{alltt} + +\subsection{Dealing with hangs} + +If you accidentally start an infinite loop, you can send the Factor runtime a \texttt{QUIT} signal. On Unix, this is done by pressing \texttt{Control-\bs} in the controlling terminal. This will cause the runtime to dump the data and return stacks in a semi-readable form. Note that this will help you find the root cause of the hang, but it will not let you interrupt the infinite loop. + + +\section{Defensive coding} + +\subsection{Unit testing} + +Unit tests are very easy to write. They are usually placed in source files. A unit test can be executed with the \texttt{unit-test} word in the \texttt{test} vocabulary. This word takes a list and a quotation; the quotation is executed, and the resulting data stack is compared against the list. If they do not equal, the unit test has failed. Here is an example of a unit test: + +\begin{verbatim} +[ "Hello, crazy world" ] [ + "editor" get [ 0 caret set ] bind + ", crazy" 5 "editor" get [ line-insert ] bind + "editor" get [ line-text get ] bind +] unit-test +\end{verbatim} + +To have a unit test assert that a piece of code does not execute successfully, but rather throws an exception, use the \texttt{unit-test-fails} word. It takes only one quotation; if the quotation does \emph{not} throw an exception, the unit test has failed. + +\begin{verbatim} +[ -3 { } vector-nth ] unit-test-fails +\end{verbatim} + +Unit testing is a good habit to get into. Sometimes, writing tests first, before any code, can speed the development process too; by running your unit test script, you can gauge progress. + +\subsection{Stack effect inference} + +While most programming errors in Factor are only caught at runtime, the stack effect checker can be useful for checking correctness of code before it is run. It can also help narrow down problems with stack shuffling. The stack checker is used by passing a quotation to the \texttt{infer} word. It uses a sophisticated algorithm to infer stack effects of recursive words, combinators, and other tricky constructions, however, it cannot infer the stack effect of all words. In particular, anything using continuations, such as \texttt{catch} and I/O, will stump the stack checker. Despite this fault, it is still a useful tool. + +\begin{alltt} +\textbf{ok} [ pile-fill * >fixnum over pref-size dup y +\texttt{...} [ + ] change ] infer . +\textbf{[ [ tuple number tuple ] [ tuple fixnum object number ] ]} +\end{alltt} + +The stack checker will report an error it it cannot infer the stack effect of a quotation. The ``recursive state'' dump is similar to a return stack, but it is not a real return stack, since only a code walk is taking place, not full evaluation. Understanding recursive state dumps is an art, much like understanding return stacks. + +\begin{alltt} +\textbf{ok} [ 100 [ f f cons ] repeat ] infer . +\textbf{! Inference error: Unbalanced branches +! Recursive state: +! [ (repeat) G:54044 pick pick >= [ 3drop ] + [ [ swap >r call 1 + r> ] keep (repeat) ] ifte ] +! [ repeat G:54042 0 -rot (repeat) ] +:s :r :n :c show stacks at time of error. +:get ( var -- value ) inspects the error namestack.} +\end{alltt} + +One reason stack inference might fail is if the quotation contains unbalanced branches, as above. For the inference to work, both branches of a conditional must exit with the same stack height. + +Another situation when it fails is if your code calls quotations that are not statically known. This can happen if the word in question uses continuations, or if it pulls a quotation from a variable and calls it. This can also happen if you wrote your own combinator, but forgot to mark it as \texttt{inline}. For example, the following will fail: + +\begin{alltt} +\textbf{ok} : dip swap >r call r> ; +\textbf{ok} [ [ + ] dip * ] infer . +! Inference error: A literal value was expected where a +computed value was found: \# +... +\end{alltt} + +However, defining \texttt{dip} to be inlined will work: + +\begin{alltt} +\textbf{ok} : dip swap >r call r> ; inline +\textbf{ok} [ [ + ] dip * ] infer . +\textbf{[ [ number number number ] [ number ] ]} +\end{alltt} + +You can combine unit testing with stack effect inference by writing unit tests that check stack effects of words. In fact, this can be automated with the \texttt{infer>test.} word; it takes a quotation on the stack, and prints a code snippet that tests the stack effect of the quotation: + +\begin{alltt} +\textbf{ok} [ draw-shape ] infer>test. +\textbf{[ [ [ object ] [ ] ] ] +[ [ draw-shape ] infer ] +unit-test} +\end{alltt} + +You can then copy and paste this snippet into a test script, and run the test script after +making changes to the word to ensure its stack effect signature has not changed. + +\section{Optimization} + +While both the Factor interpreter and compiler are relatively slow at this stage, there +are still ways you can make your Factor code go faster. The key is to find bottlenecks, +and optimize them. + +\subsection{Timing code} + +The \texttt{time} word reports the time taken to execute a quotation, in milliseconds. The portion of time spent in garbage collection is also shown: + +\begin{alltt} +\textbf{ok} [ 1000000 [ f f cons drop ] repeat ] time +\textbf{515 milliseconds run time +11 milliseconds GC time} +\end{alltt} + +\subsection{Exploring memory usage} + +Factor supports heap introspection. You can find all objects in the heap that match a certain predicate using the \texttt{instances} word. For example, if you suspect a resource leak, you can find all I/O ports as follows: + +\begin{alltt} +\textbf{ok} USE: io-internals +\textbf{ok} [ port? ] instances . +\textbf{[ \# \# ]} +\end{alltt} + +The \texttt{references} word finds all objects that refer to a given object: + +\begin{alltt} +\textbf{ok} [ float? ] instances car references . +\textbf{[ \# [ -1.0 0.0 / ] ]} +\end{alltt} + +You can print a memory usage summary with \texttt{room.}: + +\begin{alltt} +\textbf{ok} room. +\textbf{Data space: 16384 KB total 2530 KB used 13853 KB free +Code space: 16384 KB total 490 KB used 15893 KB free} +\end{alltt} + +And finally, a detailed memory allocation breakdown by type with \texttt{heap-stats.}: + +\begin{alltt} +\textbf{ok} heap-stats. +\textbf{bignum: 312 bytes, 17 instances +cons: 850376 bytes, 106297 instances +float: 112 bytes, 7 instances +t: 8 bytes, 1 instances +array: 202064 bytes, 3756 instances +hashtable: 54912 bytes, 3432 instances +vector: 5184 bytes, 324 instances +string: 391024 bytes, 7056 instances +sbuf: 64 bytes, 4 instances +port: 112 bytes, 2 instances +word: 96960 bytes, 3030 instances +tuple: 688 bytes, 22 instances} +\end{alltt} + +\subsection{The profiler} + +Factor provides a statistical sampling profiler for narrowing down memory and processor bottlenecks. +The profiler is only supported on Unix platforms. On FreeBSD 4.x, the Factor runtime must +be compiled without the \texttt{-pthread} switch, since FreeBS 4.x userspace threading makes +use of a signal that conflicts with the signal used for profiling. + +The \texttt{allot-profile} word executes a quotation with the memory profiler enabled, then prints a list of all words that allocated memory, along with the bytes allocated. Note that during particularly long executions, or executions where a lot of memory is allocated, these counters may overrun. + +\begin{alltt} +\textbf{ok} [ "boot.image.le32" make-image ] allot-profile +\emph{... many lines omitted ...} +\textbf{[[ write-little-endian-32 673952 ]] +[[ wait-to-read-line 788640 ]] +[[ blocking-read-line 821264 ]] +[[ vocabularies 822624 ]] +[[ parse-resource 823376 ]] +[[ next-line 1116440 ]] +[[ vector-map 1326504 ]] +[[ fixup-words 1326520 ]] +[[ vector-each 1768640 ]] +[[ (parse) 2434208 ]] +[[ classes 2517920 ]] +[[ when* 2939088 ]] +[[ while 3614408 ]] +[[ (parse-stream) 3634552 ]] +[[ make-list 3862000 ]] +[[ object 4143784 ]] +[[ each 4712080 ]] +[[ run-resource 5036904 ]] +[[ (callcc) 5183400 ]] +[[ catch 5188976 ]] +[[ 2slip 8631736 ]] +[[ end 202896600 ]] +[[ make-image 208611888 ]] +[[ with-scope 437823992 ]]} +\end{alltt} + +The \texttt{call-profile} word executes a quotation with the CPU profiler enabled, then prints a list of all words that were found on the return stack, along with the number of times they were seen there. This gives a rough idea of what words are taking up the majority of execution time. + +\begin{alltt} +\textbf{ok} [ "boot.image.le32" make-image ] call-profile +\emph{... many lines omitted ...} +\textbf{[[ stream-write 7 ]] +[[ wait-to-write 7 ]] +[[ vector-map 11 ]] +[[ fixup-words 11 ]] +[[ when* 12 ]] +[[ write 16 ]] +[[ write-word 17 ]] +[[ parse-loop 22 ]] +[[ make-list 24 ]] +[[ (parse) 29 ]] +[[ blocking-write 32 ]] +[[ while 35 ]] +[[ (parse-stream) 36 ]] +[[ dispatch 47 ]] +[[ run-resource 50 ]] +[[ write-little-endian-32 76 ]] +[[ (callcc) 173 ]] +[[ catch 174 ]] +[[ each 175 ]] +[[ 2slip 199 ]] +[[ end 747 ]] +[[ make-image 785 ]] +[[ with-scope 1848 ]]} +\end{alltt} + +Normally, the memory and CPU profilers run every millisecond, and increment counters for all words on the return stack. The \texttt{only-top} variable can be switched on, in which case only the counter for the word at the top of the return stack is incremented. This gives a more localized picture of CPU and memory usage. + +\subsection{\label{compiler}The compiler} + +The compiler can provide a substantial speed boost for words whose stack effect can be inferred. Words without a known stack effect cannot be compiled, and must be run in the interpreter. The compiler generates native code, and so far, x86 and PowerPC backends have been developed. + +To compile a single word, call \texttt{compile}: + +\begin{alltt} +\textbf{ok} \bs pref-size compile +\textbf{Compiling pref-size} +\end{alltt} + +During bootstrap, all words in the library with a known stack effect are compiled. You can +circumvent this, for whatever reason, by passing the \texttt{-no-compile} switch during +bootstrap: + +\begin{alltt} +\textbf{bash\$} ./f boot.image.le32 -no-compile +\end{alltt} + +The compiler has two limitations you must be aware of. First, if an exception is thrown in compiled code, the return stack will be incomplete, since compiled words do not push themselves there. Second, compiled code cannot be profiled. These limitations will be resolved in a future release. + +The compiler consists of multiple stages -- first, a dataflow graph is inferred, then various optimizations are done on this graph, then it is transformed into a linear representation, further optimizations are done, and finally, machine code is generated from the linear representation. To perform everything except for the machine code generation, use the \texttt{precompile} word. This will dump the optimized linear IR instead of generating code, which can be useful sometimes. + +\begin{alltt} +\textbf{ok} \bs append precompile +\textbf{[ \#prologue ] +[ over ] +[[ \#jump-t-label G:54091 ]] +[ swap ] +[ drop ] +[ \#return ] +[[ \#label G:54091 ]] +[ >r ] +[[ \#call uncons ]] +[ r> ] +[[ \#call append ]] +[[ \#jump cons ]]} +\end{alltt} + +\printglossary + +\input{spec.ind} + +\end{document} diff --git a/doc/naming.txt b/doc/naming.txt deleted file mode 100644 index 7ca01e98c7..0000000000 --- a/doc/naming.txt +++ /dev/null @@ -1,78 +0,0 @@ -FACTOR CODING CONVENTIONS. - -=== Naming words - -foo. - perform "foo", but instead of pushing the result on the - stack, print it in a human-readable form suitable for - interactive use. - - Eg: words. vocabs. - -.X - four words to print the contents of the stacks: - .s - data stack - .r - call stack - .n - name stack - .c - catch stack - -foo* - a variation of "foo" that takes more parameters. - - Eg: index-of* parse* random-element* - - - a lower-level word used in the implementation of "foo". - - Eg: compile* prettyprint* - - - a word that is a variation on "foo", but is more specialized - and less frequently used. - - Eg: last* get* - -(foo) - a word that is only useful in the implementation of "foo". - - Eg: (vector=) (split) - ->to - convert object to type "to". - - Eg: >str >lower >upper >fixnum >realnum - - - move top of data stack "to" stack. - - Eg: >r >n >c - -from> - convert object from type "from". - - Eg: dec> oct> hex> - - - move top of "from" stack to data stack. - - Eg: r> n> c> - -one>two - convert object of type "one" to "two". - - Eg: stream>str stack>list worddef>list - - - transfer values between stacks. - - Eg: >r r> 2>r 2r> >n - - - create an object of "type". - - Eg: - -foo@ - get the value of a variable at the top of the stack; - operate on the value with "foo"; store the value back in the - variable. - - Eg: +@ *@ -@ /@ cons@ append@ - -foo-iter - a tail-recursive word used in the implementatin of "foo". - - Eg: nreverse-iter partition-iter - -nfoo - on lists, a destructive (non-consing) version of "foo". - - Eg: nappend nreverse - -2foo - like foo but with two operands taken from stack. - - Eg: 2drop 2dup 2each diff --git a/doc/tools.tex b/doc/tools.tex deleted file mode 100644 index 5411f15c8d..0000000000 --- a/doc/tools.tex +++ /dev/null @@ -1,704 +0,0 @@ -\documentclass{article} -\usepackage{times} -\usepackage{tabularx} -\usepackage{alltt} - -\newcommand{\ttbs}{\symbol{92}} -\newcommand{\tto}{\symbol{123}} -\newcommand{\ttc}{\symbol{125}} -\begin{document} - -\title{The Factor Development Environment} -\author{Slava Pestov} - -\maketitle - -\tableofcontents - -\section{Introduction} - -This article covers the interactive development environment for the Factor programming language, whose homepage is located at \texttt{http://factor.sf.net}. - -Factor supports interactive development in a live environment. Instead of working with -static executable files and restarting your application after each change, you can -incrementally make changes to your application and test them immediately. If you -notice an undesirable behavior, Factor's powerful reflection features will aid in -pinpointing the error. - -If you are used to a statically typed language, you might find Factor's tendency to only fail at runtime hard to work with at first. However, the interactive development tools outlined in this document allow a much quicker turn-around time for testing changes. Also, write unit tests -- unit testing is a great way to ensure that old bugs do not re-appear once they've been fixed. - -\section{System organization} - -\subsection{The listener} - -Factor is an \emph{image-based environment}. When you compiled Factor, you also generated a file named \texttt{factor.image}. I will have more to say about images later, but for now it suffices to understand that to start Factor, you must pass the image file name on the command line: - -\begin{alltt} -./f factor.image -\textbf{Loading factor.image... relocating... done -Factor 0.73 :: http://factor.sourceforge.net :: unix/x86 -(C) 2003, 2005 Slava Pestov, Chris Double, -Mackenzie Straight -ok} -\end{alltt} - - -An \texttt{\textbf{ok}} prompt is printed after the initial banner, indicating the listener is ready to execute Factor phrases. The listener is a piece of Factor code, like any other; however, it helps to think of it as the primary interface to the Factor system. The listener reads Factor code and executes it. You can try the classical first program: - -\begin{alltt} -\textbf{ok} "Hello, world." print -\textbf{Hello, world.} -\end{alltt} - - -Multi-line phrases are supported; if there are unclosed brackets, the listener outputs \texttt{...} instead of the \texttt{ok} prompt, and the entire phrase is executed once all brackets are closed: - -\begin{alltt} -\textbf{ok} [ 1 2 3 ] [ -\textbf{...} . -\textbf{...} ] each -\textbf{1 -2 -3} -\end{alltt} - -The listener knows when to print a continuation prompt by looking at the height of the -stack. Parsing words such as \texttt{[} and \texttt{:} leave elements on the parser -stack; these elements are popped by \texttt{]} and \texttt{;}. - -\subsection{Source files} - -While it is possible to do all development in the listener and save your work in images, it is far more convenient to work with source files, at least until an in-image structure editor is developed. - -By convention, Factor source files are saved with the \texttt{.factor} filename extension. They can be loaded into the image as follows: - -\begin{alltt} -\textbf{ok} "examples/numbers-game.factor" run-file -\end{alltt} - -In Factor, loading a source file replaces any existing definitions\footnote{But see \ref{compiler} for this is not true of compiled code.}. Each word definition remembers what source file it was loaded from (if any). To reload the source file associated with a definition, use the \texttt{reload} word: - -\begin{alltt} -\textbf{ok} \ttbs draw reload -\end{alltt} - -Word definitions also retain the line number where they are located in their original source file. This allows you to open a word definition in jEdit\footnote{\texttt{http://www.jedit.org}} for editing using the -\texttt{jedit} word: - -\begin{alltt} -\textbf{ok} \ttbs compile jedit -\end{alltt} - -This word requires that a jEdit instance is already running. - -For the \texttt{jedit} word to work with words in the Factor library, you must set the \texttt{"resource-path"} variable to the location of the Factor source tree. One way to do this is to add a phrase like the following to your \texttt{.factor-rc}: - -\begin{verbatim} -"/home/slava/Factor/" "resource-path" set -\end{verbatim} - -On startup, Factor reads the \texttt{.factor-rc} file from your home directory. You can put -any quick definitions you want available at the listener there. To avoid loading this -file, pass the \texttt{-no-user-init} command line switch. Another way to have a set of definitions available at all times is to save a custom image, as described in the next section. - -\subsection{Images} - -The \texttt{factor.image} file is basically a dump of all objects in the heap. A new image can be saved as follows: - -\begin{alltt} -\textbf{ok} "work.image" save-image -\textbf{Saving work.image...} -\end{alltt} - -When you save an image before exiting Factor, then start Factor again, everything will be almost as you left it. Try the following: - -\begin{alltt} -./f factor.image -\textbf{ok} "Learn Factor" "reminder" set -\textbf{ok} "factor.image" save-image bye -\textbf{Saving factor.image...} -\end{alltt} - -Factor will save the image and exit. Now start it again and see that the reminder is still there: - -\begin{alltt} -./f factor.image -\textbf{ok} "reminder" get . -\textbf{"Learn Factor"} -\end{alltt} - -This is what is meant by the image being an \emph{infinite session}. When you shut down and restart Factor, what happends is much closer to a Laptop's ``suspend'' mode, than a desktop computer being fully shut down. - -\subsection{Looking at objects} - -Probably the most important debugging tool of them all is the \texttt{.} word. It prints the object at the top of the stack in a form that can be parsed by the Factor parser. A related word is \texttt{prettyprint}. It is identical to \texttt{.} except the output is more verbose; lists, vectors and hashtables are broken up into multiple lines and indented. - -\begin{alltt} -\textbf{ok} [ [ \tto 1 \ttc \tto 2 \ttc ] dup car swap cdr ] . -[ [ \tto 1 \ttc \tto 2 \ttc ] dup car swap cdr ] -\end{alltt} - -Most objects print in a parsable form, but not all. One exceptions to this rule is objects with external state, such as I/O ports or aliens (pointers to native structures). Also, objects with circular or very deeply nested structure will not print in a fully parsable form, since the prettyprinter has a limit on maximum nesting. Here is an example -- a vector is created, that holds a list whose first element is the vector itself: - -\begin{alltt} -\textbf{ok} \tto \ttc [ unit 0 ] keep [ set-vector-nth ] keep . -\tto [ ... ] \ttc -\end{alltt} - -The prettyprinted form of a vector or list with many elements is not always readable. The \texttt{[.]} and \texttt{\tto.\ttc} words output a list or a vector, respectively, with each element on its own line. In fact, the stack printing words are defined in terms of \texttt{[.]} and \texttt{\tto.\ttc}: - -\begin{verbatim} -: .s datastack {.} ; -: .r callstack {.} ; -: .n namestack [.] ; -: .c catchstack [.] ; -\end{verbatim} - -Before we move on, one final set of output words comes is used to output integers in -different numeric bases. The \texttt{.b} word prints an integer in binary, \texttt{.o} in octal, and \texttt{.h} in hexadecimal. - -\begin{alltt} -\textbf{ok} 31337 .b -\textbf{111101001101001} -\textbf{ok} 31337 .o -\textbf{75151} -\textbf{ok} 31337 .h -\textbf{7a69} -\end{alltt} - -\section{Word tools} - -\subsection{Exploring vocabularies} - -Factor organizes code in a two-tier structure of vocabularies and words. A word is the smallest unit of code; it corresponds to a function or method in other languages. Vocabularies group related words together for easy browsing and tracking of source dependencies. - -Entering \texttt{vocabs .}~in the listener produces a list of all existing vocabularies: - -\begin{alltt} -\textbf{ok} vocabs . -\textbf{[ "alien" "ansi" "assembler" "browser-responder" -"command-line" "compiler" "cont-responder" "errors" -"file-responder" "files" "gadgets" "generic" -"hashtables" "html" "httpd" "httpd-responder" "image" -"inference" "interpreter" "io-internals" "jedit" -"kernel" "kernel-internals" "line-editor" "listener" -"lists" "logging" "math" "math-internals" "memory" -"namespaces" "parser" "prettyprint" "profiler" -"quit-responder" "random" "resource-responder" -"scratchpad" "sdl" "shells" "stdio" "streams" -"strings" "syntax" "telnetd" "test" "test-responder" -"threads" "unparser" "url-encoding" "vectors" "words" ]} -\end{alltt} - -As you can see, there are a lot of vocabularies! Now, you can use \texttt{words .}~to list the words inside a given vocabulary: - -\begin{alltt} -\textbf{ok} "namespaces" words . -\textbf{[ (get) , >n append, bind change cons@ -dec extend get global inc init-namespaces list-buffer -literal, make-list make-rlist make-rstring make-string -make-vector n> namespace namestack nest off on put set -set-global set-namestack unique, unique@ with-scope ]} -\end{alltt} - -You can look at the definition of any word, including library words, using \texttt{see}. Keep in mind you might have to \texttt{USE:} the vocabulary first. - -\begin{alltt} -\textbf{ok} USE: httpd -\textbf{ok} \ttbs httpd-connection see -\textbf{IN: httpd : httpd-connection ( socket -- ) - "http-server" get accept [ - httpd-client - ] in-thread drop ;} -\end{alltt} - -The \texttt{see} word shows a reconstruction of the source code, not the original source code. So in particular, formatting and some comments are lost. - -\subsection{Cross-referencing words} - -The \texttt{apropos.} word is handy when searching for related words. It lists all words -whose names contain a given string. The \texttt{apropos.} word is also useful when you know the exact name of a word, but are unsure what vocabulary it is in. For example, if you're looking for ways to iterate over various collections, you can do an apropos search for \texttt{map}: - -\begin{alltt} -\textbf{ok} "map" apropos. -\textbf{IN: inference -type-value-map -IN: lists -map -map-with -IN: sdl -set-surface-map -surface-map -IN: strings -string-map -IN: vectors -vector-map} -\end{alltt} - -From the above output, you can see that \texttt{map} is for lists, \texttt{string-map} is for strings, and \texttt{vector-map} is for vectors. - -The \texttt{usage} word finds all words that refer to a given word and pushes a list on the stack. This word is helpful in two situations; the first is for learning -- a good way to learn a word is to see it used in context. The second is during refactoring -- if you change a word's stack effect, you must also update all words that call it. Usually you print the -return value of \texttt{usage} using \texttt{.}: - -\begin{alltt} -\textbf{ok} \ttbs string-map usage . -\textbf{schars>entities -filter-null -url-encode} -\end{alltt} - -Another useful word is \texttt{usages}. Unlike \texttt{usage}, it finds all usages, even -indirect ones -- so if a word refers to another word that refers to the given word, -both words will be in the output list. - -\subsection{Exploring classes} - -Factor supports object-oriented programming via generic words. Generic words are called -like ordinary words, however they can have multiple definitions, one per class, and -these definitions do not have to appear in the same source file. Such a definition is -termed a \emph{method}, and the method is said to \emph{specialize} on a certain -class. A class in the most -general sense is just a set of objects. You can output a list of classes in the system -with \texttt{classes .}: - -\begin{alltt} -\textbf{ok} classes. -\textbf{[ alien alien-error byte-array displaced-alien -dll ansi-stream disp-only displaced indirect operand -register absolute absolute-16/16 relative relative-bitfld -item kernel-error no-method border checkbox dialog editor -ellipse etched-rect frame gadget hand hollow-ellipse -hollow-rect label line menu pane pile plain-ellipse -plain-rect rectangle roll-rect scroller shelf slider -stack tile viewport world 2generic arrayed builtin -complement generic null object predicate tuple -tuple-class union hashtable html-stream class-tie -computed inference-error inference-warning literal -literal-tie value buffer port jedit-stream boolean -general-t array cons general-list list bignum complex -fixnum float integer number ratio rational real -parse-error potential-float potential-ratio -button-down-event button-up-event joy-axis-event -joy-ball-event joy-button-down-event joy-button-up-event -joy-hat-event key-down-event key-up-event motion-event -quit-event resize-event user-event sequence stdio-stream -client-stream fd-stream null-stream server string-output -wrapper-stream LETTER blank digit letter printable sbuf -string text POSTPONE: f POSTPONE: t vector compound -primitive symbol undefined word ]} -\end{alltt} - -If you \texttt{see} a generic word, all methods defined on the generic word are shown. -Alternatively, you can use \texttt{methods.} to print all methods specializing on a -given class: - -\begin{alltt} -\textbf{ok} \ttbs list methods. -\textbf{PREDICATE: general-list list - dup [ - last* cdr - ] when not ; -IN: gadgets -M: list custom-sheet - [ - length count - ] keep zip alist>sheet "Elements:" ; -IN: prettyprint -M: list prettyprint* - [ - [ - POSTPONE: [ - ] car swap [ - POSTPONE: ] - ] car prettyprint-sequence - ] check-recursion ;} -\end{alltt} - -\subsection{Browsing via the HTTP server} - - -A more sophisticated way to browse the library is using the integrated HTTP server. You can start the HTTP server using the following pair of commands: - -\begin{alltt} -\textbf{ok} USE: httpd -\textbf{ok} 8888 httpd -\end{alltt} - -Then, point your browser to the following URL, and start browsing: - -\begin{quote} -\texttt{http://localhost:8888/responder/inspect/vocabularies} -\end{quote} - -To stop the HTTP server, point your browser to - -\begin{quote} -\texttt{http://localhost:8888/responder/quit}. -\end{quote} - -You can even start the HTTP in a separate thread, and look at code in your web browser while continuing to play in the listener: - -\begin{alltt} -\textbf{ok} USE: httpd -\textbf{ok} USE: threads -\textbf{ok} [ 8888 httpd ] in-thread -\end{alltt} - -\section{Dealing with runtime errors} - -\subsection{Looking at stacks} - -To see the contents of the data stack, use the \texttt{.s} word. Similarly, the other stacks can be shown with \texttt{.r} (return stack), \texttt{.n} (name stack), and \texttt{.c} (catch stack). Each stack is printed with each element on its own line; the top of the stack is the first element printed. - -\subsection{The debugger} - -If the execution of a phrase in the listener causes an error to be thrown, the error -is printed and the stacks at the time of the error are saved. If you're spent any -time with Factor at all, you are probably familiar with this type of message: - -\begin{alltt} -\textbf{ok} [ 1 2 3 ] 4 append reverse -\textbf{The generic word car does not have a suitable method for 4 -:s :r :n :c show stacks at time of error. -:get ( var -- value ) inspects the error namestack.} -\end{alltt} - -The words \texttt{:s}, \texttt{:r}, \texttt{:n} and \texttt{:s} behave like their counterparts that are prefixed with \texttt{.}, except they show the stacks as they were when the error was thrown. - -The return stack warrants some special attention. To successfully develop Factor, you will need to learn to understand how it works. Lets look at the first few lines of the return stack at the time of the above error: - -\begin{verbatim} -[ swap cdr ] -uncons -[ r> tuck 2slip ] -(each) -[ swons ] -[ each ] -each -\end{verbatim} - -You can see the sequence of calls leading up to the error was \texttt{each} calling \texttt{(each)} calling \texttt{uncons}. The error tells us that the \texttt{car} word is the one that failed. Now, you can stare at the stack dump, at notice that if the call to \texttt{car} was successful and execution returned to \texttt{(each)}, the quotation \texttt{[ r> tuck 2slip ]} would resume executing. The first word there, \texttt{r>}, would take the quotation \texttt{[ swons ]} and put it back on the data stack. After \texttt{(each)} returned, it would then continue executing the quotation \texttt{[ each ]}. So what is going on here is a recursive loop, \texttt{[ swons ] each}. If you look at the definition of \texttt{reverse}, you will see that this is exactly what is being done: - -\begin{verbatim} -: reverse ( list -- list ) [ ] swap [ swons ] each ; -\end{verbatim} - -So a list is being reversed, but at some stage, the \texttt{car} is taken of something that is not a number. Now, you can look at the data stack with \texttt{:s}: - -\begin{verbatim} -<< no-method [ ] 4 car >> -car -4 -4 -[ 3 2 1 ] -\end{verbatim} - -So now, the mystery has been solved: as \texttt{reverse} iterates down the input value, it hits a cons cells whose \texttt{cdr} is not a list. Indeed, if you look at the value we are passing to \texttt{reverse}, you will see why: - -\begin{alltt} -\textbf{ok} [ 1 2 3 ] 4 append . -[[ 1 [[ 2 [[ 3 4 ]] ]] ]] -\end{alltt} - -In the future, the debugger will be linked with the walker, documented below. Right now, the walker is a separate tool. Another caveat is that in compiled code, the return stack is not reconstructed if there is an error. Until this is fixed, you should only compile code once it is debugged. For more potential compiler pitfalls, see \ref{compiler}. - -\subsection{The walker} - -The walker lets you step through the execution of a qotation. When a colon definition is reached, you can either keep walking inside the definition, or execute it in one step. The stacks can be inspected at each stage. - -There are two ways to use the walker. First of all, you can call the \texttt{walk} word explicitly, giving it a quotation: - -\begin{alltt} -\textbf{ok} [ [ 10 [ dup , ] repeat ] make-list ] walk -\textbf{\&s \&r \&n \&c show stepper stacks. -\&get ( var -- value ) inspects the stepper namestack. -step -- single step over -into -- single step into -continue -- continue execution -bye -- exit single-stepper -[ [ 10 [ dup , ] repeat ] make-list ] -walk} -\end{alltt} - -As you can see, the walker prints a brief help message, then the currently executing quotation. It changes the listener prompt from \texttt{ok} to \texttt{walk}, to remind you that there is a suspended continuation. - -The first element of the quotation shown is the next object to be evaluated. If it is a literal, both \texttt{step} and \texttt{into} have the effect of pushing it on the walker data stack. If it is a colon definition, then \texttt{into} will recurse the walker into the colon definition; otherwise, the word executes in one step. - -The \texttt{\&r} word shows the walker return stack, which is laid out just like the primary interpreter's return stack. In fact, a good way to understand how Factor's return stack works is to play with the walker. - -Note that the walker does not automatically stop when the quotation originally given finishes executing; it just keeps on walking up the return stack, and even lets you step through the listener's code. You can invoke \texttt{continue} or \texttt{exit} to terminate the walker. - -While the walker can be invoked explicitly using the \texttt{walk} word, sometimes it is more convenient to \emph{annotate} a word such that the walker is invoked automatically when the word is called. This can be done using the \texttt{break} word: - -\begin{alltt} -\textbf{ok} \ttbs layout* break -\end{alltt} - -Now, when some piece of code calls \texttt{layout*}, the walker will open, and you will be able to step through execution and see exactly what's going on. An important point to keep in mind is that when the walker is invoked in this manner, \texttt{exit} will not have the desired effect; execution will continue, but the data stack will be inconsistent, and an error will most likely be raised a short time later. Always use \texttt{continue} to resume execution after a break. - -The walker is very handy, but sometimes you just want to see if a word is being called at all and when, and you don't care to single-step it. In that case, you can use the \texttt{watch} word: - -\begin{alltt} -\textbf{ok} \ttbs draw-shape break -\end{alltt} - -Now when \texttt{draw-shape} is called, a message will be printed to that effect. - -You can undo the effect of \texttt{break} or \texttt{watch} by reloading the original source file containing the word definition in question: - -\begin{alltt} -\textbf{ok} \ttbs layout* reload -\textbf{ok} \ttbs draw-shape reload -\end{alltt} - -\subsection{Dealing with hangs} - -If you accidentally start an infinite loop, you can send the Factor runtime a \texttt{QUIT} signal. On Unix, this is done by pressing \texttt{Control-\ttbs} in the controlling terminal. This will cause the runtime to dump the data and return stacks in a semi-readable form. Note that this will help you find the root cause of the hang, but it will not let you interrupt the infinite loop. - - -\section{Defensive coding} - -\subsection{Unit testing} - -Unit tests are very easy to write. They are usually placed in source files. A unit test can be executed with the \texttt{unit-test} word in the \texttt{test} vocabulary. This word takes a list and a quotation; the quotation is executed, and the resulting data stack is compared against the list. If they do not equal, the unit test has failed. Here is an example of a unit test: - -\begin{verbatim} -[ "Hello, crazy world" ] [ - "editor" get [ 0 caret set ] bind - ", crazy" 5 "editor" get [ line-insert ] bind - "editor" get [ line-text get ] bind -] unit-test -\end{verbatim} - -To have a unit test assert that a piece of code does not execute successfully, but rather throws an exception, use the \texttt{unit-test-fails} word. It takes only one quotation; if the quotation does \emph{not} throw an exception, the unit test has failed. - -\begin{verbatim} -[ -3 { } vector-nth ] unit-test-fails -\end{verbatim} - -Unit testing is a good habit to get into. Sometimes, writing tests first, before any code, can speed the development process too; by running your unit test script, you can gauge progress. - -\subsection{Stack effect inference} - -While most programming errors in Factor are only caught at runtime, the stack effect checker can be useful for checking correctness of code before it is run. It can also help narrow down problems with stack shuffling. The stack checker is used by passing a quotation to the \texttt{infer} word. It uses a sophisticated algorithm to infer stack effects of recursive words, combinators, and other tricky constructions, however, it cannot infer the stack effect of all words. In particular, anything using continuations, such as \texttt{catch} and I/O, will stump the stack checker. Despite this fault, it is still a useful tool. - -\begin{alltt} -\textbf{ok} [ pile-fill * >fixnum over pref-size dup y -\texttt{...} [ + ] change ] infer . -\textbf{[ [ tuple number tuple ] [ tuple fixnum object number ] ]} -\end{alltt} - -The stack checker will report an error it it cannot infer the stack effect of a quotation. The ``recursive state'' dump is similar to a return stack, but it is not a real return stack, since only a code walk is taking place, not full evaluation. Understanding recursive state dumps is an art, much like understanding return stacks. - -\begin{alltt} -\textbf{ok} [ 100 [ f f cons ] repeat ] infer . -\textbf{! Inference error: Unbalanced branches -! Recursive state: -! [ (repeat) G:54044 pick pick >= [ 3drop ] - [ [ swap >r call 1 + r> ] keep (repeat) ] ifte ] -! [ repeat G:54042 0 -rot (repeat) ] -:s :r :n :c show stacks at time of error. -:get ( var -- value ) inspects the error namestack.} -\end{alltt} - -One reason stack inference might fail is if the quotation contains unbalanced branches, as above. For the inference to work, both branches of a conditional must exit with the same stack height. - -Another situation when it fails is if your code calls quotations that are not statically known. This can happen if the word in question uses continuations, or if it pulls a quotation from a variable and calls it. This can also happen if you wrote your own combinator, but forgot to mark it as \texttt{inline}. For example, the following will fail: - -\begin{alltt} -\textbf{ok} : dip swap >r call r> ; -\textbf{ok} [ [ + ] dip * ] infer . -! Inference error: A literal value was expected where a -computed value was found: \# -... -\end{alltt} - -However, defining \texttt{dip} to be inlined will work: - -\begin{alltt} -\textbf{ok} : dip swap >r call r> ; inline -\textbf{ok} [ [ + ] dip * ] infer . -\textbf{[ [ number number number ] [ number ] ]} -\end{alltt} - -You can combine unit testing with stack effect inference by writing unit tests that check stack effects of words. In fact, this can be automated with the \texttt{infer>test.} word; it takes a quotation on the stack, and prints a code snippet that tests the stack effect of the quotation: - -\begin{alltt} -\textbf{ok} [ draw-shape ] infer>test. -\textbf{[ [ [ object ] [ ] ] ] -[ [ draw-shape ] infer ] -unit-test} -\end{alltt} - -You can then copy and paste this snippet into a test script, and run the test script after -making changes to the word to ensure its stack effect signature has not changed. - -\section{Optimization} - -While both the Factor interpreter and compiler are relatively slow at this stage, there -are still ways you can make your Factor code go faster. The key is to find bottlenecks, -and optimize them. - -\subsection{Timing code} - -The \texttt{time} word reports the time taken to execute a quotation, in milliseconds. The portion of time spent in garbage collection is also shown: - -\begin{alltt} -\textbf{ok} [ 1000000 [ f f cons drop ] repeat ] time -\textbf{515 milliseconds run time -11 milliseconds GC time} -\end{alltt} - -\subsection{Exploring memory usage} - -Factor supports heap introspection. You can find all objects in the heap that match a certain predicate using the \texttt{instances} word. For example, if you suspect a resource leak, you can find all I/O ports as follows: - -\begin{alltt} -\textbf{ok} USE: io-internals -\textbf{ok} [ port? ] instances . -\textbf{[ \# \# ]} -\end{alltt} - -The \texttt{references} word finds all objects that refer to a given object: - -\begin{alltt} -\textbf{ok} [ float? ] instances car references . -\textbf{[ \# [ -1.0 0.0 / ] ]} -\end{alltt} - -You can print a memory usage summary with \texttt{room.}: - -\begin{alltt} -\textbf{ok} room. -\textbf{Data space: 16384 KB total 2530 KB used 13853 KB free -Code space: 16384 KB total 490 KB used 15893 KB free} -\end{alltt} - -And finally, a detailed memory allocation breakdown by type with \texttt{heap-stats.}: - -\begin{alltt} -\textbf{ok} heap-stats. -\textbf{bignum: 312 bytes, 17 instances -cons: 850376 bytes, 106297 instances -float: 112 bytes, 7 instances -t: 8 bytes, 1 instances -array: 202064 bytes, 3756 instances -hashtable: 54912 bytes, 3432 instances -vector: 5184 bytes, 324 instances -string: 391024 bytes, 7056 instances -sbuf: 64 bytes, 4 instances -port: 112 bytes, 2 instances -word: 96960 bytes, 3030 instances -tuple: 688 bytes, 22 instances} -\end{alltt} - -\subsection{The profiler} - -Factor provides a statistical sampling profiler for narrowing down memory and processor bottlenecks. -The profiler is only supported on Unix platforms. On FreeBSD 4.x, the Factor runtime must -be compiled without the \texttt{-pthread} switch, since FreeBS 4.x userspace threading makes -use of a signal that conflicts with the signal used for profiling. - -The \texttt{allot-profile} word executes a quotation with the memory profiler enabled, then prints a list of all words that allocated memory, along with the bytes allocated. Note that during particularly long executions, or executions where a lot of memory is allocated, these counters may overrun. - -\begin{alltt} -\textbf{ok} [ "boot.image.le32" make-image ] allot-profile -\emph{... many lines omitted ...} -\textbf{[[ write-little-endian-32 673952 ]] -[[ wait-to-read-line 788640 ]] -[[ blocking-read-line 821264 ]] -[[ vocabularies 822624 ]] -[[ parse-resource 823376 ]] -[[ next-line 1116440 ]] -[[ vector-map 1326504 ]] -[[ fixup-words 1326520 ]] -[[ vector-each 1768640 ]] -[[ (parse) 2434208 ]] -[[ classes 2517920 ]] -[[ when* 2939088 ]] -[[ while 3614408 ]] -[[ (parse-stream) 3634552 ]] -[[ make-list 3862000 ]] -[[ object 4143784 ]] -[[ each 4712080 ]] -[[ run-resource 5036904 ]] -[[ (callcc) 5183400 ]] -[[ catch 5188976 ]] -[[ 2slip 8631736 ]] -[[ end 202896600 ]] -[[ make-image 208611888 ]] -[[ with-scope 437823992 ]]} -\end{alltt} - -The \texttt{call-profile} word executes a quotation with the CPU profiler enabled, then prints a list of all words that were found on the return stack, along with the number of times they were seen there. This gives a rough idea of what words are taking up the majority of execution time. - -\begin{alltt} -\textbf{ok} [ "boot.image.le32" make-image ] call-profile -\emph{... many lines omitted ...} -\textbf{[[ stream-write 7 ]] -[[ wait-to-write 7 ]] -[[ vector-map 11 ]] -[[ fixup-words 11 ]] -[[ when* 12 ]] -[[ write 16 ]] -[[ write-word 17 ]] -[[ parse-loop 22 ]] -[[ make-list 24 ]] -[[ (parse) 29 ]] -[[ blocking-write 32 ]] -[[ while 35 ]] -[[ (parse-stream) 36 ]] -[[ dispatch 47 ]] -[[ run-resource 50 ]] -[[ write-little-endian-32 76 ]] -[[ (callcc) 173 ]] -[[ catch 174 ]] -[[ each 175 ]] -[[ 2slip 199 ]] -[[ end 747 ]] -[[ make-image 785 ]] -[[ with-scope 1848 ]]} -\end{alltt} - -Normally, the memory and CPU profilers run every millisecond, and increment counters for all words on the return stack. The \texttt{only-top} variable can be switched on, in which case only the counter for the word at the top of the return stack is incremented. This gives a more localized picture of CPU and memory usage. - -\subsection{\label{compiler}The compiler} - -The compiler can provide a substantial speed boost for words whose stack effect can be inferred. Words without a known stack effect cannot be compiled, and must be run in the interpreter. The compiler generates native code, and so far, x86 and PowerPC backends have been developed. - -To compile a single word, call \texttt{compile}: - -\begin{alltt} -\textbf{ok} \ttbs pref-size compile -\textbf{Compiling pref-size} -\end{alltt} - -During bootstrap, all words in the library with a known stack effect are compiled. You can -circumvent this, for whatever reason, by passing the \texttt{-no-compile} switch during -bootstrap: - -\begin{alltt} -\textbf{bash\$} ./f boot.image.le32 -no-compile -\end{alltt} - -The compiler has two limitations you must be aware of. First, if an exception is thrown in compiled code, the return stack will be incomplete, since compiled words do not push themselves there. Second, compiled code cannot be profiled. These limitations will be resolved in a future release. - -The compiler consists of multiple stages -- first, a dataflow graph is inferred, then various optimizations are done on this graph, then it is transformed into a linear representation, further optimizations are done, and finally, machine code is generated from the linear representation. To perform everything except for the machine code generation, use the \texttt{precompile} word. This will dump the optimized linear IR instead of generating code, which can be useful sometimes. - -\begin{alltt} -\textbf{ok} \ttbs append precompile -\textbf{[ \#prologue ] -[ over ] -[[ \#jump-t-label G:54091 ]] -[ swap ] -[ drop ] -[ \#return ] -[[ \#label G:54091 ]] -[ >r ] -[[ \#call uncons ]] -[ r> ] -[[ \#call append ]] -[[ \#jump cons ]]} -\end{alltt} - -\end{document} diff --git a/library/collections/sequences-epilogue.factor b/library/collections/sequences-epilogue.factor index 7c8d7af00a..5986387c67 100644 --- a/library/collections/sequences-epilogue.factor +++ b/library/collections/sequences-epilogue.factor @@ -52,7 +52,7 @@ M: cons (tree-each) [ car (tree-each) ] 2keep cdr (tree-each) ; M: f (tree-each) swap call ; -M: sequence (tree-each) [ swap call ] seq-each-with ; +M: sequence (tree-each) [ (tree-each) ] seq-each-with ; : tree-each swap (tree-each) ; inline diff --git a/library/generic/generic.factor b/library/generic/generic.factor index 50f979a4ff..f0ce3cb368 100644 --- a/library/generic/generic.factor +++ b/library/generic/generic.factor @@ -44,6 +44,9 @@ namespaces parser strings words vectors math math-internals ; : methods ( generic -- alist ) "methods" word-prop hash>alist [ 2car class< ] sort ; +: order ( generic -- list ) + "methods" word-prop hash-keys [ class< ] sort ; + : add-method ( generic vtable definition class -- ) #! Add the method entry to the vtable. Unlike define-method, #! this is called at vtable build time, and in the sorted