factor/doc/devel-guide.tex

2891 lines
109 KiB
TeX

% :indentSize=4:tabSize=4:noTabs=true:mode=tex:wrap=soft:
\documentclass[english]{book}
\makeindex
\usepackage[T1]{fontenc}
\usepackage[latin1]{inputenc}
\usepackage{alltt}
\usepackage{tabularx}
\usepackage{times}
\pagestyle{headings}
\setcounter{tocdepth}{1}
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}
\raggedbottom
\raggedright
\newcommand{\ttbackslash}{\char'134}
\newcommand{\sidebar}[1]{{\fbox{\fbox{\parbox{10cm}{\begin{minipage}[b]{10cm}
{\LARGE For wizards}
#1
\end{minipage}}}}}}
\newcommand{\chapkeywords}[1]{%{\parbox{10cm}{\begin{minipage}[b]{10cm}
\begin{quote}
\emph{Key words:} \texttt{#1}
\end{quote}
%\end{minipage}}}}
}
\makeatletter
\newcommand{\wordtable}[1]{{
\begin{tabularx}{12cm}{|l l X|}
#1
\hline
\end{tabularx}}}
\newcommand{\tabvocab}[1]{
\hline
\multicolumn{3}{|l|}{
\rule[-2mm]{0mm}{6mm}
\texttt{#1} vocabulary:}
\\
\hline
}
\usepackage{babel}
\makeatother
\begin{document}
\title{Factor Developer's Guide}
\author{Slava Pestov}
\maketitle
\tableofcontents{}
\chapter*{Introduction}
Factor is a programming language with functional and object-oriented
influences. Factor borrows heavily from Forth, Joy and Lisp. Programmers familiar with these languages will recognize many similarities with Factor.
Factor is \emph{interactive}. This means it is possible to run a Factor interpreter that reads from the keyboard, and immediately executes expressions as they are entered. This allows words to be defined and tested one at a time.
Factor is \emph{dynamic}. This means that all objects in the language are fully reflective at run time, and that new definitions can be entered without restarting the interpreter. Factor code can be used interchangably as data, meaning that sophisticated language extensions can be realized as libraries of words.
Factor is \emph{safe}. This means all code executes in a virtual machine that provides
garbage collection and prohibits direct pointer arithmetic. There is no way to get a dangling reference by deallocating a live object, and it is not possible to corrupt memory by overwriting the bounds of an array.
When examples of interpreter interactions are given in this guide, the input is in a roman font, and any
output from the interpreter is in boldface:
\begin{alltt}
\textbf{ok} "Hello, world!" print
\textbf{Hello, world!}
\end{alltt}
\chapter{First steps}
This chapter will cover the basics of interactive development using the listener. Short code examples will be presented, along with an introduction to programming with the stack, a guide to using source files, and some hints for exploring the library.
\section{The listener}
\chapkeywords{print write read}
\index{\texttt{print}}
\index{\texttt{write}}
\index{\texttt{read}}
Factor is an \emph{image-based environment}. When you compiled Factor, you also generated a file named \texttt{factor.image}. You will learn more about images later, but for now it suffices to understand that to start Factor, you must pass the image file name on the command line:
\begin{alltt}
./f factor.image
\textbf{Loading factor.image... relocating... done
This is native Factor 0.69
Copyright (C) 2003, 2004 Slava Pestov
Copyright (C) 2004 Chris Double
Type ``exit'' to exit, ``help'' for help.
65536 KB total, 62806 KB free
ok}
\end{alltt}
An \texttt{\textbf{ok}} prompt is printed after the initial banner, indicating the listener is ready to execute Factor expressions. You can try the classical first program:
\begin{alltt}
\textbf{ok} "Hello, world." print
\textbf{Hello, world.}
\end{alltt}
One thing all programmers have in common is they make large numbers of mistakes. There are two types of errors; syntax errors, and logic errors. Syntax errors indicate a simple typo or misplaced character.
A logic error is not associated with a piece of source code, but rather an execution state. We will learn how to debug them later. The listener will indicate the point in the source code where the confusion occurs:
\begin{alltt}
\textbf{ok} "Hello, world" pr,int
\textbf{<interactive>:1: Not a number
"Hello, world" pr,int
^
:s :r :n :c show stacks at time of error.}
\end{alltt}
Factor source is composed from whitespace-separated words. Factor is case-sensitive. Each one of the following is exactly one word, and all four are distinct:
\begin{verbatim}
2dup
2DUP
[
@$*(%#
\end{verbatim}
A frequent beginner's error is to leave out whitespace between words. When you are entering the code examples in the following sections, keep in mind that you cannot omit arbitrary whitespace:
\begin{alltt}
:greet "Greetings, " write print; \emph{! incorrect}
: greet "Greetings, " write print ; \emph{! correct}
\end{alltt}
\section{Colon definitions}
\chapkeywords{:~; see}
\index{\texttt{:}}
\index{\texttt{;}}
\index{\texttt{see}}
Factor words are similar to functions and procedures in other languages. Words are defined using \emph{colon definitino} syntax. Some words, like \texttt{print}, \texttt{write} and \texttt{read}, along with dozens of others we will see, are part of Factor. Other words will be created by you.
When you create a new word, you are associating a name with a particular sequence of \emph{already-existing} words. Enter the following colon definition in the listener:
\begin{alltt}
\textbf{ok} : ask-name "What is your name? " write read ;
\end{alltt}
What did we do above? We created a new word named \texttt{ask-name}, and associated with it the definition \texttt{"What is your name? " write read}. Now, lets type in two more colon definitions. The first one prints a personalized greeting. The second colon definition puts the first two together into a complete program.
\begin{alltt}
\textbf{ok} : greet "Greetings, " write print ;
\textbf{ok} : friend ask-name greet ;
\end{alltt}
Now that the three words \texttt{ask-name}, \texttt{greet}, and \texttt{friend} have been defined, simply typing \texttt{friend} in the listener will run our example program:
\begin{alltt}
\textbf{ok} friend
\textbf{What is your name? }Lambda the Ultimate
\textbf{Greetings, Lambda the Ultimate}
\end{alltt}
Notice that the \texttt{ask-name} word passes a piece of data to the \texttt{greet} word. We will worry about the exact details later -- for now, our focus is the colon definition syntax itself.
You can look at the definition of any word, including library words, using \texttt{see}:
\begin{alltt}
\textbf{ok} \ttbackslash greet see
\textbf{IN: scratchpad
: greet
"Greetings, " write print ;}
\end{alltt}
The \texttt{see} word shows a reconstruction of the source code, not the original source code. So in particular, formatting and some comments are lost.
\sidebar{
Factor is written in itself. Indeed, most words in the Factor library are colon definitions, defined in terms of other words. Out of more than two thousand words in the Factor library, less than two hundred are \emph{primitive}, or built-in to the runtime.
If you exit Factor by typing \texttt{exit}, any colon definitions entered at the listener will be lost. However, you can save a new image first, and use this image next time you start Factor in order to have access to your definitions.
\begin{alltt}
\textbf{ok} "work.image" save-image\\
\textbf{ok} exit
\end{alltt}
The \texttt{factor.image} file you start with initially is just an image that was saved after the standard library, and nothing else, was loaded.
}
\section{The stack}
\chapkeywords{.s .~clear}
\index{\texttt{.s}}
\index{\texttt{.}}
\index{\texttt{clear}}
In order to be able to write more sophisticated programs, you will need to master usage of the \emph{stack}. Input parameters -- for example, numbers, or strings such as \texttt{"Hello "}, are pushed onto the top of the stack when typed. The most recently pushed value is said to be \emph{at the top} of the stack. When a word is executed, it takes
input parameters from the top of the
stack. Results are then pushed back on the stack. By convention, words remove input parameters from the stack.
Recall our \texttt{friend} definition from the previous section. In this definition, the \texttt{ask-name} word passes a piece of data to the \texttt{greet} word:
\begin{alltt}
: friend ask-name greet ;
\end{alltt}
The first thing done by \texttt{friend} is calling \texttt{ask-name}, which was defined as follows:
\begin{alltt}
: ask-name "What is your name? " write read ;
\end{alltt}
Read this definition from left to right, and visualize the data flow. First, the string \texttt{"What is your name?~"} is pushed on the stack. The \texttt{write} word is called; it removes the string from the stack and writes it, without returning any values. Next, the \texttt{read} word is called. It waits for a line of input from the user, then pushes the entered string on the stack.
After \texttt{ask-name}, the \texttt{friend} word calls \texttt{greet}, which was defined as follows:
\begin{alltt}
: greet "Greetings, " write print ;
\end{alltt}
This word pushes the string \texttt{"Greetings, "} and calls \texttt{write}, which writes this string. Next, \texttt{print} is called. Recall that the \texttt{read} call inside \texttt{ask-name} left the user's input on the stack; well, it is still there, and \texttt{print} prints it. In case you haven't already guessed, the difference between \texttt{write} and \texttt{print} is that the latter outputs a terminating new line.
How did we know that \texttt{write} and \texttt{print} take one value from the stack each, or that \texttt{read} leaves one value on the stack? The answer is, you don't always know, however, you can use \texttt{see} to look up the \emph{stack effect comment} of any library word:
\begin{alltt}
\textbf{ok} \ttbackslash print see
\textbf{IN: stdio
: print ( string -{}- )
"stdio" get fprint ;}
\end{alltt}
You can see that the stack effect of \texttt{print} is \texttt{( string -{}- )}. This is a mnemonic indicating that this word pops a string from the stack, and pushes no values back on the stack. As you can verify using \texttt{see}, the stack effect of \texttt{read} is \texttt{( -{}- string )}.
All words you write should have a stack effect. So our \texttt{friend} example should have been written as follows:
\begin{verbatim}
: ask-name ( -- name ) "What is your name? " write read ;
: greet ( name -- ) "Greetings, " write print ;
: friend ( -- ) ask-name greet ;
\end{verbatim}
The contents of the stack can be printed using the \texttt{.s} word:
\begin{alltt}
\textbf{ok} 1 2 3 .s
\textbf{1
2
3}
\end{alltt}
The \texttt{.} (full-stop) word removes the top stack element, and prints it. Unlike \texttt{print}, which will only print a string, \texttt{.} can print any object.
\begin{alltt}
\textbf{ok} "Hi" .
\textbf{"Hi"}
\end{alltt}
You might notice that \texttt{.} surrounds strings with quotes, while \texttt{print} prints the string without any kind of decoration. This is because \texttt{.} is a special word that outputs objects in a form \emph{that can be parsed back in}. This is a fundamental feature of Factor -- data is code, and code is data. We will learn some very deep consequences of this throughout this guide.
If the stack is empty, calling \texttt{.} will raise an error. This is in general true for any word called with insufficient parameters, or parameters of the wrong type.
The \texttt{clear} word removes all elements from the stack.
\sidebar{
In Factor, the stack takes the place of local variables found in other languages. Factor still supports variables, but they are usually used for different purposes. Like most other languages, Factor heap-allocates objects, and passes object references by value. However, we don't worry about this until mutable state is introduced.
}
\section*{Review}
Lets review the words we've seen until now, and their stack effects. The ``vocab'' column will be described later, and only comes into play when we start working with source files. Basically, Factor's words are partitioned into vocabularies rather than being in one flat list.
\wordtable{
\tabvocab{syntax}
\texttt{:~\emph{word}}&
\texttt{( -{}- )}&
Begin a word definion named \texttt{\emph{word}}.\\
\texttt{;}&
\texttt{( -{}- )}&
End a word definion.\\
\texttt{\ttbackslash\ \emph{word}}&
\texttt{( -{}- )}&
Push \texttt{\emph{word}} on the stack, instead of executing it.\\
\tabvocab{stdio}
\texttt{print}&
\texttt{( string -{}- )}&
Write a string to the console, with a new line.\\
\texttt{write}&
\texttt{( string -{}- )}&
Write a string to the console, without a new line.\\
\texttt{read}&
\texttt{( -{}- string )}&
Read a line of input from the console.\\
\tabvocab{prettyprint}
\texttt{.s}&
\texttt{( -{}- )}&
Print stack contents.\\
\texttt{.}&
\texttt{( value -{}- )}&
Print value at top of stack in parsable form.\\
\tabvocab{stack}
\texttt{clear}&
\texttt{( ...~-{}- )}&
Clear stack contents.\\
}
From now on, each section will end with a summary of words and stack effects described in that section.
\section{Arithmetic}
\chapkeywords{+ - {*} / neg pred succ}
\index{\texttt{+}}
\index{\texttt{-}}
\index{\texttt{*}}
\index{\texttt{/}}
\index{\texttt{neg}}
\index{\texttt{pred}}
\index{\texttt{succ}}
The usual arithmetic operators \texttt{+ - {*} /} all take two parameters
from the stack, and push one result back. Where the order of operands
matters (\texttt{-} and \texttt{/}), the operands are taken in the natural order. For example:
\begin{alltt}
\textbf{ok} 10 17 + .
\textbf{27}
\textbf{ok} 111 234 - .
\textbf{-123}
\textbf{ok} 333 3 / .
\textbf{111}
\end{alltt}
The \texttt{neg} word negates the number at the top of the stack (that is, multiplies it by -1).
This type of arithmetic is called \emph{postfix}, because the operator
follows the operands. Contrast this with \emph{infix} notation used
in many other languages, so-called because the operator is in-between
the two operands.
More complicated infix expressions can be translated into postfix
by translating the inner-most parts first. Grouping parentheses are
never necessary.
Here is the postfix translation of $(2 + 3) \times 6$:
\begin{alltt}
\textbf{ok} 2 3 + 6 {*}
\textbf{30}
\end{alltt}
Here is the postfix translation of $2 + (3 \times 6)$:
\begin{alltt}
\textbf{ok} 2 3 6 {*} +
\textbf{20}
\end{alltt}
As a simple example demonstrating postfix arithmetic, consider a word, presumably for an aircraft navigation system, that takes the flight time, the aircraft
velocity, and the tailwind velocity, and returns the distance travelled.
If the parameters are given on the stack in that order, all we do
is add the top two elements (aircraft velocity, tailwind velocity)
and multiply it by the element underneath (flight time). So the definition
looks like this:
\begin{alltt}
\textbf{ok} : distance ( time aircraft tailwind -{}- distance ) + {*} ;
\textbf{ok} 2 900 36 distance .
\textbf{1872}
\end{alltt}
Note that we are not using any distance or time units here. To extend this example to work with units, we make the assumption that for the purposes of computation, all distances are
in meters, and all time intervals are in seconds. Then, we can define words
for converting from kilometers to meters, and hours and minutes to
seconds:
\begin{alltt}
\textbf{ok} : kilometers 1000 {*} ;
\textbf{ok} : minutes 60 {*} ;
\textbf{ok} : hours 60 {*} 60 {*} ;
2 kilometers .
\emph{2000}
10 minutes .
\emph{600}
2 hours .
\emph{7200}
\end{alltt}
The implementation of \texttt{km/hour} is a bit more complex. We would like it to convert kilometers per hour to meters per second. To get the desired result, we first have to convert to kilometers per second, then divide this by the number of seconds in one hour.
\begin{alltt}
\textbf{ok} : km/hour kilometers 1 hours / ;
\textbf{ok} 2 hours 900 km/hour 36 km/hour distance .
\textbf{1872000}
\end{alltt}
\section*{Review}
\wordtable{
\tabvocab{math}
\texttt{+}&
\texttt{( x y -{}- z )}&
Add \texttt{x} and \texttt{y}.\\
\texttt{-}&
\texttt{( x y -{}- z )}&
Subtract \texttt{y} from \texttt{x}.\\
\texttt{*}&
\texttt{( x y -{}- z )}&
Multiply \texttt{x} and \texttt{y}.\\
\texttt{/}&
\texttt{( x y -{}- z )}&
Divide \texttt{x} by \texttt{y}.\\
\texttt{pred}&
\texttt{( x -{}- x-1 )}&
Push the predecessor of \texttt{x}.\\
\texttt{succ}&
\texttt{( x -{}- x+1 )}&
Push the successor of \texttt{x}.\\
\texttt{neg}&
\texttt{( x -{}- -1 )}&
Negate \texttt{x}.\\
}
\section{Shuffle words}
\chapkeywords{drop dup swap over nip dupd rot -rot tuck pick 2drop 2dup}
\index{\texttt{drop}}
\index{\texttt{dup}}
\index{\texttt{swap}}
\index{\texttt{over}}
\index{\texttt{nip}}
\index{\texttt{dupd}}
\index{\texttt{rot}}
\index{\texttt{-rot}}
\index{\texttt{tuck}}
\index{\texttt{pick}}
\index{\texttt{2drop}}
\index{\texttt{2dup}}
Lets try writing a word to compute the cube of a number.
Three numbers on the stack can be multiplied together using \texttt{{*}
{*}}:
\begin{alltt}
2 4 8 {*} {*} .
\emph{64}
\end{alltt}
However, the stack effect of \texttt{{*} {*}} is \texttt{( a b c -{}-
a{*}b{*}c )}. We would like to write a word that takes \emph{one} input
only, and multiplies it by itself three times. To achieve this, we need to make two copies of the top stack element, and then execute \texttt{{*} {*}}. As it happens, there is a word \texttt{dup ( x -{}-
x x )} for precisely this purpose. Now, we are able to define the
\texttt{cube} word:
\begin{alltt}
: cube dup dup {*} {*} ;
10 cube .
\emph{1000}
-2 cube .
\emph{-8}
\end{alltt}
The \texttt{dup} word is just one of the several so-called \emph{shuffle words}. Shuffle words are used to solve the problem of composing two words whose stack effects don't quite {}``match up''.
Lets take a look at the four most-frequently used shuffle words:
\wordtable{
\tabvocab{stack}
\texttt{drop}&
\texttt{( x -{}- )}&
Discard the top stack element. Used when
a word returns a value that is not needed.\\
\texttt{dup}&
\texttt{( x -{}- x x )}&
Duplicate the top stack element. Used
when a value is required as input for more than one word.\\
\texttt{swap}&
\texttt{( x y -{}- y x )}&
Swap top two stack elements. Used when
a word expects parameters in a different order.\\
\texttt{over}&
\texttt{( x y -{}- x y x )}&
Bring the second stack element {}``over''
the top element.\\}
The remaining shuffle words are not used nearly as often, but are nonetheless handy in certian situations:
\wordtable{
\tabvocab{stack}
\texttt{nip}&
\texttt{( x y -{}- y )}&
Remove the second stack element.\\
\texttt{dupd}&
\texttt{( x y -{}- x x y )}&
Duplicate the second stack element.\\
\texttt{rot}&
\texttt{( x y z -{}- y z x )}&
Rotate top three stack elements
to the left.\\
\texttt{-rot}&
\texttt{( x y z -{}- z x y )}&
Rotate top three stack elements
to the right.\\
\texttt{tuck}&
\texttt{( x y -{}- y x y )}&
``Tuck'' the top stack element under
the second stack element.\\
\texttt{pick}&
\texttt{( x y z -{}- x y z x )}&
``Pick'' the third stack element.\\
\texttt{2drop}&
\texttt{( x y -{}- )}&
Discard the top two stack elements.\\
\texttt{2dup}&
\texttt{( x y -{}- x y x y )}&
Duplicate the top two stack elements. A frequent use for this word is when two values have to be compared using something like \texttt{=} or \texttt{<} before being passed to another word.\\
}
You should try all these words out and become familiar with them. Push some numbers on the stack,
execute a shuffle word, and look at how the stack contents was changed using
\texttt{.s}. Compare the stack contents with the stack effects above.
Try to avoid the complex shuffle words such as \texttt{rot} and \texttt{2dup} as much as possible. They make data flow harder to understand. If you find yourself using too many shuffle words, or you're writing
a stack effect comment in the middle of a colon definition to keep track of stack contents, it is
a good sign that the word should probably be factored into two or
more smaller words.
A good rule of thumb is that each word should take at most a couple of sentences to describe; if it is so complex that you have to write more than that in a documentation comment, the word should be split up.
Effective factoring is like riding a bicycle -- once you ``get it'', it becomes second nature.
\section{Working with the stack}
\chapkeywords{sq sqrt}
\index{\texttt{sq}}
\index{\texttt{sqrt}}
In this section, we will work through the construction of a word for solving quadratic equations, that is, finding values of $x$ that satisfy $ax^2+bx+c=0$, where $a$, $b$ and $c$ are given to us. If you don't like math, take comfort in the fact this is the last mathematical example for a while!
First, note that \texttt{sq} multiplies the top of the stack by itself, and \texttt{sqrt} takes the square root of a number:
\begin{alltt}
\textbf{ok} 5 sq .
\textbf{25}
\textbf{ok} 2 sqrt .
\textbf{1.414213562373095}
\end{alltt}
The mathematical formula that gives a value of $x$ for known $a$, $b$ and $c$ might be familiar to you:
$$x=\frac{-b}{2a}\pm\sqrt{\frac{b^2-4ac}{4a^2}}$$
We will compute the left-hand side first. The word to compute it will be named \texttt{quadratic-e}, and it will take the values $a$ and $b$ on the stack:
\begin{verbatim}
: quadratic-e ( b a -- -b/2a )
2 * / neg ;
\end{verbatim}
Now, lets code the right hand side of the equation:
\begin{verbatim}
: quadratic-d ( a b c -- d )
pick 4 * * swap sq swap - swap sq 4 * / sqrt ;
\end{verbatim}
To understand how \texttt{quadratic-d} works, consider the stack after each step:
\begin{tabular}{|l|l|}
\hline
Word:&Stack after:\\
\hline
Initial:&$a$ $b$ $c$\\
\hline
\texttt{pick}&$a$ $b$ $c$ $a$\\
\hline
\texttt{4}&$a$ $b$ $c$ $a$ $4$\\
\hline
\texttt{*}&$a$ $b$ $c$ $4a$\\
\hline
\texttt{*}&$a$ $b$ $4ac$\\
\hline
\texttt{swap}&$a$ $4ac$ $b$\\
\hline
\texttt{sq}&$a$ $4ac$ $b^2$\\
\hline
\texttt{swap}&$a$ $b^2$ $4ac$\\
\hline
\texttt{-}&$a$ $b^2-4ac$\\
\hline
\texttt{swap}&$b^2-4ac$ $a$\\
\hline
\texttt{sq}&$b^2-4ac$ $a^2$\\
\hline
\texttt{4}&$b^2-4ac$ $a^2$ $4$\\
\hline
\texttt{*}&$b^2-4ac$ $4a^2$\\
\hline
\texttt{/}&$\frac{b^2-4ac}{4a^2}$\\
\hline
\texttt{sqrt}&$\sqrt{\frac{b^2-4ac}{4a^2}}$\\
\hline
\end{tabular}
Now, we need a word that takes the values computed by \texttt{quadratic-e} and \texttt{quadratic-d}, and returns two values, one being the sum, the other being the difference. This is the $\pm$ part of the formula:
\begin{verbatim}
: quadratic-roots ( e d -- alpha beta )
2dup + -rot - ;
\end{verbatim}
You should be able to work through the stack flow of the above word in your head. Test it with a few inputs.
Finally, we can put these three words together into a complete program:
\begin{verbatim}
: quadratic ( a b c -- alpha beta )
3dup quadratic-d
nip swap rot quadratic-e
swap quadratic-roots ;
\end{verbatim}
Again, lets look at the stack after each step of the execution of \texttt{quadratic}:
\begin{tabular}{|l|l|}
\hline
Word:&Stack after:\\
\hline
Initial:&$a$ $b$ $c$\\
\hline
\texttt{3dup}&$a$ $b$ $c$ $a$ $b$ $c$\\
\hline
\texttt{quadratic-d}&$a$ $b$ $c$ $\sqrt{\frac{b^2-4ac}{4a^2}}$\\
\hline
\texttt{nip}&$a$ $b$ $\sqrt{\frac{b^2-4ac}{4a^2}}$\\
\hline
\texttt{swap}&$a$ $\sqrt{\frac{b^2-4ac}{4a^2}}$ $b$ \\
\hline
\texttt{rot}&$\sqrt{\frac{b^2-4ac}{4a^2}}$ $b$ $a$ \\
\hline
\texttt{quadratic-e}&$\sqrt{\frac{b^2-4ac}{4a^2}}$ $\frac{-b}{2a}$\\
\hline
\texttt{quadratic-roots}&$\frac{-b}{2a}+\sqrt{\frac{b^2-4ac}{4a^2}}$ $\frac{-b}{2a}-\sqrt{\frac{b^2-4ac}{4a^2}}$\\
\hline
\end{tabular}
You can test \texttt{quadratic} with a handful of inputs:
\begin{alltt}
\textbf{ok} 1 2 1 quadratic . .
\textbf{-1.0}
\textbf{-1.0}
\textbf{ok} 1 -5 4 quadratic . .
\textbf{1.0}
\textbf{4.0}
\textbf{ok} 1 0 1 quadratic . .
\textbf{#\{ 0 -1.0 \}
#\{ 0 1.0 \}}
\end{alltt}
The last example shows that Factor can handle complex numbers perfectly well. We will have more to say about complex numbers later.
\section*{Review}
\wordtable{
\tabvocab{math}
\texttt{sq}&
\texttt{( x -{}- x*x )}&
Square of a number.\\
\texttt{sqrt}&
\texttt{( x -{}- sqrt[x] )}&
Square root of a number.\\
}
\section{Source files}
\chapkeywords{run-file apropos.~USE: IN:}
\index{\texttt{run-file}}
\index{\texttt{apropos.}}
\index{\texttt{IN:}}
\index{\texttt{USE:}}
Entering colon definitions at the listener is very convenient for quick testing, but for serious work you should save your work in source files with the \texttt{.factor} filename extension. Any text editor will do, but if you use jEdit\footnote{\texttt{http://www.jedit.org}}, you can take advantage of the powerful integration features found in the Factor plugin. Consult the plugin documentation for details.
Lets put our program for solving quadratic equations in a source file. Create a file named \texttt{quadratic.factor} in your favorite editor, and add the following content:
\begin{verbatim}
IN: quadratic
USE: math
USE: kernel
: quadratic-e ( b a -- -b/2a )
2 * / neg ;
: quadratic-d ( a b c -- d )
pick 4 * * swap sq swap - swap sq 4 * / sqrt ;
: quadratic-roots ( d e -- alpha beta )
2dup + -rot - ;
: quadratic ( a b c -- alpha beta )
3dup quadratic-d
nip swap rot quadratic-e
swap quadratic-roots ;
\end{verbatim}
Now, load the source file in the Factor interpreter using the \texttt{run-file} word:
\begin{alltt}
\textbf{ok} "quadratic.factor" run-file
\textbf{/home/slava/quadratic.factor:2: Not a number
2 * / neg ;
^
:s :r :n :c show stacks at time of error.
:get ( var -- value ) inspects the error namestack.}
\end{alltt}
Oops! What happened? It looks like it is not recognizing the \texttt{*} word, which works fine in the listener! The problem is that while most words in the library are available for use at the listener, source files must explicitly declare which \texttt{vocabularies} they make use of. To find out which vocabulary holds the \texttt{*} word, use \texttt{apropos.}:
\begin{alltt}
\textbf{ok} "*" apropos.
\emph{...}
\textbf{IN: math
*}
\emph{...}
\end{alltt}
The \texttt{apropos.}~word searches for words whose name contains a given string. As you can see, there are a number of words whose name contains \texttt{*}, but the one we are looking for is \texttt{*} itself, in the \texttt{math} vocabulary. To make use of the \texttt{math} vocabulary, simply add the following \texttt{vocabulary use declaration} at the beginning of the \texttt{quadratic.factor} source file:
\begin{verbatim}
USE: math
\end{verbatim}
Now, try loading the file again. This time, an error will be displayed because the \texttt{pick} word cannot be found. Use \texttt{apropos.} to confirm that \texttt{pick} is in the \texttt{stack} vocabulary, and add the appropriate declaration at the start of the source file. Then, the source file should load without any errors.
By default, words you define go in the \texttt{scratchpad} vocabulary. To change this, add a declaration to the start of the source file:
\texttt{IN: quadratic}
Now, to use the words defined within, you must issue the following command in the listener first:
\begin{alltt}
\textbf{ok} USE: quadratic
\end{alltt}
\sidebar{If you are using jEdit, you can use the \textbf{Plugins}>\textbf{Factor}>\textbf{Use word at caret} command to insert a \texttt{USE:} declaration for the word at the caret.}
\section*{Review}
\wordtable{
\tabvocab{parser}
\texttt{run-file}&
\texttt{( string -{}- )}&
Load a source file with the given name.\\
\tabvocab{syntax}
\texttt{USE: \emph{vocab}}&
\texttt{( -{}- )}&
Add a vocabulary to the search path.\\
\texttt{IN: \emph{vocab}}&
\texttt{( -{}- )}&
Set vocabulary for new word definitions.\\
\tabvocab{words}
\texttt{apropos.}&
\texttt{( string -{}- )}&
List all words whose name contains a given string, and the vocabularies they are found in.\\
}
\section{Exploring the library}
\chapkeywords{apropos.~see ~vocabs.~words.~httpd}
We already saw two ways to explore the Factor library in previous sections: the \texttt{see} word, which shows a word definition, and \texttt{apropos.}~ which helps locate a word and its vocabulary if we know part of its name.
Entering \texttt{vocabs.}~in the listener produces a list of all existing vocabularies:
\begin{alltt}
\textbf{ok} vocabs.
\textbf{[ "alien" "ansi" "combinators" "compiler" "continuations"
"errors" "file-responder" "files" "format" "hashtables"
"html" "httpd" "httpd-responder" "image" "inference" "init"
"inspect-responder" "inspector" "interpreter" "io-internals"
"jedit" "kernel" "listener" "lists" "logging" "logic" "math"
"namespaces" "parser" "presentation" "prettyprint"
"processes" "profiler" "quit-responder" "random" "real-math"
"resource-responder" "scratchpad" "sdl" "sdl-event"
"sdl-gfx" "sdl-keysym" "sdl-video" "stack" "stdio" "streams"
"strings" "syntax" "telnetd" "test" "test-responder"
"threads" "unparser" "url-encoding" "vectors" "words" ]
}
\end{alltt}
As you can see, there are a lot of vocabularies! Now, you can use \texttt{words.}~to list the words inside a given vocabulary:
\begin{alltt}
\textbf{ok} "lists" words.
\textbf{[ (cons-hashcode) (count) (each) (partition) (top) , 2car
2cdr 2cons 2list 2swons 3list =-or-contains? acons acons@
all=? all? append assoc assoc* assoc-apply assoc? car cdr
cons cons-hashcode cons= cons? cons@ contains? count each
last last* length list>vector list? make-list make-rlist map
maximize nth num-sort partition partition-add partition-step
prune remove remove-assoc remove@ reverse set-assoc sort
stack>list subset swons tail top tree-contains? uncons
uncons@ unique unique, unique@ unit unswons unzip
vector>list ]}
\end{alltt}
Any word definition can then be shown using \texttt{see}, but you might have to \texttt{USE:} the vocabulary first.
\begin{alltt}
\textbf{ok} USE: presentation
\textbf{ok} \ttbackslash set-style see
\textbf{IN: presentation
: set-style ( style name -- )
"styles" get set-hash ;}
\end{alltt}
A more sophisticated way to browse the library is using the integrated HTTP server. You can start the HTTP server using the following pair of commands:
\begin{alltt}
\textbf{ok} USE: httpd
\textbf{ok} 8888 httpd
\end{alltt}
Then, point your browser to the following URL, and start browsing:
\begin{quote}
\texttt{http://localhost:8888/responder/inspect/vocabularies}
\end{quote}
To stop the HTTP server, point your browser to
\begin{quote}
\texttt{http://localhost:8888/responder/quit}.
\end{quote}
You can even start the HTTP in a separate thread, using the following commands:
\begin{alltt}
\textbf{ok} USE: httpd
\textbf{ok} USE: threads
\textbf{ok} [ 8888 httpd ] in-thread
\end{alltt}
This way, you can browse code and play in the listener at the same time.
\section*{Review}
\wordtable{
\tabvocab{words}
\texttt{apropos.}&
\texttt{( string -{}- )}&
Search for words whose name contains a string.\\
\texttt{vocabs.}&
\texttt{( -{}- )}&
List all vocabularies.\\
\texttt{words.}&
\texttt{( string -{}- )}&
List all words in a given vocabulary.\\
\tabvocab{httpd}
\texttt{httpd}&
\texttt{( port -{}- )}&
Start an HTTP server on the given port.\\
}
\chapter{Working with data}
This chapter will introduce the fundamental data type used in Factor -- the list.
This will lead is to a discussion of debugging, conditional execution, recursion, and higher-order words, as well as a peek inside the interpreter, and an introduction to unit testing. If you have used another programming language, you will no doubt find using lists for all data limiting; other data types, such as vectors and hashtables, are introduced in later sections. However, a good understanding of lists is fundamental since they are used more than any other data type.
\section{Lists and cons cells}
\chapkeywords{cons car cdr uncons unit}
\index{\texttt{cons}}
\index{\texttt{car}}
\index{\texttt{cdr}}
\index{\texttt{uncons}}
\index{\texttt{unit}}
A \emph{cons cell} is an ordered pair of values. The first value is called the \emph{car},
the second is called the \emph{cdr}.
The \texttt{cons} word takes two values from the stack, and pushes a cons cell containing both values:
\begin{alltt}
\textbf{ok} "fish" "chips" cons .
\textbf{{[} "fish" | "chips" {]}}
\end{alltt}
Recall that \texttt{.}~prints objects in a form that can be parsed back in. This suggests that there is a literal syntax for cons cells. The difference between the literal syntax and calling \texttt{cons} is two-fold; \texttt{cons} can be used to make a cons cell whose components are computed values, and not literals, and \texttt{cons} allocates a new cons cell each time it is called, whereas a literal is allocated once.
The \texttt{car} and \texttt{cdr} words push the constituents of a cons cell at the top of the stack.
\begin{alltt}
\textbf{ok} 5 "blind mice" cons car .
\textbf{5}
\textbf{ok} "peanut butter" "jelly" cons cdr .
\textbf{"jelly"}
\end{alltt}
The \texttt{uncons} word pushes both the car and cdr of the cons cell at once:
\begin{alltt}
\textbf{ok} {[} "potatoes" | "gravy" {]} uncons .s
\textbf{\{ "potatoes" "gravy" \}}
\end{alltt}
If you think back for a second to the previous section on shuffle words, you might be able to guess how \texttt{uncons} is implemented in terms of \texttt{car} and \texttt{cdr}:
\begin{alltt}
: uncons ( {[} car | cdr {]} -- car cdr ) dup car swap cdr ;
\end{alltt}
The cons cell is duplicated, its car is taken, the car is swapped with the cons cell, putting the cons cell at the top of the stack, and finally, the cdr of the cons cell is taken.
Lists of values are represented with nested cons cells. The car is the first element of the list; the cdr is the rest of the list. The following example demonstrates this, along with the literal syntax used to print lists:
\begin{alltt}
\textbf{ok} {[} 1 2 3 4 {]} car .
\textbf{1}
\textbf{ok} {[} 1 2 3 4 {]} cdr .
\textbf{{[} 2 3 4 {]}}
\textbf{ok} {[} 1 2 3 4 {]} cdr cdr .
\textbf{{[} 3 4 {]}}
\end{alltt}
Note that taking the cdr of a list of three elements gives a list of two elements; taking the cdr of a list of two elements returns a list of one element. So what is the cdr of a list of one element? It is the special value \texttt{f}:
\begin{alltt}
\textbf{ok} {[} "snowpeas" {]} cdr .
\textbf{f}
\end{alltt}
The special value \texttt{f} represents the empty list. Hence you can see how cons cells and \texttt{f} allow lists of values to be constructed. If you are used to other languages, this might seem counter-intuitive at first, however the utility of this construction will come into play when we consider recursive words later in this chapter.
One thing worth mentioning is the concept of an \emph{improper list}. An improper list is one where the cdr of the last cons cell is not \texttt{f}. Improper lists are input with the following syntax:
\begin{verbatim}
[ 1 2 3 | 4 ]
\end{verbatim}
Improper lists are rarely used, but it helps to be aware of their existence. Lists that are not improper lists are sometimes called \emph{proper lists}.
Lets review the words we've seen in this section:
\wordtable{
\tabvocab{syntax}
\texttt{{[}}&
\texttt{( -{}- )}&
Begin a literal list.\\
\texttt{{]}}&
\texttt{( -{}- )}&
End a literal list.\\
\tabvocab{lists}
\texttt{cons}&
\texttt{( car cdr -{}- {[} car | cdr {]} )}&
Construct a cons cell.\\
\texttt{car}&
\texttt{( {[} car | cdr {]} -{}- car )}&
Push the first element of a list.\\
\texttt{cdr}&
\texttt{( {[} car | cdr {]} -{}- cdr )}&
Push the rest of the list without the first element.\\
\texttt{uncons}&
\texttt{( {[} car | cdr {]} -{}- car cdr )}&
Push both components of a cons cell.\\}
\section{Operations on lists}
\chapkeywords{unit append reverse contains?}
\index{\texttt{unit}}
\index{\texttt{append}}
\index{\texttt{reverse}}
\index{\texttt{contains?}}
The \texttt{lists} vocabulary defines a large number of words for working with lists. The most important ones will be covered in this section. Later sections will cover other words, usually with a discussion of their implementation.
The \texttt{unit} word makes a list of one element. It does this by consing the top of the stack with the empty list \texttt{f}:
\begin{alltt}
: unit ( a -- {[} a {]} ) f cons ;
\end{alltt}
The \texttt{append} word appends two lists at the top of the stack:
\begin{alltt}
\textbf{ok} {[} "bacon" "eggs" {]} {[} "pancakes" "syrup" {]} append .
\textbf{{[} "bacon" "eggs" "pancakes" "syrup" {]}}
\end{alltt}
Interestingly, only the first parameter has to be a list. if the second parameter
is an improper list, or just a list at all, \texttt{append} returns an improper list:
\begin{alltt}
\textbf{ok} {[} "molasses" "treacle" {]} "caramel" append .
\textbf{{[} "molasses" "treacle" | "caramel" {]}}
\end{alltt}
The \texttt{reverse} word creates a new list whose elements are in reverse order of the given list:
\begin{alltt}
\textbf{ok} {[} "steak" "chips" "gravy" {]} reverse .
\textbf{{[} "gravy" "chips" "steak" {]}}
\end{alltt}
The \texttt{contains?}~word checks if a list contains a given element. The element comes first on the stack, underneath the list:
\begin{alltt}
\textbf{ok} : desert? {[} "ice cream" "cake" "cocoa" {]} contains? ;
\textbf{ok} "stew" desert? .
\textbf{f}
\textbf{ok} "cake" desert? .
\textbf{{[} "cake" "cocoa" {]}}
\end{alltt}
What exactly is going on in the mouth-watering example above? The \texttt{contains?}~word returns \texttt{f} if the element does not occur in the list, and returns \emph{the remainder of the list} if the element occurs in the list. The significance of this will become apparent in the next section -- \texttt{f} represents boolean falsity in a conditional statement.
Note that we glossed the concept of object equality here -- you might wonder how \texttt{contains?}~decides if the element ``occurs'' in the list or not. It turns out it uses the \texttt{=} word, which checks if two objects have the same structure. Another way to check for equality is using \texttt{eq?}, which checks if two references refer to the same object. Both of these words will be covered in detail later.
Lets review the words we saw in this section:
\wordtable{
\tabvocab{lists}
\texttt{unit}&
\texttt{( a -{}- {[} a {]} )}&
Make a list of one element.\\
\texttt{append}&
\texttt{( list list -{}- list )}&
Append two lists.\\
\texttt{reverse}&
\texttt{( list -{}- list )}&
Reverse a list.\\
\texttt{contains?}&
\texttt{( value list -{}- ?~)}&
Determine if a list contains a value.\\}
\section{Conditional execution}
\chapkeywords{f t ifte when unless unique abs infer}
\index{\texttt{f}}
\index{\texttt{t}}
\index{\texttt{ifte}}
\index{\texttt{when}}
\index{\texttt{unless}}
\index{\texttt{unique?}}
\index{\texttt{abs}}
\index{\texttt{ifer}}
Until now, all code examples we've considered have been linear in nature; each word is executed in turn, left to right. To perform useful computations, we need the ability the execute different code depending on circumstances.
The simplest style of a conditional form in Factor is the following:
\begin{alltt}
\emph{condition} {[}
\emph{to execute if true}
{] [}
\emph{to execute if false}
{]} ifte
\end{alltt}
The \texttt{ifte} word is a \emph{combinator}, because it executes lists representing code, or \emph{quotations}, given on the stack.
The condition should be some piece of code that leaves a truth value on the stack. What is a truth value? In Factor, there is no special boolean data type
-- instead, the special value \texttt{f} we've already seen to represent empty lists also represents falsity. Every other object represents boolean truth. In cases where a truth value must be explicitly produced, the value \texttt{t} can be used. The \texttt{ifte} word removes the condition from the stack, and executes one of the two branches depending on the truth value of the condition.
A good first example to look at is the \texttt{unique} word. This word conses a value onto a list as long as the value does not already occur in the list. Otherwise, the original list is returned. You can guess that this word somehow involves \texttt{contains?}~and \texttt{cons}. Check out its implementation using \texttt{see} and your intuition will be confirmed:
\begin{alltt}
: unique ( elem list -- list )
2dup contains? [ nip ] [ cons ] ifte ;
\end{alltt}
The first the word does is duplicate the two values given as input, since \texttt{contains?}~consumes its inputs. If the value does occur in the list, \texttt{contains?}~returns the remainder of the list starting from the first occurrence; in other words, a truth value. This calls \texttt{nip}, which removes the value from the stack, leaving the original list at the top of the stack. On the other hand, if the value does not occur in the list, \texttt{contains?}~returns \texttt{f}, which causes the other branch of the conditional to execute. This branch calls \texttt{cons}, which adds the value at the beginning of the list.
Another frequently-used combinator \texttt{when}. This combinator is a variation of \texttt{ifte}, except only one quotation is given. If the condition is true, the quotation is executed. Nothing is done if the condition is false. In fact \texttt{when} is implemented in terms of \texttt{ifte}. Since \texttt{when} is called with a quotation on the stack, it suffices to push an empty list, and call \texttt{ifte} -- the given quotation is the true branch, and the empty quotation is the false branch:
\begin{verbatim}
: when ( ? quot -- )
[ ] ifte ;
\end{verbatim}
An example of a word that uses both \texttt{ifte} and \texttt{when} is the \texttt{abs} word, which computes the absolute value of a number:
\begin{verbatim}
: abs ( z -- abs )
dup complex? [
>rect mag2
] [
dup 0 < [ neg ] when
] ifte ;
\end{verbatim}
If the given number is a complex number, its distance from the origin is computed\footnote{Don't worry about the \texttt{>rect} and \texttt{mag2} words at this stage; they will be described later. If you are curious, use \texttt{see} to look at their definitions and read the documentation comments.}. Otherwise, if the parameter is a real number below zero, it is negated. If it is a real number greater than zero, it is not modified.
The dual of the \texttt{when} combinator is the \texttt{unless} combinator. It takes a quotation to execute if the condition is false; otherwise nothing is done. In both cases, the condition is popped off the stack.
The implementation is similar to that of \texttt{when}, but this time, we must swap the two quotations given to \texttt{ifte}, so that the true branch is an empty list, and the false branch is the user's quotation:
\begin{verbatim}
: unless ( ? quot -- )
[ ] swap ifte ;
\end{verbatim}
A very simple example of \texttt{unless} usage is the \texttt{assert} word:
\begin{verbatim}
: assert ( t -- )
[ "Assertion failed!" throw ] unless ;
\end{verbatim}
This word is used for unit testing -- it raises an error and stops execution if the top of the stack is false.
\subsection{Stack effects of conditionals}
It is good style to ensure that both branches of conditionals you write have the same stack effect. This makes words easier to debug. Since it is easy to make a stack flow mistake when working with conditionals, Factor includes a \emph{stack effect inference} tool. It can be used as follows:
\begin{alltt}
\textbf{ok} [ 2dup contains? [ nip ] [ cons ] ifte ] infer .
\textbf{[ 2 | 1 ]}
\end{alltt}
The output indicates that the code snippet\footnote{The proper term for a code snippet is a ``quotation''. You will learn about quotations later.} takes two values from the stack, and leaves one. Now lets see what happends if we forget about the \texttt{nip} and try to infer the stack effect:
\begin{alltt}
\textbf{ok} [ 2dup contains? [ ] [ cons ] ifte ] infer .
\textbf{ERROR: Unbalanced ifte branches
:s :r :n :c show stacks at time of error.}
\end{alltt}
Now lets look at the stack effect of the \texttt{abs} word. First, verify that each branch has the same stack effect:
\begin{alltt}
\textbf{ok} [ >rect mag2 ] infer .
\textbf{[ 1 | 1 ]}
\textbf{ok} [ dup 0 < [ neg ] when ] infer .
\textbf{[ 1 | 1 ]}
\end{alltt}
Since the branches are balanced, the stack effect of the entire conditional expression can be computed:
\begin{alltt}
\textbf{ok} [ abs ] infer .
\textbf{[ 1 | 1 ]}
\end{alltt}
\subsection*{Review}
\wordtable{
\tabvocab{syntax}
\texttt{f}&
\texttt{( -{}- f )}&
Empty list, and boolean falsity.\\
\texttt{t}&
\texttt{( -{}- f )}&
Canonical truth value.\\
\tabvocab{combinators}
\texttt{ifte}&
\texttt{( ?~true false -{}- )}&
Execute either \texttt{true} or \texttt{false} depending on the boolean value of the conditional.\\
\texttt{when}&
\texttt{( ?~quot -{}- )}&
Execute quotation if the condition is true, otherwise do nothing but pop the condition and quotation off the stack.\\
\texttt{unless}&
\texttt{( ?~quot -{}- )}&
Execute quotation if the condition is false, otherwise do nothing but pop the condition and quotation off the stack.\\
\tabvocab{lists}
\texttt{unique}&
\texttt{( elem list -{}- )}&
Prepend an element to a list if it does not occur in the
list.\\
\tabvocab{math}
\texttt{abs}&
\texttt{( z -- abs )}&
Compute the complex absolute value.\\
\tabvocab{test}
\texttt{assert}&
\texttt{( t -- )}&
Raise an error if the top of the stack is \texttt{f}.\\
\tabvocab{inference}
\texttt{infer}&
\texttt{( quot -{}- {[} in | out {]} )}&
Infer stack effect of code, if possible.\\}
\section{Recursion}
\chapkeywords{ifte when cons?~last last* list?}
\index{\texttt{ifte}}
\index{\texttt{when}}
\index{\texttt{cons?}}
\index{\texttt{last}}
\index{\texttt{last*}}
\index{\texttt{list?}}
The idea of \emph{recursion} is key to understanding Factor. A \emph{recursive} word definition is one that refers to itself, usually in one branch of a conditional. The general form of a recursive word looks as follows:
\begin{alltt}
: recursive
\emph{condition} {[}
\emph{recursive case}
{] [}
\emph{base case}
{]} ifte ;
\end{alltt}
Use \texttt{see} to take a look at the \texttt{last} word; it pushes the last element of a list:
\begin{alltt}
: last ( list -- last )
last* car ;
\end{alltt}
As you can see, it makes a call to an auxilliary word \texttt{last*}, and takes the car of the return value. As you can guess by looking at the output of \texttt{see}, \texttt{last*} pushes the last cons cell in a list:
\begin{verbatim}
: last* ( list -- last )
dup cdr cons? [ cdr last* ] when ;
\end{verbatim}
So if the top of stack is a cons cell whose cdr is not a cons cell, the cons cell remains on the stack -- it gets duplicated, its cdr is taken, the \texttt{cons?} predicate tests if it is a cons cell, then \texttt{when} consumes the condition, and takes the empty ``false'' branch. This is the \emph{base case} -- the last cons cell of a one-element list is the list itself.
If the cdr of the list at the top of the stack is another cons cell, then something magical happends -- \texttt{last*} calls itself again, but this time, with the cdr of the list. The recursive call, in turn, checks of the end of the list has been reached; if not, it makes another recursive call, and so on.
Lets test the word:
\begin{alltt}
\textbf{ok} {[} "salad" "pizza" "pasta" "pancakes" {]} last* .
\textbf{{[} "pancakes" {]}}
\textbf{ok} {[} "salad" "pizza" "pasta" | "pancakes" {]} last* .
\textbf{{[} "pasta" | "pancakes" {]}}
\textbf{ok} {[} "nachos" "tacos" "burritos" {]} last .
\textbf{"burritos"}
\end{alltt}
A naive programmer might think that recursion is an infinite loop. Two things ensure that a recursion terminates: the existence of a base case, or in other words, a branch of a conditional that does not contain a recursive call; and an \emph{inductive step} in the recursive case, that reduces one of the arguments in some manner, thus ensuring that the computation proceeds one step closer to the base case.
An inductive step usually consists of taking the cdr of a list (the conditional could check for \texttt{f} or a cons cell whose cdr is not a list, as above), or decrementing a counter (the conditional could compare the counter with zero). Of course, many variations are possible; one could instead increment a counter, pass around its maximum value, and compare the current value with the maximum on each iteration.
Lets consider a more complicated example. Use \texttt{see} to bring up the definition of the \texttt{list?} word. This word checks that the value on the stack is a proper list, that is, it is either \texttt{f}, or a cons cell whose cdr is a proper list:
\begin{verbatim}
: list? ( list -- ? )
dup [
dup cons? [ cdr list? ] [ drop f ] ifte
] [
drop t
] ifte ;
\end{verbatim}
This example has two nested conditionals, and in effect, two base cases. The first base case arises when the value is \texttt{f}; in this case, it is dropped from the stack, and \texttt{t} is returned, since \texttt{f} is a proper list of no elements. The second base case arises when a value that is not a cons is given. This is clearly not a proper list.
The inductive step takes the cdr of the list. So, the birds-eye view of the word is that it takes the cdr of the list on the stack, until it either reaches \texttt{f}, in which case the list is a proper list, and \texttt{t} is returned; or until it reaches something else, in which case the list is an improper list, and \texttt{f} is returned.
Lets test the word:
\begin{alltt}
\textbf{ok} {[} "tofu" "beans" "rice" {]} list? .
\textbf{t}
\textbf{ok} {[} "sprouts" "carrots" | "lentils" {]} list? .
\textbf{f}
\textbf{ok} f list? .
\textbf{t}
\end{alltt}
\subsection{Stack effects of recursive words}
There are a few things worth noting about the stack flow inside a recursive word. The condition must take care to preserve any input parameters needed for the base case and recursive case. The base case must consume all inputs, and leave the final return values on the stack. The recursive case should somehow reduce one of the parameters. This could mean incrementing or decrementing an integer, taking the \texttt{cdr} of a list, and so on. Parameters must eventually reduce to a state where the condition returns \texttt{f}, to avoid an infinite recursion.
The recursive case should also be coded such that the stack effect of the total definition is the same regardless of how many iterations are preformed; words that consume or produce different numbers of paramters depending on circumstances are very hard to debug.
Lets review the words we saw in this chapter:
\wordtable{
\tabvocab{combinators}
\texttt{when}&
\texttt{( ?~quot -{}- )}&
Execute quotation if the condition is true, otherwise do nothing but pop the condition and quotation off the stack.\\
\tabvocab{lists}
\texttt{cons?}&
\texttt{( value -{}- ?~)}&
Test if a value is a cons cell.\\
\texttt{last}&
\texttt{( list -{}- value )}&
Last element of a list.\\
\texttt{last*}&
\texttt{( list -{}- cons )}&
Last cons cell of a list.\\
\texttt{list?}&
\texttt{( value -{}- ?~)}&
Test if a value is a proper list.\\
}
\section{Debugging}
\section{The interpreter}
\chapkeywords{acons >r r>}
\index{\texttt{acons}}
\index{\texttt{>r}}
\index{\texttt{r>}}
So far, we have seen what we called ``the stack'' store intermediate values between computations. In fact Factor maintains a number of other stacks, and the formal name for the stack we've been dealing with so far is the \emph{data stack}.
Another fundamental stack is the \emph{return stack}. It is used to save interpreter state for nested calls.
You already know that code quotations are just lists. At a low level, each colon definition is also just a quotation. The interpreter consists of a loop that iterates a quotation, pushing each literal, and executing each word. If the word is a colon definition, the interpreter saves its state on the return stack, executes the definition of that word, then restores the execution state from the return stack and continues.
The return stack also serves a dual purpose as a temporary storage area. Sometimes, juggling values on the data stack becomes ackward, and in that case \texttt{>r} and \texttt{r>} can be used to move a value from the data stack to the return stack, and vice versa, respectively.
The words \texttt{>r} and \texttt{r>} ``hide'' the top of the stack between their occurrences. Try the following in the listener:
\begin{alltt}
\textbf{ok} 1 2 3 .s
\textbf{3
2
1}
\textbf{ok} >r .s r>
\textbf{2
1}
\textbf{ok} 1 2 3 .s
\textbf{3
2
1}
\end{alltt}
A simple example can be found in the definition of the \texttt{acons} word:
\begin{alltt}
\textbf{ok} \ttbackslash acons see
\textbf{: acons ( value key alist -- alist )
>r swons r> cons ;}
\end{alltt}
When the word is called, \texttt{swons} is applied to \texttt{value} and \texttt{key} creating a cons cell whose car is \texttt{key} and whose cdr is \texttt{value}, then \texttt{cons} is applied to this new cons cell, and \texttt{alist}. So this word adds the \texttt{key}/\texttt{value} pair to the beginning of the \texttt{alist}.
Note that usages of \texttt{>r} and \texttt{r>} must be balanced within a single quotation or word definition. The following examples illustrate the point:
\begin{verbatim}
: the-good >r 2 + r> * ;
: the-bad >r 2 + ;
: the-ugly r> ;
\end{verbatim}
Basically, the rule is you must leave the return stack in the same state as you found it so that when the current quotation finishes executing, the interpreter can return to the calling word.
One exception is that when \texttt{ifte} occurs as the last word in a definition, values may be pushed on the return stack before the condition value is computed, as long as both branches of the \texttt{ifte} pop the values off the return stack before returning.
Lets review the words we saw in this chapter:
\wordtable{
\tabvocab{lists}
\texttt{acons}&
\texttt{( value key alist -{}- alist )}&
Add a key/value pair to the association list.\\
\tabvocab{stack}
\texttt{>r}&
\texttt{( obj -{}- r:obj )}&
Move value to return stack..\\
\texttt{r>}&
\texttt{( r:obj -{}- obj )}&
Move value from return stack..\\
}
\section{Association lists}
An \emph{association list} is a list where every element is a cons. The
car of each cons is a name, the cdr is a value. The literal notation
is suggestive:
\begin{alltt}
{[}
{[} "Jill" | "CEO" {]}
{[} "Jeff" | "manager" {]}
{[} "James" | "lowly web designer" {]}
{]}
\end{alltt}
\texttt{assoc? ( obj -{}- ?~)} returns \texttt{t} if the object is
a list whose every element is a cons; otherwise it returns \texttt{f}.
\texttt{assoc ( key alist -{}- value )} looks for a pair with this
key in the list, and pushes the cdr of the pair. Pushes f if no pair
with this key is present. Note that \texttt{assoc} cannot differentiate between
a key that is not present at all, or a key with a value of \texttt{f}.
\texttt{assoc{*} ( key alist -{}- {[} key | value {]} )} looks for
a pair with this key, and pushes the pair itself. Unlike \texttt{assoc},
\texttt{assoc{*}} returns different values in the cases of a value
set to \texttt{f}, or an undefined value.
\texttt{set-assoc ( value key alist -{}- alist )} removes any existing
occurrence of a key from the list, and adds a new pair. This creates
a new list, the original is unaffected.
\texttt{acons ( value key alist -{}- alist )} is slightly faster
than \texttt{set-assoc} since it simply conses a new pair onto the
list. However, if used repeatedly, the list will grow to contain a
lot of {}``shadowed'' pairs.
The following pair of word definitions from the \texttt{html} vocabulary demonstrates the usage of association lists. It implements a mapping of special characters to their HTML entity names. Note the usage of \texttt{?}~to return the original character if the association lookup yields \texttt{f}:
\begin{alltt}
: html-entities ( -- alist )
{[}
{[} CHAR: < | "\&lt;" {]}
{[} CHAR: > | "\&gt;" {]}
{[} CHAR: \& | "\&amp;" {]}
{[} CHAR: {'} | "\&apos;" {]}
{[} CHAR: {"} | "\&quot;" {]}
{]} ;
: char>entity ( ch -- str )
dup >r html-entities assoc dup r> ? ;
\end{alltt}
Searching association lists incurs a linear time cost, so they should
only be used for small mappings -- a typical use is a mapping of half
a dozen entries or so, specified literally in source. Hashtables offer
better performance with larger mappings.
\section{Recursive combinators}
\chapter{Practical: a numbers game}
In this section, basic input/output and flow control is introduced.
We construct a program that repeatedly prompts the user to guess a
number -- they are informed if their guess is correct, too low, or
too high. The game ends on a correct guess.
\begin{alltt}
numbers-game
\emph{I'm thinking of a number between 0 and 100.}
\emph{Enter your guess:} 25
\emph{Too low}
\emph{Enter your guess:} 38
\emph{Too high}
\emph{Enter your guess:} 31
\emph{Correct - you win!}
\end{alltt}
\section{Getting started}
Start a text editor and create a file named \texttt{numbers-game.factor}.
Write a short comment at the top of the file. Two examples of commenting style supported by Factor:
\begin{alltt}
! Numbers game.
( The great numbers game )
\end{alltt}
It is always a good idea to comment your code. Try to write simple
code that does not need detailed comments to describe; similarly,
avoid redundant comments. These two principles are hard to quantify
in a concrete way, and will become more clear as your skills with
Factor increase.
We will be defining new words in the \texttt{numbers-game} vocabulary; add
an \texttt{IN:} statement at the top of the source file:
\begin{alltt}
IN: numbers-game
\end{alltt}
Also in order to be able to test the words, issue a \texttt{USE:}
statement in the interactive interpreter:
\begin{alltt}
USE: numbers-game
\end{alltt}
This section will develop the numbers game in an incremental fashion.
After each addition, issue a command like the following to load the
source file into the Factor interpreter:
\begin{alltt}
"numbers-game.factor" run-file
\end{alltt}
\section{Reading a number from the keyboard}
A fundamental operation required for the numbers game is to be able
to read a number from the keyboard. The \texttt{read} word \texttt{(
-{}- str )} reads a line of input and pushes it on the stack.
The \texttt{parse-number} word \texttt{( str -{}- n )} turns a decimal
string representation of an integer into the integer itself. These
two words can be combined into a single colon definition:
\begin{alltt}
: read-number ( -{}- n ) read parse-number ;
\end{alltt}
You should add this definition to the source file, and try loading
the file into the interpreter. As you will soon see, this raises an
error! The problem is that the two words \texttt{read} and \texttt{parse-number}
are not part of the default, minimal, vocabulary search path used
when reading files. The solution is to use \texttt{apropos.} to find
out which vocabularies contain those words, and add the appropriate
\texttt{USE:} statements to the source file:
\begin{alltt}
USE: parser
USE: stdio
\end{alltt}
After adding the above two statements, the file should now parse,
and testing should confirm that the \texttt{read-number} word works correctly.%
\footnote{There is the possibility of an invalid number being entered at the
keyboard. In this case, \texttt{parse-number} returns \texttt{f},
the boolean false value. For the sake of simplicity, we ignore this
case in the numbers game example. However, proper error handling is
an essential part of any large program and is covered later.%
}
\section{Printing some messages}
Now we need to make some words for printing various messages. They
are given here without further ado:
\begin{alltt}
: guess-banner
"I'm thinking of a number between 0 and 100." print ;
: guess-prompt "Enter your guess: " write ;
: too-high "Too high" print ;
: too-low "Too low" print ;
: correct "Correct - you win!" print ;
\end{alltt}
Note that in the above, stack effect comments are omitted, since they
are obvious from context. You should ensure the words work correctly
after loading the source file into the interpreter.
\section{Taking action based on a guess}
The next logical step is to write a word \texttt{judge-guess} that
takes the user's guess along with the actual number to be guessed,
and prints one of the messages \texttt{too-high}, \texttt{too-low},
or \texttt{correct}. This word will also push a boolean flag, indicating
if the game should continue or not -- in the case of a correct guess,
the game does not continue.
This description of judge-guess is a mouthful -- and it suggests that
it may be best to split it into two words. The first word we write
handles the more specific case of an \emph{inexact} guess -- so it
prints either \texttt{too-low} or \texttt{too-high}.
\begin{alltt}
: inexact-guess ( actual guess -{}- )
< {[} too-high {]} {[} too-low {]} ifte ;
\end{alltt}
Note that the word gives incorrect output if the two parameters are
equal. However, it will never be called this way.
With this out of the way, the implementation of judge-guess is an
easy task to tackle. Using the words \texttt{inexact-guess}, \texttt{2dup}, \texttt{2drop} and \texttt{=}, we can write:
\begin{alltt}
: judge-guess ( actual guess -{}- ? )
2dup = {[}
2drop correct f
{]} {[}
inexact-guess t
{]} ifte ;
\end{alltt}
The word \texttt{=} is found in the \texttt{kernel} vocabulary, and the words \texttt{2dup} and \texttt{2drop} are found in the \texttt{stack} vocabulary. Since \texttt{=}
consumes both its inputs, we must first duplicate the \texttt{actual} and \texttt{guess} parameters using \texttt{2dup}. The word \texttt{correct} does not need to do anything with these two numbers, so they are popped off the stack using \texttt{2drop}. Try evaluating the following
in the interpreter to see what's going on:
\begin{alltt}
clear 1 2 2dup = .s
\emph{\{ 1 2 f \}}
clear 4 4 2dup = .s
\emph{\{ 4 4 t \}}
\end{alltt}
Test \texttt{judge-guess} with a few inputs:
\begin{alltt}
1 10 judge-guess .
\emph{Too low}
\emph{t}
89 43 judge-guess .
\emph{Too high}
\emph{t}
64 64 judge-guess .
\emph{Correct}
\emph{f}
\end{alltt}
\section{Generating random numbers}
The \texttt{random-int} word \texttt{( min max -{}- n )} pushes a
random number in a specified range. The range is inclusive, so both
the minimum and maximum indexes are candidate random numbers. Use
\texttt{apropos.} to determine that this word is in the \texttt{random}
vocabulary. For the purposes of this game, random numbers will be
in the range of 0 to 100, so we can define a word that generates a
random number in the range of 0 to 100:
\begin{alltt}
: number-to-guess ( -{}- n ) 0 100 random-int ;
\end{alltt}
Add the word definition to the source file, along with the appropriate
\texttt{USE:} statement. Load the source file in the interpreter,
and confirm that the word functions correctly, and that its stack
effect comment is accurate.
\section{The game loop}
The game loop consists of repeated calls to \texttt{guess-prompt},
\texttt{read-number} and \texttt{judge-guess}. If \texttt{judge-guess}
returns \texttt{f}, the loop stops, otherwise it continues. This is
realized with a recursive implementation:
\begin{alltt}
: numbers-game-loop ( actual -{}- )
dup guess-prompt read-number judge-guess {[}
numbers-game-loop
{]} {[}
drop
{]} ifte ;
\end{alltt}
In Factor, tail-recursive words consume a bounded amount of call stack
space. This means you are free to pick recursion or iteration based
on their own merits when solving a problem. In many other languages,
the usefulness of recursion is severely limited by the lack of tail-recursive
call optimization.
\section{Finishing off}
The last task is to combine everything into the main \texttt{numbers-game}
word. This is easier than it seems:
\begin{alltt}
: numbers-game number-to-guess numbers-game-loop ;
\end{alltt}
Try it out! Simply invoke the \texttt{numbers-game} word in the interpreter.
It should work flawlessly, assuming you tested each component of this
design incrementally!
\section{The complete program}
\begin{verbatim}
! Numbers game example
IN: numbers-game
USE: kernel
USE: math
USE: parser
USE: random
USE: stdio
USE: stack
: read-number ( -- n ) read parse-number ;
: guess-banner
"I'm thinking of a number between 0 and 100." print ;
: guess-prompt "Enter your guess: " write ;
: too-high "Too high" print ;
: too-low "Too low" print ;
: correct "Correct - you win!" print ;
: inexact-guess ( actual guess -- )
< [ too-high ] [ too-low ] ifte ;
: judge-guess ( actual guess -- ? )
2dup = [
2drop correct f
] [
inexact-guess t
] ifte ;
: number-to-guess ( -- n ) 0 100 random-int ;
: numbers-game-loop ( actual -- )
dup guess-prompt read-number judge-guess [
numbers-game-loop
] [
drop
] ifte ;
: numbers-game number-to-guess numbers-game-loop ;
\end{verbatim}
\chapter{All about numbers}
A brief introduction to arithmetic in Factor was given in the first chapter. Most of the time, the simple features outlined there suffice, and if math is not your thing, you can skim (or skip!) this chapter. For the true mathematician, Factor's numerical capability goes far beyond simple arithmetic.
Factor's numbers more closely model the mathematical concept of a number than other languages. Where possible, exact answers are given -- for example, adding or multiplying two integers never results in overflow, and dividing two integers yields a fraction rather than a truncated result. Complex numbers are supported, allowing many functions to be computed with parameters that would raise errors or return ``not a number'' in other languages.
\section{Integers}
\chapkeywords{integer?~BIN: OCT: HEX: .b .o .h}
The simplest type of number is the integer. Integers come in two varieties -- \emph{fixnums} and \emph{bignums}. As their names suggest, a fixnum is a fixed-width quantity\footnote{Fixnums range in size from $-2^{w-3}-1$ to $2^{w-3}$, where $w$ is the word size of your processor (for example, 32 bits). Because fixnums automatically grow to bignums, usually you do not have to worry about details like this.}, and is a bit quicker to manipulate than an arbitrary-precision bignum.
The predicate word \texttt{integer?}~tests if the top of the stack is an integer. If this returns true, then exactly one of \texttt{fixnum?}~or \texttt{bignum?}~would return true for that object. Usually, your code does not have to worry if it is dealing with fixnums or bignums.
Unlike some languages where the programmer has to declare storage size explicitly and worry about overflow, integer operations automatically return bignums if the result would be too big to fit in a fixnum. Here is an example where multiplying two fixnums returns a bignum:
\begin{alltt}
\textbf{ok} 134217728 fixnum? .
\textbf{t}
\textbf{ok} 128 fixnum? .
\textbf{t}
\textbf{ok} 134217728 128 * .
\textbf{17179869184}
\textbf{ok} 134217728 128 * bignum? .
\textbf{t}
\end{alltt}
Integers can be entered using a different base. By default, all number entry is in base 10, however this can be changed by prefixing integer literals with one of the parsing words \texttt{BIN:}, \texttt{OCT:}, or \texttt{HEX:}. For example:
\begin{alltt}
\textbf{ok} BIN: 1110 BIN: 1 + .
\textbf{15}
\textbf{ok} HEX: deadbeef 2 * .
\textbf{7471857118}
\end{alltt}
The word \texttt{.} prints numbers in decimal, regardless of how they were input. A set of words in the \texttt{prettyprint} vocabulary is provided for print integers using another base.
\begin{alltt}
\textbf{ok} 1234 .h
\textbf{4d2}
\textbf{ok} 1234 .o
\textbf{2232}
\textbf{ok} 1234 .b
\textbf{10011010010}
\end{alltt}
\section{Rational numbers}
\chapkeywords{rational?~numerator denominator}
If we add, subtract or multiply any two integers, the result is always an integer. However, this is not the case with division. When dividing a numerator by a denominator where the numerator is not a integer multiple of the denominator, a ratio is returned instead.
\begin{alltt}
1210 11 / .
\emph{110}
100 330 / .
\emph{10/33}
\end{alltt}
Ratios are printed and can be input literally in the form of the second example. Ratios are always reduced to lowest terms by factoring out the greatest common divisor of the numerator and denominator. A ratio with a denominator of 1 becomes an integer. Trying to create a ratio with a denominator of 0 raises an error.
The predicate word \texttt{ratio?}~tests if the top of the stack is a ratio. The predicate word \texttt{rational?}~returns true if and only if one of \texttt{integer?}~or \texttt{ratio?}~would return true for that object. So in Factor terms, a ``ratio'' is a rational number whose denominator is not equal to 1.
Ratios behave just like any other number -- all numerical operations work as expected, and in fact they use the formulas for adding, subtracting and multiplying fractions that you learned in high school.
\begin{alltt}
\textbf{ok} 1/2 1/3 + .
\textbf{5/6}
\textbf{ok} 100 6 / 3 * .
\textbf{50}
\end{alltt}
Ratios can be deconstructed into their numerator and denominator components using the \texttt{numerator} and \texttt{denominator} words. The numerator and denominator are both integers, and furthermore the denominator is always positive. When applied to integers, the numerator is the integer itself, and the denominator is 1.
\begin{alltt}
\textbf{ok} 75/33 numerator .
\textbf{25}
\textbf{ok} 75/33 denominator .
\textbf{11}
\textbf{ok} 12 numerator .
\textbf{12}
\end{alltt}
\section{Floating point numbers}
\chapkeywords{float?~>float /f}
Rational numbers represent \emph{exact} quantities. On the other hand, a floating point number is an \emph{approximation}. While rationals can grow to any required precision, floating point numbers are fixed-width, and manipulating them is usually faster than manipulating ratios or bignums (but slower than manipulating fixnums). Floating point literals are often used to represent irrational numbers, which have no exact representation as a ratio of two integers. Floating point literals are input with a decimal point.
\begin{alltt}
\textbf{ok} 1.23 1.5 + .
\textbf{1.73}
\end{alltt}
The predicate word \texttt{float?}~tests if the top of the stack is a floating point number. The predicate word \texttt{real?}~returns true if and only if one of \texttt{rational?}~or \texttt{float?}~would return true for that object.
Floating point numbers are \emph{contagious} -- introducing a floating point number in a computation ensures the result is also floating point.
\begin{alltt}
\textbf{ok} 5/4 1/2 + .
\textbf{7/4}
\textbf{ok} 5/4 0.5 + .
\textbf{1.75}
\end{alltt}
Apart from contaigion, there are two ways of obtaining a floating point result from a computation; the word \texttt{>float ( n -{}- f )} converts a rational number into its floating point approximation, and the word \texttt{/f ( x y -{}- x/y )} returns the floating point approximation of a quotient of two numbers.
\begin{alltt}
\textbf{ok} 7 4 / >float .
\textbf{1.75}
\textbf{ok} 7 4 /f .
\textbf{1.75}
\end{alltt}
Indeed, the word \texttt{/f} could be defined as follows:
\begin{alltt}
: /f / >float ;
\end{alltt}
However, the actual definition is slightly more efficient, since it computes the floating point result directly.
\section{Complex numbers}
\chapkeywords{i -i \#\{ complex?~real imaginary >rect rect> arg abs >polar polar>}
Complex numbers arise as solutions to quadratic equations whose graph does not intersect the x axis. For example, the equation $x^2 + 1 = 0$ has no solution for real $x$, because there is no real number that is a square root of -1. However, in the field of complex numbers, this equation has a well-known solution:
\begin{alltt}
\textbf{ok} -1 sqrt .
\textbf{\#\{ 0 1 \}}
\end{alltt}
The literal syntax for a complex number is \texttt{\#\{ re im \}}, where \texttt{re} is the real part and \texttt{im} is the imaginary part. For example, the literal \texttt{\#\{ 1/2 1/3 \}} corresponds to the complex number $1/2 + 1/3i$.
The words \texttt{i} an \texttt{-i} push the literals \texttt{\#\{ 0 1 \}} and \texttt{\#\{ 0 -1 \}}, respectively.
The predicate word \texttt{complex?} tests if the top of the stack is a complex number. Note that unlike math, where all real numbers are also complex numbers, Factor only considers a number to be a complex number if its imaginary part is non-zero.
Complex numbers can be deconstructed into their real and imaginary components using the \texttt{real} and \texttt{imaginary} words. Both components can be pushed at once using the word \texttt{>rect ( z -{}- re im )}.
\begin{alltt}
\textbf{ok} -1 sqrt real .
\textbf{0}
\textbf{ok} -1 sqrt imaginary .
\textbf{1}
\textbf{ok} -1 sqrt sqrt >rect .s
\textbf{\{ 0.7071067811865476 0.7071067811865475 \}}
\end{alltt}
A complex number can be constructed from a real and imaginary component on the stack using the word \texttt{rect> ( re im -{}- z )}.
\begin{alltt}
\textbf{ok} 1/3 5 rect> .
\textbf{\#\{ 1/3 5 \}}
\end{alltt}
Complex numbers are stored in \emph{rectangular form} as a real/imaginary component pair (this is where the names \texttt{>rect} and \texttt{rect>} come from). An alternative complex number representation is \emph{polar form}, consisting of an absolute value and argument. The absolute value and argument can be computed using the words \texttt{abs} and \texttt{arg}, and both can be pushed at once using \texttt{>polar ( z -{}- abs arg )}.
\begin{alltt}
\textbf{ok} 5.3 abs .
\textbf{5.3}
\textbf{ok} i arg .
\textbf{1.570796326794897}
\textbf{ok} \#\{ 4 5 \} >polar .s
\textbf{\{ 6.403124237432849 0.8960553845713439 \}}
\end{alltt}
A new complex number can be created from an absolute value and argument using \texttt{polar> ( abs arg -{}- z )}.
\begin{alltt}
\textbf{ok} 1 pi polar> .
\textbf{\#\{ -1.0 1.224606353822377e-16 \}}
\end{alltt}
\section{Transcedential functions}
\chapkeywords{\^{} exp log sin cos tan asin acos atan sec cosec cot asec acosec acot sinh cosh tanh asinh acosh atanh sech cosech coth asech acosech acoth}
The \texttt{math} vocabulary provides a rich library of mathematical functions that covers exponentiation, logarithms, trigonometry, and hyperbolic functions. All functions accept and return complex number arguments where appropriate. These functions all return floating point values, or complex numbers whose real and imaginary components are floating point values.
\texttt{\^{} ( x y -- x\^{}y )} raises \texttt{x} to the power of \texttt{y}. In the cases of \texttt{y} being equal to $1/2$, -1, or 2, respectively, the words \texttt{sqrt}, \texttt{recip} and \texttt{sq} can be used instead.
\begin{alltt}
\textbf{ok} 2 4 \^ .
\textbf{16.0}
\textbf{ok} i i \^ .
\textbf{0.2078795763507619}
\end{alltt}
All remaining functions have a stack effect \texttt{( x -{}- y )}, it won't be repeated for brevity.
\texttt{exp} raises the number $e$ to a specified power. The number $e$ can be pushed on the stack with the \texttt{e} word, so \texttt{exp} could have been defined as follows:
\begin{alltt}
: exp ( x -- e^x ) e swap \^ ;
\end{alltt}
However, it is actually defined otherwise, for efficiency.\footnote{In fact, the word \texttt{\^{}} is actually defined in terms of \texttt{exp}, to correctly handle complex number arguments.}
\texttt{log} computes the natural (base $e$) logarithm. This is the inverse of the \texttt{exp} function.
\begin{alltt}
\textbf{ok} -1 log .
\textbf{\#\{ 0.0 3.141592653589793 \}}
\textbf{ok} e log .
\textbf{1.0}
\end{alltt}
\texttt{sin}, \texttt{cos} and \texttt{tan} are the familiar trigonometric functions, and \texttt{asin}, \texttt{acos} and \texttt{atan} are their inverses.
The reciprocals of the sine, cosine and tangent are defined as \texttt{sec}, \texttt{cosec} and \texttt{cot}, respectively. Their inverses are \texttt{asec}, \texttt{acosec} and \texttt{acot}.
\texttt{sinh}, \texttt{cosh} and \texttt{tanh} are the hyperbolic functions, and \texttt{asinh}, \texttt{acosh} and \texttt{atanh} are their inverses.
Similarly, the reciprocals of the hyperbolic functions are defined as \texttt{sech}, \texttt{cosech} and \texttt{coth}, respectively. Their inverses are \texttt{asech}, \texttt{acosech} and \texttt{acoth}.
\section{Modular arithmetic}
\chapkeywords{/i mod /mod gcd}
In addition to the standard division operator \texttt{/}, there are a few related functions that are useful when working with integers.
\texttt{/i ( x y -{}- x/y )} performs a truncating integer division. It could have been defined as follows:
\begin{alltt}
: /i / >integer ;
\end{alltt}
However, the actual definition is a bit more efficient than that.
\texttt{mod ( x y -{}- x\%y )} computes the remainder of dividing \texttt{x} by \texttt{y}. If the result is 0, then \texttt{x} is a multiple of \texttt{y}.
\texttt{/mod ( x y -{}- x/y x\%y )} pushes both the quotient and remainder.
\begin{alltt}
\textbf{ok} 100 3 mod .
\textbf{1}
\textbf{ok} -546 34 mod .
\textbf{-2}
\end{alltt}
\texttt{gcd ( x y -{}- z )} pushes the greatest common divisor of two integers; that is, the largest number that both integers could be divided by and still yield integers as results. This word is used behind the scenes to reduce rational numbers to lowest terms when doing ratio arithmetic.
\section{Bitwise operations}
\chapkeywords{bitand bitor bitxor bitnot shift}
There are two ways of looking at an integer -- as a mathematical entity, or as a string of bits. The latter representation faciliates the so-called \emph{bitwise operations}.
\texttt{bitand ( x y -{}- x\&y )} returns a new integer where each bit is set if and only if the corresponding bit is set in both $x$ and $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-and with a mask switches off all flags that are not explicitly set in the mask.
\begin{alltt}
BIN: 101 BIN: 10 bitand .b
\emph{0}
BIN: 110 BIN: 10 bitand .b
\emph{10}
\end{alltt}
\texttt{bitor ( x y -{}- x|y )} returns a new integer where each bit is set if and only if the corresponding bit is set in at least one of $x$ or $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-or with a mask switches on all flags that are set in the mask.
\begin{alltt}
BIN: 101 BIN: 10 bitor .b
\emph{111}
BIN: 110 BIN: 10 bitor .b
\emph{110}
\end{alltt}
\texttt{bitxor ( x y -{}- x\^{}y )} returns a new integer where each bit is set if and only if the corresponding bit is set in exactly one of $x$ or $y$. If you're considering an integer as a sequence of bit flags, taking the bitwise-xor with a mask toggles on all flags that are set in the mask.
\begin{alltt}
BIN: 101 BIN: 10 bitxor .b
\emph{111}
BIN: 110 BIN: 10 bitxor .b
\emph{100}
\end{alltt}
\texttt{bitnot ( x -{}- y )} returns the bitwise complement of the input; that is, each bit in the input number is flipped. This is actually equivalent to negating a number, and subtracting one. So indeed, \texttt{bitnot} could have been defined as thus:
\begin{alltt}
: bitnot neg pred ;
\end{alltt}
\texttt{shift ( x n -{}- y )} returns a new integer consisting of the bits of the first integer, shifted to the left by $n$ positions. If $n$ is negative, the bits are shifted to the right instead, and bits that ``fall off'' are discarded.
\begin{alltt}
BIN: 101 5 shift .b
\emph{10100000}
BIN: 11111 -2 shift .b
\emph{111}
\end{alltt}
The attentive reader will notice that shifting to the left is equivalent to multiplying by a power of two, and shifting to the right is equivalent to performing a truncating division by a power of two.
\chapter{Working with state}
\section{Building lists and strings}
\chapkeywords{make-string make-list ,}
\index{\texttt{make-string}}
\index{\texttt{make-list}}
\index{\texttt{make-,}}
\section{Hashtables}
A hashtable, much like an association list, stores key/value pairs, and offers lookup by key. However, whereas an association list must be searched linearly to locate keys, a hashtable uses a more sophisticated method. Key/value pairs are sorted into \emph{buckets} using a \emph{hash function}. If two objects are equal, then they must have the same hash code; but not necessarily vice versa. To look up the value associated with a key, only the bucket corresponding to the key has to be searched. A hashtable is simply a vector of buckets, where each bucket is an association list.
\texttt{<hashtable> ( capacity -{}- hash )} creates a new hashtable with the specified number of buckets. A hashtable with one bucket is basically an association list. Right now, a ``large enough'' capacity must be specified, and performance degrades if there are too many key/value pairs per bucket. In a future implementation, hashtables will grow as needed as the number of key/value pairs increases.
\texttt{hash ( key hash -{}- value )} looks up the value associated with a key in the hashtable. Pushes \texttt{f} if no pair with this key is present. Note that \texttt{hash} cannot differentiate between a key that is not present at all, or a key with a value of \texttt{f}.
\texttt{hash* ( key hash -{}- {[} key | value {]} )} looks for
a pair with this key, and pushes the pair itself. Unlike \texttt{hash},
\texttt{hash{*}} returns different values in the cases of a value
set to \texttt{f}, or an undefined value.
\texttt{set-hash ( value key hash -{}- )} stores a key/value pair in a hashtable.
Hashtables can be converted to association lists and vice versa using
the \texttt{hash>alist} and \texttt{alist>hash} words. The list of keys and
list of values can be extracted using the \texttt{hash-keys} and \texttt{hash-values} words.
examples
\section{Variables}
Notice that until now, all the code except a handful of examples has only used the stack for storage. You can also use variables to store temporary data, much like in other languages, however their use is not so prevalent. This is not a coincidence -- Fator was designed this way, and mastery of the stack is essential. Using variables where the stack is more appropriate leads to ugly, unreusable code.
Variables are typically used for longer-term storage of data, and compound data structures, realized as nested namespaces of variables. This concept should be instantly familiar to anybody who's used an object-oriented programming language. Variables should only be used for intermediate results if keeping everything on the stack would result in ackward stack flow.
The words \texttt{get ( name -{}- value )} and \texttt{set ( value name -{}- )} retreive and store variable values, respectively. Variable names are strings, and they do not have to be declared before use. For example:
\begin{alltt}
5 "x" set
"x" get .
\emph{5}
\end{alltt}
\section{Namespaces}
Only having one list of variable name/value bindings would make the language terribly inflexible. Instead, a variable has any number of potential values, one per namespace. There is a notion of a ``current namespace''; the \texttt{set} word always stores variables in the current namespace. On the other hand, \texttt{get} traverses up the stack of namespace bindings until it finds a variable with the specified name.
\texttt{bind ( namespace quot -{}- )} executes a quotation in the dynamic scope of a namespace. For example, the following sets the value of \texttt{x} to 5 in the global namespace, regardless of the current namespace at the time the word was called.
\begin{alltt}
: global-example ( -- )
global {[} 5 "x" set {]} bind ;
\end{alltt}
\texttt{<namespace> ( -{}- namespace )} creates a new namespace object. Actually, a namespace is just a hashtable, with a default capacity.
\texttt{with-scope ( quot -{}- )} combines \texttt{<namespace>} with \texttt{bind} by executing a quotation in a new namespace.
get example
describe
\section{The name stack}
The \texttt{bind} combinator creates dynamic scope by pushing and popping namespaces on the so-called \emph{name stack}. Its definition is simpler than one would expect:
\begin{alltt}
: bind ( namespace quot -- )
swap >n call n> drop ;
\end{alltt}
The words \texttt{>n} and \texttt{n>} push and pop the name stack, respectively. Observe the stack flow in the definition of \texttt{bind}; the namespace goes on the name stack, the quotation is called, and the name space is popped and discarded.
The name stack is really just a vector. The words \texttt{>n} and \texttt{n>} are implemented as follows:
\begin{alltt}
: >n ( namespace -- n:namespace ) namestack* vector-push ;
: n> ( n:namespace -- namespace ) namestack* vector-pop ;
\end{alltt}
\section{\label{sub:List-constructors}List constructors}
The list construction words provide an alternative way to build up a list. Instead of passing a partial list around on the stack as it is built, they store the partial list in a variable. This reduces the number
of stack elements that have to be juggled.
The word \texttt{{[}, ( -{}- )} begins list construction. This also pushes a new namespace on the name stack, so any variable values that are set between calls to \texttt{[,} and \texttt{,]} will be lost.
The word \texttt{, ( obj -{}- )} appends an object to the partial
list.
The word \texttt{,{]} ( -{}- list )} pushes the complete list, and pops the corresponding namespace from the name stack.
The fact that a new
scope is created between \texttt{{[},} and \texttt{,{]}} is very important.
This means
that list constructions can be nested. There is no
requirement that \texttt{{[},} and \texttt{,{]}} appear in the same
word, however, debugging becomes prohibitively difficult when a list
construction begins in one word and ends with another.
Here is an example of list construction using this technique:
\begin{alltt}
{[}, 1 10 {[} 2 {*} dup , {]} times drop ,{]} .
\emph{{[} 2 4 8 16 32 64 128 256 512 1024 {]}}
\end{alltt}
\section{String constructors}
The string construction words provide an alternative way to build up a string. Instead of passing a string buffer around on the stack, they store the string buffer in a variable. This reduces the number
of stack elements that have to be juggled.
The word \texttt{<\% ( -{}- )} begins string construction. The word
definition creates a string buffer. Instead of leaving the string
buffer on the stack, the word creates and pushes a scope on the name
stack.
The word \texttt{\% ( str/ch -{}- )} appends a string or a character
to the partial list. The word definition calls \texttt{sbuf-append}
on a string buffer located by searching the name stack.
The word \texttt{\%> ( -{}- str )} pushes the complete list. The word
definition pops the name stack and calls \texttt{sbuf>str} on the
appropriate string buffer.
Compare the following two examples -- both define a word that concatenates together all elements of a list of strings. The first one uses a string buffer stored on the stack, the second uses string construction words:
\begin{alltt}
: cat ( list -- str )
100 <sbuf> swap {[} over sbuf-append {]} each sbuf>str ;
: cat ( list -- str )
<\% {[} \% {]} each \%> ;
\end{alltt}
The scope created by \texttt{<\%} and \texttt{\%>} is \emph{dynamic}; that is, all code executed between two words is part of the scope. This allows the call to \texttt{\%} to occur in a nested word. For example, here is a pair of definitions that turn an association list of strings into a string of the form \texttt{key1=value1 key2=value2 ...}:
\begin{alltt}
: pair\% ( pair -{}- )
unswons \% "=" \% \% ;
: assoc>string ( alist -{}- )
<\% [ pair\% " " \% ] each \%> ;
\end{alltt}
\chapter{Practical: a contractor timesheet}
For the second practical example, we will code a small program that tracks how long you spend working on tasks. It will provide two primary functions, one for adding a new task and measuring how long you spend working on it, and another to print out the timesheet. A typical interaction looks like this:
\begin{alltt}
timesheet-app
\emph{
(E)xit
(A)dd entry
(P)rint timesheet
Enter a letter between ( ) to execute that action.}
a
\emph{Start work on the task now. Press ENTER when done.
Please enter a description:}
Working on the Factor HTTP server
\emph{(E)xit
(A)dd entry
(P)rint timesheet
Enter a letter between ( ) to execute that action.}
a
\emph{Start work on the task now. Press ENTER when done.
Please enter a description:}
Writing a kick-ass web app
\emph{
(E)xit
(A)dd entry
(P)rint timesheet
Enter a letter between ( ) to execute that action.}
p
\emph{TIMESHEET:
Working on the Factor HTTP server 0:25
Writing a kick-ass web app 1:03
(E)xit
(A)dd entry
(P)rint timesheet
Enter a letter between ( ) to execute that action.}
x
\end{alltt}
Once you have finished working your way through this tutorial, you might want to try extending the program -- for example, it could print the total hours, prompt for an hourly rate, then print the amount of money that should be billed.
\section{Measuring a duration of time}
When you begin working on a new task, you tell the timesheet you want
to add a new entry. It then measures the elapsed time until you specify
the task is done, and prompts for a task description.
The first word we will write is \texttt{measure-duration}. We measure
the time duration by using the \texttt{millis} word \texttt{( -{}-
m )} to take the time before and after a call to \texttt{read}. The
\texttt{millis} word pushes the number of milliseconds since a certain
epoch -- the epoch does not matter here since we are only interested
in the difference between two times.
A first attempt at \texttt{measure-duration} might look like this:
\begin{alltt}
: measure-duration millis read drop millis - ;
measure-duration .
\end{alltt}
This word definition has the right general idea, however, the result
is negative. Also, we would like to measure durations in minutes,
not milliseconds:
\begin{alltt}
: measure-duration ( -{}- duration )
millis
read drop
millis swap - 1000 /i 60 /i ;
\end{alltt}
Note that the \texttt{/i} word \texttt{( x y -{}- x/y )}, from the
\texttt{math} vocabulary, performs truncating division. This
makes sense, since we are not interested in fractional parts of a
minute here.
\section{Adding a timesheet entry}
Now that we can measure a time duration at the keyboard, lets write
the \texttt{add-entry-prompt} word. This word does exactly what one
would expect -- it prompts for the time duration and description,
and leaves those two values on the stack:
\begin{alltt}
: add-entry-prompt ( -{}- duration description )
"Start work on the task now. Press ENTER when done." print
measure-duration
"Please enter a description:" print
read ;
\end{alltt}
You should interactively test this word. Measure off a minute or two,
press ENTER, enter a description, and press ENTER again. The stack
should now contain two values, in the same order as the stack effect
comment.
Now, almost all the ingredients are in place. The final add-entry
word calls add-entry-prompt, then pushes the new entry on the end
of the timesheet vector:
\begin{alltt}
: add-entry ( timesheet -{}- )
add-entry-prompt cons swap vector-push ;
\end{alltt}
Recall that timesheet entries are cons cells where the car is the
duration and the cdr is the description, hence the call to \texttt{cons}.
Note that this word side-effects the timesheet vector. You can test
it interactively like so:
\begin{alltt}
10 <vector> dup add-entry
\emph{Start work on the task now. Press ENTER when done.}
\emph{Please enter a description:}
\emph{Studying Factor}
.
\emph{\{ {[} 2 | "Studying Factor" {]} \}}
\end{alltt}
\section{Printing the timesheet}
The hard part of printing the timesheet is turning the duration in
minutes into a nice hours/minutes string, like {}``01:15''. We would
like to make a word like the following:
\begin{alltt}
135 hh:mm .
\emph{01:15}
\end{alltt}
First, we can make a pair of words \texttt{hh} and \texttt{mm} to extract the hours
and minutes, respectively. This can be achieved using truncating division,
and the modulo operator -- also, since we would like strings to be
returned, the \texttt{unparse} word \texttt{( obj -{}- str )} from
the \texttt{unparser} vocabulary is called to turn the integers into
strings:
\begin{alltt}
: hh ( duration -{}- str ) 60 /i unparse ;
: mm ( duration -{}- str ) 60 mod unparse ;
\end{alltt}
The \texttt{hh:mm} word can then be written, concatenating the return
values of \texttt{hh} and \texttt{mm} into a single string using string
construction:
\begin{alltt}
: hh:mm ( millis -{}- str ) <\% dup hh \% ":" \% mm \% \%> ;
\end{alltt}
However, so far, these three definitions do not produce ideal output.
Try a few examples:
\begin{alltt}
120 hh:mm .
2:0
130 hh:mm .
2:10
\end{alltt}
Obviously, we would like the minutes to always be two digits. Luckily,
there is a \texttt{digits} word \texttt{( str n -{}- str )} in the
\texttt{format} vocabulary that adds enough zeros on the left of the
string to give it the specified length. Try it out:
\begin{alltt}
"23" 2 digits .
\emph{"23"}
"7"2 digits .
\emph{"07"}
\end{alltt}
We can now change the definition of \texttt{mm} accordingly:
\begin{alltt}
: mm ( duration -{}- str ) 60 mod unparse 2 digits ;
\end{alltt}
Now that time duration output is done, a first attempt at a definition
of \texttt{print-timesheet} looks like this:
\begin{alltt}
: print-timesheet ( timesheet -{}- )
{[} uncons write ": " write hh:mm print {]} vector-each ;
\end{alltt}
This works, but produces ugly output:
\begin{alltt}
\{ {[} 30 | "Studying Factor" {]} {[} 65 | "Paperwork" {]} \}
print-timesheet
\emph{Studying Factor: 0:30}
\emph{Paperwork: 1:05}
\end{alltt}
It would be much nicer if the time durations lined up in the same
column. First, lets factor out the body of the \texttt{vector-each}
loop into a new \texttt{print-entry} word before it gets too long:
\begin{alltt}
: print-entry ( duration description -{}- )
write ": " write hh:mm print ;
: print-timesheet ( timesheet -{}- )
{[} uncons print-entry {]} vector-each ;
\end{alltt}
We can now make \texttt{print-entry} line up columns using the \texttt{pad-string}
word \texttt{( str n -{}- str )}.
\begin{alltt}
: print-entry ( duration description -{}- )
dup
write
50 swap pad-string write
hh:mm print ;
\end{alltt}
In the above definition, we first print the description, then enough
blanks to move the cursor to column 60. So the description text is
left-justified. If we had interchanged the order of the second and
third line in the definition, the description text would be right-justified.
Try out \texttt{print-timesheet} again, and marvel at the aligned
columns:
\begin{alltt}
\{ {[} 30 | "Studying Factor" {]} {[} 65 | "Paperwork" {]} \}
print-timesheet
\emph{Studying Factor 0:30}
\emph{Paperwork 1:05}
\end{alltt}
\section{The main menu}
Finally, we will code a main menu that looks like this:
\begin{alltt}
(E)xit
(A)dd entry
(P)rint timesheet
Enter a letter between ( ) to execute that action.
\end{alltt}
We will represent the menu as an association list. Recall that an association list is a list of pairs, where the car of each pair is a key, and the cdr is a value. Our keys will literally be keyboard keys (``e'', ``a'' and ``p''), and the values will themselves be pairs consisting of a menu item label and a quotation.
The first word we will code is \texttt{print-menu}. It takes an association list, and prints the second element of each pair's value. Note that \texttt{terpri} simply prints a blank line:
\begin{alltt}
: print-menu ( menu -{}- )
terpri {[} cdr car print {]} each terpri
"Enter a letter between ( ) to execute that action." print ;
\end{alltt}
You can test \texttt{print-menu} with a short association list:
\begin{alltt}
{[} {[} "x" "(X)yzzy" 2 2 + . {]} {[} "f" "(F)oo" -1 sqrt . {]} {]} print-menu
\emph{
Xyzzy
Foo
Enter a letter between ( ) to execute that action.}
\end{alltt}
The next step is to write a \texttt{menu-prompt} word that takes the same association list, reads a line of input from the keyboard, and executes the quotation associated with that line. Recall that the \texttt{assoc} word returns \texttt{f} if the specified key could not be found in the association list. The below definition makes use of a conditional to signal an error in that case:
\begin{alltt}
: menu-prompt ( menu -{}- )
read swap assoc dup {[}
cdr call
{]} {[}
"Invalid input: " swap unparse cat2 throw
{]} ifte ;
\end{alltt}
Try applying the new \texttt{menu-prompt} word to the association list we used to test \texttt{print-menu}. You should verify that entering \texttt{x} causes the quotation \texttt{{[} 2 2 + . {]}} to be executed:
\begin{alltt}
{[} {[} "x" "(X)yzzy" 2 2 + . {]} {[} "f" "(F)oo" -1 sqrt . {]} {]} menu-prompt
x
\emph{4}
\end{alltt}
Finally, we want a \texttt{menu} word that first prints a menu, then prompts for and acts on input:
\begin{alltt}
: menu ( menu -{}- )
dup print-menu menu-prompt ;
\end{alltt}
Considering the stack effects of \texttt{print-menu} and \texttt{menu-prompt}, it should be obvious why the \texttt{dup} is needed.
\section{Finishing off}
We now need a \texttt{main-menu} word. It takes the timesheet vector from the stack, and recursively calls itself until the user requests that the timesheet application exits:
\begin{alltt}
: main-menu ( timesheet -{}- )
{[}
{[} "e" "(E)xit" drop {]}
{[} "a" "(A)dd entry" dup add-entry main-menu {]}
{[} "p" "(P)rint timesheet" dup print-timesheet main-menu {]}
{]} menu ;
\end{alltt}
Note that unless the first option is selected, the timesheet vector is eventually passed into the recursive \texttt{main-menu} call.
All that remains now is the ``main word'' that runs the program with an empty timesheet vector. Note that the initial capacity of the vector is 10 elements, however this is not a limit -- adding more than 10 elements will grow the vector:
\begin{alltt}
: timesheet-app ( -{}- )
10 <vector> main-menu ;
\end{alltt}
\section{The complete program}
\begin{verbatim}
! Contractor timesheet example
IN: timesheet
USE: combinators
USE: errors
USE: format
USE: kernel
USE: lists
USE: math
USE: parser
USE: stack
USE: stdio
USE: strings
USE: unparser
USE: vectors
! Adding a new entry to the time sheet.
: measure-duration ( -- duration )
millis
read drop
millis swap - 1000 /i 60 /i ;
: add-entry-prompt ( -- duration description )
"Start work on the task now. Press ENTER when done." print
measure-duration
"Please enter a description:" print
read ;
: add-entry ( timesheet -- )
add-entry-prompt cons swap vector-push ;
! Printing the timesheet.
: hh ( duration -- str ) 60 /i ;
: mm ( duration -- str ) 60 mod unparse 2 digits ;
: hh:mm ( millis -- str ) <% dup hh % ":" % mm % %> ;
: print-entry ( duration description -- )
dup write
60 swap pad-string write
hh:mm print ;
: print-timesheet ( timesheet -- )
"TIMESHEET:" print
[ uncons print-entry ] vector-each ;
! Displaying a menu
: print-menu ( menu -- )
terpri [ cdr car print ] each terpri
"Enter a letter between ( ) to execute that action." print ;
: menu-prompt ( menu -- )
read swap assoc dup [
cdr call
] [
"Invalid input: " swap unparse cat2 throw
] ifte ;
: menu ( menu -- )
dup print-menu menu-prompt ;
! Main menu
: main-menu ( timesheet -- )
[
[ "e" "(E)xit" drop ]
[ "a" "(A)dd entry" dup add-entry main-menu ]
[ "p" "(P)rint timesheet" dup print-timesheet main-menu ]
] menu ;
: timesheet-app ( -- )
10 <vector> main-menu ;
\end{verbatim}
\chapter{Working with classes}
\section{What is object oriented programming?}
Object oriented programming is a commonly-used term, however many people
define it differently. Most will agree it consists of three key ideas:
\begin{itemize}
\item Objects are small pieces of state with the required identity and
equality semantics, along with runtime information
allowing the object to reflect on itself.
\item Objects are organized in some manner, allowing one to express
that a given set of objects features common behavior or shape. Factor organizes
objects into classes and types, however its definition of these terms is
slightly different from convention.
\item Behavior can be defined on objects, and dispatched in a polymorphic way,
where invoking a generic operation on an object takes action most
appropriate to that object.
\end{itemize}
The separation into three parts is reflected in the design of the Factor
object system.
The following terminology is used in this guide:
\begin{itemize}
\item \emph{Class} -- a class is a set of objects given by a predicate
that distinglishes elements of the class from other objects, along with
some associated meta-information.
\item \emph{Type} -- a type is a concrete representation of an object
in runtime memory. There is only a fixed number of built-in types, such as
integers, strings, and arrays. Each object has a unique type it belongs to,
whereas it may be a member of an arbitrary number of classes.
\end{itemize}
In many languages, a class refers to a specific object organization,
typically a specification form for named slots that objects in the class
shall have. In Factor, the \texttt{tuple} metaclass allows one to create
such conventional objects. However, we will look at generic words
and built-in classes first.
\section{Generic words and methods}
To use the generic word system, you must put the following near the
beginning of your source file:
\begin{verbatim}
USE: generic
\end{verbatim}
The motivation for generic words is that sometimes, you want to write a word that has
differing behavior depending on the class of its argument. For example,
in a game, a \texttt{draw} word could take different action if given a ship, a
weapon, a planet, etc. Writing one large \texttt{draw} word that contains type case logic results in
unnecessary coupling -- adding support for a new type of graphical
object would require modifying the original definition of \texttt{draw}, for
example.
A generic word is a word whose behavior depends on the class of the
object at the top of the stack, however this behavior is defined in a
decentralized manner.
A new generic word is defined using the following syntax:
\begin{verbatim}
GENERIC: draw ( actor -- )
#! Draw the actor.
\end{verbatim}
A stack effect comment, as shown above, is not required but recommended.
A generic word just defined like that will simply raise an error if
invoked. Specific behavior is defined using methods.
A method associates behavior with a generic word. Methods are defined by
writing \texttt{M:}, followed by a class name, followed by the name of a
previously-defined generic word.
One of the main benefits of generic words is that each method definition
can potentially occur in a different source file. Generic word
definitions also hide conditionals.
Here are two methods for the generic \texttt{draw} word:
\begin{verbatim}
M: ship draw ( actor -- )
[
surface get screen-xy radius get color get
filledCircleColor
] bind ;
M: plasma draw ( actor -- )
[
surface get screen-xy dup len get + color get
vlineColor
] bind ;
\end{verbatim}
Here, \texttt{ship} and \texttt{class} are user-defined classes.
Every object is a member of the \texttt{object} class. If you provide a method specializing
on the \texttt{object} class for some generic word, the method will be
invoked when no other more specific method exists. For example:
\begin{verbatim}
GENERIC: describe
M: number describe "The number " write . ;
M: object describe "I don't know anything about " write . ;
\end{verbatim}
\section{Classes}
Recall that in Factor, a class is just a predicate that categorizes objects as
being a member of the class or not. To be useful, it must be consistent
-- for a given object, it must always return the same truth value.
Classes are not always subsets or supersets of types and new classes can be defined by the user. Classes can be quite arbitrary:
\begin{itemize}
\item Cons cells where both elements are integers
\item Floating point numbers between -1 and 1
\item Hash tables holding a certain key
\item Any object that occurs as a member of a certain global variable
holding a list.
\item \... and so on.
\end{itemize}
The building blocks of classes are the various built-in types, and
user-defined tupes. Tuples are covered later in this chapter.
The built-in types each get their own class whose members are precisely
the objects having that type. The following built-in classes are
defined:
\begin{itemize}
\item \texttt{alien}
\item \texttt{array}
\item \texttt{bignum}
\item \texttt{complex}
\item \texttt{cons}
\item \texttt{dll}
\item \texttt{f}
\item \texttt{fixnum}
\item \texttt{float}
\item \texttt{port}
\item \texttt{ratio}
\item \texttt{sbuf}
\item \texttt{string}
\item \texttt{t}
\item \texttt{tuple}
\item \texttt{vector}
\item \texttt{word}
\end{itemize}
Each builtin class has a corresponding membership test predicate, named
after the builtin class suffixed by \texttt{?}. For example, \texttt{cons?}, \texttt{word?}, etc. Automatically-defined predicates is a common theme, and
in fact \emph{every} class has a corresponding predicate word,
with the following
exceptions:
\begin{itemize}
\item \texttt{object} -- there is no need for a predicate word, since
every object is an instance of this class.
\item \texttt{f} -- the only instance of this class is the sigleton
\texttt{f} signifying falsity, missing value, and empty list, and the predicate testing for this is the built-in library word \texttt{not}.
\item \texttt{t} -- the only instance of this class is the canonical truth value
\texttt{t}. You can write \texttt{t =} to test for this object, however usually
any object distinct from \texttt{f} is taken as a truth value, and \texttt{t} is not tested for directly.
\end{itemize}
\section{Metaclasses}
So far, we have only seen predefined classes corresponding to built-in
types. More complicated classes are defined in terms of metaclasses.
This section will describe how to define new classes belonging to
predefined metaclasses.
Just like shared object object traits motivates the existence of classes,
common behavior shared between classes themselves motivates metaclasses.
For example, classes corresponding to built-in types, such as \texttt{fixnum}
and \texttt{string}, are instances of
the \texttt{builtin} metaclass, whereas a user-defined class is not an
instance of \texttt{builtin}.
\subsection{The \texttt{union} metaclass}
The \texttt{union} metaclass allows new classes to be
defined as aggregates of existing classes.
For example, the Factor library defines some unions over numeric types:
\begin{verbatim}
UNION: integer fixnum bignum ;
UNION: rational integer ratio ;
UNION: real rational float ;
UNION: number real complex ;
\end{verbatim}
Now, the absolute value function can be defined in an efficient manner
for real numbers, and in a more general fashion for complex numbers:
\begin{verbatim}
GENERIC: abs ( z -- |z| )
M: real abs dup 0 < [ neg ] when ;
M: complex abs >rect mag2 ;
\end{verbatim}
New unions can be defined as in the numerical classes example:
you write \texttt{UNION:} followed by the name of the union,
followed by its members. The list of members is terminated with a
semi-colon.
A predicate named after the union followed by '?' is
automatically-defined. For example, the following definition of 'real?'
was automatically created:
\begin{verbatim}
: real?
dup rational? [
drop t
] [
dup float? [
drop t
] [
drop f
] ifte
] ifte ;
\end{verbatim}
\subsection{The \texttt{complement} metaclass}
The \texttt{complement} metaclass allows you to define a class whose members
are exactly those not in another class. For example, the class of all
truth values is defined in \texttt{library/kernel.factor} by:
\begin{verbatim}
COMPLEMENT: general-t f
\end{verbatim}
\subsection{The \texttt{predicate} metaclass}
The predicate metaclass contains classes whose membership test is an
arbitrary expression. To speed up dispatch, each predicate must be
defined as a subclass of some other class. That way predicates
subclassing from disjoint builtin classes do not need to be
exhaustively tested.
The source file \texttt{library/strings.factor} defines some subclasses of \texttt{integer}
classifying ASCII characters:
\begin{verbatim}
PREDICATE: integer blank " \t\n\r" str-contains? ;
PREDICATE: integer letter CHAR: a CHAR: z between? ;
PREDICATE: integer LETTER CHAR: A CHAR: Z between? ;
PREDICATE: integer digit CHAR: 0 CHAR: 9 between? ;
PREDICATE: integer printable CHAR: \s CHAR: ~ between? ;
\end{verbatim}
Each predicate defines a corresponding predicate word whose name is
suffixed with '?'; for example, a 'digit?' word is automatically
defined:
\begin{verbatim}
: digit?
dup integer? [
CHAR: 0 CHAR: 9 between?
] [
drop f
] ifte ;
\end{verbatim}
For obvious reasons, the predicate definition must consume and produce
exactly one value on the stack.
\section{Tuples}
Tuples are user-defined classes whose objects consist of named slots.
New tuple classes are defined with the following syntax:
\begin{verbatim}
TUPLE: point x y z ;
\end{verbatim}
This defines a new class named \texttt{point}, along with the
following set of words:
\begin{verbatim}
<point> point?
point-x set-point-x
point-y set-point-y
point-z set-point-z
\end{verbatim}
The word \texttt{<point>} takes the slot values from the stack and
produces a new \texttt{point}:
\begin{alltt}
\textbf{ok} 1 2 3 <point> .
\textbf{<< point 1 2 3 >>}
\end{alltt}
As you can guess from the above, there is a literal syntax for tuples,
and the \texttt{point?}~word tests if the top of the stack is an object
belonging to that class:
\begin{alltt}
\textbf{ok} << point 1 2 3 >> point? .
\textbf{t}
\end{alltt}
The general form of the literal syntax is as follows:
\begin{alltt}
<< \emph{class} \emph{slots} \... >>
\end{alltt}
The syntax consists of the tuple class name followed by the
values of all slots. An error is raised if insufficient or extraneous slot values are specified.
As usual, the distinction between literal syntax and explicit calls is the
time the tuple is created; literals are created at parse time, whereas
explicit constructor calls creates a new object each time the code
runs.
Slots are read and written using the various automatically-defined words with names of the
form \texttt{\emph{class}-\emph{slot}} and \texttt{set-\emph{class}-\emph{slot}}.
\subsection{Constructors}
A tuple constructor is named after the tuple class surrounded in angle
brackets (\texttt{<} and \texttt{>}). A default constructor is provided
that reads slot values from the stack, however a custom constructor can
be defined using the \texttt{C:} parsing word.
\subsection{Delegation}
If a tuple defines a slot named \texttt{delegate}, any generic words called on
the tuple that are not defined for the tuple's class will be passed on
to the delegate.
This idiom is used in the I/O code for wrapper streams. For example, the
\texttt{ansi-stream} class delegates all generic words to its underlying stream,
except for \texttt{fwrite-attr}, which outputs the necessary terminal escape
codes. Another example is \texttt{stdio-stream}, which performs all I/O on its
underlying stream, except it flushes after every new line (which would
be undesirable for say, a file).
Delegation is used instead of inheritance in Factor, but it is not a
substitute; in particular, the semantics differ in that a delegated
method call receives the delegate on the stack, not the original object.
\input{new-guide.ind}
\end{document}