factor/doc/handbook.tex

6807 lines
271 KiB
TeX
Raw Normal View History

2005-04-24 20:57:37 -04:00
% :indentSize=4:tabSize=4:noTabs=true:mode=tex:wrap=soft:
2005-05-03 02:58:59 -04:00
\documentclass{book}
2005-04-24 20:57:37 -04:00
\usepackage[plainpages=false,colorlinks]{hyperref}
\usepackage[style=list,toc]{glossary}
\usepackage{alltt}
\usepackage{times}
\usepackage{tabularx}
\usepackage{epsfig}
2005-04-28 22:40:57 -04:00
\usepackage{amssymb}
2005-06-16 18:50:49 -04:00
\usepackage{epstopdf}
2005-08-30 23:42:15 -04:00
%\usepackage{fancyref}
2005-04-24 20:57:37 -04:00
2005-05-03 02:58:59 -04:00
\pagestyle{headings}
2005-04-24 20:57:37 -04:00
\setcounter{tocdepth}{3}
\setcounter{secnumdepth}{3}
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}
\newcommand{\bs}{\char'134}
2005-05-03 02:58:59 -04:00
\newcommand{\dq}{\char'42}
2005-04-24 20:57:37 -04:00
\newcommand{\tto}{\symbol{123}}
\newcommand{\ttc}{\symbol{125}}
\newcommand{\pound}{\char'43}
2005-06-08 22:06:33 -04:00
\newcommand{\hhat}{\symbol{94}}
2005-04-24 20:57:37 -04:00
\newcommand{\ttindex}[1]{\texttt{#1}\index{\texttt{#1}}}
\newcommand{\vocabulary}[1]{\emph{Vocabulary:} \texttt{#1}&\\}
2005-04-24 20:57:37 -04:00
\newcommand{\parsingword}[2]{\index{\texttt{#1}}\emph{Parsing word:} \texttt{#2}&\\}
2005-04-24 20:57:37 -04:00
\newcommand{\ordinaryword}[2]{\index{\texttt{#1}}\emph{Word:} \texttt{#2}&\\}
2005-04-24 20:57:37 -04:00
\newcommand{\symbolword}[1]{\index{\texttt{#1}}\emph{Symbol:} \texttt{#1}&\\}
2005-04-24 20:57:37 -04:00
\newcommand{\classword}[1]{\index{\texttt{#1}}\emph{Class:} \texttt{#1}&\\}
\newcommand{\genericword}[2]{\index{\texttt{#1}}\emph{Generic word:} \texttt{#2}&\\}
\newcommand{\predword}[1]{\ordinaryword{#1}{#1~( object -- ?~)}}
2005-04-28 22:40:57 -04:00
\setlength{\tabcolsep}{1mm}
2005-04-24 20:57:37 -04:00
\newcommand{\wordtable}[1]{
2005-05-18 20:39:39 -04:00
%HEVEA\renewcommand{\index}[1]{}
%HEVEA\renewcommand{\glossary}[1]{}
\begin{tabularx}{12cm}{lX}
\hline
#1
\hline
\end{tabularx}
2005-04-24 20:57:37 -04:00
}
\makeatletter
\makeatother
\makeglossary
\makeindex
\begin{document}
\title{Factor Developer's Handbook}
\author{Slava Pestov}
\maketitle
\tableofcontents{}
\chapter*{Foreword}
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
This handbook documents release 0.77 of the Factor programming language.
2005-04-24 20:57:37 -04:00
Note that this handbook is not a tutorial or introductory guide, nor does it cover some background material that you are expected to understand, such as object-oriented programming, higher-order functions, continuations, or general algorithm and program design.
2005-04-24 20:57:37 -04:00
The Factor homepage can be found at \verb|http://factor.sourceforge.net|.
2005-04-24 20:57:37 -04:00
\part{Language reference}
2005-04-24 20:57:37 -04:00
\chapter{Conventions}
2005-04-24 20:57:37 -04:00
When examples of interpreter interactions are given in this guide, the input is in a roman font, and any
output from the interpreter is in boldface:
\begin{alltt}
"Hello, world!" print
2005-04-24 20:57:37 -04:00
\textbf{Hello, world!}
\end{alltt}
\section{Word definitions}
2005-04-24 20:57:37 -04:00
Parsing words, defined in \ref{parser}, are presented with the following notation.
\wordtable{
\vocabulary{foo}
\parsingword{word}{word syntax...}
2005-04-24 20:57:37 -04:00
}
The parsing word's name is followed by the syntax, with meta-syntactic variables set in an italic font. For example:
\wordtable{
\vocabulary{syntax}
\parsingword{colon}{:~\emph{name} \emph{definition} ;}
2005-04-24 20:57:37 -04:00
}
Ordinary words are presented in the following notation.
\wordtable{
\vocabulary{foo}
\ordinaryword{word}{word ( \emph{inputs} -- \emph{outputs} )}
2005-04-24 20:57:37 -04:00
}
A compound definition in the library, or primitive in the runtime.
\wordtable{
\vocabulary{foo}
\symbolword{word}
2005-04-24 20:57:37 -04:00
}
A symbol definition.
\wordtable{
\vocabulary{foo}
\genericword{word}{word ( \emph{inputs} -- \emph{outputs} )}
2005-04-24 20:57:37 -04:00
}
A generic word definition.
\wordtable{
\vocabulary{foo}
\classword{word}
2005-04-24 20:57:37 -04:00
}
A class that generic word methods can specialize on.
\section{Stack effects}
2005-04-24 20:57:37 -04:00
Within a stack effect comment, the top of the stack is the rightmost entry in both the
list of inputs and outputs, so \texttt{( x y -- x-y )} indicates that the top stack element will be subtracted from the element underneath.
The following abbreviations have conventional meaning in stack effect comments:
\begin{description}
\item[\texttt{[ x y z ]}] a list with elements whose types are hinted at by \texttt{x}, \texttt{y}, \texttt{z}
2005-06-08 22:06:33 -04:00
\item[\texttt{[[ x y ]]}] a cons cell where the type of the car is hinted at by \texttt{x}, and the type of the cdr is hinted at by \texttt{y}
2005-04-29 15:02:59 -04:00
\item[\texttt{ch}] an integer representing a Unicode character
2005-04-24 20:57:37 -04:00
\item[\texttt{elt}] an arbitrary object that happends to be an element of a collection
\item[\texttt{i}] a loop counter or index
\item[\texttt{j}] a loop counter or index
\item[\texttt{n}] a number
\item[\texttt{obj}] an arbitrary object
\item[\texttt{quot}] a quotation
\item[\texttt{seq}] a sequence
\item[\texttt{str}] a string
\item[\texttt{path}] a string representing a file's path name
2005-04-24 20:57:37 -04:00
\item[\texttt{?}] a boolean
\item[\texttt{foo/bar}] either \texttt{foo} or \texttt{bar}. For example, \texttt{str/f} means either a string, or \texttt{f}
\end{description}
If the stack effect identifies quotations, the stack effect of each quotation may be given after suffixing \texttt{|} to the whole string. For example, the following denotes a word that takes a list and a quotation and produces a new list, calling the quotation with elements of the list.
\begin{verbatim}
( list quot -- list | quot: elt -- elt )
\end{verbatim}
\section{Naming conventions}
2005-04-24 20:57:37 -04:00
The following naming conventions are used in the Factor library.
\begin{description}
\item[\texttt{<class>}] create a new instance of \texttt{class}
2005-04-24 20:57:37 -04:00
\item[\texttt{FOO:}] a parsing word that reads ahead from the input string
\item[\texttt{FOO}] a parsing word that does not read ahead, but rather takes a fixed action at parse time
\item[\texttt{FOO"}] a parsing word that reads characters from the input string until the next occurrence of \texttt{"}
\item[\texttt{foo?}] a predicate returning a boolean or generalized boolean value
\item[\texttt{foo.}] a word whose primary action is to print something, rather than to return a value. The basic case is the \texttt{.}~word, which prints the object at the top of the stack
\item[\texttt{foo*}] a variation of the \texttt{foo} word that takes more parameters
\item[\texttt{(foo)}] a word that is only useful for the implementation of \texttt{foo}
\item[\texttt{2foo}] like \texttt{foo} but takes or returns two operands
\item[\texttt{3foo}] like \texttt{foo} but takes or returns three operands
\item[\texttt{foo-with}] a form of the \texttt{foo} combinator that takes an extra object, and passes this object on each iteration of the quotation; for example, \texttt{each-with} and \texttt{map-with}
2005-04-24 20:57:37 -04:00
\item[\texttt{from>}] converts an instance of the \texttt{from} class into some canonical form
\item[\texttt{from>to}] convert an instance of the \texttt{from} class to the \texttt{to} class
2005-05-02 02:29:24 -04:00
\item[\texttt{>s}] move top of data stack to the \texttt{s} stack, where \texttt{s} is either \texttt{r} (call stack), \texttt{n} (name stack), or \texttt{c} (catch stack). Sometimes, libraries will define their own words following this naming convention, to implement user-defined stacks, typically stored in variables
2005-04-24 20:57:37 -04:00
\item[\texttt{s>}] move top of \texttt{s} stack to the data stack, where \texttt{s} is as above
\item[\texttt{style}] an association list holding text formatting information, possible keys are described in \ref{styles}
\item[\texttt{>to}] converts the object at the top of the stack to the \texttt{to} class
2005-04-24 20:57:37 -04:00
\item[\texttt{with-foo}] executes a quotation in a namespace where \texttt{foo} is configured in a special manner; for example, \texttt{with-stream}
\end{description}
\section{Mathematics}
2005-05-02 02:29:24 -04:00
This guide uses the standard mathematical notation to denote intervals.
\begin{tabular}{l|l}
Notation&Meaning\\
2005-05-02 02:29:24 -04:00
\hline
$(a,b)$&All numbers from $a$ to $b$, excluding $a$ and $b$\\
$[a,b)$&All numbers from $a$ to $b$, including $a$ and excluding $b$\\
$(a,b]$&All numbers from $a$ to $b$, excluding $a$ and including $b$\\
$[a,b]$&All numbers from $a$ to $b$, including $a$ and $b$
\end{tabular}
\chapter{Syntax}\label{syntax}
2005-04-24 20:57:37 -04:00
\newcommand{\parseglos}{\glossary{name=parser,
description={a set of words in the \texttt{parser} vocabulary, primarily \texttt{parse}, \texttt{eval}, \texttt{parse-file} and \texttt{run-file}, that creates objects from their printed representations, and adds word definitions to the dictionary}}}
\parseglos
In Factor, an \emph{object} is a piece of data that can be identified. Code is data, so Factor syntax is actually a syntax for describing objects, of which code is a special case. Factor syntax is read by the parser. The parser performs two kinds of tasks -- it creates objects from their \emph{printed representations}, and it adds \emph{word definitions} to the dictionary (\ref{words}). The parser can be extended (\ref{parser}).
2005-04-24 20:57:37 -04:00
\section{Parser algorithm}\label{parser}
2005-04-24 20:57:37 -04:00
2005-05-03 02:58:59 -04:00
\parseglos
2005-04-24 20:57:37 -04:00
\glossary{name=token,
description={a whitespace-delimited piece of text, the primary unit of Factor syntax}}
\glossary{name=whitespace,
description={a space (ASCII 32), newline (ASCII 10) or carriage-return (ASCII 13)}}
\begin{figure}
\caption{Parser algorithm}
2005-04-28 22:40:57 -04:00
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.40}{
%BEGIN IMAGE
\epsfbox{parser.eps}
%END IMAGE
%HEVEA\imageflush
}
\end{center}
\end{figure}
2005-04-24 20:57:37 -04:00
At the most abstract level,
Factor syntax consists of whitespace-separated tokens. The parser tokenizes the input on whitespace boundaries, where whitespace is defined as a sequence or one or more space, tab, newline or carriage-return characters. The parser is case-sensitive, so
the following three expressions tokenize differently:
\begin{verbatim}
2X+
2 X +
2 x +
\end{verbatim}
As the parser reads tokens it makes a distinction between numbers, ordinary words, and
parsing words. Tokens are appended to the parse tree, the top level of which is a list
returned by the original parser invocation. Nested levels of the parse tree are created
by parsing words.
The parser iterates through the input text, checking each character in turn. Here is the parser algorithm in more detail -- some of the concepts therein will be defined shortly:
2005-04-24 20:57:37 -04:00
\begin{itemize}
\item If the current character is a double-quote (\texttt{"}), the \texttt{"} parsing word is executed, causing a string to be read.
\item Otherwise, the next token is taken from the input. The parser searches for a word named by the token in the currently used set of vocabularies. If the word is found, one of the following two actions is taken:
\begin{itemize}
\item If the word is an ordinary word, it is appended to the parse tree.
\item If the word is a parsing word, it is executed.
\end{itemize}
Otherwise if the token does not represent a known word, the parser attempts to parse it as a number. If the token is a number, the number object is added to the parse tree. Otherwise, an error is raised and parsing halts.
\end{itemize}
2005-05-03 02:58:59 -04:00
\newcommand{\stringmodeglos}{
2005-04-24 20:57:37 -04:00
\glossary{name=string mode,
2005-05-03 02:58:59 -04:00
description={a parser mode where token are added to the parse tree as strings, without being looked up in the dictionary or converted into numbers first. Activated by switching on the \texttt{string-mode} variable}}}
\stringmodeglos
2005-04-24 20:57:37 -04:00
There is one exception to the above process; the parser might be placed in \emph{string mode}, in which case it simply reads tokens and appends them to the parse tree as strings. String mode is activated and deactivated by certain parsing words wishing to read input in an unstructured but tokenized manner -- see \ref{string-mode}.
2005-05-03 02:58:59 -04:00
\newcommand{\parsingwordglos}{
2005-04-24 20:57:37 -04:00
\glossary{name=parsing word,
2005-05-03 02:58:59 -04:00
description={a word that is run at parse time. Parsing words can be defined by suffixing the compound definition with \texttt{parsing}. Parsing words have the \texttt{\dq{}parsing\dq{}} word property set to true, and respond with true to the \texttt{parsing?}~word}}}
\parsingwordglos
2005-04-24 20:57:37 -04:00
Parsing words play a key role in parsing; while ordinary words and numbers are simply
added to the parse tree, parsing words execute in the context of the parser, and can
do their own parsing and create nested data structures in the parse tree. Parsing words
are also able to define new words.
While parsing words supporting arbitrary syntax can be defined, the default set is found
in the \texttt{syntax} vocabulary and provides the basis for all further syntactic
interaction with Factor.
\section{Vocabulary search}\label{vocabsearch}
2005-04-24 20:57:37 -04:00
\newcommand{\wordglos}{\glossary{
name=word,
description={an object holding a code definition and set of properties. Words are organized into vocabularies, and are uniquely identified by name within a vocabulary.}}}
\wordglos
\newcommand{\vocabglos}{\glossary{
name=vocabulary,
description={a collection of words, uniquely identified by name. The hashtable of vocabularies is stored in the \texttt{vocabularies} global variable, and the \texttt{USE:}~and \texttt{USING:}~parsing words add vocabularies to the parser's search path}}}
\vocabglos
2005-05-03 02:58:59 -04:00
A \emph{word} is a code definition identified by a name. Words are sorted into \emph{vocabularies}. Words are discussed in depth in \ref{words}.
2005-04-24 20:57:37 -04:00
When the parser reads a token, it attempts to look up a word named by that token. The
2005-08-10 19:37:59 -04:00
lookup is performed by searching each vocabulary in the search path, in order.
Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. For a way around this, see \ref{deferred}.
2005-05-03 02:58:59 -04:00
For a source file the vocabulary search path starts off with two vocabularies:
2005-04-24 20:57:37 -04:00
\begin{verbatim}
syntax
scratchpad
\end{verbatim}
The \texttt{syntax} vocabulary consists of a set of parsing words for reading Factor data
and defining new words. The \texttt{scratchpad} vocabulary is the default vocabulary for new
word definitions.
2005-05-03 02:58:59 -04:00
2005-08-10 19:37:59 -04:00
At the interactive listener, the default search path contains many more vocabularies. Details on the default search path and parser invocation are found in \ref{parsing-quotations}.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{USE:}{USE: \emph{vocabulary}}
2005-04-24 20:57:37 -04:00
}
\newcommand{\useglos}{\glossary{
name=search path,
description={the list of vocabularies that the parser looks up tokens in. You can add to this list with the \texttt{USE:} and \texttt{USING:} parsing words}}}
\useglos
2005-08-10 19:37:59 -04:00
Adds a new vocabulary at the front of the search path. Subsequent word lookups by the parser will search this vocabulary first.
2005-04-24 20:57:37 -04:00
\begin{alltt}
USE: lists
\end{alltt}
\wordtable{
\vocabulary{syntax}
\parsingword{USING:}{USING: \emph{vocabularies} ;}
2005-04-24 20:57:37 -04:00
}
Consecutive \texttt{USE:} declarations can be merged into a single \texttt{USING:} declaration.
\begin{alltt}
USING: lists strings vectors ;
\end{alltt}
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{IN:}{IN:~\emph{vocabulary}}
}
Sets the current vocabulary for new word definitions, and adds the vocabulary at the front of the search path (\ref{vocabsearch}).
Here is an example demonstrating the vocabulary search path. If you can understand this example, then you have grasped vocabularies.
\begin{verbatim}
IN: foe
USE: sequences
: append
#! Prints a message, then calls sequences::append.
"foe::append calls sequences::append" print append ;
IN: fee
: append
#! Loops, calling fee::append.
"fee::append calls fee::append" print append ;
USE: foe
: append
#! Redefining fee::append to call foe::append.
"fee::append calls foe::append" print append ;
"1234" "5678" append print
\end{verbatim}
When placed in a source file and run, the above code produces the following output:
\begin{verbatim}
fee::append calls foe::append
foe::append calls sequences::append
12345678
\end{verbatim}
2005-04-24 20:57:37 -04:00
\section{Numbers}
2005-04-24 20:57:37 -04:00
\newcommand{\numberglos}{\glossary{
name=number,
description={an instance of the \texttt{number} class}}}
\numberglos
If a vocabulary lookup of a token fails, the parser attempts to parse it as a number.
\subsection{Integers}\label{integer-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\integerglos}{\glossary{
name=integer,
description={an instance of the \texttt{integer} class, which is a disjoint union of the \texttt{fixnum} and \texttt{bignum} classes}}}
\numberglos
\newcommand{\fixnumglos}{\glossary{
name=fixnum,
description={an instance of the \texttt{fixnum} class, representing a fixed precision integer. On 32-bit systems, an element of the interval $(-2^{-29},2^{29}]$, and on 64-bit systems, the interval $(-2^{-61},2^{61}]$}}}
\fixnumglos
\newcommand{\bignumglos}{\glossary{
name=bignum,
description={an instance of the \texttt{bignum} class, representing an arbitrary-precision integer whose value is bounded by available object memory}}}
\bignumglos
The printed representation of an integer consists of a sequence of digits, optionally prefixed by a sign.
\begin{alltt}
123456
-10
2432902008176640000
\end{alltt}
Integers are entered in base 10 unless prefixed with a base change parsing word.
\wordtable{
\vocabulary{syntax}
\parsingword{BIN:}{BIN: \emph{integer}}
\parsingword{OCT:}{OCT: \emph{integer}}
\parsingword{HEX:}{HEX: \emph{integer}}
2005-04-24 20:57:37 -04:00
}
\begin{alltt}
BIN: 1110 BIN: 1 + .
2005-04-24 20:57:37 -04:00
\textbf{15}
HEX: deadbeef 2 * .
2005-04-24 20:57:37 -04:00
\textbf{7471857118}
\end{alltt}
2005-04-28 22:40:57 -04:00
More information on integers can be found in \ref{integers}.
\subsection{Ratios}\label{ratio-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\ratioglos}{\glossary{
name=ratio,
description={an instance of the \texttt{ratio} class, representing an exact ratio of two integers}}}
\ratioglos
The printed representation of a ratio is a pair of integers separated by a slash (\texttt{/}).
No intermediate whitespace is permitted. Either integer may be signed, however the ratio will be normalized into a form where the denominator is positive and the greatest common divisor
of the two terms is 1.
\begin{alltt}
75/33
1/10
-5/-6
\end{alltt}
2005-04-28 22:40:57 -04:00
More information on ratios can be found in \ref{ratios}.
\subsection{Floats}\label{float-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\floatglos}{\glossary{
name=float,
description={an instance of the \texttt{float} class, representing an IEEE 754 double-precision floating point number}}}
\floatglos
Floating point numbers contain an optional decimal part, an optional exponent, with
an optional sign prefix on either the significand or exponent.
\begin{alltt}
10.5
-3.1456
7e13
1e-5
\end{alltt}
2005-04-28 22:40:57 -04:00
More information on floats can be found in \ref{floats}.
\subsection{Complex numbers}\label{complex-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\complexglos}{\glossary{
name=complex,
description={an instance of the \texttt{complex} class, representing a complex number with real and imaginary components, where both components are real numbers}}}
\complexglos
\wordtable{
\vocabulary{syntax}
\parsingword{\pound\tto}{\#\tto{} \emph{real} \emph{imaginary} \ttc\#}
2005-04-24 20:57:37 -04:00
}
A complex number
is given by two components, a ``real'' part and ''imaginary'' part. The components
must either be integers, ratios or floats.
\begin{verbatim}
#{ 1/2 1/3 }# ! the complex number 1/2+1/3i
#{ 0 1 }# ! the imaginary unit
\end{verbatim}
2005-04-28 22:40:57 -04:00
More information on complex numbers can be found in \ref{complex-numbers}.
\section{Literals}
2005-04-24 20:57:37 -04:00
Many different types of objects can be constructed at parse time via literal syntax. Numbers are a special case since support for reading them is built-in to the parser. All other literals are constructed via parsing words.
If a quotation contains a literal object, the same literal object instance is used each time the quotation executes; that is, literals are ``live''.
\subsection{Booleans}\label{boolean}
2005-04-24 20:57:37 -04:00
\newcommand{\boolglos}{
\glossary{
name=boolean,
description={an instance of the \texttt{boolean} class, either \texttt{f} or \texttt{t}. See generalized boolean}}
\glossary{
name=generalized boolean,
description={an object used as a truth value. The \texttt{f} object is false and anything else is true. See boolean}}
\glossary{
name=t,
description={the canonical truth value. The \texttt{t} class, whose sole instance is the \texttt{t} object. Note that the \texttt{t} class is not equal to the \texttt{t} object}}
\glossary{
name=f,
description={the canonical false value; anything else is true. The \texttt{f} class, whose sole instance is the \texttt{f} object. Note that the \texttt{f} class is not equal to the \texttt{f} object}}
}
\boolglos
Any Factor object may be used as a truth value in a conditional expression. The \texttt{f} object is false and anything else is true. The \texttt{f} object is also used to represent the empty list, as well as the concept of a missing value. The canonical truth value is the \texttt{t} object.
\wordtable{
\vocabulary{syntax}
\parsingword{f}{f}
\parsingword{t}{t}
2005-04-24 20:57:37 -04:00
}
Adds the \texttt{f} and \texttt{t} objects to the parse tree.
Note that the \texttt{f} parsing word and class is not the same as the \texttt{f} object. The former can be obtained by writing \texttt{\bs~f} inside a quotation, or \texttt{POSTPONE: f} inside a list that will not be evaluated.
\begin{alltt}
f \bs f = .
2005-04-24 20:57:37 -04:00
\textbf{f}
\end{alltt}
An analogous distinction holds for the \texttt{t} class and object.
\subsection{Characters}\label{syntax:char}
2005-04-24 20:57:37 -04:00
\newcommand{\charglos}{\glossary{
name=character,
2005-05-03 02:58:59 -04:00
description={an integer whose value denotes a Unicode code point. Character values are limited to the range from $0$ to $2^{16}-1$ inclusive, however in a later release this can be upgraded to the full 21-bit Unicode space without requiring any changes to user code}}}
2005-04-24 20:57:37 -04:00
\charglos
Factor has no distinct character type, however Unicode character value integers can be
read by specifying a literal character, or an escaped representation thereof.
\wordtable{
\vocabulary{syntax}
\parsingword{CHAR:}{CHAR: \emph{token}}
2005-04-24 20:57:37 -04:00
}
Adds the Unicode code point of the character represented by \emph{token} to the parse tree.
\newcommand{\escapeglos}{\glossary{
name=escape,
description={a sequence allowing a non-literal character to be inserted in a string. For a list of escapes, see \ref{escape}}}}
\escapeglos
If the token is a single-character string other than whitespace or backslash, the character is taken to be this token. If the token begins with a backslash, it denotes one of the following escape codes.
\begin{table}[Special character escape codes]
\label{escape}
\begin{tabular}{l|l}
Escape code&Character\\
\hline
\texttt{\bs{}\bs}&Backslash (\texttt{\bs})\\
\texttt{\bs{}s}&Space\\
\texttt{\bs{}t}&Tab\\
\texttt{\bs{}n}&Newline\\
\texttt{\bs{}r}&Carriage return\\
2005-04-24 20:57:37 -04:00
\texttt{\bs{}0}&Null byte (ASCII 0)\\
\texttt{\bs{}e}&Escape (ASCII 27)\\
\texttt{\bs{}"}&Double quote (\texttt{"})\\
\end{tabular}
\end{table}
Examples:
\begin{alltt}
CHAR: a .
2005-04-24 20:57:37 -04:00
\textbf{97}
CHAR: \bs{}0 .
2005-04-24 20:57:37 -04:00
\textbf{0}
CHAR: \bs{}n .
2005-04-24 20:57:37 -04:00
\textbf{10}
\end{alltt}
A Unicode character can be specified by its code number by writing \texttt{\bs{}u} followed by a four-digit hexadecimal. That is, the following two expressions are equivalent:
\begin{alltt}
CHAR: \bs{}u0078
78
\end{alltt}
While not useful for single characters, this syntax is also permitted inside strings.
\subsection{Strings}\label{string-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\stringglos}{\glossary{
name=string,
description={an instance of the \texttt{string} class, representing an immutable sequence of characters}}}
\stringglos
\wordtable{
\vocabulary{syntax}
\parsingword{"}{"\emph{string}"}
2005-04-24 20:57:37 -04:00
}
Reads from the input string until the next occurrence of
\texttt{"}, and appends the resulting string to the parse tree. String literals cannot span multiple lines.
Strings containing
the \texttt{"} character and various other special characters can be read by
inserting escape sequences as described in \ref{syntax:char}.
\begin{alltt}
"Hello world" print
2005-04-24 20:57:37 -04:00
\textbf{Hello world}
\end{alltt}
Strings are documented in \ref{strings}.
\subsection{Lists}\label{listsyntax}
2005-04-24 20:57:37 -04:00
\newcommand{\listglos}{\glossary{
name=list,
description={an instance of the \texttt{list} class, storing a sequence of elements as a chain of zero or more conses, where the car of each cons is an element, and the cdr is either \texttt{f} or another list}}
\glossary{name=proper list, description=see list}
}
2005-04-24 20:57:37 -04:00
\listglos
\wordtable{
\vocabulary{syntax}
\parsingword{[}{[}
\parsingword{]}{]}
2005-04-24 20:57:37 -04:00
}
Parses a list, whose elements are read between \texttt{[} and \texttt{]} and can include other lists.
\begin{verbatim}
[
"404" "responder" set
[ drop no-such-responder ] "get" set
]
\end{verbatim}
\newcommand{\consglos}{\glossary{
name=cons,
description={an instance of the \texttt{cons} class, storing an ordered pair of objects referred to as the car and the cdr}}}
\consglos
\wordtable{
\vocabulary{syntax}
\parsingword{[[}{[[ \emph{car} \emph{cdr} ]]}
2005-04-24 20:57:37 -04:00
}
Parses two components making up a cons cell. Note that the lists parsed with \texttt{[} and \texttt{]} are just a special case of \texttt{[[} and \texttt{]]}. The following two lines are equivalent.
\begin{alltt}
[ 1 2 3 ]
[[ 1 [[ 2 [[ 3 f ]] ]] ]]
\end{alltt}
The empty list is denoted by \texttt{f}, along with boolean falsity, and the concept of a missing value. The expression \texttt{[ ]} parses to the same object as \texttt{f}.
Lists are documented in \ref{lists}.
\subsection{Words}
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
\newcommand{\wrapglos}{
\glossary{
name=wrapper,
description={an instance of the \texttt{wrapper} class, holding a reference to a single object. When the evaluator encounters a wrapper, it pushes the wrapped object on the data stack. Wrappers are used to push words literally on the data stack}}}
\wrapglos
While words parse as themselves, a word occurring inside a quotation is executed when the quotation is called. Sometimes it is desirable to have a word be pushed on the data stack during the execution of a quotation. The canonical use-case for this is passing the word to the \verb|execute| word (\ref{quotations}), or alternatively, reflectively accessing word properties (\ref{word-props}).
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{\bs}{\bs~\emph{word}}
2005-04-24 20:57:37 -04:00
}
2005-08-10 19:37:59 -04:00
Reads the next word from the input string and appends a \emph{wrapper} holding the word to the parse tree. When the evaluator encounters a wrapper, it pushes the wrapped object literally on the data stack.
Wrappers and the implementation of the \verb|\| word are discussed in detail in \ref{reading-ahead}.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{POSTPONE:}{POSTPONE: \emph{word}}
2005-04-24 20:57:37 -04:00
}
Reads the next word from the input string and appends the word to the parse tree, even if it is a parsing word. For an word \texttt{foo}, \texttt{POSTPONE: foo} and \texttt{foo} are equivalent; however, if \texttt{foo} is a parsing word, the latter will execute it at parse time, while the former will execute it at runtime. Usually used inside parsing words that wish to delegate some action to a further parsing word.
\begin{alltt}
: parsing1
2005-04-24 20:57:37 -04:00
"Parsing 1" print 2 swons ; parsing
: parsing2
2005-04-24 20:57:37 -04:00
"Parsing 2" print POSTPONE: parsing1 ; parsing
[ 1 parsing1 3 ] .
2005-04-24 20:57:37 -04:00
\textbf{Parsing 1}
\textbf{[ 1 2 3 ]}
[ 0 parsing2 2 4 ] .
2005-04-24 20:57:37 -04:00
\textbf{Parsing 2}
\textbf{Parsing 1}
\textbf{[ 0 2 4 ]}
\end{alltt}
Words are documented in \ref{words}.
2005-05-03 02:58:59 -04:00
Parsing words are documented in \ref{parsing-words}.
2005-04-24 20:57:37 -04:00
\subsection{Mutable literals}
2005-04-24 20:57:37 -04:00
\newcommand{\mutableglos}{\glossary{name=mutable object,
description=an object whose slot values can be changed}
\glossary{name=immutable object,
description=an object whose slot values cannot be changed}}
2005-04-24 20:57:37 -04:00
\mutableglos
Using mutable object literals in word definitions requires care, since if those objects
are mutated, the actual word definition will be changed, which is in most cases not what you would expect. Strings and lists are immutable; string buffers, vectors, hashtables and tuples are mutable.
\subsection{String buffers}\label{sbuf-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\sbufglos}{\glossary{
name=string buffer,
description={an instance of the \texttt{sbuf} class, representing a mutable and growable sequence of characters}}
\glossary{name=sbuf, description=see string buffer}}
\sbufglos
\wordtable{
\vocabulary{syntax}
\parsingword{SBUF"}{SBUF" \emph{text}"}
2005-04-24 20:57:37 -04:00
}
Reads from the input string until the next occurrence of
\texttt{"}, converts the string to a string buffer, and appends it to the parse tree.
As with strings, the escape codes described in \ref{syntax:char} are permitted.
\begin{alltt}
2005-08-10 19:37:59 -04:00
SBUF" Hello world" >string print
2005-04-24 20:57:37 -04:00
\textbf{Hello world}
\end{alltt}
2005-05-03 02:58:59 -04:00
String buffers are documented in \ref{string-buffers}.
\subsection{Vectors}\label{vector-literals}
2005-04-24 20:57:37 -04:00
\newcommand{\vectorglos}{\glossary{
name=vector,
description={an instance of the \texttt{vector} class, storing a mutable and growable sequence of elements in a contiguous range of memory}}}
2005-04-24 20:57:37 -04:00
\vectorglos
\wordtable{
\vocabulary{syntax}
\parsingword{\tto}{\tto}
\parsingword{\ttc}{\ttc}
2005-04-24 20:57:37 -04:00
}
2005-04-28 22:40:57 -04:00
Parses a vector, whose elements are read between \texttt{\tto} and \texttt{\ttc}.
2005-04-24 20:57:37 -04:00
\begin{verbatim}
{ 3 "blind" "mice" }
\end{verbatim}
Vectors are documented in \ref{vectors}.
\subsection{Hashtables}
2005-04-24 20:57:37 -04:00
\newcommand{\hashglos}{\glossary{
name=hashtable,
description={an instance of the \texttt{hashtable} class, providing a mutable mapping of keys to values}}}
\hashglos
\wordtable{
\vocabulary{syntax}
\parsingword{\tto\tto}{\tto\tto}
\parsingword{\ttc\ttc}{\ttc\ttc}
2005-04-24 20:57:37 -04:00
}
2005-04-28 22:40:57 -04:00
Parses a hashtable. Elements between \texttt{\tto\tto} and \texttt{\ttc\ttc} must be cons cells, where the car is the key and the cdr is a value.
2005-04-24 20:57:37 -04:00
\begin{verbatim}
{{
[[ "red" [ 255 0 0 ] ]]
[[ "green" [ 0 255 0 ] ]]
[[ "blue" [ 0 0 255 ] ]]
}}
\end{verbatim}
Hashtables are documented in \ref{hashtables}.
\subsection{Tuples}
2005-04-24 20:57:37 -04:00
\newcommand{\tupleglos}{\glossary{
name=tuple,
description={an instance of a user-defined class whose metaclass is the \texttt{tuple} metaclass, storing a fixed set of elements in named slots, with optional delegation method dispatch semantics}}}
\tupleglos
\wordtable{
\vocabulary{syntax}
\parsingword{<<}{<<}
\parsingword{>>}{>>}
2005-04-24 20:57:37 -04:00
}
2005-08-10 19:37:59 -04:00
Parses a tuple. The tuple's class must follow \texttt{<<}. The element after that is always the tuple's delegate. Further elements until \texttt{>>} are specified according to the tuple's slot definition. If an insufficient number of elements is given, the remaining slots of the tuple are set to \verb|f|. Listing too many elements raises a parse error.
2005-04-24 20:57:37 -04:00
\begin{verbatim}
<< color f 255 0 0 >>
\end{verbatim}
Tuples are documented in \ref{tuples}.
\section{Comments}\label{comments}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{!}{!~\emph{remainder of line}}
2005-04-24 20:57:37 -04:00
}
The remainder of the input line is ignored if an exclamation mark (\texttt{!}) is read.
\begin{alltt}
! Note that the sequence union does not include lists,
! or user defined tuples that respond to the sequence
! protocol.
\end{alltt}
\wordtable{
\vocabulary{syntax}
\parsingword{hash!}{\#!~\emph{remainder of line}}
2005-04-24 20:57:37 -04:00
}
\newcommand{\doccommentglos}{\glossary{
name=documentation comment,
description={a comment describing the usage of a word. Delimited by the \texttt{\#"!} parsing word, they appear at the start of a word definition and are stored in the \texttt{""documentation""} word property}}}
\doccommentglos
Comments that begin with \texttt{\#!} are called \emph{documentation comments}.
A documentation comment has no effect on the generated parse tree, but if it is the first thing inside a word definition, the comment text is appended to the string stored in the word's \texttt{"documentation"} property.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{(}{( \emph{stack effect} )}
2005-04-24 20:57:37 -04:00
}
\glossary{
name=stack effect,
description={A string of the form \texttt{( \emph{inputs} -- \emph{outputs} )}, where the inputs and outputs are a whitespace-separated list of names or types. The top of the stack is the right-most token on both sides.}}
\newcommand{\stackcommentglos}{\glossary{
name=stack effect comment,
description={a comment describing the inputs and outputs of a word. Delimited by \texttt{(} and \texttt{}), they appear at the start of a word definition and are stored in the \texttt{""stack-effect""} word property}}}
\stackcommentglos
Comments delimited by \texttt{(} and \texttt{)} are called \emph{stack effect comments}. By convention they are placed at the beginning of a word definition to document the word's inputs and outputs:
\begin{verbatim}
: push ( element sequence -- )
#! Push a value on the end of a sequence.
dup length swap set-nth ;
\end{verbatim}
A stack effect comment has no effect on the generated parse tree, but if it is the first thing inside a word definition, the word's \texttt{"stack-effect"} property is set to the comment text.
Word properties are described in \ref{word-props}.
2005-04-24 20:57:37 -04:00
\chapter{Data and control flow}
2005-04-24 20:57:37 -04:00
\section{Shuffle words}
2005-04-24 20:57:37 -04:00
\newcommand{\dsglos}{\glossary{
name=stack,
description=see data stack}
\glossary{
name=data stack,
description={the primary means of passing values between words}}}
\dsglos
Shuffle words are placed between words taking action to rearrange items on the stack
2005-05-03 02:58:59 -04:00
as the next word in the quotation would expect them. Their behavior can be understood entirely in terms of their stack effects, which are given in table \ref{shuffles}.
\begin{table}
\caption{\label{shuffles}Shuffle words}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
\ordinaryword{drop}{drop ( x -- )}
\ordinaryword{2drop}{2drop ( x y -- )}
\ordinaryword{3drop}{3drop ( x y z -- )}
\ordinaryword{nip}{nip ( x y -- y )}
\ordinaryword{2nip}{2nip ( x y z -- z )}
\ordinaryword{dup}{dup ( x -- x x )}
\ordinaryword{2dup}{2dup ( x y -- x y x y )}
\ordinaryword{3dup}{3dup ( x y z -- x y z x y z )}
\ordinaryword{dupd}{dupd ( x y -- x x y )}
\ordinaryword{over}{over ( x y -- x y x )}
\ordinaryword{pick}{pick ( x y z -- x y z x )}
\ordinaryword{tuck}{tuck ( x y -- y x y )}
\ordinaryword{swap}{swap ( x y -- y x )}
\ordinaryword{2swap}{2swap ( x y z t -- z t x y )}
\ordinaryword{swapd}{swapd ( x y z -- y x z )}
\ordinaryword{rot}{rot ( x y z -- y z x )}
\ordinaryword{-rot}{-rot ( x y z -- z x y )}
2005-04-24 20:57:37 -04:00
}
2005-05-03 02:58:59 -04:00
\end{table}
2005-04-24 20:57:37 -04:00
Try to avoid the complex shuffle words such as \texttt{rot} and \texttt{2dup} as much as possible, for they make data flow harder to understand. If you find yourself using too many shuffle words, or you're writing
a stack effect comment in the middle of a compound definition to keep track of stack contents, it is
a good sign that the word should probably be factored into two or
more smaller words.
\section{Quotations}\label{quotations}
2005-04-24 20:57:37 -04:00
\newcommand{\csglos}{\glossary{
name=return stack,
description=see call stack}
\glossary{
name=call stack,
description={holds quotations waiting to be called. When a quotation is called with \texttt{call}, or when a compound word is executed, the previous call frame is pushed on the call stack, and the new quotation becomes the current call frame}}}
\csglos
\newcommand{\cfglos}{\glossary{
name=call frame,
description=the currently executing quotation}}
\cfglos
\glossary{
2005-08-10 19:37:59 -04:00
name=evaluator,
description={a process by which code is evaluated, taking quotations as input. Two possibilities are the interpreter, which evaluates a quotation directly, and the compiler, which transforms quotations into machine code which evaluates the quotation when invoked}}
\glossary{
2005-04-24 20:57:37 -04:00
name=interpreter,
description=executes quotations by iterating them and recursing into nested definitions. see compiler}
\glossary{
name=quotation,
description=a list containing Factor code to be executed}
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
A Factor evaluator executes quotations. Quotations are lists, and since lists can contain any Factor object, they can contain words. It is words that give quotations their operational behavior, as you can see in the following description of the evaluator algorithm.
The Factor interpreter performs the below steps literally. The compiler generates machine code which perform the steps in a more efficient manner than the interpreter (\ref{compiler}).
2005-04-24 20:57:37 -04:00
\begin{itemize}
\item If the call frame is \texttt{f}, the call stack is popped and becomes the new call frame.
\item If the car of the call frame is a word, the word is executed:
\begin{itemize}
\item If the word is a symbol, it is pushed on the data stack. See \ref{symbols}.
\item If the word is a compound definition, the current call frame is pushed on the call stack, and the new call frame becomes the word definition. See \ref{colondefs}.
\item If the word is compiled or primitive, the interpreter jumps to a machine code definition. See \ref{primitives}.
\item If the word is undefined, an error is raised. See \ref{deferred}.
\end{itemize}
2005-08-10 19:37:59 -04:00
\item If the car of the call frame is a wrapper, the wrapped object is pushed on the data stack.
2005-04-24 20:57:37 -04:00
\item Otherwise, the car of the call frame is pushed on the data stack.
\item The call frame is set to the cdr, and the loop continues.
\end{itemize}
2005-05-03 02:58:59 -04:00
\begin{figure}
2005-08-10 19:37:59 -04:00
\caption{Evaluator semantics}
2005-05-03 02:58:59 -04:00
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.45}{
%BEGIN IMAGE
\epsfbox{interpreter.eps}
%END IMAGE
%HEVEA\imageflush
}
2005-05-03 02:58:59 -04:00
\end{center}
\end{figure}
2005-05-18 20:39:39 -04:00
\glossary{name=combinator,
description=a word taking quotations or other words as input}
2005-08-10 19:37:59 -04:00
The following pair of words invokes the interpreter reflectively.
2005-05-03 02:58:59 -04:00
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
\ordinaryword{call}{call ( quot -- )}
2005-04-24 20:57:37 -04:00
}
Push the current call frame on the call stack, and set the call stack to the given quotation. Conceptually: calls the quotation, as if its definition was substituted at the location of the \texttt{call}.
\begin{alltt}
[ 2 2 + 3 * ] call .
2005-04-24 20:57:37 -04:00
\textbf{12}
\end{alltt}
\wordtable{
\vocabulary{words}
\ordinaryword{execute}{execute ( word -- )}
2005-04-24 20:57:37 -04:00
}
Execute a word definition, taking action based on the word definition, as above.
\begin{alltt}
: hello "Hello world" print ;
: twice dup execute execute ;
\bs hello twice
2005-04-24 20:57:37 -04:00
\textbf{Hello world}
\textbf{Hello world}
\end{alltt}
2005-08-10 19:37:59 -04:00
These words are used to implement \emph{combinators}, which are words that take code from the stack. Combinator definitions must be followed by the \texttt{inline} word to mark them as inline in order to compile; for example:
\begin{verbatim}
: keep ( x quot -- x | quot: x -- )
over >r call r> ; inline
\end{verbatim}
Word inlining is documented in \ref{declarations}.
\subsection{Tail call optimization}
2005-04-24 20:57:37 -04:00
\newcommand{\tailglos}{\glossary{
name=tail call,
description=the last call in a quotation}
\glossary{
name=tail call optimization,
description=the elimination of call stack pushes when making a tail call}}
When a call is made to a quotation from the last word in the call frame, there is no
purpose in pushing the empty call frame on the call stack. Therefore the last call in a quotation does not grow the call stack, and tail recursion executes in bounded space.
\subsection{Call stack manipulation}
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
The definition of evaluator semantics in \ref{quotations} stipulates that the top of the call stack is not accessed during the execution of a quotation; the call stack is only popped when the end of the quotation is reached. In effect, the call stack can be used as a temporary storage area, as long as pushes and pops are balanced out within a single quotation.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
\ordinaryword{>r}{>r ( x -- r:x )}
2005-04-24 20:57:37 -04:00
}
Moves the top of the data stack to the call stack.
\wordtable{
\vocabulary{kernel}
\ordinaryword{r>}{r> ( r:x -- x )}
2005-04-24 20:57:37 -04:00
}
Moves the top of the call stack to the data stack.
The top of the data stack is ``hidden'' between \texttt{>r} and \texttt{r>}.
\begin{alltt}
1 2 3 >r .s r>
2005-04-24 20:57:37 -04:00
\textbf{2
1}
\end{alltt}
It is very important to balance usages of \texttt{>r} and \texttt{r>} within a single quotation or word definition.
\begin{verbatim}
: the-good >r 2 + r> * ; ! Okay
: the-bad >r 2 + ; ! Runtime error
: the-ugly r> ; ! Runtime error
\end{verbatim}
Basically, the rule is you must leave the call stack in the same state as you found it, so that when the current quotation finishes executing, the interpreter can return to the caller.
One exception is that when \texttt{ifte} occurs as the last word in a definition, values may be pushed on the call stack before the condition value is computed, as long as both branches of the \texttt{ifte} pop the values off the call stack before returning.
\begin{verbatim}
: foo ( m ? n -- m+n/n )
>r [ r> + ] [ drop r> ] ifte ; ! Okay
\end{verbatim}
\subsection{Quotation variants}
2005-04-24 20:57:37 -04:00
2005-06-10 16:08:00 -04:00
There are some words that combine shuffle words with \texttt{call}. They are useful in the implementation of higher-order words taking quotations as inputs.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
2005-05-18 20:39:39 -04:00
\ordinaryword{slip}{slip ( quot x -- x | quot:~-- )}
2005-04-24 20:57:37 -04:00
}
Call a quotation, while hiding the top of the stack. The implementation is as you would expect.
\begin{verbatim}
: slip ( quot x -- x | quot: -- )
>r call r> ; inline
\end{verbatim}
\wordtable{
\vocabulary{kernel}
2005-05-18 20:39:39 -04:00
\ordinaryword{2slip}{2slip ( quot x y -- x y | quot:~-- )}
}
Call a quotation, while hiding the top two stack elements.
\begin{verbatim}
: 2slip ( quot x y -- x y | quot: -- )
>r >r call r> r> ; inline
\end{verbatim}
\wordtable{
\vocabulary{kernel}
\ordinaryword{keep}{keep ( x quot -- x | quot:~x -- )}
2005-04-24 20:57:37 -04:00
}
Call a quotation with a value on the stack, restoring the value when the quotation returns.
\begin{verbatim}
: keep ( x quot -- x | quot: x -- )
over >r call r> ; inline
\end{verbatim}
\wordtable{
\vocabulary{kernel}
\ordinaryword{2keep}{2keep ( x y q -- x y | q:~x y -- )}
2005-04-24 20:57:37 -04:00
}
Call a quotation with a pair of values on the stack, restoring the values when the quotation returns.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{kernel}
\ordinaryword{3keep}{3keep ( x y z q -- x y z | q:~x y z -- )}
}
Call a quotation with three values on the stack, restoring the values when the quotation returns.
2005-04-24 20:57:37 -04:00
\section{Conditionals}
2005-04-24 20:57:37 -04:00
The simplest style of a conditional form is the \texttt{ifte} word.
\wordtable{
\vocabulary{kernel}
\ordinaryword{ifte}{ifte ( cond true false -- )}
2005-04-24 20:57:37 -04:00
}
The \texttt{cond} is a generalized boolean. If it is \texttt{f}, the \texttt{false} quotation is called, and if \texttt{cond} is any other value, the \texttt{true} quotation is called. The condition flag is removed from the stack before either quotation executes.
2005-05-03 02:58:59 -04:00
Note that in general, both branches should have the same stack effect. Not only is this good style that makes the word easier to understand, but also unbalanced conditionals cannot be compiled (\ref{compiler}).
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
\ordinaryword{when}{when ( cond true -- | true:~-- )}
\ordinaryword{unless}{unless ( cond false -- | false:~-- )}
2005-04-24 20:57:37 -04:00
}
This pair are minor variations on \texttt{ifte} where only one branch is specified. The other is implicitly \texttt{[ ]}. They are implemented in the trivial way:
\begin{verbatim}
: when [ ] ifte ; inline
: unless [ ] swap ifte ; inline
\end{verbatim}
The \texttt{ifte} word removes the condition flag from the stack before calling either quotation. Sometimes this is not desirable, if the condition flag is serving a dual purpose as a value to be consumed by the \texttt{true} quotation. The \texttt{ifte*} word exists for this purpose.
\wordtable{
\vocabulary{kernel}
\ordinaryword{ifte*}{ifte*~( cond true false -- )}
2005-04-24 20:57:37 -04:00
\texttt{true:~cond --}\\
\texttt{false:~--}\\
2005-04-24 20:57:37 -04:00
}
If the condition is true, it is retained on the stack before the \texttt{true} quotation is called. Otherwise, the condition is removed from the stack and the \texttt{false} quotation is called. The following two lines are equivalent:
\begin{verbatim}
X [ Y ] [ Z ] ifte*
X dup [ Y ] [ drop Z ] ifte
\end{verbatim}
\wordtable{
\vocabulary{kernel}
\ordinaryword{when*}{when*~( cond true -- | true:~cond -- )}
\ordinaryword{unless*}{unless*~( cond false -- | false:~-- )}
2005-04-24 20:57:37 -04:00
}
These are variations of \texttt{ifte*} where one of the quotations is \texttt{[ ]}.
2005-08-10 19:37:59 -04:00
The following two lines are equivalent:
\begin{verbatim}
X [ Y ] when*
X dup [ Y ] [ drop ] ifte
\end{verbatim}
The following two lines are equivalent:
\begin{verbatim}
X [ Y ] unless*
X dup [ ] [ drop Y ] ifte
\end{verbatim}
2005-04-24 20:57:37 -04:00
There is one final conditional form that is used to implement the ``default value'' idiom.
\wordtable{
\vocabulary{kernel}
\ordinaryword{?ifte}{?ifte ( default cond true false -- )}
2005-04-24 20:57:37 -04:00
\texttt{true:~cond --}\\
\texttt{false:~default --}\\
2005-04-24 20:57:37 -04:00
}
If the condition is \texttt{f}, the \texttt{false} quotation is called with the \texttt{default} value on the stack. Otherwise, the \texttt{true} quotation is called with the condition on the stack. The following two lines are equivalent:
\begin{verbatim}
D X [ Y ] [ Z ] ?ifte
D X dup [ nip Y ] [ drop Z ] ifte
2005-04-24 20:57:37 -04:00
\end{verbatim}
\subsection{Boolean logic}
2005-04-24 20:57:37 -04:00
The \texttt{?}~word chooses between two values, rather than two quotations.
\wordtable{
\vocabulary{kernel}
\ordinaryword{?}{?~( cond true false -- true/false )}
2005-04-24 20:57:37 -04:00
}
It is implemented in the obvious way.
\begin{verbatim}
: ? ( cond t f -- t/f )
rot [ drop ] [ nip ] ifte ; inline
\end{verbatim}
Several words use \texttt{?}~to implement typical boolean algebraic operations.
\wordtable{
\vocabulary{kernel}
\ordinaryword{>boolean}{>boolean ( obj -- t/f )}
2005-04-24 20:57:37 -04:00
}
Convert a generalized boolean into a boolean. That is, \texttt{f} retains its value, whereas anything else becomes \texttt{t}.
\wordtable{
\vocabulary{kernel}
\ordinaryword{not}{not ( ?~-- ?~)}
2005-04-24 20:57:37 -04:00
}
Given \texttt{f}, outputs \texttt{t}, and on any other input, outputs \texttt{f}.
\wordtable{
\vocabulary{kernel}
\ordinaryword{and}{and ( ?~?~-- ?~)}
2005-04-24 20:57:37 -04:00
}
Outputs \texttt{t} if both of the inputs are true.
\wordtable{
\vocabulary{kernel}
\ordinaryword{or}{or ( ?~?~-- ?~)}
2005-04-24 20:57:37 -04:00
}
Outputs \texttt{t} if at least one of the inputs is true.
An alternative set of logical operations operate on individual bits of integers bitwise, rather than generalized boolean truth values. They are documented in \ref{bitwise}.
\section{Continuations}
2005-04-24 20:57:37 -04:00
\newcommand{\contglos}{
\glossary{name=continuation,
description=an object representing the future of the computation}}
\contglos
At any point in the execution of a Factor program, the \emph{current continuation} represents the future of the computation. This object can be captured with the \texttt{callcc0} and \texttt{callcc1} words.
\wordtable{
\vocabulary{kernel}
\ordinaryword{callcc0}{callcc0 ( quot -- )}
2005-04-24 20:57:37 -04:00
\texttt{quot:~cont --}\\
\texttt{cont:~--}\\
\ordinaryword{callcc1}{callcc1 ( quot -- )}
2005-04-24 20:57:37 -04:00
\texttt{quot:~cont --}\\
\texttt{cont:~obj --}\\
2005-04-24 20:57:37 -04:00
}
Calling one of these words calls the given quotation with the continuation on the stack. The continuation is itself a quotation, and calling it \emph{continues execution} at the point after the call to \texttt{callcc0} and \texttt{callcc1}. Essentially, a continuation is a snapshot of the four stacks that can be restored at a later time.
The difference between \texttt{callcc0} and \texttt{callcc1} lies in the continuation object. When \texttt{callcc1} is used, calling the continuation takes one value from the top of the data stack, and places it back on the \emph{restored} data stack. This allows idioms such as exception handling, co-routines and generators to be implemented via continuations.
\subsection{Handling exceptional situations}\label{exceptions}
2005-04-24 20:57:37 -04:00
\glossary{name=exception,
description=an object representing an exceptional situation that has been detected}
Support for handling exceptional situations such as bad user input, implementation bugs, and input/output errors is provided by a pair of words, \texttt{throw} and \texttt{catch}.
2005-04-28 22:40:57 -04:00
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{errors}
\ordinaryword{throw}{throw ( exception -- )}
2005-04-24 20:57:37 -04:00
}
Raises an exception. Execution does not continue at the point after the \texttt{throw} call. Rather, the innermost catch block is invoked, and execution continues at that point. Passing \texttt{f} as an exception will cause \texttt{throw} to do nothing.
\wordtable{
\vocabulary{errors}
\ordinaryword{catch}{catch ( try handler -- )}
\texttt{handler:~exception/f -- }\\
2005-04-24 20:57:37 -04:00
}
An exception handler is established, and the \texttt{try} quotation is called.
If the \texttt{try} quotation throws an error and no nested \texttt{catch} is established, the following sequence of events takes place:
\begin{itemize}
\item the stacks are restored to their state prior to the \texttt{catch} call,
\item the exception is pushed on the data stack,
\item the \texttt{handler} quotation is called.
\end{itemize}
If the \texttt{try} quotation completes successfully, the stacks are \emph{not} restored. The \texttt{f} object is pushed, and the \texttt{handler} quotation is called.
A common idiom is that the \texttt{catch} block cleans up from the error in some fashion, then passes it on to the next-innermost catch block. The following word is used for this purpose.
\wordtable{
\vocabulary{errors}
\ordinaryword{rethrow}{rethrow ( exception -- )}
2005-04-24 20:57:37 -04:00
}
Raises an exception, without saving the current stacks for post-mortem inspection. This is done so that inspecting the error stacks sheds light on the original cause of the exception, rather than the point where it was rethrown.
Here is a simple example of a word definition that attempts to convert a string representing a hexadecimal number into an integer, and instead of halting execution when the string is not valid, it simply outputs \texttt{f}.
\begin{verbatim}
: catch-hex> ( str -- n/f )
[ hex> ] [ [ drop f ] when ] catch ;
\end{verbatim}
Exception handling is implemented using a \emph{catch stack}. The \texttt{catch} word pushes the current continuation on the catch stack, and \texttt{throw} calls the continuation at the top of the catch stack with the raised exception.
\glossary{name=catch stack,
description={a stack of exception handler continuations, pushed and popped by \texttt{catch}}}
2005-04-28 22:40:57 -04:00
\begin{figure}
\caption{Exception handling example}
The following diagram illustrates the nesting of exception handlers on the catch stack immediately before the call to \texttt{throw} in \texttt{foe}.
\begin{verbatim}
: foe
[
"Fatal error -- hard disk on fire!" throw
] [
"foe's catch block" print rethrow
] catch ;
: fie [ foe ] [ "fie's catch block" print rethrow ] catch ;
: flap [ fie ] [ [ "Exception: " write . ] when* ] catch ;
\end{verbatim}
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.5}{
%BEGIN IMAGE
\epsfbox{catchstack.eps}
%END IMAGE
%HEVEA\imageflush
}
2005-04-28 22:40:57 -04:00
\end{center}
\end{figure}
\subsection{Multitasking}\label{threads}
2005-04-24 20:57:37 -04:00
Factor implements co-operative multitasking, where the thread of control switches between tasks at explicit calls to \texttt{yield}, as well as when blocking I/O is performed. Multitasking is implemented via continuations.
\wordtable{
\vocabulary{threads}
\ordinaryword{in-thread}{in-thread ( quot -- )}
2005-04-24 20:57:37 -04:00
}
Calls \texttt{quot} in a co-operative thread. The new thread begins executing immediately, and the current thread resumes when the quotation yields, either from blocking
I/O or an explicit call to \texttt{yield}. This is implemented by adding the current continuation to the run queue, then calling \texttt{quot}, and finally executing \texttt{stop} after \texttt{quot} returns.
\wordtable{
\vocabulary{threads}
\ordinaryword{yield}{yield ( -- )}
2005-04-24 20:57:37 -04:00
}
Add the current continuation to the end of the run queue, and call the continuation at the front of the run queue.
2005-08-30 23:42:15 -04:00
\wordtable{
\vocabulary{threads}
\ordinaryword{sleep}{sleep ( ms -- )}
}
Pauses the current thread for \verb|ms| milliseconds. Other threads and I/O operations may execute in the meantime. The multitasker guarantees that the thread will not be woken up before \verb|ms| milliseconds passes, however it does not guarantee that the tread will not be woken up late; indeed, since multitasking is co-operative, a non-yielding thread can delay other sleeping threads indefinately.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{threads}
\ordinaryword{stop}{stop ( -- )}
2005-04-24 20:57:37 -04:00
}
Call the continuation at the front of run queue, without saving the current continuation. In effect, this stops the current thread.
\subsection{Interpreter state}
2005-04-24 20:57:37 -04:00
The current state of the interpreter is determined by the contents of the four stacks. A set of words for getting and setting stack contents are the primitive building blocks for continuations, and in turn abstractions such as exception handling and multitasking.
\wordtable{
\vocabulary{kernel}
\ordinaryword{datastack}{datastack ( -- vector )}
\ordinaryword{set-datastack}{set-datastack ( vector -- )}
2005-04-24 20:57:37 -04:00
}
Save and restore the data stack contents. As an example, here is a word that executes a quotation and restores the data stack to its previous state;
\begin{verbatim}
2005-05-03 02:58:59 -04:00
: keep-datastack ( quot -- )
datastack slip set-datastack drop ;
2005-04-24 20:57:37 -04:00
\end{verbatim}
Note that the \texttt{drop} call is made to remove the original quotation from the stack.
\wordtable{
\vocabulary{kernel}
\ordinaryword{callstack}{callstack ( -- vector )}
\ordinaryword{set-callstack}{set-callstack ( vector -- )}
2005-04-24 20:57:37 -04:00
}
Save and restore the call stack contents. The call stack does not include the currently executing quotation that made the call to \texttt{callstack}, since the current quotation is held in the call frame -- \ref{quotations} has details. Similarly, calling \texttt{set-callstack} will continue executing the current quotation until it returns, at which point control transfers to the quotation at the top of the new call stack.
\wordtable{
\vocabulary{namespaces}
\ordinaryword{namestack}{namestack ( -- list )}
\ordinaryword{set-namestack}{set-namestack ( list -- )}
2005-04-24 20:57:37 -04:00
}
Save and restore the name stack, used for dynamic variable bindings. See \ref{namespaces}.
\wordtable{
\vocabulary{errors}
\ordinaryword{catchstack}{catchstack ( -- list )}
\ordinaryword{set-catchstack}{set-catchstack ( list -- )}
2005-04-24 20:57:37 -04:00
}
Save and restore the catch stack, used for exception handling. See \ref{exceptions}.
\chapter{Words}\label{words}
2005-04-24 20:57:37 -04:00
\wordglos
\vocabglos
2005-05-03 02:58:59 -04:00
\newcommand{\definingwordglos}{\glossary{name=defining word,
description=a word that adds definitions to the dictionary}}
2005-04-24 20:57:37 -04:00
\glossary{name=dictionary,
description=the collection of vocabularies making up the code in the Factor image}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{words}
\classword{word}
2005-04-28 22:40:57 -04:00
}
2005-08-30 23:42:15 -04:00
Words are the fundamental unit of code in Factor, analogous to functions or procedures in other languages. Words are also objects, and this concept forms the basis for Factor's meta-programming facilities. A word consists of several parts:
2005-04-24 20:57:37 -04:00
\begin{itemize}
2005-08-30 23:42:15 -04:00
\item a word name,
\item a vocabulary name,
\item a definition, specifying the behavior of the word when executed,
\item a set of word properties, including documentation strings and other meta-data.
2005-04-24 20:57:37 -04:00
\end{itemize}
\wordtable{
\vocabulary{words}
\ordinaryword{word?}{word?~( object -- ?~)}
2005-04-24 20:57:37 -04:00
}
Tests if the \texttt{object} is a word.
2005-08-30 23:42:15 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{word-name}{word-name ( word -- string )}
\ordinaryword{word-vocabulary}{word-vocabulary ( word -- string )}
}
A pair of words for obtaining a word's name and vocabulary.
\wordtable{
\vocabulary{words}
\ordinaryword{word-sort}{word-sort ( list -- list )}
}
Sort a list of words by name.
\section{Vocabularies}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\symbolword{vocabularies}
2005-04-24 20:57:37 -04:00
}
2005-08-30 23:42:15 -04:00
\glossary{name=interned word,
description={a word that is a member of the vocabulary named by its vocabulary slot. Interned words are created by calls to \verb|create|}}
Words are organized into named vocabularies, stored in the global \texttt{vocabularies} variable (\ref{namespaces}). A word is said to be \emph{interned} if it is a member of the vocabulary named by its vocabulary slot. Otherwise, the word is \emph{uninterned}.
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
Parsing words add definitions to the current vocabulary. When a source file is being parsed, the current vocabulary is initially set to \texttt{scratchpad}. The current vocabulary may be changed with the \verb|IN:| parsing word (\ref{vocabsearch}).
2005-04-24 20:57:37 -04:00
\subsection{Searching for words}
2005-04-24 20:57:37 -04:00
Words whose names are known at parse time -- that is, most words making up your program -- can be referenced by stating their name. However, the parser itself, and sometimes code you write, will need to look up words dynamically.
\wordtable{
\vocabulary{words}
2005-08-30 23:42:15 -04:00
\ordinaryword{lookup}{lookup ( name vocabulary -- word/f )}
}
Searches for a word named \verb|name| in the vocabulary named \verb|vocab|. If no such word exists, pushes \texttt{f}.
\wordtable{
\vocabulary{words}
\ordinaryword{search}{search ( name vocabs -- word/f )}
2005-04-24 20:57:37 -04:00
}
2005-08-30 23:42:15 -04:00
The \texttt{vocabs} parameter is a sequence of vocabulary names. If a word with the given name is found, it is pushed on the stack, otherwise, \texttt{f} is pushed.
2005-04-24 20:57:37 -04:00
\subsection{Creating words}\label{creating-words}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{create}{create ( name vocabulary -- word )}
2005-04-24 20:57:37 -04:00
}
Creates a new word \texttt{name} in \texttt{vocabulary}. If the vocabulary already contains a word with this name, the existing word is returned.
\wordtable{
\vocabulary{words}
\ordinaryword{create-in}{create-in ( name -- word )}
2005-04-24 20:57:37 -04:00
}
2005-08-30 23:42:15 -04:00
Creates a new word \texttt{name} in the current vocabulary. This word is intended to be called from parsing words (\ref{parsing-words}).
\newcommand{\uninternedglos}{
\glossary{name=uninterned word,
description={a word whose vocabulary slot is either set to \texttt{f}, or that does not belong to the vocabulary named by its vocabulary slot. Uninterned words are created by calls to \texttt{gensym} and \texttt{<word>}, and interned words can be come uninterned via calls to \texttt{forget}}}}
\uninternedglos
\wordtable{
\vocabulary{words}
\ordinaryword{gensym}{gensym ( -- word )}
}
Creates an uninterned word that is not equal to any other word in the system, either stored in a vocabulary, or resulting from prior or future calls to \verb|gensym|. Gensyms have an automatically-generated name based on a prefix and an incrementing counter, for debugging:
\begin{alltt}
gensym .
\textbf{G:260561}
gensym .
\textbf{G:260562}
\end{alltt}
Gensyms are often used as placeholders and representitives that must be unique. For example, the compiler uses gensyms internally to label sections of assembly code.
\wordtable{
\vocabulary{words}
\ordinaryword{<word>}{<word> ( name vocabulary -- word )}
}
Creates an uninterned word whose name and vocabulary slots have the given values, however the word is not actually entered into this vocabulary. This word is used to implement \verb|create| and \verb|gensym|, and it is not usually used directly, since it can give confusing results:
\begin{alltt}
"reverse" "sequences" <word> dup .
\textbf{reverse}
"reverse" "sequences" lookup dup .
\textbf{reverse}
eq?
\textbf{f}
\end{alltt}
2005-04-24 20:57:37 -04:00
\section{Word definition}
2005-04-24 20:57:37 -04:00
There are two ways to create a word definition:
\begin{itemize}
\item Using parsing words at parse time,
\item Using defining words at run-time. This is a more dynamic feature that can be used to implement code generation and such, and in fact parse-time defining words are implemented in terms of run-time defining words.
\end{itemize}
\subsection{Compound definitions}\label{colondefs}
2005-04-24 20:57:37 -04:00
\newcommand{\colonglos}{\glossary{
name=compound definition,
description=a word defined to execute a quotation consisting of existing words}
\glossary{
name=colon definition,
description=see compound definition}}
\colonglos
A compound definition associates a word name with a quotation that is called when the word is executed.
\wordtable{
\vocabulary{syntax}
\parsingword{:}{:~\emph{name} \emph{definition} ;}
2005-04-24 20:57:37 -04:00
}
A word \texttt{name} is created in the current vocabulary, and is associated with \texttt{definition}.
\begin{verbatim}
: ask-name ( -- name )
"What is your name? " write readln ;
2005-04-24 20:57:37 -04:00
: greet ( name -- )
"Greetings, " write print ;
: friend ( -- )
ask-name greet ;
\end{verbatim}
By convention, the word name should be followed by a stack effect comment, and for more complex definitions, a documentation comment; see \ref{comments}.
\wordtable{
\vocabulary{words}
\ordinaryword{define-compound}{define-compound ( word quotation -- )}
2005-04-24 20:57:37 -04:00
}
Defines \texttt{word} to call the \texttt{quotation} when executed.
\wordtable{
\vocabulary{words}
\ordinaryword{compound?}{compound?~( object -- ?~)}
2005-04-24 20:57:37 -04:00
}
Tests if the \texttt{object} is a compound word definition.
\wordtable{
\vocabulary{words}
\classword{compound}
2005-04-24 20:57:37 -04:00
}
The class that all compound words are an instance of.
\subsection{Symbols}\label{symbols}
2005-04-24 20:57:37 -04:00
\newcommand{\symbolglos}{\glossary{
name=symbol,
description={a word defined to push itself on the stack when executed, created by the \texttt{SYMBOL:}~parsing word}}}
\symbolglos
\wordtable{
\vocabulary{syntax}
\parsingword{SYMBOL:}{SYMBOL:~\emph{name}}
2005-04-24 20:57:37 -04:00
}
A word \texttt{name} is created in the current vocabulary that pushes itself on the stack when executed. Symbols are used to identify variables (\ref{namespaces}) as well as for storing crufties in their properties (\ref{word-props}).
\wordtable{
\vocabulary{words}
\ordinaryword{define-symbol}{define-symbol ( word -- )}
2005-04-24 20:57:37 -04:00
}
Defines \texttt{word} to push itself on the data stack when executed.
\wordtable{
\vocabulary{words}
\ordinaryword{symbol?}{symbol?~( object -- ?~)}
2005-04-24 20:57:37 -04:00
}
Tests if the \texttt{object} is a symbol.
\wordtable{
\vocabulary{words}
\classword{symbol}
2005-04-24 20:57:37 -04:00
}
The class that all symbols are an instance of.
\subsection{Primitives}\label{primitives}
2005-04-24 20:57:37 -04:00
\newcommand{\primglos}{\glossary{
name=primitive,
description=a word implemented as native code in the Factor runtime}}
\symbolglos
Executing a primitive invokes native code in the Factor runtime. Primitives cannot be defined through Factor code. Compiled definitions behave similarly to primitives in that the interpreter jumps to native code upon encountering them.
\wordtable{
\vocabulary{words}
\ordinaryword{primitive?}{primitive?~( object -- ?~)}
2005-04-24 20:57:37 -04:00
}
Tests if the \texttt{object} is a primitive.
\wordtable{
\vocabulary{words}
\classword{primitive}
2005-04-24 20:57:37 -04:00
}
The class that all primitives are an instance of.
\subsection{Deferred words and mutual recursion}\label{deferred}
2005-04-24 20:57:37 -04:00
\glossary{
name=deferred word,
description={a word without a definition, created by the \texttt{DEFER:}~parsing word}}
Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. Mutually-recursive pairs of words can be implemented by \emph{deferring} one of the words in the pair so that the second word in the pair can parse, then by replacing the deferred definition with a real one.
A demonstration of the idiom:
\begin{verbatim}
DEFER: foe
: fie ... foe ... ;
: foe ... fie ... ;
\end{verbatim}
\wordtable{
\vocabulary{syntax}
\parsingword{DEFER:}{DEFER:~\emph{name}}
2005-04-24 20:57:37 -04:00
}
Create a word \texttt{name} in the current vocabulary that simply raises an error when executed. Usually, the word will be replaced with a real definition later.
\wordtable{
\vocabulary{words}
\ordinaryword{undefined?}{undefined?~( object -- ?~)}
2005-04-24 20:57:37 -04:00
}
Tests if the \texttt{object} is an undefined (deferred) word.
\wordtable{
\vocabulary{words}
\classword{undefined}
2005-04-24 20:57:37 -04:00
}
The class that all undefined words are an instance of.
\subsection{Undefining words}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{FORGET:}{FORGET:~\emph{name}}
2005-04-24 20:57:37 -04:00
}
Removes the word \texttt{name} from its vocabulary. Existing definitions that reference the word will continue to work, but newly-parsed occurrences of the word will not locate the forgotten definition. No exception is thrown if no such word exists.
2005-08-30 23:42:15 -04:00
\uninternedglos
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{forget}{forget ( word -- )}
2005-04-24 20:57:37 -04:00
}
2005-08-30 23:42:15 -04:00
Removes the word from its vocabulary. The word becomes uninterned. The parsing word \texttt{FORGET:} is implemented using this word.
\wordtable{
\vocabulary{words}
\ordinaryword{interned?}{interned?~( word -- ?~)}
}
Test if the word is interned. If the word's vocabulary slot is \verb|f|, immediately outputs \verb|f|, otherwise, tests if the word with the same name in that vocabulary is actually the given word.
\begin{alltt}
"interning" "scratchpad" create
dup interned?
\textbf{t}
dup forget
interned?
\textbf{f}
\end{alltt}
2005-04-24 20:57:37 -04:00
\subsection{Declarations}\label{declarations}
2005-05-18 20:39:39 -04:00
2005-08-30 23:42:15 -04:00
A compound or generic word (\ref{generic}) can be given special behavior with one of the below parsing words. They all act on the most recently-defined word by setting to \verb|t| a word property keyed by the string naming the declaration word.
2005-05-18 20:39:39 -04:00
2005-08-30 23:42:15 -04:00
The first declaration specifies the time when a word runs. It affects both interpreted and compiled definitions.
\wordtable{
\vocabulary{syntax}
\parsingword{parsing}{parsing}
}
Parsing words run at parse time. See \ref{parsing-words}.
The remaining declarations only affect compiled definitions. They do not change evaluation semantics of a word, but instead declare that the word follows a certain contract, and thus may be compiled differently.
If a generic word is defined as \verb|flushable| or \verb|foldable|, all methods must satisfy the contract, otherwise unpredicable behavior will occur.
\glossary{name=inline word,
description={calls to inline words are replaced with the inline word's body by the compiler. Inline words are declared via the \verb|inline| parsing word}}
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{inline}{inline}
}
2005-08-30 23:42:15 -04:00
The compiler copies the definitions of inline words directly into the word being compiled. Combinators must be inlined in order to compile. For any other word, inlining is merely an optimization; see \ref{compiler}. Inlining does not affect the execution of the word in the interpreter.
2005-05-18 20:39:39 -04:00
2005-08-30 23:42:15 -04:00
\glossary{name=flushable word,
description={calls to flushable words may be removed from compiled code if their outputs are subsequently discarded by calls to \verb|drop|. Flushable words are declared via the \verb|flushable| parsing word}}
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{syntax}
2005-08-30 23:42:15 -04:00
\parsingword{flushable}{flushable}
2005-05-18 20:39:39 -04:00
}
2005-08-30 23:42:15 -04:00
Calls to flushable words may be removed from compiled code if their outputs are subsequently discarded by calls to \verb|drop|. Flushable words must be side-effect-free; that is, their outputs must solely depend on inputs, and they must not modify their inputs, or any other object visible outside the dynamic extent of the flushable word. Note that if a word with no outputs is declared flushable, calls to it are \emph{never} compiled in.
2005-08-10 19:37:59 -04:00
2005-08-30 23:42:15 -04:00
\glossary{name=foldable word,
description={calls to foldable words may be evaluated at compile time if all inputs are literal. Foldable words are declared via the \verb|foldable| parsing word}}
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{syntax}
2005-08-30 23:42:15 -04:00
\parsingword{foldable}{foldable}
2005-08-10 19:37:59 -04:00
}
2005-08-30 23:42:15 -04:00
Foldable words may be evaluated at compile time if all inputs are literal. Foldable words must satisfy a very strong contract:
\begin{itemize}
\item foldable words must satisfy the contract of flushable words,
\item foldable words must halt\footnote{of course, this cannot be guaranteed in the general case, but for example, a word computing a series until it coverges should not be foldable, since compilation will not halt in the event the series does not converge.}
\item inputs and outputs of foldable words must be immutable objects.
\end{itemize}
The last restriction ensures that words like \verb|clone| do not satisfy the foldable word contract. Indeed, \verb|clone| is flushable, however it may output a mutable object if its input is mutable, and so it is undesirable to evaluate it at compile-time, since the following two definitions have differing semantics:
\begin{verbatim}
: foe { } ;
: foe { } clone ;
\end{verbatim}
Most mathematical opeartions are foldable. For example, \verb|2 2 +| is compiled to a literal \verb|4|, because \verb|+| is foldable.
2005-05-18 20:39:39 -04:00
\section{Word properties}\label{word-props}
2005-04-24 20:57:37 -04:00
\glossary{name=word property,
description={a name/value pair stored in a word's properties}}
\glossary{name=word properties,
description={a hashtable associated with each word storing various sundry properties}}
Each word has an associated hashtable of properties. Conventionally, the property names are strings, but nothing requires that this be so.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{word-prop}{word-prop ( word name -- value )}
\ordinaryword{set-word-prop}{set-word-prop ( value word name -- )}
2005-04-24 20:57:37 -04:00
}
Retrieve and store word properties. Note that the stack effect is designed so that it is most convenient when \texttt{name} is a literal that is pushed on the stack right before executing these words. This is usually the case.
The following properties are commonly-set:
\begin{description}
2005-08-30 23:42:15 -04:00
\item[\texttt{"parsing"}, \texttt{"inline"}, \texttt{"flushable"}, \texttt{"foldable"}] declarations (see \ref{declarations})
\item[\texttt{"methods"}] only defined on generic words; a hashtable mapping classes to quotations (see \ref{generic})
\item[\texttt{"combination"}] only defined on generic words; see \ref{combinations}
\item[\texttt{"file"}] The source file storing the word definition
\item[\texttt{"line"}] The line number in the source file storing the word definition
\item[\texttt{"col"}] The column number in the source file storing the word definition
\end{description}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{word-props}{word-props ( word -- hashtable )}
\ordinaryword{set-word-props}{set-word-props ( hashtable word -- )}
2005-04-24 20:57:37 -04:00
}
Retreive and store the entire set of word properties.
\section{Low-level details}
2005-04-24 20:57:37 -04:00
The actual behavior of a word when executed is determined by the values of two slots:
\begin{itemize}
\item The primitive number
\item The primitive parameter
\end{itemize}
The primitive number is an index into an array of native functions in the Factor runtime.
Some frequently-occurring primitive numbers:
\begin{description}
\item[0] deferred word,
\item[1] compound definition -- executes the quotation stored in the parameter slot,
\item[2] symbol -- pushes the value of the parameter slot,
\item[3 onwards] the actual set of primitives, of which there are around 170.
\end{description}
The words outlined in this section should not be used in ordinary code.
\wordtable{
\vocabulary{words}
\ordinaryword{word-primitive}{word-primitive ( word -- n )}
\ordinaryword{set-word-primitive}{set-word-primitive ( word -- n )}
2005-04-24 20:57:37 -04:00
}
2005-08-30 23:42:15 -04:00
Retreives and stores a word's primitive number. Note that changing the primitive number does not update the execution token, and the word will still call the old definition until a subsequent call to \verb|update-xt|.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{word-def}{word-def ( word -- object )}
\ordinaryword{set-word-def}{set-word-def ( object word -- )}
2005-04-24 20:57:37 -04:00
}
Retreives and stores a word's primitive parameter. This parameter is only used if the primitive number is 1 (compound definitions) or 2 (symbols). Note that to define a compound definition or symbol, you must use \texttt{define-compound} or \texttt{define-symbol}, as these words do not update the cross-referencing of word dependencies.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{word-xt}{word-xt ( word -- n )}
\ordinaryword{set-word-xt}{set-word-xt ( n word -- )}
2005-04-24 20:57:37 -04:00
}
Retreives and stores a word's \emph{execution token}.
2005-04-24 20:57:37 -04:00
This is an even lower-level facility for working with the address containing native code to be invoked when the word is executed. The compiler sets the execution token to a location in memory containing generated code.
\wordtable{
\vocabulary{words}
\ordinaryword{update-xt}{update-xt ( word -- )}
2005-04-24 20:57:37 -04:00
}
2005-08-30 23:42:15 -04:00
Updates a word's execution token according to its primitive number. When called with a compiled word, has the effect of decompiling the word.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{words}
\ordinaryword{recrossref}{recrossref ( word -- )}
2005-04-24 20:57:37 -04:00
}
Updates the cross-referencing database, which you will probably need to do if you mess around with any of the words in this section -- assuming Factor does not crash first, that is.
\chapter{Objects}
2005-04-24 20:57:37 -04:00
\glossary{name=object,
description=a datum that can be identified}
\mutableglos
Everything in Factor is an object, where an object is a collection of slots. Each object has a unique identity, and references to objects are passed by value on the stack. It is possible to have two references to the same object, and if the object is mutated through one reference, the changes will be visible through the other reference. Not all objects are mutable; the documentation for each class details if its instances are mutable or not.
2005-04-24 20:57:37 -04:00
\section{Identity and equality}\label{equality}
2005-04-24 20:57:37 -04:00
\glossary{name=equal,
description={two objects are equal if they have the same class and if their slots are equal, or alternatively, if both are numbers that denote the same value}}
2005-08-10 19:37:59 -04:00
There are two distinct notions of ``sameness'' when it comes to objects. You can test if two references point to the same object, or you can test if two objects are equal in some sense, usually by being instances of the same class, and having equal slot values. Both notions of equality are equality relations in the mathematical sense; that is, they obey the following axioms:
\begin{itemize}
\item They are reflexive: $x\sim x$
\item They are symmetric: $x\sim y$ if and only if $y\sim x$
\item They are transitive: if $x\sim y$ and $y\sim z$, then $x\sim z$
\end{itemize}
\wordtable{
\vocabulary{kernel}
\ordinaryword{eq?}{eq?~( object object -- ?~)}
}
Output \texttt{t} if two references point to the same object, and \texttt{f} otherwise.
\wordtable{
\vocabulary{kernel}
\genericword{=}{= ( object object -- ?~)}
}
Output \texttt{t} if two objects are equal, and \texttt{f} otherwise. The precise meaning of equality depends on the object's class, however usually two objects are equal if their slot values are equal. If two objects are equal, they have the same printed representation, although the converse is not always true. In particular:
\begin{itemize}
\item If no more specific method is defined, \texttt{=} calls \texttt{eq?}.
2005-08-10 19:37:59 -04:00
\item Two numbers are equal if they have the same numerical value after being upgraded to the highest type of the two (\ref{number-protocol}).
\item Two lists, vectors, strings, string buffers or arrays are equal if they have the same length, and elements.
\item Two hashtables are equal if they hold the same set of key/value pairs.
\item Two tuples are equal if they are of the same class and their slots are equal.
\item Two words are equal if they are the same object.
\end{itemize}
2005-08-30 23:42:15 -04:00
This generic word is flushable, so user-defined methods must satisfy the flushable contract (see \ref{declarations}).
\wordtable{
\vocabulary{kernel}
\genericword{clone}{clone ( object -- object )}
}
Make a fresh object that is equal to the given object. This is not guaranteed to actually copy the object; it does nothing with immutable objects, and does not copy words either. However, sequences and tuples can be cloned to obtain a new shallow copy of the original.
2005-08-30 23:42:15 -04:00
This generic word is flushable, so user-defined methods must satisfy the flushable contract (see \ref{declarations}).
\section{Generic words and methods}\label{generic}
2005-04-24 20:57:37 -04:00
\glossary{name=generic word,
description={a word defined using the \texttt{GENERIC:}~parsing word. The behavior of generic words depends on the class of the object at the top of the stack. A generic word is composed of methods, where each method is specialized on a class}}
\glossary{name=method,
description={gives a generic word behavior when the top of the stack is an instance of a specific class}}
Sometimes you want a word's behavior to depend on the class of the object at the top of the stack, however implementing the word as a set of nested conditional tests is undesirable since it leads to unnecessary coupling -- adding support for a new class requires modifying the original definition of the word.
A generic word is a word whose behavior depends on the class of the
object at the top of the stack, however this behavior is defined in a
decentralized manner.
\wordtable{
\vocabulary{syntax}
\parsingword{GENERIC:}{GENERIC: \emph{word}}
2005-04-24 20:57:37 -04:00
}
Defines a new generic word. Initially, it contains no methods, and thus will raise an error when called.
\wordtable{
\vocabulary{syntax}
\parsingword{M:}{M: \emph{class} \emph{word} \emph{definition} ;}
2005-04-24 20:57:37 -04:00
}
Defines a method, that is, a behavior for the generic \texttt{word} specialized on instances of \texttt{class}. Each method definition
can potentially occur in a different source file.
\subsection{Method ordering}\label{method-order}
2005-04-24 20:57:37 -04:00
If two classes have a non-empty intersection, there is no guarantee that one is a subclass of the other. This means there is no canonical linear ordering of classes. The methods of a generic word are linearly ordered, though, and you can inspect this order using the \texttt{order} word.
Suppose you have the following definitions:
\begin{verbatim}
GENERIC: foo
M: integer foo 1 + ;
M: number foo 1 - ;
M: object foo dup 2list ;
\end{verbatim}
Since the \texttt{integer} class is strictly smaller than the \texttt{number} class, which in turn is strictly smaller than the \texttt{object} class, the ordering of methods is not surprising in this case:
\begin{alltt}
\bs foo order .
\textbf{[ object number integer ]}
\end{alltt}
However, suppose we had the following set of definitions:
\begin{verbatim}
GENERIC: describe
M: general-t describe drop "a true value" print ;
M: general-list describe drop "a list" print ;
M: object describe drop "an object" print ;
\end{verbatim}
Neither \texttt{general-t} nor \texttt{general-list} contains the other, and their intersection is the non-empty \texttt{cons} class. So the generic word system will place \texttt{object} first in the method order, however either \texttt{general-t} or \texttt{general-list} may come next, and it is pretty much a random choice that depends on hashing:
2005-04-24 20:57:37 -04:00
\begin{alltt}
\bs bar order .
\textbf{[ object general-list general-t ]}
2005-04-24 20:57:37 -04:00
\end{alltt}
Therefore, the outcome of calling \texttt{bar} with a cons cell is undefined.
2005-04-24 20:57:37 -04:00
\section{Classes}
2005-04-24 20:57:37 -04:00
\glossary{name=class,
description=a set of objects defined in a formal manner. Methods specialize generic words on classes}
\glossary{name=metaclass,
description={a set of classes sharing common traits. Examples include \texttt{builtin}, \texttt{union}, and \texttt{tuple}}}
\wordtable{
\vocabulary{generic}
\classword{object}
2005-04-24 20:57:37 -04:00
}
Every object is a member of the \texttt{object} class. If you provide a method specializing
on the \texttt{object} class for some generic word, the method will be
invoked when no more specific method exists. For example:
\begin{verbatim}
GENERIC: describe
M: number describe
"The number " write . ;
M: object describe
"I don't know anything about " write . ;
\end{verbatim}
Each class has a membership predicate named
2005-05-18 20:39:39 -04:00
after the class with a \texttt{?}~suffix, with the following two exceptions:
2005-04-24 20:57:37 -04:00
\begin{description}
\item[object] there is no need for a predicate word, since
every object is an instance of this class.
\item[f] the only instance of this class is the singleton
\texttt{f} signifying falsity, missing value, and empty list, and the predicate testing for this is the built-in library word \texttt{not}.
\end{description}
\subsection{Built-in classes}
2005-04-24 20:57:37 -04:00
\glossary{name=type,
description={an object invariant that describes its shape. An object's type is constant for the lifetime of the object, and there is only a fixed number of types built-in to the run-time. See class}}
\glossary{name=built-in class,
description=see type}
Every object is an instance of to exactly one type, and the type is constant for the lifetime of the object. There is only a fixed number of types built-in to the run-time, and corresponding to each type is a \emph{built-in class}:
\begin{verbatim}
alien
array
bignum
byte-array
complex
cons
2005-08-10 19:37:59 -04:00
displaced-alien
2005-04-24 20:57:37 -04:00
dll
f
fixnum
float
ratio
sbuf
string
t
tuple
vector
word
2005-08-10 19:37:59 -04:00
wrapper
2005-04-24 20:57:37 -04:00
\end{verbatim}
\wordtable{
\vocabulary{kernel}
\ordinaryword{type}{type ( object -- n )}
2005-04-24 20:57:37 -04:00
}
Outputs the type number of a given object. Most often, the \texttt{class} word is more useful.
\wordtable{
\vocabulary{kernel}
\ordinaryword{class}{class ( object -- class )}
2005-04-24 20:57:37 -04:00
}
Outputs the canonical class of a given object. While an object may be an instance of more than one class, the canonical class is either the built-in class, or if the object is a tuple, the tuple class. Examples:
\begin{alltt}
1.0 class .
2005-04-24 20:57:37 -04:00
\textbf{float}
TUPLE: point x y z ;
<< point f 1 2 3 >> class .
2005-04-24 20:57:37 -04:00
\textbf{point}
\end{alltt}
\subsection{Unions}
2005-04-24 20:57:37 -04:00
\glossary{name=union,
description={a class whose set of instances is the union of the set of instances of a list of member classes}}
An object is an instance of a union class if it is an instance of one of its members. Union classes are used to associate the same method with several different classes, as well as to conveniently define predicates.
\wordtable{
\vocabulary{syntax}
\parsingword{UNION:}{UNION: \emph{name} \emph{members} ;}
2005-04-24 20:57:37 -04:00
}
Defines a union class. For example, the Factor library defines some unions over numeric types:
\begin{verbatim}
UNION: integer fixnum bignum ;
UNION: rational integer ratio ;
UNION: real rational float ;
UNION: number real complex ;
\end{verbatim}
Now, the absolute value function can be defined in an efficient manner
for real numbers, and in a more general fashion for complex numbers:
\begin{verbatim}
GENERIC: abs ( z -- |z| )
M: real abs dup 0 < [ neg ] when ;
M: complex absq fsqrt ;
2005-04-24 20:57:37 -04:00
\end{verbatim}
\subsection{Complements}
2005-04-24 20:57:37 -04:00
\glossary{name=complement,
description={a class whose set of instances is the set of objects that are not instances of a specific class}}
An object is an instance of a complement if it is not an instance of the complement's parameter.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{COMPLEMENT:}{COMPLEMENT: \emph{name} \emph{parameter}}
2005-04-24 20:57:37 -04:00
}
Defines a complement class. For example, the class of all values denoting ``true'' is defined as follows:
\begin{verbatim}
COMPLEMENT: general-t f
\end{verbatim}
\subsection{Predicates}
2005-04-24 20:57:37 -04:00
\glossary{name=predicate,
description={a word with stack effect \texttt{( object -- ?~)}, or more alternatively, a class whose instances are the instances of a superclass that satisfy an arbitrary predicate}}
An object is an instance of a predicate classes if it is an instance of the predicate's parent class, and if it satisfies the predicate definition.
Each predicate must be
defined as a subclass of some other class. This ensures that predicates inheriting from disjoint classes do not need to be
exhaustively tested during method dispatch.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{syntax}
\parsingword{PREDICATE:}{PREDICATE: \emph{parent} \emph{name} \emph{predicate} ;}
2005-04-24 20:57:37 -04:00
}
Defines a predicate class deriving from \texttt{parent} whose instances are the instances of \texttt{superclass} that satisfy the \texttt{predicate} quotation. The predicate quotation must have stack effect \texttt{( object -- ?~)}.
For example, the \texttt{strings} vocabulary contains subclasses of \texttt{integer}
classifying various ASCII characters:
2005-04-24 20:57:37 -04:00
\begin{verbatim}
PREDICATE: integer blank " \t\n\r" member? ;
2005-05-03 02:58:59 -04:00
PREDICATE: integer letter CHAR: a CHAR: z between? ;
PREDICATE: integer LETTER CHAR: A CHAR: Z between? ;
PREDICATE: integer digit CHAR: 0 CHAR: 9 between? ;
2005-04-24 20:57:37 -04:00
PREDICATE: integer printable CHAR: \s CHAR: ~ between? ;
\end{verbatim}
\subsection{Operations on classes}
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
2005-08-10 19:37:59 -04:00
\ordinaryword{class<}{class< ( class1 class2 -- ?~)}
2005-04-24 20:57:37 -04:00
}
2005-08-10 19:37:59 -04:00
Tests if all instances of \verb|class1| are also instances of \verb|class2|. This is a partial order with top and bottom in the mathematical sense; that is, it obeys the following axioms:
\begin{itemize}
\item It is reflexive: $X\subset X$
\item It is transitive: if $X\subset Y$ and $Y\subset Z$, then $X\subset Z$
\item There is a bottom element: for all classes $X$, $\texttt{null}\subset X$
\item There is a top element: for all classes $X$, $X\subset\texttt{object}$
\end{itemize}
This ordering determines the method ordering of a generic word (\ref{method-order}).
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{kernel}
2005-08-10 19:37:59 -04:00
\ordinaryword{class-and}{class-and ( class class -- class )}
\ordinaryword{class-or}{class-or ( class class -- class )}
2005-04-24 20:57:37 -04:00
}
2005-08-10 19:37:59 -04:00
Intersection and union of classes. Note that the returned class might not be the exact desired class; for example, \texttt{object} is output if no suitable class definition could be found at all. However, the following axioms are satisfied:
\begin{itemize}
\item If $X\subset Y$, then $X\cup Y=Y$
\item If $X\subset Y$, then $X\cap Y=X$
\end{itemize}
2005-04-24 20:57:37 -04:00
\section{Tuples}\label{tuples}
2005-04-24 20:57:37 -04:00
\tupleglos
Tuples are user-defined classes composed of named slots. All tuples have the same type, however distinct classes of tuples are defined.
\wordtable{
\vocabulary{syntax}
\parsingword{TUPLE:}{TUPLE: \emph{name} \emph{slots} ;}
2005-04-24 20:57:37 -04:00
}
Defines a new tuple class with membership predicate \texttt{name?}~and constructor \texttt{<name>}.
2005-04-24 20:57:37 -04:00
The constructor takes slots in left-to-right order from the stack. After construction, slots are read and written using various automatically-defined words with names of the
2005-04-24 20:57:37 -04:00
form \texttt{\emph{class}-\emph{slot}} and \texttt{set-\emph{class}-\emph{slot}}.
Here is an example:
\begin{verbatim}
TUPLE: point x y z ;
\end{verbatim}
This defines a new class named \texttt{point}, along with the
following set of words:
\begin{verbatim}
<point> point?
point-x set-point-x
point-y set-point-y
point-z set-point-z
\end{verbatim}
The word \texttt{<point>} takes the slot values from the stack and
produces a new \texttt{point}:
\begin{alltt}
1 2 3 <point> .
2005-08-10 19:37:59 -04:00
\textbf{<< point f 1 2 3 >>}
2005-04-24 20:57:37 -04:00
\end{alltt}
\subsection{Constructors}
2005-04-24 20:57:37 -04:00
Constructors are named after the tuple class surrounded in angle
brackets (\texttt{<}~and~\texttt{>}). A default constructor is provided
2005-04-24 20:57:37 -04:00
that reads slot values from the stack, however a custom constructor can
be defined using the \texttt{C:} parsing word.
\wordtable{
\vocabulary{syntax}
\parsingword{C:}{C: \emph{class} \emph{definition} ;}
2005-04-24 20:57:37 -04:00
}
Define a \texttt{<class>} word that creates a tuple instance of the \texttt{class}, then applies the \texttt{definition} to this new tuple. The \texttt{definition} quotation must have stack effect \texttt{( tuple -- tuple )}.
\subsection{Delegation}
2005-04-24 20:57:37 -04:00
\glossary{name=delegate,
description={a fa\,cade object's delegate receives unhandled methods that are called on the fa\,cade}}
\glossary{name={fa\,cade},
description=an object with a delegate}
Each tuple can have an optional delegate tuple. Generic words called on
the tuple that do not have a method for the tuple's class will be passed on
to the delegate. Note that delegation to objects that are not tuples is not fully supported at this stage and might not work as you might expect.
Factor uses delegation instead of inheritance, but it is not a direct
substitute; in particular, the semantics differ in that a delegated
method call receives the delegate on the stack, not the original object.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{generic}
\ordinaryword{delegate}{delegate ( object -- object )}
2005-04-24 20:57:37 -04:00
}
2005-08-10 19:37:59 -04:00
Returns an object's delegate, or \texttt{f} if no delegate is set. A direct consequence of this behavior is that an object may not have a delegate of \texttt{f}.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{generic}
\ordinaryword{set-delegate}{set-delegate ( object tuple -- )}
2005-04-24 20:57:37 -04:00
}
Sets a tuple's delegate.
Class membership test pridicates only test if an object is a direct instance of that class. Sometimes, you need to check if an object \emph{or its delegate} is an instance of a class. This can be done with the \verb|is?| combinator.
\wordtable{
\vocabulary{generic}
\ordinaryword{is?}{is?~( object quot -- ?~)}
\texttt{quot:~obj -- ?~)}\\
}
Tests if the quotation outputs a true value when applied to the object or some object that it delegates to.
2005-08-30 23:42:15 -04:00
Note that the \verb|standard-combination| method combination does not respect delegation unless the picker quotation is given as \verb|[ dup ]|. The \verb|math-combination| does not respect delegation at all (see \ref{combinations}).
\subsection{Method combination}\label{combinations}
Method combination adds a degree of flexibility to the generic word system, where a particular form of higher-order programming can be used to customize two aspects of generic word behavior:
\begin{itemize}
\item which stack item(s) the generic word dispatches upon,
\item which methods out of the set of applicable methods are called
\end{itemize}
The \verb|GENERIC:| parsing word creates a generic word using the \emph{standard method combination}. The \verb|G:| parsing word allows a custom method combination to be specified.
\wordtable{
\vocabulary{syntax}
\parsingword{G:}{G: \emph{generic} \emph{combination ...} ;}
}
Defines a generic word using the long-form.
A method combination is a quotation that is given the generic word on the stack, and outputs a quotation \emph{that becomes the definition of the word}. This is a very profound and abstract concept, and the examples in the remainder of the section will make it easier to grasp. The method combination quotation is called each time the generic word has to be updated (for example, when a method is added), and must not have any side effects.
\subsubsection{Standard method combination}
The following two lines are equivalent:
\begin{verbatim}
GENERIC: foo
G: foo simple-combination ;
\end{verbatim}
\wordtable{
\vocabulary{generic}
\ordinaryword{simple-combination}{simple-combination~( word -- quot )}
}
Perform simple method combination:
\begin{itemize}
\item the word dispatches on the top stack item,
\item only the method with most specific class is invoked,
\item if no suitable method is found, the generic word is called on the object's delegate
\end{itemize}
The next level of generality is the standard combination, which also invokes only the most specific method, but dispatches on an arbitrary stack element.
\wordtable{
\vocabulary{generic}
\ordinaryword{standard-combination}{standard-combination~( word picker -- quot )}
}
The \verb|picker| quotation must produce exactly one value on the stack. The picker is spliced into the returned quotation at appropriate points, making the generic word dispatch on the stack item produced by the picker. The simple combination is defined in terms of the standard combination as follows:
\begin{verbatim}
: simple-combination [ dup ] standard-combination ;
\end{verbatim}
Here is an example of a generic word a non-simple picker.
\begin{verbatim}
G: sbuf-append [ over ] standard-combination ;
M: string sbuf-append swap nappend ;
M: integer sbuf-append push ;
\end{verbatim}
Now it may be used as thus:
\begin{alltt}
SBUF" " clone "my-sbuf" set
"hello" "my-sbuf" get sbuf-append
CHAR: \bs{}s "my-sbuf" get sbuf-append
"world" "my-sbuf" get sbuf-append
"my-sbuf" get .
\textbf{SBUF" hello world"}
\end{alltt}
\subsubsection{Math method combination}
\newcommand{\numupgradeglos}{
\glossary{
name=numerical upgrading,
description={the stipulation that if one of the inputs to an arithmetic word is a \texttt{bignum} and the other is a \texttt{fixnum}, the latter is first coerced to a \texttt{bignum}, and if one of the inputs is a \texttt{float}, the other is coerced to a \texttt{float}}}}
\numupgradeglos
\wordtable{
\vocabulary{generic}
\ordinaryword{math-combination}{math-combination~( word -- quot )}
}
The math method combination is used for binary operators such as \verb|+|, \verb|*|, and so on.
A method can only be added to a generic word using the math combination if the method specializes on one of the below classes, or a union defined over one or more of the below classes:
\begin{verbatim}
fixnum
bignum
ratio
float
complex
object
\end{verbatim}
The math combination performs numerical upgrading as described in \ref{number-protocol}.
\subsubsection{Custom method combinations}
Development of custom method combination requires a good understanding of higher-order programming (code that writes code) and Factor internals. Custom method combination has not been fully explored at this stage of Factor development, and this section can only give a brief sketch of what is involved.
\wordtable{
\vocabulary{generic}
\ordinaryword{methods}{methods~( word -- alist )}
}
Outputs an association list mapping classes to method definition quotations. The association list is sorted with the least-specific method first. The task of the method combination is to transform this association list into an executable quotation.
\part{Library reference}
2005-04-28 22:40:57 -04:00
\chapter{Sequences}
\glossary{name=sequence,
description=an object storing a linearly-ordered set of elements}
A sequence is a linearly-ordered collection of objects. A set of built-in sequence types is provided by the library.
\begin{tabular}[t]{l|c|c|c|c|c|l}
\multicolumn{4}{l|}{}&\multicolumn{2}{c|}{Adding elements}&\multicolumn{1}{l}{}\\
\hline
Class&Mutable&Growable&Lookup&at start&at end&Primary purpose\\
\hline
2005-04-28 22:40:57 -04:00
%\texttt{array}&$\surd$&&$O(1)$&&&Low-level and unsafe\\
\texttt{list}&&&$O(n)$&$O(1)$&$O(n)$&Functional manipulation\\
\texttt{vector}&$\surd$&$\surd$&$O(1)$&$O(n)$&$O(1)$&Imperitive aggregation\\
2005-05-23 19:14:29 -04:00
\texttt{sbuf}&$\surd$&$\surd$&$O(1)$&$O(n)$&$O(1)$&Character accumulation\\
\texttt{string}&&&$O(1)$&&&Immutable text strings
\end{tabular}
2005-08-10 19:37:59 -04:00
A handful of ``virtual'' sequences are provided by the library. These sequences are not backed by actual storage, but instead either compute their values, or take them from an underlying sequence. Virtual sequences include:
2005-05-18 20:39:39 -04:00
\begin{verbatim}
repeated
range
reversed
2005-05-18 20:39:39 -04:00
slice
\end{verbatim}
User-defined classes can also implement the sequence protocol and gain the ability to reuse many of the words in this section.
2005-08-10 19:37:59 -04:00
Finally, integers implement the sequence protocol, allowing counted loops to fall out as a trivial case of sequence iteration (\ref{counted-loops}).
\glossary{name=virtual sequence,
description={a sequence that is not backed by actual storage, but instead either computes its values, or take them from an underlying sequence}}
\section{Sequence protocol}
The following set of generic words constitutes the sequence protocol. The mutating words are not supported by all sequences; in particular, lists and strings are immutable.
\glossary{name=resizable sequence,
description={a sequence implementing the \texttt{set-length} generic word. For example, vectors and string buffers}}
\glossary{name=mutable sequence,
description={a sequence implementing the \texttt{set-nth} generic word. For example, vectors and string buffers}}
An object that is an instance of a class implementing these generic words can be thought of as a sequence, and given to the words in the following sections.
\wordtable{
\vocabulary{sequences}
\genericword{length}{length ( seq -- n )}
}
Outputs the length of the sequence. All sequences support this operation.
2005-08-30 23:42:15 -04:00
This generic word is flushable, so user-defined methods must satisfy the flushable contract (see \ref{declarations}).
\wordtable{
\vocabulary{sequences}
\genericword{set-length}{set-length ( n seq -- )}
}
Resizes the sequence. Not all sequences can be resized.
\wordtable{
\vocabulary{sequences}
\genericword{nth}{nth ( n seq -- elt )}
}
Outputs the $n$th element of the sequence. Elements are numbered starting from 0, so the last element has an index one less than the length of the sequence. An exception should be thrown if an out-of-bounds index is accessed. All sequences support this operation, however with lists it has non-constant running time.
2005-08-30 23:42:15 -04:00
This generic word is flushable, so user-defined methods must satisfy the flushable contract (see \ref{declarations}).
\wordtable{
\vocabulary{sequences}
\genericword{set-nth}{set-nth ( elt n seq -- )}
}
Sets the $n$th element of the sequence. Storing beyond the end of a resizable sequence such as a vector or string buffer grows the sequence. Storing to a negative index is always an error.
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{sequences}
\genericword{like}{like ( seq template -- seq )}
}
Outputs a sequence with the same elements as the input sequence, but ``like'' the template sequence, meaning it either has the same class as the template sequence, or if the template sequence is a virtual sequence, the same class as the template sequence's underlying sequence. The default implementation does nothing.
2005-08-30 23:42:15 -04:00
This generic word is flushable, so user-defined methods must satisfy the flushable contract (see \ref{declarations}).
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{sequences}
\genericword{thaw}{thaw ( seq -- seq )}
}
Outputs a sequence with the same elements as the input sequence, but mutable. The default implementation converts the sequence into a vector.
2005-08-30 23:42:15 -04:00
This generic word is flushable, so user-defined methods must satisfy the flushable contract (see \ref{declarations}).
\section{Sequence operations}
\subsection{Comparison}
\wordtable{
\vocabulary{sequences}
\ordinaryword{sequence=}{sequence= ( s1 s2 -- ?~)}
}
Tests if the two sequences have the same length and elements. This is weaker than \texttt{=}, since it does not ensure that the sequences are instances of the same class.
\wordtable{
\vocabulary{sequences}
\ordinaryword{lexi}{lexi~( s1 s2 -- n )}
}
Compares two sequences of integers lexicographically (dictionary order). The output value is one of the following:
\begin{description}
\item[Positive] indicating that \texttt{s1} follows \texttt{s2}
\item[Zero] indicating that \texttt{s1} is equal to \texttt{s2}
\item[Negative] indicating that \texttt{s1} precedes \texttt{s2}
\end{description}
\subsection{Iteration}\label{iteration}
Standard iteration patterns are abstracted away in these words.
\wordtable{
\vocabulary{sequences}
\genericword{each}{each ( seq quot -- )}
\texttt{quot:~element --}\\
}
Applies the quotation to each element of the sequence.
\wordtable{
\vocabulary{sequences}
\ordinaryword{reduce}{reduce ( seq ident quot -- result )}
\texttt{quot:~previous element -- next}\\
}
Combines successive elements of the sequence using a binary operation. The first input value at each iteration except the first one is the result of the previous iteration. The first input value at the first iteration is \verb|ident|. For example, the \verb|sum| word adds a sequence of numbers together, and is defined as follows:
\begin{alltt}
: sum ( seq -- n ) 0 [ + ] reduce ;
\end{alltt}
The \verb|reduce| word has a very simple implementation:
\begin{verbatim}
: reduce ( seq ident quot -- result ) swapd each ; inline
\end{verbatim}
So indeed, it is an expression of an idiom rather than an algorithm. Various words that combine the \verb|reduce| combinator together with an identity and mathematical operation are defined in \ref{reductions}.
\wordtable{
\vocabulary{sequences}
\ordinaryword{accumulate}{accumulate ( seq ident quot -- results )}
\texttt{quot:~previous element -- next}\\
2005-05-18 20:39:39 -04:00
}
Like \verb|reduce|, but instead outputs a sequence of intermediate values. The first element of the resulting sequence is always \verb|ident|. For example,
\begin{alltt}
2005-08-10 19:37:59 -04:00
\tto 2 2 2 2 2 \ttc 0 [ + ] accumulate .
\tto 0 2 4 6 8 \ttc
\end{alltt}
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
\genericword{tree-each}{tree-each ( seq quot -- )}
\texttt{quot:~element --}\\
2005-06-12 20:55:30 -04:00
}
Applies the quotation to each element of the sequence. Elements that are themselves sequences are iterated recursively. Note that this word only operates on lists, strings, string buffers and vectors, not on virtual sequences or user-defined sequences.
2005-06-12 20:55:30 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{map}{map ( seq quot -- seq )}
\texttt{quot:~element -- element}\\
2005-05-18 20:39:39 -04:00
}
Applies the quotation to each element yielding a new element. The new elements are collected into a sequence of the same class as the input sequence.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{nmap}{nmap ( seq quot -- )}
\texttt{quot:~element -- element}\\
}
Applies the quotation to each element yielding a new element, storing the new elements back in the original sequence. This modifies \texttt{seq} and so throws an exception if it is immutable.
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{2each}{2each ( s1 s2 quot -- )}
\texttt{quot:~e1 e2 --}\\
}
2005-08-30 23:42:15 -04:00
Applies the quotation to pairs of elements from \texttt{s1} and \texttt{s2}, which must have the same length.
2005-08-10 19:37:59 -04:00
2005-08-30 23:42:15 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{2reduce}{2reduce ( seq1 seq2 ident quot -- result )}
\texttt{quot:~previous elt1 elt2 -- next}\\
}
Combines successive pairs of elements from the two sequences using a ternary operation. The first input value at each iteration except the first one is the result of the previous iteration. The first input value at the first iteration is \verb|ident|. For example, the \verb|v.| word computing the dot product of two vectors is implemented using \verb|2reduce|:
2005-08-10 19:37:59 -04:00
\begin{verbatim}
2005-08-30 23:42:15 -04:00
: v. ( v v -- n ) 0 [ * + ] 2reduce ;
2005-08-10 19:37:59 -04:00
\end{verbatim}
See \ref{inner-product} for details.
2005-08-30 23:42:15 -04:00
The \verb|2reduce| word has a trivial implementation:
\begin{verbatim}
: 2reduce >r -rot r> 2each ; inline
\end{verbatim}
\wordtable{
\vocabulary{sequences}
\ordinaryword{2map}{2map ( s1 s2 quot -- seq )}
\texttt{quot:~e1 e2 -- element}\\
}
2005-08-30 23:42:15 -04:00
Applies the quotation to pairs of elements from \texttt{s1} and \texttt{s2}, yielding a new element. The two input sequences must have the same length. The new elements are collected into a sequence of the same class as \texttt{s1}. Here is an example computing the pair-wise product of the elements of two vectors:
\begin{alltt}
\tto 5 3 -2 \ttc \tto 8 16 3 \ttc [ * ] 2map .
\textbf{\tto 40 48 -6 \ttc}
\end{alltt}
2005-08-10 19:37:59 -04:00
In fact the \verb|v*| word in the \verb|math| vocabulary is defined to call \verb|[ * ] 2map|; see \ref{pairwise} for documentation on this and similar words.
\wordtable{
\vocabulary{sequences}
\genericword{find}{find ( seq quot -- i elt )}
\genericword{find*}{find*~( i seq quot -- i elt )}
\texttt{quot:~elt -- ?}\\
}
Applies the quotation to each element of the sequence in turn, until it outputs a true value or the end of the sequence is reached. If the quotation yields a true value for some sequence element, the element index and the element itself are output. Otherwise, outputs $-1$ and \verb|f|. The \verb|find*| combinator is a variation that takes a starting index as a parameter. Various higher-level words are built on this combinator. Usually, they are used instead:
\begin{verbatim}
contains? ( seq quot -- ? )
all? ( seq quot -- ? )
index ( elt seq -- i )
index* ( i elt seq -- i )
member? ( elt seq -- ? )
\end{verbatim}
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{each-with}{each-with ( obj seq quot -- )}
\texttt{quot:~obj elt --}\\
\ordinaryword{map-with}{map-with ( obj seq quot -- seq )}
\texttt{quot:~obj elt -- elt}\\
\ordinaryword{tree-each-with}{tree-each-with ( obj seq quot -- )}
\texttt{quot:~obj elt --}\\
\ordinaryword{find-with}{find-with ( obj seq quot -- i elt )}
\texttt{quot:~obj elt -- ?}\\
\ordinaryword{find-with*}{find-with*~( obj i seq quot -- i elt )}
\texttt{quot:~obj elt -- ?}\\
2005-05-18 20:39:39 -04:00
}
Curried forms of the above combinators. They pass an additional object to each invocation of the quotation.
2005-08-10 19:37:59 -04:00
\subsection{Counted loops}\label{counted-loops}
2005-08-10 19:37:59 -04:00
Integers support the sequence protocol in a trivial fashion; a non-negative integer presents its non-negative predecessors as elements. For example, the integer 3, when viewed as a sequence, contains the elements 0, 1, 2. This is very useful for performing counted loops.
2005-08-10 19:37:59 -04:00
For example, the \verb|each| combinator, given an integer, simply calls the quotation that number of times, pushing a counter on each iteration that ranges from 0 up to that integer:
\begin{alltt}
3 [ . ] each
0
1
2
\end{alltt}
A common idiom is to iterate over a sequence, while maintaining a loop counter. This can be done using \verb|2each|:
\begin{alltt}
\tto "a" "b" "c" \ttc dup length [
"Index: " write . "Element: " write .
] 2each
\textbf{Index: 0
Element: "a"
Index: 1
Element: "b"
Index: 2
Element: "c"}
\end{alltt}
Combinators that produce new sequences, such as \verb|map|, will output a vector if the input is an integer.
If you wish to perform an iteration over a range of integers that does not begin from zero, or an iteration that starts at a specific index and decreases towards zero, use a \verb|<range>| sequence.
\glossary{name=range sequence,
description={an instance of the \texttt{range} class, which is a virtual sequence of integers}}
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{<range>}{<range> ( a b -- seq )}
}
2005-08-10 19:37:59 -04:00
Creates an immutable sequence consisting of all integers in the interval $[a,b)$ (if $a<b$) or $(b,a]$ (if $a>b$). If $a=b$, the resulting sequence is empty. This is just a tuple implementing the sequence protocol.
\begin{alltt}
CHAR: a CHAR: z 1 + <range> .
<< range [ ] 97 123 1 >>
CHAR: a CHAR: z 1 + <range> >string .
"abcdefghijklmnopqrstuvwxyz"
CHAR: z CHAR: a 1 - <range> >string .
"zyxwvutsrqponmlkjihgfedcba"
\end{alltt}
\subsection{Aggregation and grouping}\label{aggregation}
2005-06-12 20:55:30 -04:00
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{append}{append ( s1 s2 -- seq )}
}
2005-08-10 19:37:59 -04:00
Outputs a new sequence consisting of the elements of \texttt{s1} followed by the elements of \texttt{s2}. The new sequence is of the same class as \texttt{s1}.
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{add}{add ( seq elt -- seq )}
}
2005-08-10 19:37:59 -04:00
Outputs a new sequence consisting of the elements of \texttt{seq} followed by \verb|elt|.
The new sequence is of the same type as \texttt{seq}.
\wordtable{
\vocabulary{sequences}
\ordinaryword{append3}{append3 ( s1 s2 s3 -- seq )}
}
Appends the three sequences \texttt{s1}, \texttt{s2} and \texttt{s3} into a new sequence of the same class as \texttt{s1}.
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{nappend}{nappend ( s1 s2 -- )}
}
2005-08-10 19:37:59 -04:00
Appends \texttt{s2} to \texttt{s1}. Nothing is output, and \texttt{s1} is modified.
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{copy-into}{copy-into ( start to from -- )}
}
2005-08-10 19:37:59 -04:00
Copies all elements of \verb|from| into \verb|to|, with destination indices starting from \verb|start|. The \verb|to| sequence must be large enough, or an exception is thrown.
\wordtable{
\vocabulary{sequences}
\ordinaryword{concat}{concat ( sequence -- sequence )}
}
The input is a sequence of sequences. If the input is empty, the output is the empty list (\texttt{f}). Otherwise, the elements of the input sequence are concatenated together, and a new sequence of the same type as the first element is output.
\begin{alltt}
2005-08-10 19:37:59 -04:00
\tto "a" [ CHAR: b ] \tto CHAR: c \ttc \ttc concat .
\textbf{"abc"}
\end{alltt}
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{join}{join ( sequence glue -- sequence )}
}
2005-08-10 19:37:59 -04:00
Like \verb|concat|, but \verb|glue| is placed between each pair of sequences, and the resulting sequence has the same type as \verb|glue|.
\begin{alltt}
\tto "alpha" "beta" "gamma" \ttc ", " join .
\textbf{"alpha, beta, gamma"}
\end{alltt}
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{split1}{split1~( seq split -- before after )}
}
If \texttt{seq} does not contain \texttt{split} as a subsequence, then \texttt{before} is equal to the \texttt{seq}, and \texttt{after} is \texttt{f}. Otherwise, \texttt{before} and \texttt{after} are both sequences, and yield the input excluding \texttt{split} when appended.
\wordtable{
\vocabulary{sequences}
\ordinaryword{split}{split~( seq split -- list )}
}
Outputs a list of subsequences taken between occurrences of \texttt{split} in \texttt{seq}. If \texttt{split} does not occur in \texttt{seq}, outputs a singleton list containing \texttt{seq} only.
\begin{alltt}
"/usr/local/bin" "/" split .
\textbf{[ "" "usr" "local" "bin" ]}
\end{alltt}
\wordtable{
\vocabulary{sequences}
\ordinaryword{group}{group~( str n -- list )}
}
2005-08-10 19:37:59 -04:00
Splits the sequence into groups of $n$ elements and collects each group in a list. If the sequence length is not a multiple of $n$, the final subsequence in the list will be shorter than $n$.
2005-08-30 23:42:15 -04:00
\subsection{Searching and sorting}\label{seq-searching}
2005-08-30 23:42:15 -04:00
A set of words dealing with sequence element indices, and for sorting sequences.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{index}{index ( obj seq -- n )}
\ordinaryword{index*}{index* ( obj i seq -- n )}
2005-05-18 20:39:39 -04:00
}
2005-08-10 19:37:59 -04:00
Outputs the index of the first element in the sequence equal to \texttt{obj}. If no element is found, outputs $-1$. The \verb|index*| form allows a start index to be specified. A related word is \verb|member?| (\ref{set-theoretic}).
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{start}{start ( subseq seq -- n )}
\ordinaryword{start*}{start* ( subseq i seq -- n )}
2005-05-18 20:39:39 -04:00
}
Outputs the start index of a subsequence, or $-1$ if the subsequence does not occur in the sequence. The \verb|start*| form allows a start index to be specified. A related word is \verb|subseq?|.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{subseq?}{subseq?~( s1 s2 -- ?~)}
2005-05-18 20:39:39 -04:00
}
2005-08-10 19:37:59 -04:00
Tests if \texttt{s2} contains \texttt{s1} as a subsequence.
2005-08-30 23:42:15 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{sort}{sort~( seq quot -- seq )}
\texttt{quot:~e1 e2 -- -1/0/1}\\
}
Sorts the sequence by comparing each pair of elements with the quotation. The quotation should output one of the following values:
\begin{description}
\item[Positive] indicating that \texttt{e1} follows \texttt{e2}
\item[Zero] indicating that \texttt{e1} is equal to \texttt{e2}
\item[Negative] indicating that \texttt{e1} precedes \texttt{e2}
\end{description}
A new sorted sequence is output, and the given sequence is not modified.
\wordtable{
\vocabulary{sequences}
\ordinaryword{nsort}{nsort~( seq quot -- )}
\texttt{quot:~e1 e2 -- -1/0/1}\\
}
Like \verb|sort|, except the sequence is sorted in-place. Giving an immutable sequence to this word will raise an exception.
\wordtable{
\vocabulary{sequences}
\ordinaryword{number-sort}{number-sort~( seq -- seq )}
}
Sorts a sequence of real numbers. Defined as follows:
\begin{verbatim}
: number-sort [ - ] sort ;
\end{verbatim}
\wordtable{
\vocabulary{sequences}
\ordinaryword{string-sort}{string-sort~( seq -- seq )}
}
Sorts a sequence of strings. Defined as follows:
\begin{verbatim}
: string-sort [ lexi ] sort ;
\end{verbatim}
\wordtable{
\vocabulary{sequences}
\ordinaryword{binsearch}{binsearch~( elt seq quot -- i )}
}
Perform a binary search for \verb|elt| on a sorted sequence. The quotation follows the same protocol as the comaprator quotation given to \verb|sort|, and the sequence must already be sorted under this quotation. The index of the greatest element that is equal to or less than \verb|elt| is output. If the sequence is empty, outputs $-1$.
\wordtable{
\vocabulary{sequences}
\ordinaryword{binsearch*}{binsearch*~( elt seq quot -- elt )}
}
Like \verb|binsearch|, but outputs the element at that index, rather than the index itself. If the sequence is empty, outputs \verb|f|.
2005-08-10 19:37:59 -04:00
\subsection{Slicing and reshaping}\label{reshaping}
\glossary{name=slice,
description={an instance of the \texttt{slice} class, which is a virtual sequence sharing structure with a subrange of some underlying sequence}}
2005-08-10 19:37:59 -04:00
The first set of words are concerned with taking subsequences of a sequence. Each of the below words comes in dual pairs; the first of the pair outputs a new copied sequence, the second outputs a virtual sequence sharing structure with the underlying sequence.
\wordtable{
\vocabulary{sequences}
\ordinaryword{head}{head~( n seq -- seq )}
2005-08-10 19:37:59 -04:00
\ordinaryword{head-slice}{head-slice ( n seq -- slice )}
}
Outputs a new sequence consisting of the first $n$ elements of the input sequence.
\wordtable{
\vocabulary{sequences}
\ordinaryword{tail}{tail~( n seq -- seq )}
2005-08-10 19:37:59 -04:00
\ordinaryword{tail-slice}{tail-slice ( n seq -- slice )}
}
Outputs a new sequence consisting of all elements of the sequence, starting at the $n$th index.
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{head*}{head~( n seq -- seq )}
\ordinaryword{head-slice*}{head-slice*~( n seq -- slice )}
}
Outputs a new sequence consisting of all elements of the sequence, until the $n$th element from the end. In other words, it outputs a sequence of the first $l-n$ elements of the input sequence, where $l$ is its length.
\wordtable{
\vocabulary{sequences}
\ordinaryword{tail*}{tail*~( n seq -- seq )}
2005-08-10 19:37:59 -04:00
\ordinaryword{tail-slice*}{tail-slice*~( n seq -- slice )}
}
Outputs a new sequence consisting of the last $n$ elements of the input sequence.
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{subseq}{subseq~( from to seq -- seq )}
\ordinaryword{<slice>}{<slice> ( from to seq -- slice )}
}
2005-08-10 19:37:59 -04:00
Outputs a new sequence consisting of all elements in the interval $[from,to)$.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\genericword{reverse}{reverse ( seq -- seq )}
\genericword{reverse}{reverse-slice ( seq -- seq )}
2005-05-18 20:39:39 -04:00
}
2005-08-10 19:37:59 -04:00
Outputs a new sequence with the reverse element order.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{head?}{head?~( s1 s2 -- ?~)}
\ordinaryword{tail?}{tail?~( s1 s2 -- ?~)}
2005-05-18 20:39:39 -04:00
}
2005-08-10 19:37:59 -04:00
Tests if \texttt{s1} starts or ends with \texttt{s1}. If \texttt{s1} is longer than \texttt{s2}, outputs \texttt{f}.
\wordtable{
\vocabulary{sequences}
\ordinaryword{cut}{cut ( seq n -- s1 s2 )}
}
Outputs a pair of sequences that equal the original sequence when appended. The first sequence has length $n$, the second has length $l-n$ where $l$ is the length of the input.
2005-05-18 20:39:39 -04:00
\begin{alltt}
2005-08-10 19:37:59 -04:00
"Hello world" 5 cut .s
\textbf{" world"
"Hello"}
2005-05-18 20:39:39 -04:00
\end{alltt}
2005-08-10 19:37:59 -04:00
This word has a simple definition:
\begin{verbatim}
: cut ( n seq -- seq seq )
[ head ] 2keep tail ;
\end{verbatim}
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{?head}{?head~( s1 s2 -- seq ?~)}
\ordinaryword{?tail}{?tail~( s1 s2 -- seq ?~)}
2005-05-18 20:39:39 -04:00
}
2005-08-10 19:37:59 -04:00
Tests if \texttt{s1} starts or ends with \texttt{s1} as a subsequence. If there is a match, outputs the subrange of \texttt{s1} excluding \texttt{s1} followed by \texttt{t}. If there is no match, outputs \texttt{s1} followed by \texttt{f}.
2005-05-18 20:39:39 -04:00
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{sequences}
\ordinaryword{flip}{flip ( seq -- seq )}
}
Outputs the two-dimensional transpose of the sequence of sequences, all of which must have equal length. An example:
\begin{alltt}
\tto \tto 1 2 3 \ttc \tto 4 5 6 \ttc \ttc flip
\textbf{\tto \tto 1 2 \ttc \tto 3 4 \ttc \tto 5 6 \ttc \ttc}
\end{alltt}
\subsection{Set-theoretic operations}\label{set-theoretic}
A set of words for testing membership, and aggregating sequences without regard for element order.
\wordtable{
\vocabulary{sequences}
\ordinaryword{member?}{member?~( elt seq -- ?~)}
}
2005-08-10 19:37:59 -04:00
Tests if \texttt{seq} contains an element equal to \texttt{elt}. A related word is \verb|index| (\ref{seq-searching})..
\wordtable{
\vocabulary{sequences}
\ordinaryword{memq?}{memq?~( elt seq -- ?~)}
}
Tests if the sequence contains the actual object given. Elements are compared by identity.
\wordtable{
\vocabulary{sequences}
\ordinaryword{contained?}{contained?~( s1 s2 -- ?~)}
}
Tests if every element of \texttt{s1} is equal to some element of \texttt{s2}.
\wordtable{
\vocabulary{sequences}
\ordinaryword{remove}{remove ( object seq -- seq )}
}
Outputs a new sequence containing all elements of the input sequence except those equal to the \texttt{object}.
\wordtable{
\vocabulary{sequences}
\ordinaryword{prune}{prune ( seq -- seq )}
}
Outputs a new sequence with each element of \verb|seq| appearing only once.
\wordtable{
\vocabulary{sequences}
\ordinaryword{seq-union}{seq-union ( seq seq -- seq )}
}
2005-08-10 19:37:59 -04:00
Outputs a sequence of elements present in at least one of the sequences, filtering duplicates by comparing elements for equality.
\wordtable{
\vocabulary{sequences}
\ordinaryword{seq-intersect}{seq-intersect ( seq seq -- seq )}
}
Outputs a sequence of elements present in both sequences, comparing elements for equality.
\wordtable{
\vocabulary{sequences}
\ordinaryword{seq-diff}{seq-diff ( s1 s2 -- seq )}
}
Outputs a sequence of elements present in \texttt{sl2} but not \texttt{s1}, comparing elements for equality.
\wordtable{
\vocabulary{sequences}
\ordinaryword{subset}{subset ( seq quot -- seq )}
\texttt{quot:~element -- ?}\\
}
Applies the quotation to each element, and outputs a new sequence containing the elements of the original sequence for which the quotation output a true value.
\wordtable{
\vocabulary{sequences}
\ordinaryword{contains?}{contains?~( seq quot -- ?~)}
\texttt{quot:~element -- ?}\\
}
2005-08-10 19:37:59 -04:00
Applies the quotation to each element of the sequence. If an element is found for which the quotation outputs a true value, a true value is output. Otherwise if the end of the sequence is reached, \verb|f| is output. Given an empty sequence, vacuously outputs \texttt{f}.
\wordtable{
\vocabulary{sequences}
\ordinaryword{all?}{all?~( seq quot -- ?~)}
\texttt{quot:~element -- ?}\\
}
Outputs \texttt{t} if the quotation yields true when applied to each element, otherwise outputs \texttt{f}. Given an empty sequence, vacuously outputs \texttt{t}.
\wordtable{
\vocabulary{sequences}
2005-08-10 19:37:59 -04:00
\ordinaryword{every?}{every?~( seq quot -- ?~)}
\texttt{quot:~element element -- ?}\\
}
2005-08-10 19:37:59 -04:00
Tests if all elements of the sequence are equivalent under the relation. The quotation should be an equality relation (see \ref{equality}), otherwise the result will not be useful. This is implemented by vacuously outputting \verb|t| if the sequence is empty, or otherwise, by applying the quotation to each element together with the first element in turn, and testing if it always yields a true value. Usually, this word is used to test if all elements of a sequence are equal, or the same element:
\begin{verbatim}
2005-08-10 19:37:59 -04:00
[ = ] every?
[ eq? ] every?
\end{verbatim}
A pair of utility words test of every element in a sequence is true, or if the sequence contains at least one true element.
\wordtable{
\vocabulary{sequences}
\ordinaryword{conjunction}{conjunction~( seq -- ?~)}
\ordinaryword{disjunction}{disjunction~( seq -- ?~)}
}
The implementations are trivial:
\begin{verbatim}
: conjunction ( v -- ? ) [ ] all? ;
: disjunction ( v -- ? ) [ ] contains? ;
\end{verbatim}
\wordtable{
\ordinaryword{subset-with}{subset-with ( object seq quot -- seq )}
\texttt{quot:~object element -- ?}\\
\ordinaryword{some-with?}{some-with?~( object seq quot -- ?~)}
\texttt{quot:~object element -- ?}\\
\ordinaryword{all-with?}{all-with?~( object seq quot -- ?~)}
\texttt{quot:~object element -- ?}\\
}
Curried forms of the above combinators. They pass an additional object to each invocation of the quotation.
2005-08-10 19:37:59 -04:00
\section{Oddball operations}\label{oddball-seq}
These operations do not fit into any clearly-defined functional category, but are nonetheless useful.
\wordtable{
\vocabulary{sequences}
\genericword{empty?}{empty?~( seq -- ?~)}
}
Tests if the sequence contains any elements. The default implementation of this word tests if the length is zero; user-defined sequences can provide a custom implementation that is more efficient.
A few convenience words are defined for accessing the first few elements.
\wordtable{
\vocabulary{sequences}
\ordinaryword{first}{first ( seq -- elt )}
\ordinaryword{second}{second ( seq -- elt )}
\ordinaryword{third}{third ( seq -- elt )}
\ordinaryword{fourth}{fourth ( seq -- elt )}
}
Note the naming convention here; the \verb|first| word actually gets the 0th element:
\begin{verbatim}
: first 0 swap nth ; inline
\end{verbatim}
\wordtable{
\vocabulary{sequences}
\ordinaryword{2unseq}{2unseq ( seq -- first second )}
\ordinaryword{3unseq}{3unseq ( seq -- first second third )}
}
Outputs the first two, or the first three elements of the sequence, respectively.
\wordtable{
\vocabulary{sequences}
\ordinaryword{peek}{peek ( sequence -- element )}
}
Outputs the last element of the sequence. Throws an exception if the sequence is empty. This word has a trivial implementation:
\begin{verbatim}
: peek ( sequence -- element ) dup length 1 - swap nth ;
\end{verbatim}
\wordtable{
\vocabulary{sequences}
\ordinaryword{push}{push ( element sequence -- )}
\ordinaryword{pop}{pop ( sequence -- element )}
}
Adds and removes an element at the end of the sequence. The sequence's length is adjusted accordingly. These are implemented as follows:
\begin{verbatim}
: push ( element sequence -- )
dup length swap set-nth ;
: pop ( sequence -- element )
dup peek >r dup length 1 - swap set-length r> ;
\end{verbatim}
\wordtable{
\vocabulary{sequences}
\ordinaryword{push-new}{push-new ( element sequence -- )}
}
Adds the element to the sequence if the sequence does not already contain an equal element.
\wordtable{
\vocabulary{sequences}
\ordinaryword{change-nth}{change-nth ( seq n quot -- )}
\texttt{quot:~element -- element}\\
}
Applies the quotation to the $n$th element of the sequence, and store the output back in the $n$th slot of the sequence. This modifies \texttt{seq} and so throws an exception if it is immutable.
\wordtable{
\vocabulary{sequences}
\ordinaryword{<repeated>}{<repeated> ( n object -- seq )}
}
Creates an immutable sequence consisting of \verb|object| repeated $n$ times. No storage allocation of $n$ elements is made; rather a repeated sequence is just a tuple where the \verb|nth| word is implemented to return the same value on each invocation.
\begin{alltt}
5 "hey" <repeated> .
<< repeated [ ] 5 "hey" >>
5 "hey" <repeated> >list .
[ "hey" "hey" "hey" "hey" "hey" ]
\end{alltt}
\glossary{name=repeated sequence,
description={an instance of the \texttt{repeated} class, which is a virtual, immutable sequence consisting of a fixed element repeated a certain number of times}}
\section{Vectors}\label{vectors}
\wordtable{
\vocabulary{vectors}
\classword{vector}
}
\vectorglos
A vector is a growable, mutable sequence whose elements are stored in a contiguous range of memory. The literal syntax is covered in \ref{vector-literals}. Very few words operate specifically on vectors; most operations on vectors are done with generic sequence words.
\wordtable{
\vocabulary{vectors}
\ordinaryword{vector?}{vector?~( object -- ?~)}
}
Tests if the object at the top of the stack is a vector.
\wordtable{
\vocabulary{vectors}
\ordinaryword{>vector}{>vector~( sequence -- vector )}
}
Turns any type of sequence into a vector. Given a vector, this makes a fresh copy.
\wordtable{
\vocabulary{vectors}
\ordinaryword{<vector>}{<vector>~( capacity -- vector )}
}
Creates a new vector with an initial capacity that determines how many elements it can store before it needs resizing. The initial length is zero.
\wordtable{
\vocabulary{vectors}
\ordinaryword{empty-vector}{empty-vector~( length -- vector )}
}
Creates a new vector of the requested length, where all elements are initially \texttt{f}.
\wordtable{
\vocabulary{vectors}
\ordinaryword{zero-vector}{zero-vector~( length -- vector )}
}
Creates a new vector of the requested length, where all elements are initially \texttt{f}.
\section{Cons cells}\label{cons-cells}
\consglos
\glossary{name=car,description=the first component of a cons cell}
\glossary{name=cdr,description=the second component of a cons cell}
\wordtable{
\vocabulary{lists}
\classword{cons}
}
A \emph{cons cell} is an ordered pair of values. The first value is called the \emph{car},
the second is called the \emph{cdr}. The literal syntax of cons cells is documented in \ref{listsyntax}.
2005-08-10 19:37:59 -04:00
Cons cells, and by extension lists, are immutable.
\wordtable{
\vocabulary{lists}
\ordinaryword{cons?}{cons?~( object -- ?~)}
}
Tests if the object at the top of the stack is a cons cell.
\wordtable{
\vocabulary{lists}
\ordinaryword{cons}{cons ( car cdr -- cons )}
\ordinaryword{swons}{swons ( cdr car -- cons )}
}
Creates a new cons cell from two components. The \texttt{swons} word is defined as follows:
\begin{verbatim}
: swons swap cons ;
\end{verbatim}
\wordtable{
\vocabulary{lists}
\ordinaryword{car}{car ( cons -- car )}
\ordinaryword{cdr}{cdr ( cons -- cdr )}
}
Outputs the individual components of a cons cell. Taking the car of cdr of the empty list yields the empty list back.
\begin{alltt}
5 "blind mice" cons car .
\textbf{5}
"peanut butter" "jelly" cons cdr .
\textbf{"jelly"}
\end{alltt}
\wordtable{
\vocabulary{lists}
\ordinaryword{uncons}{uncons ( cons -- car cdr )}
\ordinaryword{unswons}{unswons ( cons -- cdr car )}
}
Pushes both the car and cdr of the cons cell at once. These words are implemented in the obvious way:
\begin{verbatim}
: uncons ( cons -- car cdr ) dup car swap cdr ;
: unswons ( cons -- car cdr ) dup cdr swap car ;
\end{verbatim}
Here is an example:
\begin{alltt}
{[[} "potatoes" "gravy" {]]} uncons .s
\textbf{"gravy"
"potatoes"}
\end{alltt}
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{lists}
\ordinaryword{2car}{2car ( c1 c2 -- car1 car2 )}
\ordinaryword{2cdr}{2cdr ( c1 c2 -- cdr1 cdr2 )}
}
Deconstructs paired lists. Compare the stack effects with those of \verb|car|, \verb|cdr| and \verb|uncons|
\subsection{Lists}\label{lists}
\listglos
\glossary{name=improper list,description={a sequence of cons cells where the cdr of the last cons cell is not \texttt{f}}}
\glossary{name=general list,description={a proper or improper list; that is, either \texttt{f} or a cons cell}}
Lists of values are represented with nested cons cells. The car is the first element of the list; the cdr is the rest of the list. The value \texttt{f} represents the empty list.
The following example demonstrates the construction of lists as chains of cons cells, along with the literal syntax used to print lists:
\begin{alltt}
{[} 1 2 3 {]} car .
\textbf{1}
{[} 1 2 3 {]} cdr .
\textbf{{[} 2 3 {]}}
{[} 1 2 3 {]} cdr cdr .
\textbf{{[} 3 {]}}
\end{alltt}
\begin{figure}
\caption{Cons cells making up the list \texttt{[ 1 2 3 ]}}
2005-04-28 22:40:57 -04:00
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.5}{
%BEGIN IMAGE
\epsfbox{cons.eps}
%END IMAGE
%HEVEA\imageflush
}
\end{center}
\end{figure}
List operations are typically implemented in a recursive fashion, where the cdr of the list is taken until the desired element is reached.
\wordtable{
\vocabulary{lists}
\classword{general-list}
\classword{list}
}
2005-04-28 22:40:57 -04:00
A \emph{general list} is either the empty list or a cons cell. A \emph{list} is either the empty list or a cons cell whose cdr is also a list. A list is sometimes also known as a \emph{proper list}, and a general list that is not a proper list is known as a \emph{improper list}.
Not all list operations will function given an improper list,
however methods are usually defined on \texttt{general-list} not \texttt{list} since dispatching on \texttt{list} involves a costly check.
\wordtable{
\vocabulary{lists}
\ordinaryword{>list}{>list ( sequence -- list )}
}
2005-04-28 22:40:57 -04:00
Converts an arbitrary sequence into a list.
\wordtable{
\vocabulary{lists}
\ordinaryword{list?}{list?~( obj -- ?~)}
}
Tests if the object at the top of the stack is a proper list.
\wordtable{
\vocabulary{lists}
\ordinaryword{unit}{unit ( obj -- [ obj ] )}
}
Makes a list of one element.
\wordtable{
\vocabulary{lists}
\ordinaryword{2list}{2list ( o1 o2 -- [ o1 o2 ] )}
}
Makes a list of two elements.
\wordtable{
\vocabulary{lists}
\ordinaryword{unique}{unique ( obj list -- list )}
}
If the list already contains an element equal to the object, do nothing, otherwise cons the object into the list.
\section{Strings}\label{strings}
\stringglos
\wordtable{
\vocabulary{strings}
\classword{string}
}
A string is an immutable sequence of characters. The literal syntax is covered in \ref{string-literals}. Characters do not have a distinct data type, so elements taken out of strings appear as integers on the stack.
\wordtable{
\vocabulary{strings}
\ordinaryword{string?}{string?~( obj -- ?~)}
}
Tests if the object at the top of the stack is a string.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{strings}
2005-05-18 20:39:39 -04:00
\ordinaryword{>string}{>string~( sbuf -- string )}
}
2005-05-18 20:39:39 -04:00
Turns a sequence of integers into a string. The integer elements are interpreted as characters. Note that this is not a way to turn any object into a printable representation; for that feature, see \ref{prettyprint}.
\wordtable{
\vocabulary{strings}
\ordinaryword{fill}{fill~( n char -- string )}
}
Creates a string with \texttt{char} repeated $n$ times.
\wordtable{
\vocabulary{strings}
\ordinaryword{pad-left}{pad-left~( string n char -- string )}
\ordinaryword{pad-right}{pad-right~( string n char -- string )}
}
Creates a string with \texttt{char} repeated $\max(0,l-n)$ times, where $l$ is the length of \texttt{string}, then appends this new string on the left or the right of the input string.
\subsection{Characters}
\wordtable{
\vocabulary{strings}
\ordinaryword{ch>string}{ch>string ( n -- string )}
}
2005-04-28 22:40:57 -04:00
Converts an integer representing a character value into a single-element string.
\wordtable{
\vocabulary{strings}
\ordinaryword{blank?}{blank?~( n -- ?~)}
\ordinaryword{letter?}{letter?~( n -- ?~)}
\ordinaryword{LETTER?}{LETTER?~( n -- ?~)}
\ordinaryword{digit?}{digit?~( n -- ?~)}
\ordinaryword{printable?}{printable?~( n -- ?~)}
\ordinaryword{quotable?}{quotable?~( n -- ?~)}
\ordinaryword{url-quotable?}{url-quotable?~( n -- ?~)}
}
Various character classification predicates.
\section{String buffers}\label{string-buffers}
\sbufglos
\wordtable{
\vocabulary{strings}
\classword{sbuf}
}
A string buffer is a mutable and growable sequence of characters. String buffers can be used to construct new strings by accumilating substrings and characters, however usually they are only used indirectly, since the sequence construction words in \ref{make-seq} are more convenient to use in many cases.
\wordtable{
\vocabulary{strings}
\ordinaryword{sbuf?}{sbuf?~( object -- ?~)}
}
Tests if the object at the top of the stack is a string buffer.
\wordtable{
\vocabulary{strings}
\ordinaryword{>sbuf}{>sbuf~( sequence -- sbuf )}
}
2005-08-10 19:37:59 -04:00
Turns a sequence of integers into a string buffer. Given a string buffer, this makes a fresh copy.
2005-05-18 20:39:39 -04:00
String buffers support the stream input and output protocol (\ref{string-streams}).
2005-05-18 20:39:39 -04:00
\section{Constructing sequences}\label{make-seq}
The library supports an idiom where sequences can be constructed without passing the partial sequence being built on the stack. This reduces stack noise, and thus simplifies code and makes it easier to understand.
\newcommand{\dynamicscopeglos}{\glossary{
name=dynamic scope,
description={a variable binding policy where bindings established in a scope are visible to all code executed while the scope is active}}}
\dynamicscopeglos
\wordtable{
\vocabulary{namespaces}
2005-08-30 23:42:15 -04:00
\ordinaryword{make}{make-list ( quot exemplar -- seq )}
}
2005-08-30 23:42:15 -04:00
Calls the quotation in a new \emph{dynamic scope}. The quotation and any words it calls can execute the \texttt{,} and \texttt{\%} words to accumulate elements. When the quotation returns, all accumulated elements are collected into a sequence with the same type as \verb|exemplar|.
\wordtable{
\vocabulary{namespaces}
\ordinaryword{,}{,~( element -- )}
}
2005-08-30 23:42:15 -04:00
Adds the element to the end of the sequence being constructed.
\wordtable{
\vocabulary{namespaces}
\ordinaryword{\%}{\% ( sequence -- )}
}
2005-08-30 23:42:15 -04:00
Appends the given sequence to the end of the sequence being constructed.
2005-04-28 22:40:57 -04:00
Here is an example of sequence construction:
\begin{alltt}
2005-08-30 23:42:15 -04:00
: silly [ [ dup , ] repeat ] \tto \ttc make , ;
[ 4 [ silly ] each ] [ ] make .
\textbf{[ \tto \ttc \tto 0 \ttc \tto 0 1 \ttc \tto 0 1 2 \ttc ]}
2005-04-28 22:40:57 -04:00
\end{alltt}
2005-08-30 23:42:15 -04:00
Note that \verb|make| will capture any variables set inside the quotation, due to dynamic scoping. See \ref{namespaces}.
2005-08-30 23:42:15 -04:00
\chapter{Collections}
\glossary{name=mapping,
description={an unordered collection of elements, accessed by key. Examples include association lists and hashtables}}
2005-08-30 23:42:15 -04:00
Apart from sequences, there are two types of collections in Factor:
\begin{itemize}
\item queues, which implement first-in-first-out semantics,
\item mappings, which associate keys with values.
\end{itemize}
\section{Queues}
The following set of words manages LIFO (last-in-first-out) queues.
\wordtable{
\vocabulary{lists}
\ordinaryword{<queue>}{<queue> ( -- queue )}
}
Makes a new queue with no elements.
\wordtable{
\vocabulary{lists}
\ordinaryword{queue-empty?}{queue-empty?~( queue -- ?~)}
}
Outputs \texttt{t} if the given queue does not contain any elements, \texttt{f} otherwise.
\wordtable{
\vocabulary{lists}
\ordinaryword{deque}{deque ( queue -- element )}
}
Dequeues an element. An exception is thrown if the queue is empty.
\wordtable{
\vocabulary{lists}
\ordinaryword{enque}{deque ( element queue -- )}
}
Enqueues an element.
\section{Mappings}
The two classes of mappings in the Factor library are association lists and hashtables.
\begin{tabular}[t]{l|c|c|c|l}
Class&Mutable&Ordered&Lookup&Primary purpose\\
\hline
\texttt{assoc}&&$\surd$&$O(n)$&Small, unchanging mappings\\
\texttt{hashtable}&$\surd$&&$O(1)$&Large or frequently-changing mappings
\end{tabular}
It might be tempting to just always use hashtables, however for very small mappings, association lists are just as efficient, and are easier to work with since the entire set of list words can be used with them.
2005-08-30 23:42:15 -04:00
\subsection{Association lists}
\glossary{name=association list,
2005-05-03 02:58:59 -04:00
description={a list of pairs, where the car of each pair is a key and the cdr is the value associated with that key}}
Association lists are built from cons cells. They are structured like a ribbed spine, where the ``spine'' is a list and each ``rib'' is a cons cell holding a key/value pair.
\wordtable{
\vocabulary{lists}
\ordinaryword{assoc?}{assoc ( object -- ?~)}
}
Tests if the object at the top of the stack is a proper list whose every element is a cons.
\wordtable{
\vocabulary{lists}
\ordinaryword{assoc}{assoc ( k alist -- v )}
\ordinaryword{assoc*}{assoc* ( k alist -- [[ k v ]] )}
}
2005-04-28 22:40:57 -04:00
These words look up a key in an association list, comparing keys in the list with the given key by equality with \texttt{=}. The list is searched starting from the beginning. The two words differ in that the latter returns the key/value pair located, whereas the former only returns the value. The \texttt{assoc*} word allows a distinction to be made between a missing value.
\wordtable{
\vocabulary{lists}
\ordinaryword{acons}{acons ( v k alist -- alist )}
\ordinaryword{set-assoc}{set-assoc ( v k alist -- alist )}
}
These words output a new association list containing the key/value pair.
They differ in that \texttt{set-assoc} removes any existing key/value pairs with the given key first. In both cases, searching for the key in the returned association list gives the new value, however with the slightly faster \texttt{acons}, the old value remains shadowed in the list.
\wordtable{
\vocabulary{lists}
\ordinaryword{remove-assoc}{remove-assoc ( k alist -- alist )}
}
Outputs a new association list which does not have any key/value pairs with the key equal to \texttt{k}.
\begin{figure}
\caption{An association list and its graphical representation}
\begin{verbatim}
[
[[ "Salsa" "Hot" ]]
[[ "Stir-Fry" "Medium" ]]
[[ "Peppers" "Very Hot" ]]
]
\end{verbatim}
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.45}{
%BEGIN IMAGE
\epsfbox{assoc.eps}
%END IMAGE
2005-08-10 19:37:59 -04:00
%HEVEA\imageflush
}
2005-08-10 19:37:59 -04:00
\end{center}
\end{figure}
2005-08-30 23:42:15 -04:00
\subsection{Hashtables}\label{hashtables}
2005-04-28 22:40:57 -04:00
\hashglos
\glossary{name=bucket,
description={a container for key/value pairs inside a hashtable. A hash function assigns each key to a bucket, with the goal of spreading the keys as evenly as possible}}
\glossary{name=hashcode,
description={an integer chosen so that equal objects have equal hashcodes, and unequal objects' hashcodes are distributed as evently as possible}}
\wordtable{
\vocabulary{hashtables}
\classword{hashtable}
}
2005-04-28 22:40:57 -04:00
A hashtable sorts key/value pairs into buckets using a hashing function. The number of buckets is chosen to be approximately equal to the number of key/value pairs in the hashtable, so assuming a good hash function that distributes keys evenly, lookups can be performed in constant time, with a quick hash calculation to determine a bucket, followed by testing of only one or two key/value pairs.
\wordtable{
\vocabulary{kernel}
\genericword{hashcode}{hashcode~( object -- n )}
2005-04-28 22:40:57 -04:00
}
Outputs the hashcode of the object. The contract of this generic word is as follows:
\begin{itemize}
2005-08-30 23:42:15 -04:00
\item the hashcode must be a fixnum (\ref{integers})\footnote{Strictly speaking, returning a bignum will not fail, however it will result in lower overall performance since the compiler will no longer make type assumptions when compiling callers of \texttt{hashcode}.},
\item if two objects are equal under \texttt{=}, they must have the same hashcode,
\item the word must not have any side effects
2005-04-28 22:40:57 -04:00
\end{itemize}
2005-08-10 19:37:59 -04:00
If mutable objects are used as hashtable keys, they must not be mutated in such a way that their hashcode changes. Doing so will violate bucket sorting invariants and result in undefined behavior.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hashtable?}{hashtable?~( object -- ?~)}
2005-04-28 22:40:57 -04:00
}
Tests if the object at the top of the stack is a hashtable.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{<hashtable>}{<hashtable>~( n -- hash )}
2005-04-28 22:40:57 -04:00
}
Creates a new empty hashtable with \texttt{n} buckets. As more elements are added to the hashtable, the number of buckets is automatically increased and the keys are re-sorted.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash}{hash ( k hash -- v )}
\ordinaryword{hash*}{hash* ( k hash -- [[ k v ]] )}
2005-04-28 22:40:57 -04:00
}
Looks up the value associated with a key. The two words differ in that the latter returns the key/value pair located, whereas the former only returns the value. The \texttt{hash*} word allows a distinction to be made between a missing value and a value equal to \texttt{f}.
\wordtable{
\vocabulary{hashtables}
2005-05-03 02:58:59 -04:00
\ordinaryword{set-hash}{set-hash ( v k hash -- )}
2005-04-28 22:40:57 -04:00
}
Stores a hashtable entry associating \texttt{k} with \texttt{v}.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{remove-hash}{remove-hash ( k hash -- )}
2005-04-28 22:40:57 -04:00
}
Removes the entry, if any, associated with the key \texttt{k}.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash-clear}{hash-clear ( hash -- )}
2005-04-28 22:40:57 -04:00
}
Removes all entries from the hashtable.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash-size}{hash-size ( hash -- n )}
2005-04-28 22:40:57 -04:00
}
Outputs the number of key/value pairs in the hashtable.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{bucket-count}{bucket-count ( hash -- n )}
2005-04-28 22:40:57 -04:00
}
Outputs the number of buckets in the hashtable. Ideally, this will be approximately equal to, or greater than \texttt{hash-size}.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash-each}{hash-each ( hash quot -- )}
\texttt{quot:~[[ key value ]] --}\\
2005-04-28 22:40:57 -04:00
}
Applies the quotation to each key/value pair in the hashtable.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash-subset}{hash-subset ( hash quot -- hash )}
\texttt{quot:~[[ key value ]] -- ?}\\
}
Applies the quotation to each key/value pair, collecting the key/value pairs for which the quotation output a true value into a new hashtable.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{cache}{cache ( key hash quot -- value )}
\texttt{quot:~key -- value}\\
}
If the key is present in the hashtable, return the associated value, otherwise apply the quotation to the key, yielding a new value that is then stored in the hashtable.
2005-08-10 19:37:59 -04:00
There is a pair of words for working with lazily-instantiated hashtables.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{?hash}{?hash ( key hash/f -- value )}
}
Outputs the value of the key, or \texttt{f} if the hashtable is \texttt{f}. The standard \verb|hash| word would raise an exception in the latter case.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{?set-hash}{?set-hash ( value key hash/f -- hash )}
}
If the given hashtable is not \verb|f|, store the key/value pair and output the same hashtable instance. Otherwise if the given hashtable if \verb|f|, create a new hashtable holding the key/value pair.
The following pair of words from the UI framework (see \ref{ui}) demonstrate the lazily-insantiated hashtable idiom:
\begin{verbatim}
: paint-prop* ( gadget key -- value )
swap gadget-paint ?hash ;
: set-paint-prop ( gadget value key -- )
pick gadget-paint ?set-hash swap set-gadget-paint ;
\end{verbatim}
\subsection{Converting between mappings}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{hashtables}
\ordinaryword{alist>hash}{alist>hash ( assoc -- hash )}
2005-04-28 22:40:57 -04:00
}
Creates a hashtable with the same key/value pairs as the association list. If the association list contains duplicate keys, latter keys take precedence; this behavior is the opposite of the \texttt{assoc} word, where prior keys take precedence.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash>alist}{hash>alist ( hash -- assoc )}
2005-04-28 22:40:57 -04:00
}
Creates an association list with the same key/valie pairs as the hashtable.
\wordtable{
\vocabulary{hashtables}
\ordinaryword{hash-keys}{hash-keys ( hash -- list )}
\ordinaryword{hash-values}{hash-values ( hash -- list )}
2005-04-28 22:40:57 -04:00
}
Builds lists of keys and values stored in the hashtable.
\wordtable{
\vocabulary{hashtables}
2005-08-10 19:37:59 -04:00
\ordinaryword{buckets>vector}{buckets>vector ( hash -- vector )}
2005-04-28 22:40:57 -04:00
}
2005-08-10 19:37:59 -04:00
Outputs a vector of association lists, where each association list contains the key/value pairs in a certain bucket. Useful for debugging hashcode distribution.
2005-04-28 22:40:57 -04:00
2005-08-30 23:42:15 -04:00
\subsection{Variables and namespaces}\label{namespaces}
2005-04-28 22:40:57 -04:00
A variable is an entry in a hashtable of bindings, with the hashtable being implicit rather than passed on the stack. These hashtables are termed \emph{namespaces}. Nesting of scopes is implemented with a search order on namespaces, defined by a \emph{name stack}. Since namespaces are just hashtables, any object can be used as a variable, however by convention, variables are keyed by symbols (\ref{symbols}).
2005-04-28 22:40:57 -04:00
The \texttt{get} and \texttt{set} words read and write variable values. The \texttt{get} word searches up the chain of nested namespaces, while \texttt{set} always sets variable values in the current namespace only. Namespaces are dynamically scoped; when a quotation is called from a nested scope, any words called by the quotation also execute in that scope.
\glossary{name=name stack,
description={a stack holding namespaces. Entering a dynamic scope pushes the name stack, leaving a scope pops it}}
\glossary{name=namespace,
description={a hashtable pushed on the name stack and used as a set of variable bindings}}
\glossary{name=current namespace,
description={the namespace at the top of the name stack}}
\wordtable{
\vocabulary{namespaces}
\ordinaryword{get}{get ( variable -- value )}
2005-04-28 22:40:57 -04:00
}
Searches the name stack for a namespace containing \texttt{variable}, and outputs the value. If no such namespace is found, outputs \texttt{f}.
\wordtable{
\vocabulary{namespaces}
\ordinaryword{set}{set ( value variable -- )}
2005-04-28 22:40:57 -04:00
}
Sets the value of \texttt{variable} to \texttt{value} in the current namespace at the top of the name stack.
\wordtable{
\vocabulary{namespaces}
\ordinaryword{on}{on ( variable -- )}
\ordinaryword{off}{off ( variable -- )}
2005-04-28 22:40:57 -04:00
}
Sets the value of \texttt{variable} to \texttt{t} and \texttt{f} respectively, implemented as follows:
\begin{verbatim}
: on ( variable -- ) t swap set ;
: off ( variable -- ) f swap set ;
\end{verbatim}
\wordtable{
\vocabulary{namespaces}
\ordinaryword{change}{change ( variable quot -- )}
\texttt{quot:~old -- new}\\
2005-04-28 22:40:57 -04:00
}
Applies the quotation to the current variable value, and stores the return value of the quotation back in the variable.
\begin{figure}
\caption{Dynamic scope example}
The following diagram shows the nesting of scopes resulting inside the \texttt{inner} word call resulting from executing \texttt{outer}.
\begin{verbatim}
SYMBOL: rand
SYMBOL: rator
SYMBOL: gator
: inner ( -- )
rand get gator get * rator set ;
: middle ( -- )
5 rator set [ inner ] with-scope rator get ;
: outer ( -- )
2 gator set
3 rand set
[ 4 rand set middle ] with-scope ;
outer
\end{verbatim}
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.5}{
%BEGIN IMAGE
\epsfbox{namestack.eps}
%END IMAGE
%HEVEA\imageflush
}
2005-04-28 22:40:57 -04:00
\end{center}
\end{figure}
\wordtable{
\vocabulary{namespaces}
\ordinaryword{with-scope}{with-scope ( quot -- )}
2005-04-28 22:40:57 -04:00
}
2005-08-30 23:42:15 -04:00
Calls the quotation in a new namespace. Any variables set by the quotation are discarded when it returns.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{namespaces}
2005-08-30 23:42:15 -04:00
\ordinaryword{make-hash}{make-hash ( quot -- hash )}
2005-04-28 22:40:57 -04:00
}
2005-08-30 23:42:15 -04:00
Calls the quotation in a new namespace, and outputs this namespace when the quotation returns. Useful for quickly building hashtables; for example:
\begin{alltt}
[ 1 "one" set 2 "two" set ] make-hash .
\textbf{\tto\tto [[ "one" 1 ]] [[ "two" 2 ]] \ttc\ttc}
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{namespaces}
\ordinaryword{bind}{bind ( ns quot -- )}
2005-04-28 22:40:57 -04:00
}
2005-08-30 23:42:15 -04:00
Calls the quotation in the dynamic scope of \texttt{ns}. When variables are looked up by the quotation, \texttt{ns} is checked first, and setting variables in the quotation stores them in \texttt{ns}.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{namespaces}
\ordinaryword{namespace}{namespace ( -- ns )}
2005-04-28 22:40:57 -04:00
}
Outputs the current namespace.
Sets the value of \texttt{variable} to \texttt{t} and \texttt{f}, respectively.
\wordtable{
\vocabulary{namespaces}
\ordinaryword{global}{global ( -- ns )}
2005-04-28 22:40:57 -04:00
}
Outputs the global namespace. Most often this is used as a parameter to \texttt{bind}. For example, a global variable is set as follows:
\begin{verbatim}
SYMBOL: the-boss
global [ "Mr. Lahey" the-boss set ] bind
\end{verbatim}
\wordtable{
\vocabulary{namespaces}
\ordinaryword{nest}{nest ( variable -- ns )}
2005-04-28 22:40:57 -04:00
}
If the variable is set in the current namespace, outputs its value. Otherwise sets its value to a new namespace and output that.
\chapter{Mathematics}
2005-04-24 20:57:37 -04:00
\numberglos
\begin{figure}
\caption{Numerical class hierarchy}
2005-04-28 22:40:57 -04:00
\begin{center}
2005-05-18 20:39:39 -04:00
\scalebox{0.5}{
%BEGIN IMAGE
\epsfbox{number.eps}
%END IMAGE
%HEVEA\imageflush
}
\end{center}
\end{figure}
2005-04-28 22:40:57 -04:00
Factor attempts to preserve natural mathematical semantics for numbers. Multiplying two large integers never results in overflow, and dividing two integers yields an exact fraction rather than a floating point approximation. Floating point numbers are also supported, along with complex numbers.
2005-08-10 19:37:59 -04:00
\section{Number protocol}\label{number-protocol}
2005-08-30 23:42:15 -04:00
\numupgradeglos
2005-04-28 22:40:57 -04:00
The following usual operations are supported by all numbers.
2005-08-10 19:37:59 -04:00
These words obey the rules of numerical upgrading. If one of the inputs is a \texttt{bignum} and the other is a \texttt{fixnum}, the latter is first coerced to a \texttt{bignum}; if one of the inputs is a \texttt{float}, the other is coerced to a \texttt{float}.
Two examples where you should note the types of the inputs and outputs:
\begin{alltt}
3 >fixnum 6 >bignum * class .
\textbf{bignum}
1/2 2.0 + .
\textbf{4.5}
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{+}{+ ( n n -- n )}
\ordinaryword{-}{- ( n n -- n )}
\ordinaryword{*}{* ( n n -- n )}
\ordinaryword{/}{/ ( n n -- n )}
2005-04-28 22:40:57 -04:00
}
The non-commutative operations \texttt{-} and \texttt{/} take operands from the stack in the natural order; \texttt{6 2 /} divides 6 by 2.
The following ordering operations are supported on real numbers only.
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{<}{< ( n n -- ?~)}
\ordinaryword{<=}{<= ( n n -- ?~)}
\ordinaryword{>}{> ( n n -- ?~)}
\ordinaryword{>=}{>= ( n n -- ?~)}
2005-04-28 22:40:57 -04:00
}
2005-08-10 19:37:59 -04:00
The following pair of division operations are supported on integers only.
\wordtable{
\vocabulary{math}
\ordinaryword{/i}{/i ( n n -- integer )}
\ordinaryword{/f}{/f ( n n -- float )}
}
The \texttt{/} word gives an exact answer where possible. These two words output the answer in other forms. The \texttt{/i} word truncates the result towards zero, and \texttt{/f} converts it to a floating point approximation.
\section{Integers}\label{integers}
2005-04-24 20:57:37 -04:00
\integerglos
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\classword{integer}
\classword{fixnum}
\classword{bignum}
2005-04-28 22:40:57 -04:00
}
The simplest type of number is the integer. Integers come in two varieties -- \emph{fixnums} and \emph{bignums}. As their names suggest, a fixnum is a fixed-width quantity\footnote{On 32-bit systems, an element of the interval $(-2^{-29},2^{29}]$, and on 64-bit systems, the interval $(-2^{-61},2^{61}]$. Because fixnums automatically grow to bignums, usually you do not have to worry about details like this.}, and is a bit quicker to manipulate than an arbitrary-precision bignum.
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\predword{integer?}
\predword{fixnum?}
\predword{bignum?}
2005-04-28 22:40:57 -04:00
}
2005-04-24 20:57:37 -04:00
The predicate word \texttt{integer?}~tests if the top of the stack is an integer. If this returns true, then exactly one of \texttt{fixnum?}~or \texttt{bignum?}~would return true for that object. Usually, your code does not have to worry if it is dealing with fixnums or bignums.
2005-04-28 22:40:57 -04:00
Integer operations automatically return bignums if the result would be too big to fit in a fixnum. Here is an example where multiplying two fixnums returns a bignum:
2005-04-24 20:57:37 -04:00
\begin{alltt}
134217728 fixnum? .
2005-04-24 20:57:37 -04:00
\textbf{t}
128 fixnum? .
2005-04-24 20:57:37 -04:00
\textbf{t}
134217728 128 * .
2005-04-24 20:57:37 -04:00
\textbf{17179869184}
134217728 128 * bignum? .
2005-04-24 20:57:37 -04:00
\textbf{t}
\end{alltt}
2005-04-28 22:40:57 -04:00
Integers can be entered using a different base; see \ref{integer-literals}.
The word \texttt{.} prints numbers in decimal, regardless of how they were input. A set of words in the \texttt{prettyprint} vocabulary is provided to print integers using another base.
\wordtable{
\vocabulary{prettyprint}
\ordinaryword{.h}{.h ( n -- )}
\ordinaryword{.o}{.o ( n -- )}
\ordinaryword{.b}{.b ( n -- )}
2005-04-28 22:40:57 -04:00
}
Prints an integer in hexadecimal, octal or binary.
\subsection{Modular arithmetic}
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{mod}{mod ( x y -- r )}
2005-04-28 22:40:57 -04:00
}
Computes the remainder of dividing \texttt{x} by \texttt{y}. If the result is 0, then \texttt{x} is a multiple of \texttt{y}.
2005-04-24 20:57:37 -04:00
\begin{alltt}
100 3 mod .
2005-04-28 22:40:57 -04:00
\textbf{1}
-546 34 mod .
2005-04-28 22:40:57 -04:00
\textbf{-2}
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{rem}{rem ( x y -- r )}
2005-04-28 22:40:57 -04:00
}
This is the same as \texttt{mod} except the answer is always positive.
\begin{alltt}
-546 34 rem .
2005-04-28 22:40:57 -04:00
\textbf{32}
\end{alltt}
\wordtable{
\vocabulary{math}
\ordinaryword{/mod}{/mod ( x y -- q r )}
2005-04-28 22:40:57 -04:00
}
Computes both the quotient and remainder. That is, \texttt{/mod} could be defined as follows, except in practice it is slightly more efficient:
\begin{verbatim}
: /mod ( x y -- q r ) dup /i swap mod ;
\end{verbatim}
\wordtable{
\vocabulary{math}
\ordinaryword{gcd}{gcd ( x y -- a d )}
2005-04-28 22:40:57 -04:00
}
Applies the Euclidian algorithm to \texttt{x} and \texttt{y}. The output values satisfy the following property for some integer $b$:
$$ax+by=d$$
Furthermore, $d$ is the greatest integer having this property; that is, it is the greatest common divisor of $a$ and $b$.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
2005-08-10 19:37:59 -04:00
\ordinaryword{mod-inv}{mod-inv ( x n -- y )}
2005-04-28 22:40:57 -04:00
}
Computes a value \texttt{y} that satisfies the following property:
$$xy \equiv 1 \bmod{n}$$ An exception is thrown if no such \texttt{y} exists.
\wordtable{
\vocabulary{math}
2005-06-08 22:06:33 -04:00
\ordinaryword{\hhat{}mod}{\hhat{}mod ( x y n -- z )}
2005-04-28 22:40:57 -04:00
}
Raises \texttt{x} to the power of \texttt{y}, modulo \texttt{n}. This is far more efficient than first calling \texttt{\^{}} followed by \texttt{mod}.
2005-04-24 20:57:37 -04:00
\subsection{Bitwise operations}\label{bitwise}
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
There are two ways of looking at an integer -- as a mathematical entity, or as a string of bits. The latter representation motivates \emph{bitwise operations}.
\wordtable{
\vocabulary{math}
\ordinaryword{bitand}{bitand ( x y -- z )}
2005-04-28 22:40:57 -04:00
}
Outputs a new integer where each bit is set if and only if the corresponding bit is set in both $x$ and $y$.
\begin{alltt}
BIN: 101 BIN: 10 bitand .b
\textbf{0}
BIN: 110 BIN: 10 bitand .b
\textbf{10}
2005-04-28 22:40:57 -04:00
\end{alltt}
\wordtable{
\vocabulary{math}
\ordinaryword{bitor}{bitor ( x y -- z )}
2005-04-28 22:40:57 -04:00
}
Outputs a new integer where each bit is set if and only if the corresponding bit is set in at least one of $x$ or $y$.
2005-04-24 20:57:37 -04:00
\begin{alltt}
BIN: 101 BIN: 10 bitor .b
\textbf{111}
BIN: 110 BIN: 10 bitor .b
\textbf{110}
2005-04-28 22:40:57 -04:00
\end{alltt}
\wordtable{
\vocabulary{math}
\ordinaryword{bitxor}{bitxor ( x y -{}- x\^{}y )}
2005-04-28 22:40:57 -04:00
}
Outputs a new integer where each bit is set if and only if the corresponding bit is set in exactly one of $x$ or $y$.
\begin{alltt}
BIN: 101 BIN: 10 bitxor .b
\textbf{111}
BIN: 110 BIN: 10 bitxor .b
\textbf{100}
2005-04-28 22:40:57 -04:00
\end{alltt}
\wordtable{
\vocabulary{math}
\ordinaryword{bitnot}{bitnot ( x -{}- y )}
2005-04-28 22:40:57 -04:00
}
Computes the bitwise complement of the input; that is, each bit in the input number is flipped. Because integers are represented in two's complement form, this is actually equivalent to negating the integer, and subtracting 1.
\wordtable{
\vocabulary{math}
\ordinaryword{shift}{shift ( x n -{}- y )}
2005-04-28 22:40:57 -04:00
}
Computes a new integer consisting of the bits of the first integer, shifted to the left by $n$ positions. If $n$ is negative, the bits are shifted to the right instead, and bits that ``fall off'' are discarded.
\begin{alltt}
BIN: 101 5 shift .b
\textbf{10100000}
BIN: 11111 -2 shift .b
\textbf{111}
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{log2}{log2 ( n -{}- b )}
}
2005-08-10 19:37:59 -04:00
Computes the largest integer less than or equal to $\log_2 n$. The input must be positive and the result is always an integer. In most cases, the \verb|log| word (\ref{algebraic}) should be used instead, since it allows any complex number as input, and the result is not truncated to an integer.
2005-05-18 20:39:39 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{each-bit}{each-bit ( n quot -{}- | quot: 0/1 -{}- )}
}
Applies the quotation to each bit of the input. The input must be a positive integer.
\subsection{Generating random numbers}
\wordtable{
\vocabulary{math}
2005-05-02 02:29:24 -04:00
\ordinaryword{random-int}{random-int ( a b -- n )}
}
2005-05-02 02:29:24 -04:00
Outputs a pseudo-random integer in the interval $[a,b]$.
\section{Rational numbers}\label{ratios}
2005-04-24 20:57:37 -04:00
\newcommand{\rationalglos}{\glossary{
name=rational,
description={an instance of the \texttt{rational} class, which is a disjoint union of the
\texttt{integer} and \texttt{ratio} classes}}}
\rationalglos
\ratioglos
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\classword{ratio}
2005-04-28 22:40:57 -04:00
}
2005-04-24 20:57:37 -04:00
If we add, subtract or multiply any two integers, the result is always an integer. However, this is not the case with division. When dividing a numerator by a denominator where the numerator is not a integer multiple of the denominator, a ratio is returned instead.
\begin{alltt}
1210 11 / .
\textbf{110}
100 330 / .
\textbf{10/33}
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-04-28 22:40:57 -04:00
Ratios are printed and can be input literally in the form above. Ratios are always reduced to lowest terms by factoring out the greatest common divisor of the numerator and denominator. A ratio with a denominator of 1 becomes an integer. Trying to create a ratio with a denominator of 0 raises an error.
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
Ratios behave just like any other number -- all numerical operations work as expected.
2005-04-24 20:57:37 -04:00
\begin{alltt}
1/2 1/3 + .
2005-04-24 20:57:37 -04:00
\textbf{5/6}
100 6 / 3 * .
2005-04-24 20:57:37 -04:00
\textbf{50}
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\predword{ratio?}
2005-04-28 22:40:57 -04:00
}
Tests if the top of the stack is a ratio.
\wordtable{
\vocabulary{math}
\ordinaryword{numerator}{numerator ( rational -- numerator )}
\ordinaryword{denominator}{denominator ( rational -- numerator )}
2005-04-28 22:40:57 -04:00
}
Deconstructs rational numbers into their numerator and denominator. The denominator is always positive; for integers, it equals 1.
2005-04-24 20:57:37 -04:00
\begin{alltt}
75/33 numerator .
2005-04-24 20:57:37 -04:00
\textbf{25}
75/33 denominator .
2005-04-24 20:57:37 -04:00
\textbf{11}
12 numerator .
2005-04-24 20:57:37 -04:00
\textbf{12}
\end{alltt}
2005-06-12 20:55:30 -04:00
\section{Floats}\label{floats}
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\classword{float}
2005-04-28 22:40:57 -04:00
}
2005-04-24 20:57:37 -04:00
\newcommand{\realglos}{\glossary{
name=real,
description={an instance of the \texttt{real} class, which is a disjoint union of the
\texttt{rational} and \texttt{float} classes}}}
\realglos
\floatglos
Rational numbers represent \emph{exact} quantities. On the other hand, a floating point number is an \emph{approximation}. While rationals can grow to any required precision, floating point numbers are fixed-width, and manipulating them is usually faster than manipulating ratios or bignums (but slower than manipulating fixnums). Floating point literals are often used to represent irrational numbers, which have no exact representation as a ratio of two integers. Floating point literals are input with a decimal point.
\begin{alltt}
1.23 1.5 + .
2005-04-24 20:57:37 -04:00
\textbf{1.73}
\end{alltt}
2005-04-28 22:40:57 -04:00
Introducing a floating point number in a computation forces the result to be expressed in floating point.
2005-04-24 20:57:37 -04:00
\begin{alltt}
5/4 1/2 + .
2005-04-24 20:57:37 -04:00
\textbf{7/4}
5/4 0.5 + .
2005-04-24 20:57:37 -04:00
\textbf{1.75}
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\predword{float?}
2005-04-28 22:40:57 -04:00
}
Tests if the top of the stack is a floating point number.
\wordtable{
\vocabulary{math}
\ordinaryword{>float}{>float ( real -- float )}
2005-04-28 22:40:57 -04:00
}
Turn any real number into a floating point approximation.
2005-04-24 20:57:37 -04:00
2005-06-12 20:55:30 -04:00
\subsection{Binary representation of floats}\label{float-bits}
Floating point numbers are represented internally in IEEE 754 double-precision format. This internal representation can be accessed for advanced operations and input/output purposes.
\wordtable{
\vocabulary{math}
\ordinaryword{float>bits}{float>bits ( x -- n )}
\ordinaryword{double>bits}{double>bits ( x -- n )}
}
Converts the floating point number $x$ into IEEE 754 single or double precision form, and outputs an integer holding the binary representation of the result.
\wordtable{
\vocabulary{math}
\ordinaryword{bits>float}{float>bits ( n -- x )}
\ordinaryword{bits>double}{double>bits ( n -- x )}
}
Converts an integer representation $n$ of an IEEE 754 single or double precision float into a \verb|float| object.
\section{Complex numbers}\label{complex-numbers}
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\classword{complex}
2005-04-28 22:40:57 -04:00
}
Complex numbers arise as solutions to quadratic equations whose graph does not intersect the $x$ axis. Their literal syntax is covered in \ref{complex-literals}.
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\predword{complex?}
2005-04-28 22:40:57 -04:00
}
Tests if the top of the stack is a complex number. Note that unlike math, where all real numbers are also complex numbers, Factor only considers a number to be a complex number if its imaginary part is non-zero.
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{conjugate}{conjugate ( n -- n )}
}
Outputs the complex conjugate of a complex number. The complex conjugate of $a+bi$ is denoted $\overline{a+bi}$ and equals $a-bi$.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{real}{real ( n -- n )}
\ordinaryword{imaginary}{imaginary ( n -- n )}
2005-04-28 22:40:57 -04:00
}
Deconstructs complex numbers into their real and imaginary components. The imaginary component of a real number is always zero.
2005-04-24 20:57:37 -04:00
\begin{alltt}
-1 sqrt real .
2005-04-24 20:57:37 -04:00
\textbf{0}
-1 sqrt imaginary .
2005-04-24 20:57:37 -04:00
\textbf{1}
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{>rect}{>rect ( n -- re im )}
\ordinaryword{rect>}{rect> ( re im -- n )}
2005-04-28 22:40:57 -04:00
}
Converts between complex numbers and pairs of real numbers representing them in rectangular form.
2005-04-24 20:57:37 -04:00
\begin{alltt}
-1 sqrt sqrt >rect .s
2005-04-28 22:40:57 -04:00
\textbf{0.7071067811865475
0.7071067811865476}
1/3 5 rect> .
2005-04-28 22:40:57 -04:00
\textbf{\#\tto 1/3 5 \ttc\#}
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{>polar}{>polar ( n -- r theta )}
\ordinaryword{polar>}{polar> ( r theta -- n )}
2005-04-28 22:40:57 -04:00
}
Converts between complex numbers and pairs of real numbers representing them in polar form. The polar form of a complex number consists of an absolute value and argument.
\begin{alltt}
\#\tto 4 5 \ttc >polar .s
2005-04-28 22:40:57 -04:00
\textbf{0.8960553845713439
6.403124237432849}
\end{alltt}
2005-04-24 20:57:37 -04:00
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{abs}{abs ( n -- r )}
\ordinaryword{arg}{arg ( n -- theta )}
2005-04-28 22:40:57 -04:00
}
Computes the absolute value and argument individually.
2005-04-24 20:57:37 -04:00
\begin{alltt}
-5.3 abs .
2005-04-24 20:57:37 -04:00
\textbf{5.3}
i arg .
2005-04-24 20:57:37 -04:00
\textbf{1.570796326794897}
\end{alltt}
\section{Algebraic and transcedential functions}\label{algebraic}
2005-04-24 20:57:37 -04:00
There is a pair of words for computing additive and multiplicative inverses.
\wordtable{
\vocabulary{math}
\ordinaryword{neg}{neg ( x -- -x )}
\ordinaryword{recip}{recip ( x -- -x )}
}
These words are defined in the obvious way:
\begin{verbatim}
: neg 0 swap - ;
: recip 1 swap / ;
\end{verbatim}
2005-06-10 18:53:52 -04:00
The library includes the standard set of words for rounding real numbers to integers.
\wordtable{
\vocabulary{math}
\ordinaryword{ceiling}{ceiling ( x -- n )}
}
Outputs the smallest integer greater than or equal to $x$.
\wordtable{
\vocabulary{math}
\ordinaryword{floor}{floor ( x -- n )}
}
Outputs the greatest integer smaller than or equal to $x$.
\wordtable{
\vocabulary{math}
\ordinaryword{truncate}{truncate ( x -- n )}
2005-06-10 18:53:52 -04:00
}
Outputs the integer that results from subtracting the fractional component of $x$.
The relation between these three words can be seen in the following table:
\begin{tabular}{r|r|r|r}
$x$&\verb|ceiling|&\verb|floor|&\verb|truncate|\\
\hline
$3/2$&$2$&$1$&$1$\\
$-3/2$&$-1$&$-2$&$-1$\\
$2$&$2$&$2$&$2$
\end{tabular}
The next set of words computes powers and logarithms.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{sq}{sq ( x -- y )}
\ordinaryword{sqrt}{sqrt ( x -- y )}
2005-04-28 22:40:57 -04:00
}
2005-08-10 19:37:59 -04:00
Computes the square (raised to power 2) and square root (raised to power $1/2$).
\wordtable{
\vocabulary{math}
\ordinaryword{\hhat}{\^{} ( x y -- z )}
}
Raises \texttt{x} to the power of \texttt{y}. If \texttt{y} is an integer the answer is computed exactly, otherwise a floating point approximation is used.
2005-04-28 22:40:57 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{exp}{exp ( n -- n )}
2005-04-28 22:40:57 -04:00
}
Raises the number $e$ to a specified power. The number $e$ can be pushed on the stack with the \texttt{e} word, so \texttt{exp} could have been defined as follows\footnote{Of course, $e\approx 2.718281828459045$}:
2005-04-24 20:57:37 -04:00
\begin{alltt}
2005-04-28 22:40:57 -04:00
: exp ( x -- y ) e swap \^ ;
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-04-28 22:40:57 -04:00
However, it is actually defined otherwise, for efficiency.\footnote{In fact, things are done the other way around; the word \texttt{\^{}} is actually defined in terms of \texttt{exp}, to correctly handle complex number arguments.}
\wordtable{
\vocabulary{math}
\ordinaryword{log}{log ( x -- y )}
2005-04-28 22:40:57 -04:00
}
Computes the natural (base $e$) logarithm. This is the inverse of the \texttt{exp} function.
2005-04-24 20:57:37 -04:00
\begin{alltt}
e log .
2005-04-24 20:57:37 -04:00
\textbf{1.0}
-1 log .
2005-04-28 22:40:57 -04:00
\textbf{\#\tto 0.0 3.141592653589793 \ttc}
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-04-28 22:40:57 -04:00
The \texttt{math} vocabulary provides the full set of trigonometric and hyperbolic functions, along with inverses and reciprocals. Complex number arguments are supported.
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
\index{\texttt{sin}}
\index{\texttt{cos}}
\index{\texttt{tan}}
\index{\texttt{cosec}}
\index{\texttt{sec}}
\index{\texttt{cot}}
\index{\texttt{asin}}
\index{\texttt{acos}}
\index{\texttt{atan}}
\index{\texttt{acosec}}
\index{\texttt{asec}}
\index{\texttt{acot}}
\index{\texttt{sinh}}
\index{\texttt{cosh}}
\index{\texttt{tanh}}
\index{\texttt{cosech}}
\index{\texttt{sech}}
\index{\texttt{coth}}
\index{\texttt{asinh}}
\index{\texttt{acosh}}
\index{\texttt{atanh}}
\index{\texttt{acosech}}
\index{\texttt{asech}}
\index{\texttt{acoth}}
2005-04-28 22:40:57 -04:00
\begin{tabular}{l|l|l|l|l}
Function&Trigonometric&Hyperbolic&Trig. inverse&Hyp. inverse\\
\hline
Sine&\texttt{sin}&\texttt{sinh}&\texttt{asin}&\texttt{asinh}\\
Cosine&\texttt{cos}&\texttt{cosh}&\texttt{acos}&\texttt{acosh}\\
Tangent&\texttt{tan}&\texttt{tanh}&\texttt{atan}&\texttt{atanh}\\
\hline
Cosecant&\texttt{cosec}&\texttt{cosech}&\texttt{acosec}&\texttt{acosech}\\
Secant&\texttt{sec}&\texttt{sech}&\texttt{asec}&\texttt{asech}\\
Cotangent&\texttt{cot}&\texttt{coth}&\texttt{acot}&\texttt{acoth}
\end{tabular}
2005-04-24 20:57:37 -04:00
\section{Constants}
The following words in the \texttt{math} vocabulary push constant values on the stack.
2005-08-10 19:37:59 -04:00
\index{\texttt{i}}
\index{\texttt{-i}}
\index{\texttt{inf}}
\index{\texttt{-inf}}
\index{\texttt{e}}
\index{\texttt{pi}}
\begin{tabular}{l|l}
Word&Value\\
\hline
\texttt{i}&Positive imaginary unit -- \texttt{\pound\tto~0 1 \ttc\pound}\\
\texttt{-i}&Negative imaginary unit -- \texttt{\pound\tto~0 -1 \ttc\pound}\\
\texttt{inf}&Positive floating point infinity\\
\texttt{-inf}&Negative floating point infinity\\
\texttt{e}&Base of natural logarithm ($e\approx 2.7182818284590452354$)\\
\texttt{pi}&Ratio of circumference to diameter ($\pi\approx 3.14159265358979323846$)\\
\end{tabular}
\section{Linear algebra}
2005-05-18 20:39:39 -04:00
\subsection{Vectors}
2005-05-18 20:39:39 -04:00
2005-05-22 22:08:46 -04:00
Any Factor sequence can be used to represent a mathematical vector, not just instances of the \verb|vector| class. Anywhere a vector is mentioned in this section, keep in mind it is a mathematical term, not a Factor data type.
The usual mathematical operations on vectors are supported.
2005-05-18 20:39:39 -04:00
\subsubsection{Scaling operations}
2005-05-18 20:39:39 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
\ordinaryword{vneg}{vneg ( vec -- vec )}
}
Negates each element of a vector.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
\ordinaryword{v*n}{v*n ( vec n -- vec )}
2005-05-18 20:39:39 -04:00
\ordinaryword{n*v}{n*v ( n vec -- vec )}
}
Multiplies each element of the vector by a scalar. The two words only differ in argument order.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
\ordinaryword{v/n}{v/n ( vec n -- vec )}
\ordinaryword{n/v}{n/v ( n vec -- vec )}
}
Divides each element of the vector by a scalar, or alternatively, divides the scalar by each element of a vector. The two words yield different results; the elements of the two resulting vector are reciprocals of each other.
2005-05-18 20:39:39 -04:00
\subsubsection{Pairwise operations}\label{pairwise}
These words all expect a pair of vectors of equal length. They apply a binary operation to each pair of elements, producing a new vector. They are all implemented using the \verb|2map| combinator (\ref{iteration}).
2005-05-18 20:39:39 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{v+}{v+ ( vec vec -- vec )}
\ordinaryword{v-}{v-~( vec vec -- vec )}
\ordinaryword{v*}{v*~( vec vec -- vec )}
\ordinaryword{v/}{v/~( vec vec -- vec )}
\ordinaryword{vmax}{vmax~( vec vec -- vec )}
\ordinaryword{vmin}{vmin~( vec vec -- vec )}
\ordinaryword{v<}{v<~( vec vec -- vec )}
\ordinaryword{v<=}{v<=~( vec vec -- vec )}
\ordinaryword{v>}{v>~( vec vec -- vec )}
\ordinaryword{v>=}{v>=~( vec vec -- vec )}
\ordinaryword{vand}{vand~( vec vec -- vec )}
\ordinaryword{vor}{vor~( vec vec -- vec )}
2005-05-18 20:39:39 -04:00
}
Note that \verb|v*| is not the inner product. The inner product is the \verb|v.| word (\ref{inner-product}).
2005-05-18 20:39:39 -04:00
\begin{alltt}
\tto 0 2 1/2 1 \ttc \tto 5 6 3 8 \ttc v+ .
2005-05-18 20:39:39 -04:00
\textbf{\tto 5 8 7/2 9 \ttc}
\end{alltt}
\subsubsection{Reducing operations}\label{reductions}
These words take a vector as input and produce a single number (or boolean).
2005-05-18 20:39:39 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
\ordinaryword{sum}{sum~( vec -- n )}
\ordinaryword{product}{product~( vec -- n )}
2005-05-18 20:39:39 -04:00
}
2005-08-10 19:37:59 -04:00
Adds or multiplies all numbers in the vector. These are implemented using the \verb|reduce| combinator (\ref{iteration}):
\begin{verbatim}
: sum ( v -- n ) 0 [ + ] reduce ;
: product 1 [ * ] reduce ;
\end{verbatim}
\subsubsection{Inner and cross products}\label{inner-product}
2005-05-18 20:39:39 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{v.}{v.~( vec vec -- n )}
}
2005-08-10 19:37:59 -04:00
Computes the real inner product of two vectors. They must be of equal length.
2005-05-18 20:39:39 -04:00
2005-08-10 19:37:59 -04:00
\wordtable{
\vocabulary{math}
\ordinaryword{c.}{c.~( vec vec -- n )}
}
Computes the complex inner product of two vectors. The complex inner product is skew-symmetric; that is, $<a,b>=\overline{<b,a>}$, where $\overline{z}$ is the complex conjugate of $z$.
2005-05-18 20:39:39 -04:00
2005-05-22 22:08:46 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-22 22:08:46 -04:00
\ordinaryword{norm}{norm~( vec -- n )}
}
Computes the norm (``length'') of a vector. The norm of a vector $v$ is defined as $\sqrt{<v,v>}$.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-22 22:08:46 -04:00
\ordinaryword{normalize}{normalize~( vec -- vec )}
}
Outputs a vector with the same direction, but length 1. Defined as follows:
\begin{verbatim}
: normalize ( vec -- vec ) dup norm v/n ;
2005-05-22 22:08:46 -04:00
\end{verbatim}
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-22 22:08:46 -04:00
\ordinaryword{cross}{cross~( v1 v2 -- vec )}
}
2005-05-22 22:16:31 -04:00
Computes the cross product $v_1\times v_2$. The following example illustrates the fact that a cross product of two vectors is always orthogonal to either vector.
2005-05-22 22:08:46 -04:00
\begin{alltt}
\tto 1 6/7 -8 \ttc \tto 8/5 3 -2 \ttc cross .
2005-05-22 22:08:46 -04:00
\textbf{\tto 156/7 -54/5 -118/35 \ttc}
\tto 156/7 -54/5 57/35 \ttc \tto 1 6/7 -8 \ttc v. .
2005-05-22 22:08:46 -04:00
\textbf{0}
\tto 156/7 -54/5 57/35 \ttc \tto 8/5 3 -2 \ttc v. .
2005-05-22 22:08:46 -04:00
\textbf{0}
\end{alltt}
\subsection{Matrices}\label{matrices}
2005-05-18 20:39:39 -04:00
2005-08-10 19:37:59 -04:00
Matrices are represented as sequences of sequences of equal length. For example, consider
the following object:
\begin{verbatim}
{ { 1 0 -1 }
{ 2 1/3 6 }
{ 4 -2 0 }
{ 0 0 8 } }
\end{verbatim}
It corresponds to the following mathematical matrix:
$$\left( \begin{array}{c c c}
1 & 0 & -1 \\
2 & 1/3 & 6 \\
4 & -2 & 0 \\
0 & 0 & 8 \end{array}
\right)$$
2005-05-18 20:39:39 -04:00
2005-08-10 19:37:59 -04:00
The transpose of a matrix may be computed with the \verb|flip| word (\ref{reshaping}).
2005-05-18 20:39:39 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{<zero-matrix>}{<zero-matrix> ( rows cols -- matrix )}
}
Creates a new matrix with the given dimensions and all elements set to zero.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{<identity-matrix>}{<identity-matrix> ( n -- matrix )}
}
Creates a new $n\times n$ matrix where all elements on the main diagonal are 1, and all other elements are zero; for example:
\begin{alltt}
2005-08-10 19:37:59 -04:00
3 <identity-matrix> .
\textbf{\tto \tto 1 0 0 \ttc \tto 0 1 0 \ttc \tto 0 0 1 \ttc \ttc}
2005-05-18 20:39:39 -04:00
\end{alltt}
The following are the usual algebraic operations on matrices.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{n*m}{n*m ( n matrix -- matrix )}
}
Multiplies each element of a matrix by a scalar.
\begin{alltt}
5 2 <identity-matrix> n*m prettyprint
2005-08-10 19:37:59 -04:00
\textbf{\tto \tto 5 0 \ttc
\tto 0 5 \ttc \ttc}
2005-05-18 20:39:39 -04:00
\end{alltt}
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-22 22:16:31 -04:00
\ordinaryword{m+}{m+~( matrix matrix -- matrix )}
2005-05-18 20:39:39 -04:00
}
Adds two matrices. They must have the same dimensions.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-22 22:16:31 -04:00
\ordinaryword{m-}{m-~( matrix matrix -- matrix )}
2005-05-18 20:39:39 -04:00
}
Subtracts two matrices. They must have the same dimensions.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-22 22:16:31 -04:00
\ordinaryword{m*}{m*~( matrix matrix -- matrix )}
2005-05-18 20:39:39 -04:00
}
Multiplies two matrices element-wise. They must have the same dimensions. This is \emph{not} matrix multiplication in the usual mathematical sense.
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{m.}{m.~( matrix matrix -- matrix )}
}
Composes two matrices as linear operators. This is the usual mathematical matrix multiplication, and the first matrix must have the same number of columns as the second matrix has rows.
2005-05-22 22:08:46 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{math}
2005-05-18 20:39:39 -04:00
\ordinaryword{m.v}{m.v~( matrix vector -- vector )}
}
Applies a matrix to a vector on the right, as a linear transformation. The vector is
treated as a matrix with one column.
\begin{alltt}
2005-08-10 19:37:59 -04:00
\tto 5 -3 \ttc \tto \tto 0 1 \ttc \tto 1 0 \ttc \ttc v.m .
2005-05-18 20:39:39 -04:00
\textbf{\tto -3 5 \ttc}
\end{alltt}
\wordtable{
\vocabulary{matrices}
\ordinaryword{v.m}{v.m~( vector matrix -- vector )}
}
Applies a matrix to a vector on the left, as a linear transformation. The vector is
treated as a matrix with one row.
\chapter{Streams}
\glossary{name=stream,
description={a source or sink of characters supporting some subset of the stream protocol, used as an end-point for input/output operations}}
Input and output is centered around the concept of a \emph{stream}, which is a source or
sink of characters. Streams also support formatted output, which may be used to present styled text in a manner independent of output medium. There are two stream implementations that read and write external resources:
2005-04-29 15:02:59 -04:00
\begin{description}
\item[File streams] read and write local files.
\item[Network streams] connect to servers and accept connections from clients.
\end{description}
The remaining types of streams wrap underlying streams and transform the data as it is read or written:
\begin{description}
\item[Line streams] read lines of text from an underlying stream supporting only character input.
2005-04-29 15:02:59 -04:00
\item[HTML streams] implement the formatted output protocol to generate HTML from styled text attributes, then direct the HTML to an underlying stream.
\item[Duplex streams] combine an input and output stream into a single bidirectional stream.
\item[Null streams] return end of file on input, and ignore output.
2005-04-29 15:02:59 -04:00
\end{description}
2005-04-24 20:57:37 -04:00
String buffers support the stream output protocol. See \ref{stdio}.
\section{Stream protocol}\label{stream-protocol}
\glossary{name=input stream,
description={a stream that implements the \texttt{stream-readln} and \texttt{stream-read} generic words and can be used for character input}}
\glossary{name=output stream,
description={a stream that implements the \texttt{stream-format}, \texttt{stream-flush} and \texttt{stream-finish} generic words and can be used for character output}}
2005-04-24 20:57:37 -04:00
2005-04-29 15:02:59 -04:00
There are various subsets of the stream protocol that a class can implement so that its instances may be used as streams. The following generic word is mandatory.
\wordtable{
\vocabulary{io}
\genericword{stream-close}{stream-close ( s -- )}
2005-04-29 15:02:59 -04:00
}
Releases any external resources associated with the stream, such as file handles and network connections. No further operations can be performed on the stream after this call.
You must close streams after you are finished working with them. A convenient way to automate this is by using the \texttt{with-stream} word in \ref{stdio}.
2005-08-10 19:37:59 -04:00
The following three words are optional, and should be implemented on input streams.
2005-04-29 15:02:59 -04:00
\wordtable{
\vocabulary{io}
\genericword{stream-readln}{stream-readln ( s -- str/f )}
2005-04-29 15:02:59 -04:00
}
2005-08-10 19:37:59 -04:00
Reads a line of text and outputs it on the stack. If the end of the stream has been reached, outputs \texttt{f}. The precise meaning of a ``line'' depends on the stream. Streams that do not support this generic word can be wrapped in a line stream that reads lines terminated by \verb|\n|, \verb|\r| or \verb|\r\n| (\ref{special-streams}). File and network streams are automatically wrapped in line streams.
2005-04-29 15:02:59 -04:00
\wordtable{
\vocabulary{io}
\genericword{stream-read1}{stream-read1 ( s -- char/f )}
}
Reads a character from the stream. If the end of the stream is reached, outputs \verb|f|.
\wordtable{
\vocabulary{io}
\genericword{stream-read}{stream-read ( n s -- str )}
2005-04-29 15:02:59 -04:00
}
Reads \texttt{n} characters from the stream. If less than \texttt{n} characters are available before the end of the stream, a shorter string is output.
The following three words are optional, and should be implemented on output streams.
\wordtable{
\vocabulary{io}
\genericword{stream-write1}{stream-write1 ( ch s -- )}
}
Outputs a character to the stream. This might not result in immediate output to the underlying resource if the stream performs buffering, like all file and network streams do. Output can be forced with \verb|stream-flush|.
\wordtable{
\vocabulary{io}
\genericword{stream-format}{stream-format ( str attrs s -- )}
2005-04-29 15:02:59 -04:00
}
Outputs a string to the stream. As with \verb|stream-write1|, this may not result in an immediate output operation unless \verb|stream-flush| is called.
2005-04-29 15:02:59 -04:00
The \texttt{attrs} parameter is an association list holding style information (\ref{styles}). Most of the time no style information needs to be output, and either the \texttt{stream-write} or \texttt{stream-print} word is used. Those words wrap \verb|stream-format| and are described in the next section.
2005-04-29 15:02:59 -04:00
\wordtable{
\vocabulary{io}
\genericword{stream-flush}{stream-flush ( s -- )}
2005-04-29 15:02:59 -04:00
}
2005-05-03 02:58:59 -04:00
Ensures all pending output operations are been complete. With many output streams, written output is buffered and not sent to the underlying resource until either the buffer is full, or an explicit call to \texttt{stream-flush} is made.
2005-04-29 15:02:59 -04:00
\wordtable{
\vocabulary{io}
\genericword{stream-finish}{stream-finish ( s -- )}
2005-04-29 15:02:59 -04:00
}
Ensures the user sees prior output. It is not as strong as \texttt{stream-flush}. The contract is as follows: if the stream is connected to an interactive end-point such as a terminal, \texttt{stream-finish} should execute \texttt{stream-flush}. If the stream is a file or network stream used for ``batch'' operations, this word should have an empty definition.
2005-04-29 15:02:59 -04:00
The \texttt{stream-print} word executes \texttt{stream-finish} after each line of output.
2005-04-29 15:02:59 -04:00
With some streams, the above operations may suspend the current thread and execute other threads until input data is available (\ref{threads}).
\section{Stream utilities}
2005-04-29 15:02:59 -04:00
The following three words are implemented in terms of the stream protocol, and should work with any stream supporting the required underlying operations.
2005-04-29 15:02:59 -04:00
\wordtable{
\vocabulary{io}
\ordinaryword{stream-write}{stream-write ( string stream -- )}
2005-04-29 15:02:59 -04:00
}
Outputs a string to the stream, without any specific style information. Implemented as follows:
2005-04-29 15:02:59 -04:00
\begin{verbatim}
: stream-write ( string stream -- )
f swap stream-format ;
2005-04-29 15:02:59 -04:00
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{stream-terpri}{stream-terpri ( stream -- )}
}
Outputs a newline to the stream, then executes \texttt{stream-finish} to force the line to be displayed on interactive streams.
\wordtable{
\vocabulary{io}
\ordinaryword{stream-print}{stream-print ( string stream -- )}
2005-04-29 15:02:59 -04:00
}
Outputs a string to the stream, then calls \verb|stream-terpri| to force a newline. Defined as follows:
\begin{verbatim}
: stream-print ( string stream -- )
[ stream-write ] keep stream-terpri ;
\end{verbatim}
2005-04-29 15:02:59 -04:00
\section{The default stream}\label{stdio}
\glossary{name=default stream,
description={the value of the \texttt{stdio} variable, used by various words as an implicit stream parameter}}
2005-05-03 02:58:59 -04:00
\glossary{name=stdio,
description={see default stream}}
2005-08-10 19:37:59 -04:00
Various words take an implicit stream parameter from the \texttt{stdio} variable to reduce stack shuffling. Unless rebound in a child namespace, this variable will be set to a console stream for interacting with the user.
\wordtable{
\vocabulary{io}
\ordinaryword{close}{close ( -- )}
}
\begin{verbatim}
: close stdio get stream-close ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{readln}{readln ( -- str/f )}
}
\begin{verbatim}
: readln stdio get stream-readln ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{read1}{read1 ( n -- char/f )}
}
\begin{verbatim}
: read1 stdio get stream-read1 ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{read}{read ( n -- str/f )}
}
\begin{verbatim}
: read stdio get stream-read ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{write1}{write ( char -- )}
}
\begin{verbatim}
: write1 stdio get stream-write1 ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{write}{write ( str -- )}
}
\begin{verbatim}
: write stdio get stream-write ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{format}{format ( str attrs -- )}
}
\begin{verbatim}
: format stdio get stream-format ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{terpri}{terpri ( -- )}
}
\begin{verbatim}
: terpri stdio get stream-terpri ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{print}{print ( str -- )}
}
\begin{verbatim}
: print stdio get stream-print ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{flush}{flush ( -- )}
}
\begin{verbatim}
: flush stdio get stream-flush ;
\end{verbatim}
The value of the \texttt{stdio} variable can be rebound inside a quotation with the following combinators.
\wordtable{
\vocabulary{io}
\ordinaryword{with-stream}{with-stream ( stream quot -- )}
}
Calls the quotation in a new dynamic scope, with the \texttt{stdio} variable set to \texttt{stream}. The stream is closed when the quotation returns or if an exception
is thrown.
2005-05-03 02:58:59 -04:00
\wordtable{
\vocabulary{io}
2005-05-03 02:58:59 -04:00
\ordinaryword{with-stream*}{with-stream* ( stream quot -- )}
}
Like \verb|with-stream| extend the stream is only closed in the case of an error.
\section{Styled output}\label{styles}
HTML streams (\ref{html}) and pane streams (\ref{panes}) support styled text output using the \verb|stream-format| word. The style association list given to this word can contain any of the following keys:
\begin{tabular}{l|l}
Key&Description\\
\hline
\ttindex{foreground}&The foreground color, as a list with red, green, blue components\\
\ttindex{background}&The background color, as a list with red, green, blue components\\
\ttindex{font}&A font family name\\
2005-08-10 19:37:59 -04:00
\ttindex{font-style}&One of \ttindex{plain}, \ttindex{bold}, \ttindex{italic}, or \ttindex{|bold-italic}\\
\ttindex{font-size}&An integer\\
\ttindex{underline}&A boolean\\
\ttindex{presented}&If set, a presentation for this object is output\\
\ttindex{file}&If set, a hyperlink to that file is output\\
\ttindex{icon}&If set, the icon named by this resource path is output\\
\end{tabular}
All keys are symbols in the \verb|styles| vocabulary.
Note that
HTML streams only use the \verb|presented| key if it is set to a word; in that case, a link to the browser responder is output. The \verb|file| key causes an HTML stream to output a link to the file responder. Both responders must be enabled for such links to function.
Pane streams support presentation of any object, but do not support the \verb|file| key.
\section{String streams}\label{string-streams}
String buffers support both the stream input and output protocol directly, with the exception of \verb|stream-readln|.
\wordtable{
\vocabulary{io}
\ordinaryword{<string-reader>}{<string-reader> ( string -- stream )}
}
2005-08-10 19:37:59 -04:00
Creates a new stream for reading characters from a string. First, a string buffer is created holding the reversed string, since characters are read in reverse by repeated calls to \verb|pop| (\ref{oddball-seq}). The result is wrapped in a line stream providing a \verb|stream-readln| implementation (\ref{special-streams}):
\begin{verbatim}
: <string-reader> ( string -- stream )
<reversed> >sbuf <line-reader> ;
\end{verbatim}
\wordtable{
\vocabulary{io}
\ordinaryword{string-in}{string-in ( string quot -- )}
}
Calls the quotation in a new dynamic scope, with the \texttt{stdio} variable set to a stream reading from a string. Executing \texttt{read1}, \texttt{read} or \texttt{readln} will take text from the string reader.
\wordtable{
\vocabulary{io}
\ordinaryword{string-out}{string-out ( quot -- string )}
}
Calls the quotation in a new dynamic scope, with the \texttt{stdio} variable set to a new string output stream. Executing \texttt{write}, \texttt{format} or \texttt{print} will append text to the string buffer. When the quotation returns, the string accumulated by the stream is output.
The stream output protocol is very natural with a string buffer:
\begin{verbatim}
M: sbuf stream-write1 push ;
M: sbuf stream-format rot nappend drop ;
\end{verbatim}
\section{Reading and writing binary data}
2005-06-10 17:41:41 -04:00
\glossary{name=big endian,
2005-08-30 23:42:15 -04:00
description={a representation of an integer as a sequence of bytes, ordered from most significant to least significant. This is the native byte ordering for PowerPC, SPARC, and ARM processors}}
2005-06-10 17:41:41 -04:00
\glossary{name=little endian,
2005-08-30 23:42:15 -04:00
description={a representation of an integer as a sequence of bytes, ordered from least significant to most significant. This is the native byte ordering for x86, x86-64 and Alpha processors}}
2005-06-12 20:55:30 -04:00
The core stream words read and write strings. Packed binary integers can be read and written by converting to and from sequences of bytes. Floating point numbers can be read and written by converting them into a their bitwise integer representation (\ref{float-bits}).
2005-06-10 17:41:41 -04:00
There are two ways to order the bytes making up an integer; \emph{little endian} byte order outputs the least significant byte first, and the most significant byte last, whereas \emph{big endian} is the other way around.
Consider the hexadecimal integer \texttt{HEX: cafebabe}. Big endian byte order yields the following sequence of bytes:
\begin{tabular}{l|c|c|c|c}
Byte:&1&2&3&4\\
\hline
Value:&\verb|be|&\verb|ba|&\verb|fe|&\verb|ca|\\
\end{tabular}
Compare this with little endian byte order:
\begin{tabular}{l|c|c|c|c}
Byte:&1&2&3&4\\
\hline
Value:&\verb|ca|&\verb|fe|&\verb|ba|&\verb|be|\\
\end{tabular}
2005-06-10 18:53:52 -04:00
With the above explanation of byte ordering, the behavior of the following words should be clear.
\wordtable{
\vocabulary{io}
2005-06-10 18:53:52 -04:00
\ordinaryword{be>}{be> ( seq -- x )}
\ordinaryword{le>}{le> ( seq -- x )}
}
Converts a sequence of bytes in big or little endian order into an unsigned integer.
\wordtable{
\vocabulary{io}
2005-06-10 18:53:52 -04:00
\ordinaryword{>be}{>be ( x n -- string )}
\ordinaryword{>le}{>le ( x n -- string )}
}
Converts an integer into a string of $n$ bytes in big or little endian order. Truncation will occur if the integer is not in the range $[-2^{8n},2^{8n})$.
\section{Reading and writing files}
2005-04-24 20:57:37 -04:00
2005-06-10 18:53:52 -04:00
Files are read and written in a standard way, by attaching a reader or writer stream to the file. It is vital that file streams are closed after all input/output operations have been performed; a convenient way is to use the \verb|with-stream| word (\ref{stdio}).
2005-05-03 02:58:59 -04:00
\glossary{name=file reader,
description=an input stream reading from a file}
\glossary{name=file writer,
description=an output stream writing to a file}
\wordtable{
\vocabulary{io}
\ordinaryword{<file-reader>}{<file-reader> ( path -- stream )}
}
Opens the file for reading and returns an input stream. An exception is thrown if the file could not be opened.
\wordtable{
\vocabulary{io}
\ordinaryword{<file-writer>}{<file-writer> ( path -- stream )}
}
Opens the file for writing and returns an output stream. An exception is thrown if the file could not be opened.
The following set of words provide access to file system metadata.
\wordtable{
\vocabulary{io}
\ordinaryword{file-extension}{file-extension~( path -- string/f )}
}
Outputs the remainder of the file's name after the last occurrence of a period, or \texttt{f} if there is no extension.
\begin{alltt}
"world.takeover.plan.txt" file-extension .
\textbf{"txt"}
\end{alltt}
\wordtable{
\vocabulary{io}
\ordinaryword{exists?}{exists?~( path -- ?~)}
}
Tests if a file exists.
\wordtable{
\vocabulary{io}
\ordinaryword{directory?}{directory?~( path -- ?~)}
}
Tests if a file is a directory. Outputs \texttt{f} if the file does not exist..
\wordtable{
\vocabulary{io}
\ordinaryword{file-length}{file-length~( path -- n/f~)}
}
Outputs the size of the file, or \texttt{f} if it does not exist.
\wordtable{
\vocabulary{io}
\ordinaryword{stat}{stat~( path -- list~)}
}
Outputs a list of file system attributes, or \texttt{f} if the file does not exist. The elements of the list are the following:
\begin{description}
\item[Directory flag] a boolean
\item[Length] an integer
\item[Permissions] on Unix, a standard \texttt{chmod}-style permission bitmap
\item[Last modification time] milliseconds since midnight, January 1st 1970 GMT
\end{description}
2005-04-24 20:57:37 -04:00
\section{TCP/IP networking}
2005-04-24 20:57:37 -04:00
2005-05-03 02:58:59 -04:00
\glossary{name=server stream,
description=a stream listening on a TCP/IP socket}
\glossary{name=client stream,
description=a bidirectional stream for an to end-point of a TCP/IP connection}
\wordtable{
\vocabulary{io}
\ordinaryword{<client>}{<client>~( host port -- stream~)}
}
Connects to TCP/IP port number \texttt{port} on the host named by \texttt{host}, and returns a bidirectional stream. An exception is thrown if the connection attempt fails.
\wordtable{
\vocabulary{io}
\ordinaryword{<server>}{<server>~( port -- server~)}
}
Begins listening for connections to \texttt{port} on all network interfaces. An exception is thrown if the port cannot be opened. The returned object can be used as an input to the \texttt{stream-close} and \texttt{accept} words.
\wordtable{
\vocabulary{io}
\ordinaryword{accept}{accept~( server -- stream~)}
}
Waits for a connection to the port number that \texttt{server} is listening on, and outputs a bidirectional stream when the connection has been established. An exception is thrown if an error occurs.
\wordtable{
\vocabulary{io}
\ordinaryword{client-stream-host}{client-stream-host~( stream -- port~)}
\ordinaryword{client-stream-port}{client-stream-port~( stream -- host~)}
}
Outputs the IP address as a dotted-quad string, and the local port number, respectively, of a client socket returned from \texttt{accept}.
2005-04-24 20:57:37 -04:00
\section{Special streams}\label{special-streams}
These special streams wrap other streams.
2005-05-03 02:58:59 -04:00
\glossary{name=null stream,
description=a bidirectional stream that ignores output and returns end of file on input}
\glossary{name=duplex stream,
description=a bidirectional delegating to an input stream for input and an output stream for output}
\glossary{name=wrapper stream,
description=a bidirectional stream delegating to an underlying stream and providing a namespace where the delegated stream is the default stream}
\glossary{name=line stream,
description=an input stream that reads lines a character at a time from an underlying stream}
2005-05-03 02:58:59 -04:00
\wordtable{
\vocabulary{io}
\ordinaryword{<null-stream>}{<null-stream>~( -- stream~)}
}
Creates a null stream, which ignores output written to it, and returns end of file if an attempt is made to read.
\wordtable{
\vocabulary{io}
\ordinaryword{<duplex-stream>}{<duplex-stream>~( in out -- stream~)}
}
Creates a duplex stream. Writing to a duplex stream will write to \texttt{out}, and reading from a duplex stream will read from \texttt{in}. Closing a duplex stream closes both the input and output streams.
2005-05-02 02:29:24 -04:00
\wordtable{
\vocabulary{io}
\ordinaryword{<line-stream>}{<line-stream>~( stream -- stream~)}
}
A line stream provides an implementation of the \verb|stream-readln| generic word that reads lines a character at a time from the underlying stream. Lines are terminated by either \verb|\n|, \verb|\r| or \verb|\r\n|.
\wordtable{
\vocabulary{io}
2005-05-02 02:29:24 -04:00
\ordinaryword{<wrapper-stream>}{<wrapper-stream>~( stream -- stream~)}
}
Creates a stream wrapping \texttt{stream}. The given stream becomes the delegate of the new wrapper stream, so calling any stream operation on the wrapper passes it on to the delegate.
You can then define your own tuple class that delegates to a wrapper stream, then override methods on this new tuple class, and use the following combinator in your method definitions.
\wordtable{
\vocabulary{io}
2005-05-02 02:29:24 -04:00
\ordinaryword{with-wrapper}{with-wrapper~( stream quot -- ~)}
}
Executes the quotation in a dynamic scope where the \texttt{stdio} variable is set to the wrapped stream.
The following example implements a stream that emits \TeX\ markup when a certain attribute is set in the \texttt{attrs} parameter to \texttt{stream-format}.
2005-05-02 02:29:24 -04:00
\begin{verbatim}
USING: generic kernel lists stdio streams ;
TUPLE: tex-stream ;
C: tex-stream ( stream -- stream )
[ >r <wrapper-stream> r> set-delegate ] keep ;
M: tex-stream stream-format ( string attrs stream -- )
2005-05-02 02:29:24 -04:00
[
font-style swap assoc bold = [
2005-05-02 02:29:24 -04:00
"\textbf{" write write "}" write
] [
write
] ifte
] with-wrapper ;
\end{verbatim}
\section{Printing objects}
2005-04-24 20:57:37 -04:00
\glossary{name=prettyprinter,
description={a set of words for printing objects in readable form}}
One of Factor's key features is the ability to print almost any object in a readable form. This greatly aids debugging and provides the building blocks for light-weight object serialization facilities.
\subsection{The unparser}
2005-05-03 02:58:59 -04:00
The unparser provides a basic facility for turning certain types of objects into strings. A more general facility supporting more types is the prettyprinter (\ref{prettyprint}).
\glossary{
name=unreadable string,
description={a string which raises a parse error when parsed}}
\glossary{
name=readable form,
description={a readable form of an object is a string that parses to that object}}
\wordtable{
\vocabulary{unparser}
2005-05-03 02:58:59 -04:00
\genericword{unparse}{unparse~( object -- string~)}
}
Outputs a string representation of \texttt{object}. Only the following classes of objects are supported; for anything else, an unreadable string is output:
\begin{verbatim}
boolean
dll
number
sbuf
string
2005-05-03 02:58:59 -04:00
word
\end{verbatim}
2005-05-03 02:58:59 -04:00
A set of words are provided for converting integers into strings with various bases.
\wordtable{
\vocabulary{unparser}
\ordinaryword{>base}{>base~( n base -- string~)}
}
Converts \texttt{n} into a string representation in the given base. The base must be between 2 and 36, inclusive.
\wordtable{
\vocabulary{unparser}
\ordinaryword{>bin}{>bin~( n -- string~)}
\ordinaryword{>oct}{>oct~( n -- string~)}
\ordinaryword{>dec}{>dec~( n -- string~)}
\ordinaryword{>hex}{>hex~( n -- string~)}
}
Convenience words defined in terms of \texttt{>base} for converting integers into string representations in base 2, 8, 10 and 16, respectively.
\subsection{The prettyprinter}\label{prettyprint}
\wordtable{
\vocabulary{prettyprint}
\ordinaryword{prettyprint}{prettyprint~( object --~)}
}
2005-05-03 02:58:59 -04:00
Prints the object using literal syntax that can be parsed back again. Even though the prettyprinter supports more classes of objects than \texttt{unparse}, it is still not a general serialization mechanism. The following restrictions apply:
\begin{itemize}
\item Not all objects print in a readable way. Namely, the following classes do not:
\begin{verbatim}
array
byte-array
displaced-alien
\end{verbatim}
2005-08-10 19:37:59 -04:00
\item Shared structure is not reflected in the printed output; if the output is parsed back in, fresh objects are created for all literal denotations.
\item Circular structure is not printed in a readable way. Circular references are printed as ``\texttt{\&}''.
\item Floating point numbers might not equal themselves after being printed and read, since a decimal representation of a float is inexact.
\end{itemize}
\wordtable{
\vocabulary{prettyprint}
\ordinaryword{.}{.~( object --~)}
}
Prettyprint the object, except all output is on a single line without indentation, and deeply-nested structure is not printed fully. This word is intended for interactive use at the listener.
\wordtable{
\vocabulary{prettyprint}
\ordinaryword{[.]}{[.]~( sequence --~)}
}
Prettyprint each element of the sequence on its own line using the \texttt{.} word.
\subsection{Variables controlling the prettyprinter}
The following variables affect the prettyprinter if set in the dynamic scope from which \texttt{prettyprint} is called.
2005-04-24 20:57:37 -04:00
\wordtable{
\vocabulary{prettyprint}
\symbolword{tab-size}
}
Specifies the indentation for recursive objects such as lists, vectors, hashtables and tuples. The default tab size is 4.
\wordtable{
\vocabulary{prettyprint}
\symbolword{prettyprint-limit}
}
2005-08-10 19:37:59 -04:00
Controls the maximum nesting depth. Printing structures that nest further than this will simply print ``\texttt{\#}''. If this is set to \texttt{f}, the nesting depth is unlimited. The default is \texttt{f}. Inside calls to \texttt{.}, set to 16, which translates to four levels of nesting with the default tab size.
\wordtable{
\vocabulary{prettyprint}
\symbolword{one-line}
}
If set to true, the prettyprinter does not emit newlines. The default is \texttt{f}. Inside calls to \texttt{.}, set to \texttt{t}.
2005-04-24 20:57:37 -04:00
\subsection{Extending the prettyprinter}
2005-04-24 20:57:37 -04:00
If define your own data type and wish to add new syntax for it, you must implement two facilities:
\begin{itemize}
\item Parsing word(s) for reading your data type,
\item A prettyprinter method for printing your data type.
\end{itemize}
Parsing words are documented in \ref{parsing-words}.
\wordtable{
\vocabulary{prettyprint}
\genericword{prettyprint*}{prettyprint* ( indent object -- indent )}
}
Prettyprints the given object. Unlike \texttt{prettyprint}, this word does not emit a trailing newline, and the current indent level is given. This word is also generic, so you can add methods to have it print your own data types in a nice way.
The remaining words in this section are useful in the implementation of prettyprinter methods.
\wordtable{
\vocabulary{prettyprint}
\genericword{unparse.}{unparse.~( object -- )}
}
Prints the textual representation of an object as returned by \verb|unparse|. Generally \texttt{prettyprint*} is used instead; one important distinction with \verb|unparse.| is that if the given object is a parsing word, the output is not prefixed with \texttt{POSTPONE:}.
\wordtable{
\vocabulary{prettyprint}
\genericword{prettyprint-newline}{prettyprint-newline ( indent -- )}
}
Emits a newline followed by the given amount of indentation.
\wordtable{
\vocabulary{prettyprint}
\genericword{?prettyprint-newline}{?prettyprint-newline ( indent -- )}
}
If \texttt{one-line} is on, emits a space, otherwise, emits a newline followed by the given amount of indentation.
\wordtable{
\vocabulary{prettyprint}
\genericword{<prettyprint}{<prettyprint~( indent -- indent )}
}
Increases the indent level and emits a newline if \texttt{one-line} is off.
\wordtable{
\vocabulary{prettyprint}
\genericword{prettyprint>}{prettyprint>~( indent -- indent )}
}
Decreases the indent level and emits a newline if \texttt{one-line} is off.
\chapter{The parser}
2005-04-24 20:57:37 -04:00
2005-05-03 02:58:59 -04:00
This section concerns itself with reflective access and extension of the parser. The parser algorithm and standard syntax is described in \ref{syntax}. Before the parser proper is documented, we draw attention to a set of words for parsing numbers. They are called by the parser, and are useful in their own right.
\section{Parsing numbers}\label{parsing-numbers}
2005-05-03 02:58:59 -04:00
\wordtable{
\vocabulary{parser}
\ordinaryword{str>number}{str>number~( string -- number )}
}
Attempts to parse the string as a number. An exception is thrown if the string does not represent a number in one of the following forms:
\begin{itemize}
\item An integer; see \ref{integer-literals}
\item A ratio; see \ref{ratio-literals}
\item A float; see \ref{float-literals}
\end{itemize}
In particular, complex numbers are parsed by the \verb|#{| and \verb|}#| parsing words, not by the number parser. To parse complex number literals, use the \texttt{parse} word (\ref{parsing-quotations}).
\wordtable{
\vocabulary{parser}
\ordinaryword{parse-number}{parse-number~( string -- number/f )}
}
Like \texttt{str>number}, except instead of raising an error, outputs \texttt{f} if the string is not a valid literal number.
\wordtable{
\vocabulary{parser}
\genericword{base>}{base>~( string base -- integer )}
}
Converts a string representation of an integer in the given base into an integer. Throws an exception if the string is not a valid representation of an integer.
\wordtable{
\vocabulary{parser}
\ordinaryword{bin>}{bin>~( string -- integer )}
\ordinaryword{oct>}{oct>~( string -- integer )}
\ordinaryword{dec>}{dec>~( string -- integer )}
\ordinaryword{hex>}{hex>~( string -- integer )}
}
Convenience words defined in terms of \texttt{base>} for parsing integers in base 2, 8, 10 and 16, respectively.
\section{Parsing quotations}\label{parsing-quotations}
2005-05-02 02:29:24 -04:00
As documented in \ref{vocabsearch}, the parser looks up words in the vocabulary search path. New word definitions are added to the current vocabulary. These two parameters are stored in a pair of variables (\ref{namespaces}):
\begin{description}
2005-05-03 02:58:59 -04:00
\item[\texttt{"use"}] the vocabulary search path; a list of strings
\item[\texttt{"in"}] the current vocabulary; a string
2005-05-02 02:29:24 -04:00
\end{description}
\wordtable{
\vocabulary{parser}
\genericword{parse}{parse~( string -- list )}
}
2005-05-03 02:58:59 -04:00
Parses the string and outputs a quotation. The vocabulary search path and current vocabulary are taken from the current scope.
2005-05-02 02:29:24 -04:00
\begin{alltt}
"1 2 3" parse .
2005-05-02 02:29:24 -04:00
\textbf{[ 1 2 3 ]}
\end{alltt}
\wordtable{
\vocabulary{parser}
\genericword{eval}{eval~( string -- )}
}
Parses a string then calls the resulting quotation.
\begin{alltt}
"2 2 + ." eval
2005-05-02 02:29:24 -04:00
\textbf{4}
\end{alltt}
The \texttt{eval} word is defined as follows:
\begin{verbatim}
: eval parse call ;
\end{verbatim}
\section{Parsing from streams}
2005-05-03 02:58:59 -04:00
2005-08-10 19:37:59 -04:00
Words for parsing input from streams use the following initial values for the \texttt{"use"} and \texttt{"in"} variables:
2005-05-03 02:58:59 -04:00
\begin{description}
\item[\texttt{"use"}] \texttt{[ "scratchpad" "syntax" ]}
\item[\texttt{"in"}] \texttt{"scratchpad"}
\end{description}
2005-05-02 02:29:24 -04:00
\wordtable{
\vocabulary{parser}
\genericword{parse-stream}{parse-stream~( name stream -- list )}
}
2005-05-03 02:58:59 -04:00
Parses lines of text from the stream and outputs a quotation. The \texttt{name} parameter identifies the stream in error messages. The stream is closed when the end is reached.
2005-05-02 02:29:24 -04:00
\wordtable{
\vocabulary{parser}
\genericword{parse-file}{parse-file~( path -- list )}
}
Parses the contents of a file and outputs a quotation. Defined as follows:
\begin{verbatim}
: parse-file dup <file-reader> parse-stream ;
\end{verbatim}
\wordtable{
\vocabulary{parser}
\genericword{run-file}{run-file~( path -- list )}
}
Parses the contents of a file and calls the resulting quotation. Defined as follows:
\begin{verbatim}
: run-file parse-file call ;
\end{verbatim}
\section{Parsing words}\label{parsing-words}
2005-04-24 20:57:37 -04:00
2005-05-03 02:58:59 -04:00
\parsingwordglos
Parsing words execute at parse time, and therefore can access and modify the state of the parser, as well as add objects to the parse tree. Parsing words are a difficult concept to grasp, so this section has several examples and explains the workings of some of the parsing words provided in the library.
To define a parsing word, suffix the colon definition with the \texttt{parsing} word.
\wordtable{
\vocabulary{syntax}
\parsingword{parsing}{parsing}
}
Marks the most recently defined word as a parsing word. For example:
\begin{verbatim}
: hello "Hello world" print ; parsing
\end{verbatim}
Now writing \texttt{hello} anywhere will print the message \texttt{"Hello world"} at parse time. Of course, this is a useless definition. In the sequel, we will look into writing useful parsing words that modify parser state.
\subsection{Nested structure}
2005-05-03 02:58:59 -04:00
The first thing to look at is how the parse tree is built. When parsing begins, the empty list is pushed on the data stack; whenever the parser algorithm appends an object to the parse tree, it conses the object onto the quotation at the top of the stack. This builds the quotation in reverse order, so when parsing is done, the quotation is reversed before it is called.
Lets look at a simple example; the parsing of \texttt{"1 2 3"}:
\begin{tabular}{l|l|l}
\hline
Token&Stack before&Stack after\\
\hline
\verb|1|&\verb|[ ]|&\verb|[ 1 ]|\\
\verb|2|&\verb|[ 1 ]|&\verb|[ 2 1 ]|\\
\verb|3|&\verb|[ 2 1 ]|&\verb|[ 3 2 1 ]|
\end{tabular}
Once the end of the string has been reached, the quotation is reversed, and the output, as you would expect, is \verb|[ 1 2 3 ]|.
Nested structure is a bit more involved. The basic idea is that parsing words can push an empty list on the stack, then all subsequent tokens are consed onto this quotation, until another parsing word adds this quotation to the quotation underneath.
The following definitions of the \verb|[| and \verb|]| parsing words illustrate the idiom:
\begin{verbatim}
: [ f ; parsing
: ] reverse swons ; parsing
\end{verbatim}
Let us look at how the following string parses:
\begin{verbatim}
"1 [ 2 3 ] 4"
\end{verbatim}
\begin{tabular}{l|l|l|l}
\hline
Token&Stack before&Stack after&Note\\
\hline
\verb|1|&\verb|[ ] [ ]|&\verb|[ ] [ 1 ]|&\\
\textbf{\texttt{[}}&\verb|[ 1 ]|&\verb|[ 1 ] [ ]|&pushes an empty list\\
\verb|2|&\verb|[ 1 ] [ ]|&\verb|[ 1 ] [ 2 ]|&\\
\verb|3|&\verb|[ 1 ] [ 2 ]|&\verb|[ 1 ] [ 3 2 ]|&\\
\textbf{\texttt{]}}&\verb|[ 1 ] [ 3 2 ]|&\verb|[ [ 2 3 ] 1 ]|&calls \verb|reverse swons|\\
\verb|4|&\verb|[ [ 2 3 ] 1 ]|&\verb|[ 4 [ 2 3 ] 1 ]|&
\end{tabular}
Now, the parser reverses the original quotation, and the resulting output is clear:
\begin{verbatim}
[ 1 [ 2 3 ] 4 ]
\end{verbatim}
Data types such as vectors, hashtables and so on are built in a similar way. For example, the vector parsing words are defined as thus:
\begin{verbatim}
: { f ; parsing
: } reverse >vector swons ; parsing
\end{verbatim}
Indeed, any type of object can be added to the parse tree in this fashion.
\subsection{Reading ahead}\label{reading-ahead}
2005-05-03 02:58:59 -04:00
\glossary{name=reading ahead,
description=a parsing word reads ahead of it scans following tokens from the input string}
2005-08-10 19:37:59 -04:00
The next idiom to look at is parsing words that read ahead.
\wordtable{
\vocabulary{parser}
\ordinaryword{scan}{scan ( -- string )}
}
Outputs the next token as a string, or \texttt{f} if the end of the input has been reached. Advances the parser state to after this token.
\wordtable{
\vocabulary{parser}
\ordinaryword{scan-word}{scan-word ( -- word )}
}
Reads the next token from the input and looks up a word with this name. If the lookup fails, attempts to parse the word as a number by calling \verb|str>number|. Outputs \verb|f| if the end of input has been reached. There is no confusion with the \verb|f| literal here, since the latter is raad as the \verb|f| parsing word.
The first example is the \verb|HEX:| word, documented in \ref{integer-literals}. This word is defined so that the following two lines are equivalent:
2005-05-03 02:58:59 -04:00
\begin{verbatim}
HEX: deadbeef
3735928559
\end{verbatim}
It is defined in terms of a lower-level \texttt{(BASE)} word that takes the numerical base on the data stack, reads the next token from the string, then calls \texttt{base>} (\ref{parsing-numbers}):
\begin{verbatim}
: (BASE) ( base -- ) scan swap base> swons ;
: HEX: 16 (BASE) ; parsing
\end{verbatim}
The key word here is \texttt{scan}.
2005-08-10 19:37:59 -04:00
The next example of a parsing word we will look at is the \verb|\| word. It is used to insert a word literally in a quotation, that is, push it on the stack during evaluation, so that the following two lines are equivalent:
2005-05-03 02:58:59 -04:00
\begin{verbatim}
2005-08-10 19:37:59 -04:00
\ <vector> execute
<vector>
2005-05-03 02:58:59 -04:00
\end{verbatim}
2005-08-10 19:37:59 -04:00
The implementation of the \verb|\| parsing word reads the next token from the input stream using \verb|scan-word|.
It then uses the \verb|literalize| word to turn the word into an object that pushes the word
on the stack, and then appends this to the parse tree:
2005-05-03 02:58:59 -04:00
\begin{verbatim}
2005-08-10 19:37:59 -04:00
: \ scan-word literalize swons ; parsing
2005-05-03 02:58:59 -04:00
\end{verbatim}
2005-08-10 19:37:59 -04:00
2005-05-03 02:58:59 -04:00
\wordtable{
2005-08-10 19:37:59 -04:00
\vocabulary{words}
\ordinaryword{literalize}{literalize ( object -- object )}
2005-05-03 02:58:59 -04:00
}
2005-08-10 19:37:59 -04:00
Turns non-self-evaluating objects (words and wrappers) into wrappers that push those objects, and is a no-op on everything else. This word is generic (\ref{generic}), with three trivial methods:
\begin{verbatim}
GENERIC: literalize ( object -- object )
M: object literalize ;
M: wrapper literalize <wrapper> ;
M: word literalize <wrapper> ;
\end{verbatim}
\wrapglos
Instances of the \verb|wrapper| class hold a reference to a single object. When the evaluator encounters a wrapper, it pushes the wrapped object on the data stack.
\wordtable{
\vocabulary{kernel}
\ordinaryword{wrapped}{wrapped ( wrapper -- object )}
}
Outputs the object wrapped by the wrapper. This word is used in the implementation of the \verb|wrapper| method of the \verb|prettyprint*| generic word:
\begin{verbatim}
M: wrapper prettyprint*
dup wrapped word? [
\ \ unparse. bl wrapped unparse.
] [
\ W[ unparse. bl wrapped prettyprint* \ ]W unparse.
] ifte ;
\end{verbatim}
The somewhat more verbose \verb|W[ ... ]W| syntax is only part of the language for completeness, to handle the corner case where a wrapper wrapping another wrapper is printed out and read back in by the parser.
2005-05-03 02:58:59 -04:00
\subsection{Defining words}
2005-05-03 02:58:59 -04:00
\definingwordglos
Defining words add definitions to the dictionary without modifying the parse tree.
The first example to look at is the \verb|SYMBOL:| word. It reads the next token from the input stream, creates a word with that name, and makes it a symbol (\ref{symbols}). The next
example is the common \verb|:| word, which creates a colon definition. First, it reads the
name of the new word, then the definition is built up until \verb|;|. The latter
example will demonstrate building nested structure in defining words.
First, let us look at the \verb|SYMBOL:| word (\ref{symbols}).
\begin{verbatim}
: SYMBOL: CREATE define-symbol ; parsing
\end{verbatim}
The key factor the above definition is \verb|CREATE|, which reads a token from the input and creates a word with that name. This word is then passed to \verb|define-symbol|.
\wordtable{
\vocabulary{parser}
\ordinaryword{CREATE}{CREATE ( -- word )}
}
Reads the next token from the input and creates a word in the current vocabulary with that name. It uses \verb|create-in| to do this (\ref{creating-words}).
The definition of \verb|:| introduces the next idiom, and that is building a quotation and then adding a definition using \verb|;|.
\begin{verbatim}
: :
CREATE [ define-compound ] [ ]
"in-definition" on ; parsing
\end{verbatim}
The factors of the word are, in order:
\begin{description}
\item[\texttt{CREATE}] reads the following token and pushes a new word on the stack,
\item[\texttt{[ define-compound ]}] a quotation to be called by \verb|;|,
\item[\texttt{[ ]}] an empty list that the parser will build the colon definition on,
\item[\texttt{"in-definition" on}] sets a flag that subsequent parsing words can query.
\end{description}
While \verb|:| is very specific, \verb|;| is quite general because it takes a quotation pushed by a previous parsing word. You can use \verb|;| in your own parsing words.
\wordtable{
\parsingword{;}{;~( definer parsed -- )}
\texttt{definer:~parsed --}\\
}
Reverses the \verb|parsed| quotation, and passes it as input to the \verb|definer| quotation.
The definition of this word is in some sense dual to \verb|:| even thought it is more general:
\begin{verbatim}
: ; "in-definition" off reverse swap call ; parsing
\end{verbatim}
Suppose we are parsing the following string:
\begin{verbatim}
: sq dup * ;
\end{verbatim}
We can trace the parsing as before.
\begin{tabular}{l|l|l}
\hline
Token&Stack after&Note\\
\hline
\verb|:|&\verb|[ ] sq [ define-compound ] [ ]|&reads the next token\\
\verb|dup|&\verb|[ ] sq [ define-compound ] [ dup ]|\\
\verb|*|&\verb|[ ] sq [ define-compound ] [ * dup ]|&\\
\verb|;|&\verb|[ ]|&reverses and defines
\end{tabular}
The call to the \verb|;| word proceeds as follows:
\begin{description}
\item[\texttt{"in-definition" off}] this variable was switched on by \verb|:|.
\item[\texttt{reverse}] reverses \verb|[ * dup ]| yielding \verb|[ dup * ]|.
\item[\texttt{swap call}] calls \texttt{[ define-compound ]}. Thus, \verb|define-compound| is called to define \verb|sq| as the quotation \verb|[ dup * ]|.
\end{description}
\subsection{String mode and parser variables}\label{string-mode}
2005-05-03 02:58:59 -04:00
\stringmodeglos
String mode allows custom parsing of tokenized input. For even more esoteric situations, the input text can be accessed directly.
String mode is controlled by the \verb|string-mode| variable.
\wordtable{
\vocabulary{parser}
\symbolword{string-mode}
}
When enabled, the parser adds tokens to the parse tree as strings. This creates a paradox because further parsing words are not executed while string mode is on. However, if the token \verb|";"| is read, there is a special case that calls the \verb|;| parsing word. This parsing word reverses the quotation at the top of the stack, and calls the quotation underneath it, as usual.
An illustration of this idiom is found in the \verb|USING:| parsing word. It reads a list of vocabularies, terminated by \verb|;|. However, the vocabulary names do not name words, except by coincidence; so string mode is used to read them.
\begin{verbatim}
: USING:
string-mode on [
string-mode off [ use+ ] each
] [ ] ; parsing
\end{verbatim}
Make note of the quotation that is left in position for \verb|;| to call. It switches off string mode, so that normal parsing can resume, then adds the given vocabularies to the search path.
If the parser features described in the earlier sections are still insufficient, you can directly access a pair of variables holding parser state:
\begin{description}
\item[\texttt{"line"}] the text being parsed,
\item[\texttt{"col"}] the column number.
\end{description}
The \verb|"col"| variable is implicitly changed the \verb|scan| word (\ref{reading-ahead}), and the following word.
\wordtable{
\vocabulary{parser}
\ordinaryword{until-eol}{until-eol ( -- string )}
}
Outputs the remainder of the line being parsed. The \verb|"col"| variable is set to point to the end of the line.
This word is used to implement end-of-line comments:
\begin{verbatim}
: ! until-eol drop ; parsing
\end{verbatim}
2005-08-10 19:37:59 -04:00
\chapter{UI framework}\label{ui}
\begin{itemize}
\item An object on the screen is a 'gadget'
\item Every gadget can have child gadgets
\item The world gadget is the parent of all other gadgets
\item Everything is drawn on an SDL surface.
\item The hand gadget holds mouse and keyboard focus state.
\end{itemize}
\section{Low-level graphics rendering}
\begin{itemize}
\item SDL - with-screen, with-surface, make-rect, with-pixels, rgb, make-color, surface, width, height, bpp
\item keysyms: modifiers and keysyms hash
\item SDL\_ttf
\item SDL\_gfx
\end{itemize}
\section{Gadgets}
\begin{itemize}
\item rectangles
\item intersection, union, other stuff, inside?
\item co-ordinates are 3-vectors
\end{itemize}
\subsection{Hierarchy}
\begin{itemize}
\item Words for adding, removing gadgets
\item Layout protocol - relayout, layout*, pref-dim
\item pick-up
\end{itemize}
\subsection{Painting}
\begin{itemize}
\item Paint properties, are inherited from the parent
\item paint-prop set-paint-prop paint-prop*
\item Interior fill
\item Boundary fill
\end{itemize}
\subsection{Solid interior and boundary}
\begin{itemize}
\item uses foreground, background paint-props
\item reverse video and rollover
\end{itemize}
\subsubsection{Bevel boundary}
\begin{itemize}
\item width specified in bevel tuple
\item uses bevel-1, bevel-2, reverse-video paint-props
\end{itemize}
\subsubsection{Gradient interior}
\begin{itemize}
\item from/to color specified in tuple
\end{itemize}
\subsection{Gestures}
\begin{itemize}
\item Gestures table
\item Mouse, keyboard, focus gestures
\item Custom gestures
\end{itemize}
\section{Basic gadgets}
\subsection{Labels and text rendering}
\begin{itemize}
\item A label is a piece of uneditable text
\item <label>
\item lookup-font
\item gadget-font
\item label-size
\item size-string
\item draw-string
\end{itemize}
\subsection{Borders}
\begin{itemize}
\item Borders position a child inside a delegate, padded by a certain amount at the top and bottom
\item empty-border
\item line-border
\item bevel-border
\end{itemize}
\subsection{Buttons}
\begin{itemize}
\item button-actions
\item <button>
\end{itemize}
\subsection{Pack layouts}
\begin{itemize}
\item Pack layouts arrange gadgets along an axis
\item Align and fill parameters
\item <pile>, <line-pile>, <shelf>, <line-shelf>, <stack>, <pack>
\end{itemize}
\subsubsection{Incremental layout}
\begin{itemize}
\item General idea, when it can be used
\item <incremental>
\item add-incremental
\item clear-incremental
\end{itemize}
\subsection{Frame layout}
\begin{itemize}
\item <frame>
\item add-left, add-top, add-bottom, add-right, add-center
\end{itemize}
\section{Composed gadgets}
\subsection{Editors}
\subsection{Menus}
\begin{itemize}
\item <menu>, show-menu
\end{itemize}
\subsection{Presentations and commands}
\begin{itemize}
\item Presentations show a menu of commands when clicked
\item <presentation>
\item define-command
\item applicable
\end{itemize}
\subsection{Scrolling}
\begin{itemize}
\item Scrollers are scrolled via a pair of sliders, or the mouse wheel
\item <scroller>
\item scroll
\item scroll>bottom
\end{itemize}
\subsection{Panes}\label{panes}
\begin{itemize}
\item A pane is a text input and output stream gadget displaying styled text and presentations.
\item <pane>
\item pane-clear
\item panes are bidirectional streams
\item panes are wrapped in scrollers
\end{itemize}
\subsection{Splitters}
\begin{itemize}
\item A splitter displays two components side-by-side, separated with a draggable divider.
\item <splitter>, <x-splitter>, <y-splitter>
\end{itemize}
\subsection{Books}
\begin{itemize}
\item A book lets you flick through pages.
\item <book>
\item show-page
\item first-page, prev-page, next-page, last-page
\item book-buttons
\item <book-browser>
\end{itemize}
\chapter{Web framework}
2005-06-12 20:55:30 -04:00
Factor includes facilities for interoperating with web-based services. This includes an HTTP client, and an HTTP server with a continuation-based web application framework.
\section{HTTP client}
2005-05-02 02:29:24 -04:00
\wordtable{
\vocabulary{http-client}
\ordinaryword{http-get}{http-get ( url -- code headers stream )}
}
Connects to the server specified in the URL, and makes a \verb|GET| request to retreive that resource.
\wordtable{
\vocabulary{http-client}
\ordinaryword{http-post}{http-post ( type content url -- code headers stream )}
}
Attempts to connect to the server specified in the URL, and makes a \verb|POST| request with the specified content type and content. The content is automatically URL-encoded for you.
With both words, the output values are as follows:
2005-05-02 02:29:24 -04:00
\begin{description}
\item[\texttt{code}] an integer with the HTTP response code; for example, 404 denotes ``file not found'' whereas 200 means ``success''.
\item[\texttt{headers}] an association list of returned headers.
\item[\texttt{stream}] a stream for reading the resource.
\end{description}
The following pair of words convert a string to and from its URL-encoded form.
\wordtable{
\vocabulary{http}
\ordinaryword{url-encode}{url-encode ( string -- string )}
\ordinaryword{url-decode}{url-decode ( string -- string )}
}
These words are called automatically by much of the web framework, however they are sometimes useful to call directly.
2005-06-12 20:55:30 -04:00
\section{HTTP server}\label{httpd}
The HTTP server listens for requests on a port and hands them off to responders. A responder takes action based on the type of request.
\wordtable{
\vocabulary{httpd}
\ordinaryword{httpd}{httpd ( port -- )}
}
Starts listening for HTTP requests on \verb|port|.
The HTTP server is usually started with a phrase like the following:
\begin{alltt}
USING: httpd threads ;
[ 8888 httpd ] in-thread
2005-06-12 20:55:30 -04:00
\end{alltt}
\wordtable{
\vocabulary{httpd}
\ordinaryword{stop-httpd}{stop-httpd ( -- )}
}
Stops the HTTP server.
A useful application of the HTTP server is the built-in vocabulary browser. You can use it simply by starting the HTTP server then visiting the following location:
\begin{verbatim}
http://localhost:8888/responder/browser/
\end{verbatim}
\subsection{Serving static content}
Static content may be served by setting the \verb|"doc-root"| variable to a directory holding the content. This variable may be set in the global namespace, or indeed, individually on each virtual host.
\begin{verbatim}
"/var/www/" "doc-root" set
\end{verbatim}
If a directory holds an \verb|index.html| file, the file is served when the directory is requested, otherwise a directory listing is produced. The directory listing references icons sent via the resource responder. The icons are located in the Factor source tree, and the \verb|"resource-path"| variable may be set to the root of the source tree in order for the icons to be located:
\begin{verbatim}
"/home/slava/work/Factor/" "resource-path" set
\end{verbatim}
A facility for ad-hoc server-side scripting exists. If a file with the \verb|.factsp| filename extension is requested, the file is run with \verb|run-file| and any output it sends to the default stream is sent to the client (\ref{stdio}). These ``Factor server pages'' are slower and less powerful than responders, so it is recommended that responders be used instead.
A different static site can be associated with each virtual host by setting the \verb|"doc-root"| variable in each virtual host (\ref{vhosts}).
2005-06-12 20:55:30 -04:00
\subsection{Responders}
\glossary{name=responder,
description={A named handler for HTTP requests, installed in the \texttt{responders} variable}}
\glossary{name=HTTP responder,
description={See responder}}
The HTTP server listens on a port number for HTTP requests and issues requests to \emph{responders}. The following form of request is understood specially by the HTTP server:
\begin{verbatim}
http://myhost/responder/foo/bar
\end{verbatim}
Such a request results in the \verb|foo| responder being invoked with the \verb|bar| argument. Requesting a path that does not begin with \verb|/responder| simply invokes the file responder with that path, as documented in the previous section.
2005-06-12 20:55:30 -04:00
\subsubsection{Managing responders}
2005-06-12 20:55:30 -04:00
\wordtable{
\vocabulary{httpd}
\symbolword{responders}
}
Responders are located in a hashtable stored in the \verb|responders| variable. This variable must be in scope when the \verb|httpd| word is invoked. This is usually the case, as the global namespace includes a default value filled out with all responders that are included as part of the library.
The following words manage the set of installed responders:
\wordtable{
\vocabulary{httpd}
\ordinaryword{set-default-responder}{set-default-responder ( name -- )}
}
Sets the default responder, that is, the one handling requests not prefixed by \verb|/responder|, to the responder \verb|name|. The initial value for the default responder is \verb|"file"|, identifying the file responder.
\wordtable{
\vocabulary{httpd}
\ordinaryword{add-responder}{add-responder ( responder -- )}
}
Adds a responder to the hashtable stored in the \verb|responders| variable that is in scope.
\subsubsection{Developing a responder}
A responder is a hashtable where the following keys must be set:
\begin{description}
\item[\texttt{"responder"}] The name of the responder
\item[\texttt{"get"}] A quotation with stack effect \verb|( resource -- )|, invoked when a client requests the path \texttt{/responder/\emph{name}/\emph{resource}} using the \texttt{GET} method
\item[\texttt{"post"}] A quotation with stack effect \verb|( resource -- )|, invoked when a client requests the path using the \texttt{POST} method
\item[\texttt{"head"}] A quotation with stack effect \verb|( resource -- )|, invoked when a client requests the path using the \texttt{HEAD} method
\end{description}
The quotations are called by the HTTP server in the dynamic scope of the responder, with the following additional variables set:
\begin{description}
\item[\texttt{"method"}] The HTTP method requested by the client; one of \texttt{"get"}, \texttt{"post"} or \texttt{"head"}
\item[\texttt{"request"}] The full URL requested by the client, sans any query parameters trailing a \verb|?| character
\item[\texttt{"query"}] An association list of URL-decoded string pairs, obtained by splitting the query string, if any
\item[\texttt{"raw-query"}] The raw query string. Usually not used
\item[\texttt{"header"}] An association list of URL-decoded string pairs, obtained from the headers sent by the client
\item[\texttt{"response"}] Only set in the \verb|POST| method, an association list of string pairs, obtained from the response sent by the client
\end{description}
\subsection{Virtual hosts}\label{vhosts}
\glossary{name=virtual hosting,
description={A technique where a server is given several host name aliases via DNS; then the server serves different web sites, depending on the particular host name alias requested by the client}}
Factor's HTTP server supports virtual hosting. This is done with an additional layer of indirection, where each virtual host has its own set of responders. Virtual hosting is optional; by default, all virtual hosts share the same set of responders.
\wordtable{
\vocabulary{httpd}
\symbolword{vhosts}
}
Virtual hosts are defined in a hashtable mapping virtual host names to hashtables of responders. An initial value for this variable is defined in the global namespace; it provides an empty default virtual host. The default virtual host is the value associated with the \verb|"default"| key.
When a client makes a request, the HTTP server first searches for a responder in the requested virtual host; if no responder is found there, the default virtual host is checked. If this also fails, the global \verb|responders| variable is tested.
As an example, suppose the machine running the HTTP server has two DNS aliases, \verb|porky.ham.net| and \verb|spam.ham.net|. The following setup will serve two different static web sites from each virtual host, whereas all other responders are taken from the global table, and are shared between the two hosts:
\begin{verbatim}
vhosts get [
"porky.ham.net" {{
[[ "doc-root" "/var/www/porky/" ]]
}} set
"spam.ham.net" {{
[[ "doc-root" "/var/www/spam/" ]]
}} set
] bind
\end{verbatim}
\section{HTML streams}\label{html}
An HTML stream wraps an existing stream. Strings written to the HTML stream have their special characters converted to HTML entities via the \verb|chars>entities| word documented below. In addition, the \texttt{attrs} parameter to the \texttt{stream-format} word is inspected for style attributes that direct the stream to wrap the output in various HTML tags; see \ref{styles}.
\wordtable{
\vocabulary{html}
\ordinaryword{with-html-stream}{with-html-stream ( quot -- )}
}
Calls the quotation in a new dynamic scope. The \texttt{stdio} variable is set to an HTML stream wrapping the previous value of \texttt{stdio}, so calls to \texttt{write}, \texttt{format} and \texttt{print} go through the HTML stream.
\wordtable{
\vocabulary{html}
\ordinaryword{html-document}{html-document ( title quot -- )}
}
2005-05-02 02:29:24 -04:00
Builds on \texttt{with-html-stream} to emit the basic structure of an HTML document, consisting of \texttt{<html>}, \texttt{<head>} and \texttt{<body>} tags. The title is output in two places; a \texttt{<title>} tag and \texttt{<h1>} tag.
\wordtable{
\vocabulary{html}
\ordinaryword{simple-html-document}{simple-html-document ( title quot -- )}
}
Like \texttt{html-document}, except the output is wrapped inside a \texttt{<pre>} tag.
\wordtable{
\vocabulary{html}
\ordinaryword{chars>entities}{chars>entities ( string -- string )}
}
Converts various special characters in the input string (or any sequence of characters) into corresponding HTML entities. The following characters are converted:
\begin{tabular}{l|l}
Character&Entity\\
\hline
\verb|<| &\verb|&lt;|\\
\verb|>| &\verb|&gt;|\\
\verb|&| &\verb|&amp;|\\
\verb|'| &\verb|&apos;|\\
\verb|"| &\verb|&quot;|
\end{tabular}
\chapter{Alien interface}
2005-05-05 03:12:37 -04:00
Factor's alien inteface provides a means of directly calling native libraries written in C and other languages. There are no
wrappers to write, other than having to specify the return type and parameter types for
the functions you wish to call.
\section{Loading native libraries}
2005-05-05 03:12:37 -04:00
A native library must be made available to Factor under a logical name before use. This is done via command line parameters, or the \verb|add-library| word.
The following two command line parameters can be specified for each library to load; the second parameter is optional.
\begin{description}
\item[\texttt{-libraries:\emph{logical}:name=\emph{name}}] associates a logical name with a system-specific native library name,
\item[\texttt{-libraries:\emph{logical}:abi=\emph{type}}] specifies the calling convention to use; \verb|type| is either \verb|cdecl| or \verb|stdcall|. If not specified, the default is \verb|cdecl|. On Unix, all libraries follow the \verb|cdecl| convention. On Windows, most libraries (but not all) follow \verb|stdcall|.
\end{description}
For example:
\begin{alltt}
\textbf{\$} ./f factor.image -libraries:sdl:name=libSDL-1.2.so
\end{alltt}
Another option is to add libraries while Factor is running.
\wordtable{
\vocabulary{alien}
\ordinaryword{add-library}{add-library ( library name abi -- )}
}
Adds a logical library named \verb|library|. The underlying shared library name is \verb|name|, and the calling convention is \verb|abi| and must be either \verb|"cdecl"| or \verb|"stdcall"|.
For example:
\begin{alltt}
"kernel32" "kernel32.dll" "stdcall" add-library
2005-05-05 03:12:37 -04:00
\end{alltt}
The next word is used in the implementation of the alien interface, and it can also be used
interactively to test if a library can be loaded.
\wordtable{
\vocabulary{alien}
\ordinaryword{load-library}{load-library ( library -- dll )}
}
Attempts to load the library with the given logical name, and outputs a DLL handle. If the library is already loaded, the existing DLL is output.
More will be said about DLL handles in \ref{alien-internals}.
\section{Calling native functions}
2005-05-05 03:12:37 -04:00
Native function bindings are established using a pair of parsing words.
\wordtable{
\vocabulary{alien}
\parsingword{LIBARARY:}{LIBARARY:~\emph{name}}
}
Specifies the logical name of the C library in which the following functions are found.
2005-05-05 03:12:37 -04:00
\wordtable{
\vocabulary{alien}
\parsingword{FUNCTION:}{FUNCTION:~\emph{returns} \emph{name} ( \emph{type} \emph{name}, \emph{...} )}
2005-05-05 03:12:37 -04:00
}
Defines a new word \verb|name| that calls the C function with the same name, found in the library given by the most recent \verb|LIBRARY:| declaration.
2005-05-05 03:12:37 -04:00
The \verb|return| value names a C type, or \verb|void| if no return value is expected.
Parameters are given by consecutive type/name pairs, where the type is again a C type, and the name is for documentation purposes. C types are documented in \ref{aliens}.
2005-05-05 03:12:37 -04:00
The word generated by \verb|FUNCTION:| must be compiled before use \ref{compiler}). Executing it without compiling will throw an exception.
2005-05-05 03:12:37 -04:00
For example, suppose you have a \verb|foo| library exporting the following function:
\begin{verbatim}
void the_answer(char* question, int value) {
printf("The answer to %s is %d.\n",question,value);
}
\end{verbatim}
You can define a word for invoking it:
\begin{verbatim}
LIBRARY: foo
FUNCTION: the_answer ( char* question, int value ) ;
2005-05-05 03:12:37 -04:00
\end{verbatim}
Now, after being compiled, the word can be executed with two parameters on the stack:
\begin{alltt}
\bs the-answer compile
2005-05-05 03:12:37 -04:00
\textbf{Compiling the-answer}
"the question" 42 the-answer
2005-05-05 03:12:37 -04:00
\textbf{The answer to the question is 42.}
\end{alltt}
Note that the parentheses and commas are only syntax sugar; the following two definitions are equivalent, although the former is slightly more readable:
\begin{verbatim}
FUNCTION: void glHint ( GLenum target, GLenum mode ) ;
FUNCTION: void glHint GLenum target GLenum mode ;
\end{verbatim}
\section{Alien objects}\label{aliens}
2005-05-05 03:12:37 -04:00
\glossary{
name=alien,
description={an instance of the \verb|alien| class, holding a pointer to native memory outside the Factor heap}}
The alien interface can work with an assortment of native data types:
\begin{itemize}
\item integer and floating point values
\item null-terminated strings
\item structures (\ref{alien-structs})
\item unions (\ref{alien-unions})
\end{itemize}
Table \ref{c-types} lists the built-in return value and parameter types. The sizes are given for a 32-bit system. Native numbers and strings are handled in a straight-forward way. Pointers are a bit more complicated, and are wrapped inside alien objects on the Factor side.
\begin{table}
\caption{\label{c-types}Supported native types}
\begin{tabular}{l|l|l}
Name&Size&Representation\\
\hline
\texttt{char} &1& Signed integer\\
\texttt{uchar} &1& Unsigned integer\\
\texttt{short} &2& Signed integer\\
\texttt{ushort} &2& Unsigned integer\\
\texttt{int} &4& Signed integer\\
\texttt{uint} &4& Unsigned integer\\
\texttt{long} &4& Signed integer\\
\texttt{ulong} &4& Unsigned integer\\
\texttt{longlong} &8& Signed integer\\
\texttt{ulonglong} &8& Unsigned integer\\
\texttt{float} &4& Single-precision float\\
\texttt{double} &8& Double-precision float\\
\texttt{char*} &4& Pointer to null-terminated byte string\\
\texttt{ushort*} &4& Pointer to null-terminated UTF16 string\\
\texttt{void*} &4& Generic pointer
\end{tabular}
\end{table}
A facility similar to C's \verb|typedef| type aliasing is provided. It can help with readability, as well as ease of development of library bindings.
\wordtable{
\vocabulary{alien}
\parsingword{TYPEDEF:}{TYPEDEF:~\emph{old} \emph{new}}
}
Defines a C type named \verb|new| that is identical to \verb|old|, along with a pointer type \verb|new*| that is identical to \verb|old*|.
2005-05-05 03:12:37 -04:00
\wordtable{
\vocabulary{alien}
\ordinaryword{c-size}{c-size ( type -- n )}
}
Outputs the size of the given C type. This is just like the \verb|sizeof| operator in C.
Many native functions expect you to specify sizes for input and output parameters, and
this word can be used for that purpose.
\wordtable{
\vocabulary{alien}
\classword{alien}
}
Pointers to native memory, including \verb|void*| and other types, are represented as objects of the \verb|alien| class.
\wordtable{
\vocabulary{alien}
\predword{alien?}
}
Tests if the object at the top of the stack is an alien pointer.
\subsection{Structures}\label{alien-structs}
2005-05-05 03:12:37 -04:00
One way to think of a C-style \verb|struct| is that it abstracts reading and writing field values stored at a range of memory given a pointer, by associating a type and offset with each field. This is the view taken by the alien interface, where defining a C structure creates a set of words for reading and writing fields of various types, offset from a base pointer given by an alien object.
\wordtable{
\vocabulary{alien}
\parsingword{BEGIN-STRUCT:}{BEGIN-STRUCT: \emph{name}}
}
Begins reading a C structure definition.
\wordtable{
\vocabulary{alien}
\parsingword{FIELD:}{FIELD: \emph{type} \emph{name}}
}
Adds a field to the structure. The \verb|type| token identifies a C type, and \verb|name| gives a name to the field. A pair of words is defined, where \verb|structure| and \verb|field| are names, respectively:
\begin{alltt}
\emph{structure}-\emph{field} ( alien -- value )
set-\emph{structure}-\emph{field} ( value alien -- )
\end{alltt}
\wordtable{
\vocabulary{alien}
\parsingword{END-STRUCT}{END-STRUCT}
}
Ends a structure definition.
Defining a structure adds two new C types, where \verb|name| is the name of the structure:
\begin{description}
\item[\texttt{\emph{name}}] the type of the structure itself; structure and union definitions can define members to be of this type.
\item[\texttt{\emph{name}*}] the type of a pointer to the structure; this type can be used with return values and parameters, and in fact it is an alias for \texttt{void*}.
\end{description}
Additionally, the following two words are defined:
\begin{description}
\item[\texttt{<\emph{name}> ( -- byte-array )}] allocates a byte array large enough to hold the structure in the Factor heap. The field accessor words can then be used to work with this byte array. This feature allows calling native functions that expect pointers to caller-allocated structures\footnote{
There is an important restriction, however; the function must not retain the pointer in a global variable after it returns. Since the structure is allocated in the Factor heap, the garbage collector is free to move it between native function calls. If this behavior is undesirable, memory can be managed manually instead (\ref{malloc}).}.
\item[\texttt{\emph{name}-nth ( n alien -- alien )}] given a pointer and index into an array of structures, returns a pointer to the structure at that index.
\end{description}
Here is an example of a structure with various fields:
\begin{verbatim}
BEGIN-STRUCT: surface
FIELD: uint flags
FIELD: format* format
FIELD: int w
FIELD: int h
FIELD: ushort pitch
FIELD: void* pixels
FIELD: int offset
FIELD: void* hwdata
FIELD: short clip-x
FIELD: short clip-y
FIELD: ushort clip-w
FIELD: ushort clip-h
FIELD: uint unused1
FIELD: uint locked
FIELD: int map
FIELD: uint format_version
FIELD: int refcount
END-STRUCT
\end{verbatim}
\subsection{Unions}\label{alien-unions}
2005-05-05 03:12:37 -04:00
A C-style \verb|union| type allocates enough space for its largest member. In the alien interface, unions are used to allocate byte arrays in the Factor heap that may hold any one of the union's members.
\wordtable{
\vocabulary{alien}
\parsingword{BEGIN-STRUCT:}{BEGIN-STRUCT: \emph{name}}
}
Begins reading a C union definition.
\wordtable{
\vocabulary{alien}
\parsingword{MEMBER:}{MEMBER: \emph{type}}
}
Adds a member type to the union.
\wordtable{
\vocabulary{alien}
\parsingword{END-UNION}{END-UNION}
}
Ends a union definition.
Unions define C types and words analogous to those for structures; see \ref{alien-structs}.
Here is an example:
\begin{verbatim}
BEGIN-UNION: event
MEMBER: event
MEMBER: active-event
MEMBER: keyboard-event
MEMBER: motion-event
MEMBER: button-event
END-UNION
\end{verbatim}
\subsection{Enumerations}
A C-style \verb|enum| type defines a set of integer constants. The alien interface lets you define a set of words that push integers on the stack in much the same way as you would in C. While these words can be used for any purpose, using them outside of interfacing with C is discouraged.
\wordtable{
\vocabulary{alien}
\parsingword{BEGIN-ENUM:}{BEGIN-ENUM \emph{start}}
}
Begins an enumeration that numbers constants starting from \verb|start|.
\wordtable{
\vocabulary{alien}
\parsingword{ENUM:}{ENUM: \emph{name}}
}
Defines a compound word \verb|name| that pushes a integer. The integer's value is incremented each time \verb|ENUM:| defines a new word.
\wordtable{
\vocabulary{alien}
\parsingword{END-ENUM}{END-ENUM}
}
Ends an enumeration.
Here is an example:
\begin{verbatim}
BEGIN-ENUM: 0
ENUM: monday
ENUM: tuesday
ENUM: wednesday
ENUM: thursday
ENUM: friday
ENUM: saturday
ENUM: sunday
END-ENUM
\end{verbatim}
This is in fact functionally equivalent to the following code:
\begin{verbatim}
: monday 0 ;
: tuesday 1 ;
: wednesday 2 ;
: thursday 3 ;
: friday 4 ;
: saturday 5 ;
: sunday 6 ;
\end{verbatim}
\section{Low-level interface}\label{alien-internals}
2005-05-05 03:12:37 -04:00
The alien interface is built on top of a handful of primitives. Sometimes, it is
useful to call these primitives directly for debugging purposes.
\wordtable{
\vocabulary{alien}
\classword{dll}
}
Instances of this class are handles to native libraries.
\wordtable{
\vocabulary{alien}
\ordinaryword{dlopen}{dlopen ( path -- dll )}
}
Opens the specified native library and returns a DLL object. The input parameter is the
name of a native library file,
\emph{not} a logical library name.
\wordtable{
\vocabulary{alien}
\ordinaryword{dlsym}{dlsym ( symbol dll -- address )}
}
Looks up a named symbol in a native library, and outputs it address. If the \verb|dll| is \verb|f|, the lookup is performed in the runtime executable itself.
\wordtable{
\vocabulary{alien}
\ordinaryword{dlclose}{dlclose ( dll -- )}
}
Closes a native library and frees associated native resources.
\wordtable{
\vocabulary{alien}
\ordinaryword{alien-address}{alien-address ( alien -- address )}
}
Outputs the address of an alien, as an integer.
\wordtable{
\vocabulary{alien}
\ordinaryword{<alien>}{<alien> ( address -- alien )}
}
Creates an alien pointing to the specified address.
\wordtable{
\vocabulary{alien}
\ordinaryword{<displaced-alien>}{<displaced-alien> ( offset alien -- alien )}
}
Outputs an alien pointing at an offset from the base pointer of the input alien. Displaced aliens are used to access nested structures and native arrays.
\wordtable{
\vocabulary{alien}
\ordinaryword{alien-signed-cell}{alien-signed-cell ( alien offset -- n )}
\ordinaryword{set-alien-signed-cell}{set-alien-signed-cell ( n alien offset -- )}
\ordinaryword{alien-unsigned-cell}{alien-unsigned-cell ( alien offset -- n )}
\ordinaryword{set-alien-unsigned-cell}{set-alien-unsigned-cell( n alien offset -- )}
\ordinaryword{alien-signed-8}{alien-signed-8 ( alien offset -- n )}
\ordinaryword{set-alien-signed-8}{set-alien-signed-8 ( n alien offset -- )}
\ordinaryword{alien-unsigned-8}{alien-unsigned-8 ( alien offset -- n )}
\ordinaryword{set-alien-unsigned-8}{set-alien-unsigned-8 ( n alien offset -- )}
\ordinaryword{alien-signed-4}{alien-signed-4 ( alien offset -- n )}
\ordinaryword{set-alien-signed-4}{set-alien-signed-4 ( n alien offset -- )}
\ordinaryword{alien-unsigned-4}{alien-unsigned-4 ( alien offset -- n )}
\ordinaryword{set-alien-unsigned-4}{set-alien-unsigned-4 ( n alien offset -- )}
\ordinaryword{alien-signed-2}{alien-signed-2 ( alien offset -- n )}
\ordinaryword{set-alien-signed-2}{set-alien-signed-2 ( n alien offset -- )}
\ordinaryword{alien-unsigned-2}{alien-unsigned-2 ( alien offset -- n )}
\ordinaryword{set-alien-unsigned-2}{set-alien-unsigned-2 ( n alien offset -- )}
\ordinaryword{alien-signed-1}{alien-signed-1 ( alien offset -- n )}
\ordinaryword{set-alien-signed-1}{set-alien-signed-1 ( n alien offset -- )}
\ordinaryword{alien-unsigned-1}{alien-unsigned-1 ( alien offset -- n )}
\ordinaryword{set-alien-unsigned-1}{set-alien-unsigned-1 ( n alien offset -- )}
\ordinaryword{alien-value-string}{alien-value-string ( alien offset -- string )}
}
These primitives read and write native memory. They can be given an alien, displaced alien, or byte array. No bounds checking of any kind is performed.
\section{Manual memory management}\label{malloc}
2005-05-05 03:12:37 -04:00
If for whatever reason Factor's memory management is unsuitable for a certain task, you can
directly call the standard C memory management routines. These words are very raw and deal with addresses directly, and of course it is easy to corrupt memory or crash the runtime
this way.
\wordtable{
\vocabulary{kernel-internals}
\ordinaryword{malloc}{malloc ( size -- address )}
}
Allocate a block of size \verb|size| and output a pointer to it.
\wordtable{
\vocabulary{kernel-internals}
\ordinaryword{realloc}{realloc ( address size -- address )}
}
Resize a block previously allocated with \verb|malloc|.
\wordtable{
\vocabulary{kernel-internals}
\ordinaryword{free}{free ( address -- )}
}
Deallocate a block previously allocated with \verb|malloc|.
2005-04-24 20:57:37 -04:00
\part{Development tools}
2005-04-24 20:57:37 -04:00
Factor supports interactive development in a live environment. Instead of working with
static executable files and restarting your application after each change, you can
incrementally make changes to your application and test them immediately. If you
notice an undesirable behavior, Factor's powerful reflection features will aid in
pinpointing the error.
If you are used to a statically typed language, you might find Factor's tendency to only fail at runtime hard to work with at first. However, the interactive development tools outlined in this part allow a much quicker turn-around time for testing changes. Also, write unit tests -- unit testing is a great way to ensure that old bugs do not re-appear once they've been fixed.
2005-04-24 20:57:37 -04:00
\chapter{System organization}
2005-04-24 20:57:37 -04:00
\section{The listener}\label{listener}
2005-04-24 20:57:37 -04:00
Factor is an \emph{image-based environment}. When you compiled Factor, you also generated a file named \texttt{factor.image}. I will have more to say about images later, but for now it suffices to understand that to start Factor, you must pass the image file name on the command line:
\begin{alltt}
./f factor.image
\textbf{Loading factor.image... relocating... done
Factor 0.73 :: http://factor.sourceforge.net :: unix/x86
(C) 2003, 2005 Slava Pestov, Chris Double,
Mackenzie Straight
ok}
\end{alltt}
An \texttt{\textbf{ok}} prompt is printed after the initial banner, indicating the listener is ready to execute Factor phrases. The listener is a piece of Factor code, like any other; however, it helps to think of it as the primary interface to the Factor system. The listener reads Factor code and executes it. You can try the classical first program:
\begin{alltt}
"Hello, world." print
2005-04-24 20:57:37 -04:00
\textbf{Hello, world.}
\end{alltt}
Multi-line phrases are supported; if there are unclosed brackets, the listener outputs \texttt{...} instead of the \texttt{ok} prompt, and the entire phrase is executed once all brackets are closed:
\begin{alltt}
[ 1 2 3 ] [
2005-04-24 20:57:37 -04:00
\textbf{...} .
\textbf{...} ] each
\textbf{1
2
3}
\end{alltt}
The listener knows when to print a continuation prompt by looking at the height of the
stack. Parsing words such as \texttt{[} and \texttt{:} leave elements on the parser
stack; these elements are popped by \texttt{]} and \texttt{;}.
2005-05-03 02:58:59 -04:00
On startup, Factor reads the \texttt{.factor-rc} file from your home directory. You can put
any quick definitions you want available at the listener there. To avoid loading this
file, pass the \texttt{-no-user-init} command line switch. Another way to have a set of definitions available at all times is to save a custom image, as described in the next section.
\section{Source files}
2005-04-24 20:57:37 -04:00
While it is possible to do all development in the listener and save your work in images, it is far more convenient to work with source files, at least until an in-image structure editor is developed.
By convention, Factor source files are saved with the \texttt{.factor} filename extension. They can be loaded into the image as follows:
\begin{alltt}
"examples/numbers-game.factor" run-file
2005-04-24 20:57:37 -04:00
\end{alltt}
In Factor, loading a source file replaces any existing definitions\footnote{But see \ref{compiler} for this is not true of compiled code.}. Each word definition remembers what source file it was loaded from (if any). To reload the source file associated with a definition, use the \texttt{reload} word:
\begin{alltt}
\bs draw reload
2005-04-24 20:57:37 -04:00
\end{alltt}
Word definitions also retain the line number where they are located in their original source file. This allows you to open a word definition in jEdit\footnote{\texttt{http://www.jedit.org}} for editing using the
\texttt{jedit} word:
\begin{alltt}
\bs compile jedit
2005-04-24 20:57:37 -04:00
\end{alltt}
This word requires that a jEdit instance is already running.
2005-05-03 02:58:59 -04:00
The \texttt{jedit} word will open word definitions from the Factor library once the full path of the Factor source tree is entered into the \texttt{"resource-path"} variable. One way to do this is to add a phrase like the following to your \texttt{.factor-rc}:
2005-04-24 20:57:37 -04:00
\begin{verbatim}
"/home/slava/Factor/" "resource-path" set
\end{verbatim}
\section{Images}
2005-04-24 20:57:37 -04:00
The \texttt{factor.image} file is basically a dump of all objects in the heap. A new image can be saved as follows:
\begin{alltt}
"work.image" save-image
2005-04-24 20:57:37 -04:00
\textbf{Saving work.image...}
\end{alltt}
When you save an image before exiting Factor, then start Factor again, everything will be almost as you left it. Try the following:
\begin{alltt}
./f factor.image
"Learn Factor" "reminder" set
"factor.image" save-image bye
2005-04-24 20:57:37 -04:00
\textbf{Saving factor.image...}
\end{alltt}
Factor will save the image and exit. Now start it again and see that the reminder is still there:
\begin{alltt}
./f factor.image
"reminder" get .
2005-04-24 20:57:37 -04:00
\textbf{"Learn Factor"}
\end{alltt}
This is what is meant by the image being an \emph{infinite session}. When you shut down and restart Factor, what happends is much closer to a Laptop's ``suspend'' mode, than a desktop computer being fully shut down.
\section{Looking at objects}
2005-04-24 20:57:37 -04:00
Probably the most important debugging tool of them all is the \texttt{.} word. It prints the object at the top of the stack in a form that can be parsed by the Factor parser. A related word is \texttt{prettyprint}. It is identical to \texttt{.} except the output is more verbose; lists, vectors and hashtables are broken up into multiple lines and indented.
\begin{alltt}
[ [ \tto 1 \ttc \tto 2 \ttc ] dup car swap cdr ] .
2005-08-10 19:37:59 -04:00
\textbf{[ [ \tto 1 \ttc \tto 2 \ttc ] dup car swap cdr ]}
2005-04-24 20:57:37 -04:00
\end{alltt}
Most objects print in a parsable form, but not all. One exceptions to this rule is objects with external state, such as I/O ports or aliens (pointers to native structures). Also, objects with circular or very deeply nested structure will not print in a fully parsable form, since the prettyprinter has a limit on maximum nesting. Here is an example -- a vector is created, that holds a list whose first element is the vector itself:
\begin{alltt}
\tto \ttc [ unit 0 ] keep [ set-vector-nth ] keep .
2005-08-10 19:37:59 -04:00
\textbf{\tto [ ... ] \ttc}
2005-04-24 20:57:37 -04:00
\end{alltt}
The prettyprinted form of a vector or list with many elements is not always readable. The \texttt{[.]} and \texttt{\tto.\ttc} words output a list or a vector, respectively, with each element on its own line. In fact, the stack printing words are defined in terms of \texttt{[.]} and \texttt{\tto.\ttc}:
\begin{verbatim}
: .s datastack {.} ;
: .r callstack {.} ;
\end{verbatim}
Before we move on, one final set of output words comes is used to output integers in
different numeric bases. The \texttt{.b} word prints an integer in binary, \texttt{.o} in octal, and \texttt{.h} in hexadecimal.
\begin{alltt}
31337 .b
2005-04-24 20:57:37 -04:00
\textbf{111101001101001}
31337 .o
2005-04-24 20:57:37 -04:00
\textbf{75151}
31337 .h
2005-04-24 20:57:37 -04:00
\textbf{7a69}
\end{alltt}
\chapter{Word tools}
2005-04-24 20:57:37 -04:00
\section{Exploring vocabularies}\label{exploring-vocabs}
2005-04-24 20:57:37 -04:00
Factor organizes code in a two-tier structure of vocabularies and words. A word is the smallest unit of code; it corresponds to a function or method in other languages. Vocabularies group related words together for easy browsing and tracking of source dependencies.
Entering \texttt{vocabs .}~in the listener produces a list of all existing vocabularies:
\begin{alltt}
vocabs .
2005-04-24 20:57:37 -04:00
\textbf{[ "alien" "ansi" "assembler" "browser-responder"
"command-line" "compiler" "cont-responder" "errors"
"file-responder" "files" "gadgets" "generic"
"hashtables" "html" "httpd" "httpd-responder" "image"
"inference" "interpreter" "io-internals" "jedit"
"kernel" "kernel-internals" "line-editor" "listener"
"lists" "logging" "math" "math-internals" "memory"
"namespaces" "parser" "prettyprint" "profiler"
"quit-responder" "random" "resource-responder"
"scratchpad" "sdl" "shells" "stdio" "streams"
"strings" "syntax" "telnetd" "test" "test-responder"
"threads" "unparser" "url-encoding" "vectors" "words" ]}
\end{alltt}
As you can see, there are a lot of vocabularies! Now, you can use \texttt{words .}~to list the words inside a given vocabulary:
\begin{alltt}
"namespaces" words .
2005-04-24 20:57:37 -04:00
\textbf{[ (get) , <namespace> >n append, bind change cons@
dec extend get global inc init-namespaces list-buffer
literal, make-list make-rlist make-rstring make-string
make-vector n> namespace namestack nest off on put set
set-global set-namestack unique, unique@ with-scope ]}
\end{alltt}
You can look at the definition of any word, including library words, using \texttt{see}. Keep in mind you might have to \texttt{USE:} the vocabulary first.
\begin{alltt}
USE: httpd
\bs httpd-connection see
2005-04-24 20:57:37 -04:00
\textbf{IN: httpd : httpd-connection ( socket -- )
"http-server" get accept [
httpd-client
] in-thread drop ;}
\end{alltt}
The \texttt{see} word shows a reconstruction of the source code, not the original source code. So in particular, formatting and some comments are lost.
\section{Cross-referencing words}
2005-04-24 20:57:37 -04:00
The \texttt{apropos.} word is handy when searching for related words. It lists all words
whose names contain a given string. The \texttt{apropos.} word is also useful when you know the exact name of a word, but are unsure what vocabulary it is in. For example, if you're looking for ways to iterate over various collections, you can do an apropos search for \texttt{map}:
\begin{alltt}
"map" apropos.
2005-08-10 19:37:59 -04:00
\textbf{IN: generic
typemap
IN: hashtables
map>hash
2005-04-24 20:57:37 -04:00
IN: sdl
set-surface-map
surface-map
2005-08-10 19:37:59 -04:00
IN: sequences
2map
map
map-with
nmap}
2005-04-24 20:57:37 -04:00
\end{alltt}
2005-08-10 19:37:59 -04:00
From the above output, you can see a few words to explore, such as \verb|map|, \verb|map-with|, and \verb|2map|.
2005-04-24 20:57:37 -04:00
The \texttt{usage} word finds all words that refer to a given word and pushes a list on the stack. This word is helpful in two situations; the first is for learning -- a good way to learn a word is to see it used in context. The second is during refactoring -- if you change a word's stack effect, you must also update all words that call it. Usually you print the
return value of \texttt{usage} using \texttt{.}:
\begin{alltt}
\bs string-map usage .
2005-04-24 20:57:37 -04:00
\textbf{schars>entities
filter-null
url-encode}
\end{alltt}
Another useful word is \texttt{usages}. Unlike \texttt{usage}, it finds all usages, even
indirect ones -- so if a word refers to another word that refers to the given word,
both words will be in the output list.
\section{Exploring classes}
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
Factor supports object-oriented programming via generic words (\ref{generic}). Generic words are called
2005-04-24 20:57:37 -04:00
like ordinary words, however they can have multiple definitions, one per class, and
these definitions do not have to appear in the same source file. Such a definition is
termed a \emph{method}, and the method is said to \emph{specialize} on a certain
class. A class in the most
general sense is just a set of objects. You can output a list of classes in the system
with \texttt{classes .}:
\begin{alltt}
classes.
2005-04-24 20:57:37 -04:00
\textbf{[ alien alien-error byte-array displaced-alien
dll ansi-stream disp-only displaced indirect operand
register absolute absolute-16/16 relative relative-bitfld
item kernel-error no-method border checkbox dialog editor
ellipse etched-rect frame gadget hand hollow-ellipse
hollow-rect label line menu pane pile plain-ellipse
plain-rect rectangle roll-rect scroller shelf slider
stack tile viewport world 2generic arrayed builtin
complement generic null object predicate tuple
tuple-class union hashtable html-stream class-tie
computed inference-error inference-warning literal
literal-tie value buffer port jedit-stream boolean
general-t array cons general-list list bignum complex
fixnum float integer number ratio rational real
parse-error potential-float potential-ratio
button-down-event button-up-event joy-axis-event
joy-ball-event joy-button-down-event joy-button-up-event
joy-hat-event key-down-event key-up-event motion-event
quit-event resize-event user-event sequence stdio-stream
client-stream fd-stream null-stream server string-output
wrapper-stream LETTER blank digit letter printable sbuf
string text POSTPONE: f POSTPONE: t vector compound
primitive symbol undefined word ]}
\end{alltt}
If you \texttt{see} a generic word, all methods defined on the generic word are shown.
Alternatively, you can use \texttt{methods.} to print all methods specializing on a
given class:
\begin{alltt}
\bs list methods.
2005-04-24 20:57:37 -04:00
\textbf{PREDICATE: general-list list
dup [
last* cdr
] when not ;
IN: gadgets
M: list custom-sheet
[
length count
] keep zip alist>sheet "Elements:" <titled> ;
IN: prettyprint
M: list prettyprint*
[
[
POSTPONE: [
] car swap [
POSTPONE: ]
] car prettyprint-sequence
] check-recursion ;}
\end{alltt}
2005-06-15 23:27:28 -04:00
\chapter{Debugging and optimizing}
2005-04-24 20:57:37 -04:00
\section{Looking at stacks}
2005-04-24 20:57:37 -04:00
2005-08-10 19:37:59 -04:00
To see the contents of the data or return stack, use the \texttt{.s} and \texttt{.r} words.
Each stack is printed with each element on its own line; the top of the stack is the first element printed.
2005-04-24 20:57:37 -04:00
\section{The debugger}
2005-04-24 20:57:37 -04:00
If the execution of a phrase in the listener causes an error to be thrown, the error
is printed and the stacks at the time of the error are saved. If you're spent any
time with Factor at all, you are probably familiar with this type of message:
\begin{alltt}
[ 1 2 3 ] 4 append reverse
2005-04-24 20:57:37 -04:00
\textbf{The generic word car does not have a suitable method for 4
2005-08-10 19:37:59 -04:00
:s :r show stacks at time of error.
2005-04-24 20:57:37 -04:00
:get ( var -- value ) inspects the error namestack.}
\end{alltt}
2005-08-10 19:37:59 -04:00
The words \texttt{:s} and \texttt{:r} behave like their counterparts that are prefixed with \texttt{.}, except they show the stacks as they were when the error was thrown.
2005-04-24 20:57:37 -04:00
The return stack warrants some special attention. To successfully develop Factor, you will need to learn to understand how it works. Lets look at the first few lines of the return stack at the time of the above error:
\begin{verbatim}
[ swap cdr ]
uncons
[ r> tuck 2slip ]
(each)
[ swons ]
[ each ]
each
\end{verbatim}
You can see the sequence of calls leading up to the error was \texttt{each} calling \texttt{(each)} calling \texttt{uncons}. The error tells us that the \texttt{car} word is the one that failed. Now, you can stare at the stack dump, at notice that if the call to \texttt{car} was successful and execution returned to \texttt{(each)}, the quotation \texttt{[ r> tuck 2slip ]} would resume executing. The first word there, \texttt{r>}, would take the quotation \texttt{[ swons ]} and put it back on the data stack. After \texttt{(each)} returned, it would then continue executing the quotation \texttt{[ each ]}. So what is going on here is a recursive loop, \texttt{[ swons ] each}. If you look at the definition of \texttt{reverse}, you will see that this is exactly what is being done:
\begin{verbatim}
: reverse ( list -- list ) [ ] swap [ swons ] each ;
\end{verbatim}
So a list is being reversed, but at some stage, the \texttt{car} is taken of something that is not a number. Now, you can look at the data stack with \texttt{:s}:
\begin{verbatim}
<< no-method [ ] 4 car >>
car
4
4
[ 3 2 1 ]
\end{verbatim}
So now, the mystery has been solved: as \texttt{reverse} iterates down the input value, it hits a cons cells whose \texttt{cdr} is not a list. Indeed, if you look at the value we are passing to \texttt{reverse}, you will see why:
\begin{alltt}
[ 1 2 3 ] 4 append .
2005-04-24 20:57:37 -04:00
[[ 1 [[ 2 [[ 3 4 ]] ]] ]]
\end{alltt}
In the future, the debugger will be linked with the walker, documented below. Right now, the walker is a separate tool. Another caveat is that in compiled code, the return stack is not reconstructed if there is an error. Until this is fixed, you should only compile code once it is debugged. For more potential compiler pitfalls, see \ref{compiler}.
\section{The walker}
2005-04-24 20:57:37 -04:00
The walker lets you step through the execution of a qotation. When a compound definition is reached, you can either keep walking inside the definition, or execute it in one step. The stacks can be inspected at each stage.
There are two ways to use the walker. First of all, you can call the \texttt{walk} word explicitly, giving it a quotation:
\begin{alltt}
2005-08-30 23:42:15 -04:00
[ [ 10 [ dup , ] repeat ] [ ] make ] walk
2005-08-10 19:37:59 -04:00
\textbf{\&s \&r show stepper stacks.
2005-04-24 20:57:37 -04:00
\&get ( var -- value ) inspects the stepper namestack.
step -- single step over
into -- single step into
continue -- continue execution
bye -- exit single-stepper
[ [ 10 [ dup , ] repeat ] make-list ]
walk}
\end{alltt}
As you can see, the walker prints a brief help message, then the currently executing quotation. It changes the listener prompt from \texttt{ok} to \texttt{walk}, to remind you that there is a suspended continuation.
The first element of the quotation shown is the next object to be evaluated. If it is a literal, both \texttt{step} and \texttt{into} have the effect of pushing it on the walker data stack. If it is a compound definition, then \texttt{into} will recurse the walker into the compound definition; otherwise, the word executes in one step.
The \texttt{\&r} word shows the walker return stack, which is laid out just like the primary interpreter's return stack. In fact, a good way to understand how Factor's return stack works is to play with the walker.
Note that the walker does not automatically stop when the quotation originally given finishes executing; it just keeps on walking up the return stack, and even lets you step through the listener's code. You can invoke \texttt{continue} or \texttt{exit} to terminate the walker.
While the walker can be invoked explicitly using the \texttt{walk} word, sometimes it is more convenient to \emph{annotate} a word such that the walker is invoked automatically when the word is called. This can be done using the \texttt{break} word:
\begin{alltt}
\bs layout* break
2005-04-24 20:57:37 -04:00
\end{alltt}
Now, when some piece of code calls \texttt{layout*}, the walker will open, and you will be able to step through execution and see exactly what's going on. An important point to keep in mind is that when the walker is invoked in this manner, \texttt{exit} will not have the desired effect; execution will continue, but the data stack will be inconsistent, and an error will most likely be raised a short time later. Always use \texttt{continue} to resume execution after a break.
The walker is very handy, but sometimes you just want to see if a word is being called at all and when, and you don't care to single-step it. In that case, you can use the \texttt{watch} word:
\begin{alltt}
\bs draw-shape break
2005-04-24 20:57:37 -04:00
\end{alltt}
Now when \texttt{draw-shape} is called, a message will be printed to that effect.
You can undo the effect of \texttt{break} or \texttt{watch} by reloading the original source file containing the word definition in question:
\begin{alltt}
\bs layout* reload
\bs draw-shape reload
2005-04-24 20:57:37 -04:00
\end{alltt}
\section{Dealing with hangs}
2005-04-24 20:57:37 -04:00
If you accidentally start an infinite loop, you can send the Factor runtime a \texttt{QUIT} signal. On Unix, this is done by pressing \texttt{Control-\bs} in the controlling terminal. This will cause the runtime to dump the data and return stacks in a semi-readable form. Note that this will help you find the root cause of the hang, but it will not let you interrupt the infinite loop.
\section{Unit testing}
2005-04-24 20:57:37 -04:00
Unit tests are very easy to write. They are usually placed in source files. A unit test can be executed with the \texttt{unit-test} word in the \texttt{test} vocabulary. This word takes a list and a quotation; the quotation is executed, and the resulting data stack is compared against the list. If they do not equal, the unit test has failed. Here is an example of a unit test:
\begin{verbatim}
[ "Hello, crazy world" ] [
"editor" get [ 0 caret set ] bind
", crazy" 5 "editor" get [ line-insert ] bind
"editor" get [ line-text get ] bind
] unit-test
\end{verbatim}
To have a unit test assert that a piece of code does not execute successfully, but rather throws an exception, use the \texttt{unit-test-fails} word. It takes only one quotation; if the quotation does \emph{not} throw an exception, the unit test has failed.
\begin{verbatim}
[ -3 { } vector-nth ] unit-test-fails
\end{verbatim}
Unit testing is a good habit to get into. Sometimes, writing tests first, before any code, can speed the development process too; by running your unit test script, you can gauge progress.
\section{Timing code}
2005-04-24 20:57:37 -04:00
The \texttt{time} word reports the time taken to execute a quotation, in milliseconds. The portion of time spent in garbage collection is also shown:
\begin{alltt}
[ 1000000 [ f f cons drop ] repeat ] time
2005-04-24 20:57:37 -04:00
\textbf{515 milliseconds run time
11 milliseconds GC time}
\end{alltt}
\section{Exploring memory usage}
2005-04-24 20:57:37 -04:00
Factor supports heap introspection. You can find all objects in the heap that match a certain predicate using the \texttt{instances} word. For example, if you suspect a resource leak, you can find all I/O ports as follows:
\begin{alltt}
USE: io-internals
[ port? ] instances .
2005-04-24 20:57:37 -04:00
\textbf{[ \#<port @ 805466443> \#<port @ 805466499> ]}
\end{alltt}
The \texttt{references} word finds all objects that refer to a given object:
\begin{alltt}
[ float? ] instances car references .
2005-04-24 20:57:37 -04:00
\textbf{[ \#<array @ 805542171> [ -1.0 0.0 / ] ]}
\end{alltt}
You can print a memory usage summary with \texttt{room.}:
\begin{alltt}
room.
2005-04-24 20:57:37 -04:00
\textbf{Data space: 16384 KB total 2530 KB used 13853 KB free
Code space: 16384 KB total 490 KB used 15893 KB free}
\end{alltt}
And finally, a detailed memory allocation breakdown by type with \texttt{heap-stats.}:
\begin{alltt}
heap-stats.
2005-04-24 20:57:37 -04:00
\textbf{bignum: 312 bytes, 17 instances
cons: 850376 bytes, 106297 instances
float: 112 bytes, 7 instances
t: 8 bytes, 1 instances
array: 202064 bytes, 3756 instances
hashtable: 54912 bytes, 3432 instances
vector: 5184 bytes, 324 instances
string: 391024 bytes, 7056 instances
sbuf: 64 bytes, 4 instances
port: 112 bytes, 2 instances
word: 96960 bytes, 3030 instances
tuple: 688 bytes, 22 instances}
\end{alltt}
\chapter{Stack effect inference}
The stack effect inference tool checks correctness of code before it is run.
A \emph{stack effect} is a list of input classes and a list of output classes corresponding to
the effect a quotation has on the stack when called. For example, the stack effect of \verb|[ dup * ]| is \verb|[ [ integer ] [ integer ] ]|. The stack checker is used by passing a quotation to the \texttt{infer} word. It uses a sophisticated algorithm to infer stack effects of recursive words, combinators, and other tricky constructions, however, it cannot infer the stack effect of all words. In particular, anything using continuations, such as \texttt{catch} and I/O, will stump the stack checker.
\section{Usage}
The main entry point of the stack checker is a single word.
\wordtable{
\vocabulary{inference}
\ordinaryword{infer}{infer ( quot -- effect )}
}
Takes a quotation and attempts to infer its stack effect. An exception is thrown if the stack effect cannot be inferred.
\section{The algorithm}
2005-08-10 19:37:59 -04:00
The stack effect inference algorithm mirrors the evaluator algorithm (\ref{quotations}). A ``meta data stack'' holds two types of entries; computed values, whose type is known but literal value will only be known at runtime, and literals, whose value is known statically. When a literal value is encountered, it is simply placed on the meta data stack. When a word is encountered, one of several actions are taken, depending on the type of the word:
\begin{itemize}
\item If the word has special stack effect inference behavior, this behavior is invoked. Shuffle words and various primitives fall into this category.
\item If the word's stack effect is already known, then the inputs are removed from the meta data stack, and output values are added. If the meta data stack contains insufficient values, more values are added, and the newly added values are placed in the input list. Since inference begins with an empty stack, the input list contains all required input values when inference is complete.
\item If the word is marked to be inlined, stack effect inference recurses into the word definition and uses the same meta data stack. See \ref{declarations}.
\item Otherwise, the word's stack effect is inferred in a fresh inferencer instance, and the stack effect is cached. The fresh inferencer is used rather than the current one, so that type information and literals on the current meta data stack do not affect the subsequently-cached stack effect.
\end{itemize}
The following two examples demonstrate some simple cases:
\begin{alltt}
[ 1 2 3 ] infer .
\textbf{[ [ ] [ fixnum fixnum fixnum ] ]}
[ "hi" swap ] infer .
\textbf{[ [ object ] [ string object ] ]}
\end{alltt}
\subsection{Combinators}
A simple combinator such as \verb|keep| does not by itself have a stack effect, since \verb|call| takes an arbitrary quotation from the stack, which itself may have an arbitrary stack effect.
\begin{verbatim}
IN: kernel
: keep ( x quot -- x | quot: x -- )
over >r call r> ; inline
\end{verbatim}
On the other hand, the stack effect of word that passes a literal quotation to \verb|keep| can be inferred. The quotation is a literal on the meta data stack, and since \verb|keep| is marked \verb|inline|, the special inference behavior of \verb|call| receives this quotation.
\begin{alltt}
[ [ dup * ] keep ] infer .
\textbf{[ [ number ] [ number number ] ]}
\end{alltt}
Note that if \verb|call| is applied to a computed value, for example, a quotation taken from a variable, or a quotation that is constructed immediately before the \verb|call|, the stack effect inferencer will raise an error.
\begin{alltt}
[ frog get call ] infer .
\textbf{! Inference error: A literal value was expected where a
computed value was found: \#<computed @ 716167923>
! Recursive state:
:s :r :n :c show stacks at time of error.
:get ( var -- value ) inspects the error namestack.}
\end{alltt}
Another word with special inference behavior is \verb|execute|. It is used much more rarely than \verb|call|, but does pretty much the same thing, except it takes a word as input rather than a string.
\subsection{Conditionals}
Simpler than a stack effect is the concept of a stack height difference. This is simply the input value count subtracted from the output value count. A conditional's stack effect can be inferred if each branch has the same stack height difference; in this case, we say that the conditional is \emph{balanced}, and the total stack effect is computed by performing a unification of types across each branch.
The following two examples exhibit balanced conditionals:
\begin{verbatim}
[ 1 ] [ dup ] ifte
dup cons? [ unit ] when cons
\end{verbatim}
The following example is not balanced and raises an error when we attempt to infer its stack effect:
\begin{alltt}
[ [ dup ] [ drop ] ifte ] infer .
\textbf{! Inference error: Unbalanced branches
! Recursive state:
:s :r :n :c show stacks at time of error.
:get ( var -- value ) inspects the error namestack.}
\end{alltt}
\subsection{Recursive words}
Recursive words all have the same general form; there is a conditional, and one branch of the conditional is the \emph{base case} terminating the recursion, and the other branch is the \emph{inductive case}, which reduces the problem and recurses on the reduced problem. A key observation one must make is that in a well-formed recursion, the recursive call in the inductive case eventually results in the base case being called, so we can take the stack effect of the recursive call to be the stack effect of the base case.
Consider the following implementation of a word that measures the length of a list:
\begin{verbatim}
: length ( list -- n )
[ cdr length 1 + ] [ 0 ] ifte* ;
\end{verbatim}
The stack effect can be inferred without difficulty:
\begin{alltt}
[ length ] infer .
\textbf{[ [ object ] [ integer ] ]}
\end{alltt}
The base case is taken if the top of the stack is \verb|f|, and the base case has a stack effect \verb|[ [ object ] [ fixnum ] ]|.
On the other hand if the top of the stack is something else, the inductive case is taken. The inductive case makes a recursive call to \verb|length|, and once we substitute the stack effect of the base case into this call point, we can infer that the stack effect of the recursive case is \verb|[ [ object ] [ integer ] ]|.
If both branches contain a recursive call, the stack effect inferencer gives up.
\begin{alltt}
: fie [ fie ] [ fie ] ifte ;
[ fie ] infer .
\textbf{! Inference error: fie does not have a base case
! Recursive state:
:s :r :n :c show stacks at time of error.
:get ( var -- value ) inspects the error namestack.}
\end{alltt}
\chapter{The compiler}\label{compiler}
\section{Basic usage}
2005-04-24 20:57:37 -04:00
The compiler can provide a substantial speed boost for words whose stack effect can be inferred. Words without a known stack effect cannot be compiled, and must be run in the interpreter. The compiler generates native code, and so far, x86 and PowerPC backends have been developed.
To compile a single word, call \texttt{compile}:
\begin{alltt}
\bs pref-size compile
2005-04-24 20:57:37 -04:00
\textbf{Compiling pref-size}
\end{alltt}
During bootstrap, all words in the library with a known stack effect are compiled. You can
circumvent this, for whatever reason, by passing the \texttt{-no-compile} switch during
bootstrap:
\begin{alltt}
\textbf{bash\$} ./f boot.image.le32 -no-compile
\end{alltt}
The compiler has two limitations you must be aware of. First, if an exception is thrown in compiled code, the return stack will be incomplete, since compiled words do not push themselves there. Second, compiled code cannot be profiled. These limitations will be resolved in a future release.
The compiler consists of multiple stages -- first, a dataflow graph is inferred, then various optimizations are done on this graph, then it is transformed into a linear representation, further optimizations are done, and finally, machine code is generated from the linear representation.
2005-07-27 01:46:06 -04:00
\section{Dataflow intermediate representation}
The dataflow IR represents nested control structure, and annotates all calls with stack input and output annotations. Such annotations consists of lists of values, where a value abstracts over a possibly unknown computation result. It has a tree shape, where each node is a tuple delegating to an instance of the \verb|node| tuple class.
The \verb|node| tuple has the following slots:
\begin{description}
\item[\texttt{param}] The meaning is determined by the tuple wrapping the node instance. For example with \verb|#call| nodes, this is the word being called.
\item[\texttt{in-d}] A list of input values popped the data stack.
\item[\texttt{in-r}] A list of input values popped the return stack. Only used by \verb|#call >r| nodes.
\item[\texttt{out-d}] A list of output values pushed on the data stack.
\item[\texttt{out-r}] A list of output values pushed on the return stack. Only used by \verb|#call r>| nodes.
2005-08-10 19:37:59 -04:00
\item[\texttt{node-classes}] A hashtable mapping values to classes.
\item[\texttt{node-literals}] A hashtable mapping values to literals.
2005-07-27 01:46:06 -04:00
\item[\texttt{node-successor}] The direct successor of the node.
\item[\texttt{node-children}] A list of the node's children, for example if this is a branch or label node. The number of children depends on the type of node.
\end{description}
Note that nodes are linked by the \verb|node-successor| slot. Nested structure is realized by a list value in the \verb|node-children| slot.
The stack effect inferencer transforms quotations into dataflow IR.
2005-08-10 19:37:59 -04:00
The \verb|node-classes| and \verb|node-literals| slots are filled in by a separate class inference stage (\ref{class-inference}).
2005-07-27 01:46:06 -04:00
\wordtable{
\vocabulary{inference}
\ordinaryword{dataflow}{dataflow ( quot -- node )}
}
Produces the dataflow IR of a quotation.
\wordtable{
\vocabulary{inference}
2005-08-10 19:37:59 -04:00
\ordinaryword{dataflow.}{dataflow.~( quot ? -- )}
2005-07-27 01:46:06 -04:00
}
2005-08-10 19:37:59 -04:00
Prints dataflow IR in human-readable form. The boolean indicates if stack effect annotations should be output.
2005-07-27 01:46:06 -04:00
\subsection{Values}
Values are an abstraction over possibly known computation inputs and outputs. There are three types of values:
\begin{description}
\item[Literal values] represent a known constant
\item[Computed values] represent inputs and outputs whose specific value is not known
\item[Joined values] represent a unification of possible values of a stack slot where branched control flow meets
\end{description}
2005-08-10 19:37:59 -04:00
The \verb|value| tuple has one slot, \verb|value-recursion|. This is a list of nested lexical scopes, used to resolve recursive stack effects
2005-07-27 01:46:06 -04:00
\subsection{Straight-line code}
\begin{description}
\item[\texttt{\#push}] Pushes literal values on the data stack.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-out-d}] A list of literals.
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#drop}] Pops literal values from the data stack.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-in-d}] A list of literals.
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#call}] Invokes the word identified by \verb|node-param|.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-param}]A word.\\
\item[\texttt{node-in-d}]Input values.\\
\item[\texttt{node-out-d}]Output values.
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#call-label}] Like \verb|#call| but \verb|node-param| refers to a parent \verb|#label| node.
\end{description}
\subsection{Branching and recursion}
\begin{description}
\item[\texttt{\#ifte}] A conditional expression.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-in-d}]A singleton list holding the condition being tested.\\
\item[\texttt{node-children}]A list of two nodes, the true and false branches.
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#dispatch}] A jump table.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-in-d}]A singleton list holding the jump table index.\\
\item[\texttt{node-children}]A list of nodes, in consecutive jump table order.
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#values}] Found at the end of each branch in an \verb|#ifte| or \verb|#dispatch| node.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-out-d}]A list of values present on the data stack at the end of the branch.\\
\end{description}
\item[\texttt{\#meet}] Must be the successor if an \verb|#ifte| or \verb|#dispatch| node.
\begin{description}
\item[\texttt{node-in-d}]A list of \verb|meet| values unified from the \verb|#values| node at the end of each branch.\\
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#label}] A named block of code. Child \verb|#call-label| nodes can recurse on this label.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-param}]A gensym identifying the label.\\
\item[\texttt{node-children}]A singleton list whose sole element is the labelled node.
\end{description}
\item[\texttt{\#entry}] Must be the first node of a \verb|#label|. These nodes are created by the stack effect inferencer, however are only properly filled out by the recursive value inference stage (\ref{recursive-inference}).
\begin{description}
\item[\texttt{node-in-d}]A list of \verb|meet| values unified from all entry points into the block scoped by the \verb|#label| node.\\
\end{description}
2005-07-27 01:46:06 -04:00
\item[\texttt{\#return}] Found at the end of a word's dataflow IR.
2005-08-10 19:37:59 -04:00
\begin{description}
\item[\texttt{node-out-d}]Values present on the stack when the word returns.
\end{description}
2005-07-27 01:46:06 -04:00
\end{description}
\section{Dataflow optimizer}
2005-08-10 19:37:59 -04:00
The dataflow optimizer consists of a set of loosely-related words and passes over dataflow IR that apply various transformations with the intent of improving the efficiency of the generated code.
2005-07-27 01:46:06 -04:00
\subsection{Killing unused literals}
2005-08-10 19:37:59 -04:00
\subsection{Class inference}\label{class-inference}
\subsection{Recursive value inference}\label{recursive-inference}
\subsection{Method inlining and type check elimination}
\subsection{Branch folding}
\subsection{Partial evaluation}
\section{Linear intermediate representation}
2005-04-24 20:57:37 -04:00
The linear IR is the second of the two intermediate
representations used by Factor. It is basically a high-level
assembly language. Linear IR operations are called VOPs. The last stage of the compiler generates machine code instructions corresponding to each \emph{virtual operation} in the linear IR.
To perform everything except for the machine code generation, use the \texttt{precompile} word. This will dump the optimized linear IR instead of generating code, which can be useful sometimes.
2005-04-24 20:57:37 -04:00
\subsection{Control flow}
\begin{description}
\item[\texttt{\%prologue}] On x86, this does nothing. On PowerPC, at the start of
each word that calls a subroutine, we store the link
register in r0, then push r0 on the C stack.
\item[\texttt{\%call-label}] On PowerPC, uses near calling convention, where the
caller pushes the return address.
\item[\texttt{\%call}] On PowerPC, if calling a primitive, compiles a sequence that loads a 32-bit literal and jumps to that address. For other compiled words, compiles an immediate branch with link, so all compiled word definitions must be within 64 megabytes of each other.
\item[\texttt{\%jump-label}] Like \texttt{\%call-label} except the return address is not saved. Used for tail calls.
\item[\texttt{\%jump}] Like \texttt{\%call} except the return address is not saved. Used for tail calls.
\item[\texttt{\%dispatch}] Compile a piece of code that jumps to an offset in a
jump table indexed by an integer. The jump table consists of \texttt{\%target-label} and \texttt{\%target} must immediately follow this VOP.
\item[\texttt{\%target}] Not supported on PowerPC.
\item[\texttt{\%target-label}] A jump table entry.
\end{description}
\subsection{Slots and objects}
\begin{description}
\item[\texttt{\%slot}] The untagged object is in \texttt{vop-out-1}, the tagged slot
number is in \texttt{vop-in-1}.
\item[\texttt{\%fast-slot}] The tagged object is in \texttt{vop-out-1}, the pointer offset is
in \texttt{vop-in-1}. the offset already takes the type tag into
account, so its just one instruction to load.
\item[\texttt{\%set-slot}] The new value is \texttt{vop-in-1}, the object is \texttt{vop-in-2}, and
the slot number is \texttt{vop-in-3}.
\item[\texttt{\%fast-set-slot}] The new value is \texttt{vop-in-1}, the object is \texttt{vop-in-2}, and
the slot offset is \texttt{vop-in-3}.
the offset already takes the type tag into account, so
it's just one instruction to load.
\item[\texttt{\%write-barrier}] Mark the card containing the object pointed by \texttt{vop-in-1}.
\item[\texttt{\%untag}] Mask off the tag bits of \texttt{vop-in-1}, store result in
\texttt{vop-in-1} (which should equal \texttt{vop-out-1}!)
\item[\texttt{\%untag-fixnum}] Shift \texttt{vop-in-1} to the right by 3 bits, store result in
\texttt{vop-in-1} (which should equal \texttt{vop-out-1}!)
\item[\texttt{\%type}] Intrinstic version of type primitive. It outputs an
unboxed value in \texttt{vop-out-1}.
\end{description}
\subsection{Alien interface}
\begin{description}
\item[\texttt{\%parameters}] Ignored on x86.
\item[\texttt{\%parameter}] Ignored on x86.
\item[\texttt{\%unbox}] An unboxer function takes a value from the data stack
and converts it into a C value.
\item[\texttt{\%box}] A boxer function takes a C value as a parameter and
converts into a Factor value, and pushes it on the data
stack.
On x86, C functions return integers in EAX.
\item[\texttt{\%box-float}] On x86, C functions return floats on the FP stack.
\item[\texttt{\%box-double}] On x86, C functions return doubles on the FP stack.
\item[\texttt{\%cleanup}] Ignored on PowerPC.
On x86, in the cdecl ABI, the caller must pop input
parameters off the C stack. In stdcall, the callee does
it, so this node is not used in that case.
\end{description}
2005-04-24 20:57:37 -04:00
\printglossary
\input{handbook.ind}
2005-04-24 20:57:37 -04:00
\end{document}