compiler work
parent
c02755227e
commit
34041bedbf
2
Makefile
2
Makefile
|
@ -3,7 +3,7 @@ CC = gcc
|
|||
# On PowerPC G5:
|
||||
# CFLAGS = -mcpu=970 -mtune=970 -mpowerpc64 -ffast-math -O3
|
||||
# On Pentium 4:
|
||||
# CFLAGS = -march=pentium4 -ffast-math -O3
|
||||
# CFLAGS = -march=pentium4 -ffast-math -O3 -fomit-frame-pointer
|
||||
# Add -fomit-frame-pointer if you don't care about debugging
|
||||
CFLAGS = -Os -g -Wall
|
||||
|
||||
|
|
|
@ -409,6 +409,7 @@ is pushed on the stack. Try evaluating the following:
|
|||
call .s
|
||||
\emph{\{ 5 \}}
|
||||
\end{alltt}
|
||||
|
||||
\texttt{call} \texttt{( quot -{}- )} executes the quotation at the
|
||||
top of the stack. Using \texttt{call} with a literal quotation is
|
||||
useless; writing out the elements of the quotation has the same effect.
|
||||
|
@ -442,10 +443,6 @@ More combinators will be introduced in later sections.
|
|||
|
||||
\subsection{Recursion}
|
||||
|
||||
The idea of \emph{recursion} is key to understanding Factor. A \emph{recursive} word definition is one that refers to itself, usually in one branch of a conditional.
|
||||
|
||||
FIXME
|
||||
|
||||
\section{Numbers}
|
||||
|
||||
Factor provides a rich set of math words. Factor numbers more closely model the mathematical concept of a number than other languages. Where possible, exact answers are given -- for example, adding or multiplying two integers never results in overflow, and dividing two integers yields a fraction rather than a truncated result. Complex numbers are supported, allowing many functions to be computed with parameters that would raise errors or return ``not a number'' in other languages.
|
||||
|
@ -2400,11 +2397,28 @@ The name stack is really just a vector. The words \texttt{>n} and \texttt{n>} ar
|
|||
: n> ( n:namespace -- namespace ) namestack* vector-pop ;
|
||||
\end{alltt}
|
||||
|
||||
\section{Metaprogramming}
|
||||
\section{The execution model in depth}
|
||||
|
||||
Recall that code quotations are in fact just linked lists. Factor code is data, and vice versa. Essentially, the interpreter iterates through code quotations, pushing literals and executing words. When a word is executed, one of two things happen -- either the word has a colon definition, and the interpreter is invoked recursively on the definition, or the word is primitive, and it is executed by the underlying virtual machine. A word is itself a first-class object.
|
||||
\subsection{Recursion}
|
||||
|
||||
It is the job of the parser to transform source code denoting literals and words into their internal representations. This is done using a vocabulary of \emph{parsing words}. The prettyprinter does the converse, by printing out data structures in a parsable form (both to humans and Factor). Because code is data, text representation of source code doubles as a way to serialize almost any Factor object.
|
||||
The idea of \emph{recursion} is key to understanding Factor. A \emph{recursive} word definition is one that refers to itself, usually in one branch of a conditional.
|
||||
|
||||
|
||||
tail recursion
|
||||
|
||||
preserving values between iterations
|
||||
|
||||
ensuring a consistent stack effect
|
||||
|
||||
works well with lists, since only the head is passed
|
||||
|
||||
not so well with vectors and strings -- need an obj+index
|
||||
|
||||
\subsection{Combinators}
|
||||
|
||||
a combinator is a recursive word that takes quotations
|
||||
|
||||
how to ensure a consistent stack view for the quotations
|
||||
|
||||
\subsection{Looking at words}
|
||||
|
||||
|
@ -2452,24 +2466,6 @@ If the primitive number is set to 1, the word is a colon definition and the para
|
|||
|
||||
The word \texttt{define ( word quot -{}- )} defines a word to have the specified colon definition. Note that \texttt{create} and \texttt{define} perform an action somewhat analagous to the \texttt{: ... ;} notation for colon definitions, except at parse time rather than run time.
|
||||
|
||||
\subsection{The prettyprinter}
|
||||
|
||||
We've already seen the word \texttt{.} which prints the top of the stack in a form that may be read back in. The word \texttt{prettyprint} is similar, except the output is in an indented, multiple-line format. Both words are in the \texttt{prettyprint} vocabulary. Here is an example:
|
||||
|
||||
\begin{alltt}
|
||||
{[} 1 {[} 2 3 4 {]} 5 {]} .
|
||||
\emph{{[} 1 {[} 2 3 4 {]} 5 {]}}
|
||||
{[} 1 {[} 2 3 4 {]} 5 {]} prettyprint
|
||||
\emph{{[}
|
||||
1 {[}
|
||||
2 3 4
|
||||
{]} 5
|
||||
{]}}
|
||||
\end{alltt}
|
||||
|
||||
|
||||
\subsection{The parser}
|
||||
|
||||
\subsection{Parsing words}
|
||||
|
||||
Lets take a closer look at Factor syntax. Consider a simple expression,
|
||||
|
@ -2538,6 +2534,29 @@ next occurrence of \texttt{{}''}, and appends this string to the
|
|||
current node of the parse tree. Note that strings and words are different
|
||||
types of objects. Strings are covered in great detail later.
|
||||
|
||||
\section{NOT DONE}
|
||||
|
||||
Recall that code quotations are in fact just linked lists. Factor code is data, and vice versa. Essentially, the interpreter iterates through code quotations, pushing literals and executing words. When a word is executed, one of two things happen -- either the word has a colon definition, and the interpreter is invoked recursively on the definition, or the word is primitive, and it is executed by the underlying virtual machine. A word is itself a first-class object.
|
||||
|
||||
It is the job of the parser to transform source code denoting literals and words into their internal representations. This is done using a vocabulary of \emph{parsing words}. The prettyprinter does the converse, by printing out data structures in a parsable form (both to humans and Factor). Because code is data, text representation of source code doubles as a way to serialize almost any Factor object.
|
||||
|
||||
\subsection{The prettyprinter}
|
||||
|
||||
We've already seen the word \texttt{.} which prints the top of the stack in a form that may be read back in. The word \texttt{prettyprint} is similar, except the output is in an indented, multiple-line format. Both words are in the \texttt{prettyprint} vocabulary. Here is an example:
|
||||
|
||||
\begin{alltt}
|
||||
{[} 1 {[} 2 3 4 {]} 5 {]} .
|
||||
\emph{{[} 1 {[} 2 3 4 {]} 5 {]}}
|
||||
{[} 1 {[} 2 3 4 {]} 5 {]} prettyprint
|
||||
\emph{{[}
|
||||
1 {[}
|
||||
2 3 4
|
||||
{]} 5
|
||||
{]}}
|
||||
\end{alltt}
|
||||
|
||||
|
||||
\subsection{Profiling}
|
||||
|
||||
\section{PRACTICAL: Infix syntax}
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@ import java.io.*;
|
|||
|
||||
public class FactorInterpreter implements FactorObject, Runnable
|
||||
{
|
||||
public static final String VERSION = "0.65";
|
||||
public static final String VERSION = "0.66";
|
||||
|
||||
public static final Cons DEFAULT_USE = new Cons("builtins",
|
||||
new Cons("syntax",new Cons("scratchpad",null)));
|
||||
|
|
|
@ -26,6 +26,7 @@
|
|||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
IN: compiler
|
||||
USE: combinators
|
||||
USE: math
|
||||
USE: kernel
|
||||
USE: stack
|
||||
|
@ -36,6 +37,13 @@ USE: stack
|
|||
: init-assembler ( -- )
|
||||
compiled-offset literal-table + set-compiled-offset ;
|
||||
|
||||
: compile-aligned ( n -- )
|
||||
dup compiled-offset mod dup 0 = [
|
||||
2drop
|
||||
] [
|
||||
- compiled-offset + set-compiled-offset
|
||||
] ifte ;
|
||||
|
||||
: intern-literal ( obj -- lit# )
|
||||
address-of
|
||||
literal-top set-compiled-cell
|
||||
|
|
|
@ -41,6 +41,9 @@ USE: combinators
|
|||
: ESI 6 ;
|
||||
: EDI 7 ;
|
||||
|
||||
: MOD-R/M ( r/m reg/opcode mod -- )
|
||||
6 shift swap 3 shift bitor bitor compile-byte ;
|
||||
|
||||
: PUSH ( reg -- )
|
||||
HEX: 50 + compile-byte ;
|
||||
|
||||
|
@ -57,7 +60,7 @@ USE: combinators
|
|||
drop HEX: a1 compile-byte
|
||||
] [
|
||||
HEX: 8b compile-byte
|
||||
3 shift BIN: 101 bitor compile-byte
|
||||
BIN: 101 swap 0 MOD-R/M
|
||||
] ifte compile-cell ;
|
||||
|
||||
: I>[R] ( imm reg -- )
|
||||
|
@ -71,21 +74,21 @@ USE: combinators
|
|||
nip HEX: a3 compile-byte
|
||||
] [
|
||||
HEX: 89 compile-byte
|
||||
swap 3 shift BIN: 101 bitor compile-byte
|
||||
swap BIN: 101 swap 0 MOD-R/M
|
||||
] ifte compile-cell ;
|
||||
|
||||
: [R]>R ( reg reg -- )
|
||||
#! MOV INDIRECT <reg> TO <reg>.
|
||||
HEX: 8b compile-byte swap 3 shift bitor compile-byte ;
|
||||
HEX: 8b compile-byte swap 0 MOD-R/M ;
|
||||
|
||||
: R>[R] ( reg reg -- )
|
||||
#! MOV <reg> TO INDIRECT <reg>.
|
||||
HEX: 89 compile-byte swap 3 shift bitor compile-byte ;
|
||||
HEX: 89 compile-byte swap 0 MOD-R/M ;
|
||||
|
||||
: I+[I] ( imm addr -- )
|
||||
#! ADD <imm> TO ADDRESS <addr>
|
||||
HEX: 81 compile-byte
|
||||
HEX: 05 compile-byte
|
||||
BIN: 101 0 0 MOD-R/M
|
||||
compile-cell
|
||||
compile-cell ;
|
||||
|
||||
|
@ -93,14 +96,14 @@ USE: combinators
|
|||
#! SUBTRACT <imm> FROM <reg>, STORE RESULT IN <reg>
|
||||
over -128 127 between? [
|
||||
HEX: 83 compile-byte
|
||||
HEX: e8 + compile-byte
|
||||
BIN: 101 BIN: 11 MOD-R/M
|
||||
compile-byte
|
||||
] [
|
||||
dup EAX = [
|
||||
drop HEX: 2d compile-byte
|
||||
] [
|
||||
HEX: 81 compile-byte
|
||||
BIN: 11101000 bitor
|
||||
BIN: 101 BIN: 11 MOD-R/M
|
||||
] ifte
|
||||
compile-cell
|
||||
] ifte ;
|
||||
|
@ -111,11 +114,11 @@ USE: combinators
|
|||
#! 81 38 33 33 33 00 cmpl $0x333333,(%eax)
|
||||
over -128 127 between? [
|
||||
HEX: 83 compile-byte
|
||||
HEX: 38 + compile-byte
|
||||
BIN: 111 0 MOD-R/M
|
||||
compile-byte
|
||||
] [
|
||||
HEX: 81 compile-byte
|
||||
HEX: 38 + compile-byte
|
||||
BIN: 111 0 MOD-R/M
|
||||
compile-cell
|
||||
] ifte ;
|
||||
|
||||
|
@ -127,8 +130,8 @@ USE: combinators
|
|||
4 DATASTACK I+[I] ;
|
||||
|
||||
: [LITERAL] ( cell -- )
|
||||
#! Push literal on data stack by following an indirect
|
||||
#! pointer.
|
||||
#! Push complex literal on data stack by following an
|
||||
#! indirect pointer.
|
||||
ECX PUSH
|
||||
( cell -- ) ECX [I]>R
|
||||
DATASTACK EAX [I]>R
|
||||
|
|
|
@ -132,6 +132,7 @@ USE: words
|
|||
] with-scope ;
|
||||
|
||||
: begin-compiling ( word -- )
|
||||
cell compile-aligned
|
||||
compiled-offset "compiled-xt" rot set-word-property ;
|
||||
|
||||
: end-compiling ( word -- xt )
|
||||
|
|
|
@ -26,24 +26,42 @@
|
|||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
IN: compiler
|
||||
USE: combinators
|
||||
USE: words
|
||||
USE: stack
|
||||
USE: kernel
|
||||
USE: math
|
||||
USE: lists
|
||||
|
||||
: compile-ifte ( -- )
|
||||
pop-literal pop-literal commit-literals
|
||||
: compile-f-test ( -- fixup )
|
||||
#! Push addr where we write the branch target address.
|
||||
POP-DS
|
||||
! ptr to condition is now in EAX
|
||||
f address-of EAX CMP-I-[R]
|
||||
compiled-offset JE ( -- fixup ) >r
|
||||
( t -- ) compile-quot
|
||||
RET
|
||||
compiled-offset r> ( fixup -- ) fixup
|
||||
( f -- ) compile-quot
|
||||
RET ;
|
||||
compiled-offset JE ;
|
||||
|
||||
[ compile-ifte ]
|
||||
"compiling"
|
||||
"ifte" [ "combinators" ] search
|
||||
set-word-property
|
||||
: branch-target ( fixup -- )
|
||||
cell compile-aligned compiled-offset swap fixup ;
|
||||
|
||||
: compile-else ( fixup -- fixup )
|
||||
#! Push addr where we write the branch target address,
|
||||
#! and fixup branch target address from compile-f-test.
|
||||
#! Push f for the fixup if we're tail position.
|
||||
tail? [ RET f ] [ 0 JUMP ] ifte swap branch-target ;
|
||||
|
||||
: compile-end-if ( fixup -- )
|
||||
tail? [ drop RET ] [ branch-target ] ifte ;
|
||||
|
||||
: compile-ifte ( -- )
|
||||
pop-literal pop-literal commit-literals
|
||||
compile-f-test >r
|
||||
( t -- ) compile-quot
|
||||
r> compile-else >r
|
||||
( f -- ) compile-quot
|
||||
r> compile-end-if ;
|
||||
|
||||
[
|
||||
[ ifte compile-ifte ]
|
||||
] [
|
||||
unswons "compiling" swap set-word-property
|
||||
] each
|
||||
|
|
|
@ -25,7 +25,6 @@
|
|||
! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
|
||||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
IN: cross-compiler
|
||||
USE: combinators
|
||||
USE: kernel
|
||||
USE: lists
|
||||
|
@ -127,7 +126,7 @@ DEFER: set-word-plist
|
|||
IN: unparser
|
||||
DEFER: unparse-float
|
||||
|
||||
IN: cross-compiler
|
||||
IN: image
|
||||
|
||||
: primitives, ( -- )
|
||||
1 [
|
||||
|
|
|
@ -25,7 +25,12 @@
|
|||
! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
|
||||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
IN: cross-compiler
|
||||
IN: namespaces
|
||||
|
||||
( Java Factor doesn't have this )
|
||||
: namespace-buckets 23 ;
|
||||
|
||||
IN: image
|
||||
USE: combinators
|
||||
USE: errors
|
||||
USE: hashtables
|
||||
|
@ -254,12 +259,6 @@ DEFER: '
|
|||
|
||||
( Word definitions )
|
||||
|
||||
IN: namespaces
|
||||
|
||||
: namespace-buckets 23 ;
|
||||
|
||||
IN: cross-compiler
|
||||
|
||||
: (vocabulary) ( name -- vocab )
|
||||
#! Vocabulary for target image.
|
||||
dup "vocabularies" get hash dup [
|
||||
|
|
|
@ -25,7 +25,7 @@
|
|||
! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
|
||||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
IN: cross-compiler
|
||||
IN: image
|
||||
USE: combinators
|
||||
USE: kernel
|
||||
USE: lists
|
||||
|
|
|
@ -26,7 +26,7 @@
|
|||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
USE: lists
|
||||
USE: cross-compiler
|
||||
USE: image
|
||||
|
||||
primitives,
|
||||
[
|
||||
|
|
|
@ -25,7 +25,7 @@
|
|||
! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
|
||||
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
IN: cross-compiler
|
||||
IN: image
|
||||
USE: namespaces
|
||||
USE: parser
|
||||
|
||||
|
|
|
@ -28,7 +28,6 @@
|
|||
IN: syntax
|
||||
|
||||
USE: combinators
|
||||
USE: cross-compiler
|
||||
USE: errors
|
||||
USE: kernel
|
||||
USE: lists
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
USE: test
|
||||
USE: cross-compiler
|
||||
USE: image
|
||||
USE: namespaces
|
||||
USE: stdio
|
||||
|
||||
|
|
|
@ -75,3 +75,8 @@ garbage-collection
|
|||
: one-rec [ f one-rec ] [ "hi" ] ifte ; compiled
|
||||
|
||||
[ "hi" ] [ t one-rec ] unit-test
|
||||
|
||||
: after-ifte-test
|
||||
t [ ] [ ] ifte 5 ; compiled
|
||||
|
||||
[ 5 ] [ after-ifte-test ] unit-test
|
||||
|
|
|
@ -7,7 +7,8 @@ void* alloc_guarded(CELL size)
|
|||
int pagesize = getpagesize();
|
||||
|
||||
char* array = mmap((void*)0,pagesize + size + pagesize,
|
||||
PROT_READ | PROT_WRITE,MAP_ANON | MAP_PRIVATE,-1,0);
|
||||
PROT_READ | PROT_WRITE | PROT_EXEC,
|
||||
MAP_ANON | MAP_PRIVATE,-1,0);
|
||||
|
||||
if(mprotect(array,pagesize,PROT_NONE) == -1)
|
||||
fatal_error("Cannot allocate low guard page",(CELL)array);
|
||||
|
|
Loading…
Reference in New Issue