From 34041bedbf71a684af82eebfb160812bc86e2c66 Mon Sep 17 00:00:00 2001 From: Slava Pestov Date: Sat, 11 Sep 2004 19:26:24 +0000 Subject: [PATCH] compiler work --- Makefile | 2 +- doc/devel-guide.tex | 69 ++++++++++++------- factor/FactorInterpreter.java | 2 +- library/compiler/assembler.factor | 8 +++ library/compiler/assembly-x86.factor | 25 ++++--- library/compiler/compiler.factor | 1 + library/compiler/words.factor | 42 +++++++---- library/cross-compiler.factor | 3 +- library/image.factor | 13 ++-- library/platform/jvm/cross-compiler.factor | 2 +- library/platform/native/boot.factor | 2 +- library/platform/native/cross-compiler.factor | 2 +- library/platform/native/parse-syntax.factor | 1 - library/test/image.factor | 2 +- library/test/x86-compiler/compiler.factor | 5 ++ native/memory.c | 3 +- 16 files changed, 117 insertions(+), 65 deletions(-) diff --git a/Makefile b/Makefile index 1ef2485b08..c8c1ce7f8a 100644 --- a/Makefile +++ b/Makefile @@ -3,7 +3,7 @@ CC = gcc # On PowerPC G5: # CFLAGS = -mcpu=970 -mtune=970 -mpowerpc64 -ffast-math -O3 # On Pentium 4: -# CFLAGS = -march=pentium4 -ffast-math -O3 +# CFLAGS = -march=pentium4 -ffast-math -O3 -fomit-frame-pointer # Add -fomit-frame-pointer if you don't care about debugging CFLAGS = -Os -g -Wall diff --git a/doc/devel-guide.tex b/doc/devel-guide.tex index 9b31957c3c..e04f1c21aa 100644 --- a/doc/devel-guide.tex +++ b/doc/devel-guide.tex @@ -409,6 +409,7 @@ is pushed on the stack. Try evaluating the following: call .s \emph{\{ 5 \}} \end{alltt} + \texttt{call} \texttt{( quot -{}- )} executes the quotation at the top of the stack. Using \texttt{call} with a literal quotation is useless; writing out the elements of the quotation has the same effect. @@ -442,10 +443,6 @@ More combinators will be introduced in later sections. \subsection{Recursion} -The idea of \emph{recursion} is key to understanding Factor. A \emph{recursive} word definition is one that refers to itself, usually in one branch of a conditional. - -FIXME - \section{Numbers} Factor provides a rich set of math words. Factor numbers more closely model the mathematical concept of a number than other languages. Where possible, exact answers are given -- for example, adding or multiplying two integers never results in overflow, and dividing two integers yields a fraction rather than a truncated result. Complex numbers are supported, allowing many functions to be computed with parameters that would raise errors or return ``not a number'' in other languages. @@ -2400,11 +2397,28 @@ The name stack is really just a vector. The words \texttt{>n} and \texttt{n>} ar : n> ( n:namespace -- namespace ) namestack* vector-pop ; \end{alltt} -\section{Metaprogramming} +\section{The execution model in depth} -Recall that code quotations are in fact just linked lists. Factor code is data, and vice versa. Essentially, the interpreter iterates through code quotations, pushing literals and executing words. When a word is executed, one of two things happen -- either the word has a colon definition, and the interpreter is invoked recursively on the definition, or the word is primitive, and it is executed by the underlying virtual machine. A word is itself a first-class object. +\subsection{Recursion} -It is the job of the parser to transform source code denoting literals and words into their internal representations. This is done using a vocabulary of \emph{parsing words}. The prettyprinter does the converse, by printing out data structures in a parsable form (both to humans and Factor). Because code is data, text representation of source code doubles as a way to serialize almost any Factor object. +The idea of \emph{recursion} is key to understanding Factor. A \emph{recursive} word definition is one that refers to itself, usually in one branch of a conditional. + + +tail recursion + +preserving values between iterations + +ensuring a consistent stack effect + +works well with lists, since only the head is passed + +not so well with vectors and strings -- need an obj+index + +\subsection{Combinators} + +a combinator is a recursive word that takes quotations + +how to ensure a consistent stack view for the quotations \subsection{Looking at words} @@ -2452,24 +2466,6 @@ If the primitive number is set to 1, the word is a colon definition and the para The word \texttt{define ( word quot -{}- )} defines a word to have the specified colon definition. Note that \texttt{create} and \texttt{define} perform an action somewhat analagous to the \texttt{: ... ;} notation for colon definitions, except at parse time rather than run time. -\subsection{The prettyprinter} - -We've already seen the word \texttt{.} which prints the top of the stack in a form that may be read back in. The word \texttt{prettyprint} is similar, except the output is in an indented, multiple-line format. Both words are in the \texttt{prettyprint} vocabulary. Here is an example: - -\begin{alltt} -{[} 1 {[} 2 3 4 {]} 5 {]} . -\emph{{[} 1 {[} 2 3 4 {]} 5 {]}} -{[} 1 {[} 2 3 4 {]} 5 {]} prettyprint -\emph{{[} - 1 {[} - 2 3 4 - {]} 5 -{]}} -\end{alltt} - - -\subsection{The parser} - \subsection{Parsing words} Lets take a closer look at Factor syntax. Consider a simple expression, @@ -2538,6 +2534,29 @@ next occurrence of \texttt{{}''}, and appends this string to the current node of the parse tree. Note that strings and words are different types of objects. Strings are covered in great detail later. +\section{NOT DONE} + +Recall that code quotations are in fact just linked lists. Factor code is data, and vice versa. Essentially, the interpreter iterates through code quotations, pushing literals and executing words. When a word is executed, one of two things happen -- either the word has a colon definition, and the interpreter is invoked recursively on the definition, or the word is primitive, and it is executed by the underlying virtual machine. A word is itself a first-class object. + +It is the job of the parser to transform source code denoting literals and words into their internal representations. This is done using a vocabulary of \emph{parsing words}. The prettyprinter does the converse, by printing out data structures in a parsable form (both to humans and Factor). Because code is data, text representation of source code doubles as a way to serialize almost any Factor object. + +\subsection{The prettyprinter} + +We've already seen the word \texttt{.} which prints the top of the stack in a form that may be read back in. The word \texttt{prettyprint} is similar, except the output is in an indented, multiple-line format. Both words are in the \texttt{prettyprint} vocabulary. Here is an example: + +\begin{alltt} +{[} 1 {[} 2 3 4 {]} 5 {]} . +\emph{{[} 1 {[} 2 3 4 {]} 5 {]}} +{[} 1 {[} 2 3 4 {]} 5 {]} prettyprint +\emph{{[} + 1 {[} + 2 3 4 + {]} 5 +{]}} +\end{alltt} + + +\subsection{Profiling} \section{PRACTICAL: Infix syntax} diff --git a/factor/FactorInterpreter.java b/factor/FactorInterpreter.java index 6ab77d451d..ba2a968241 100644 --- a/factor/FactorInterpreter.java +++ b/factor/FactorInterpreter.java @@ -35,7 +35,7 @@ import java.io.*; public class FactorInterpreter implements FactorObject, Runnable { - public static final String VERSION = "0.65"; + public static final String VERSION = "0.66"; public static final Cons DEFAULT_USE = new Cons("builtins", new Cons("syntax",new Cons("scratchpad",null))); diff --git a/library/compiler/assembler.factor b/library/compiler/assembler.factor index 6411bf076d..1466530bd3 100644 --- a/library/compiler/assembler.factor +++ b/library/compiler/assembler.factor @@ -26,6 +26,7 @@ ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. IN: compiler +USE: combinators USE: math USE: kernel USE: stack @@ -36,6 +37,13 @@ USE: stack : init-assembler ( -- ) compiled-offset literal-table + set-compiled-offset ; +: compile-aligned ( n -- ) + dup compiled-offset mod dup 0 = [ + 2drop + ] [ + - compiled-offset + set-compiled-offset + ] ifte ; + : intern-literal ( obj -- lit# ) address-of literal-top set-compiled-cell diff --git a/library/compiler/assembly-x86.factor b/library/compiler/assembly-x86.factor index 34cdcf145f..286f119c0b 100644 --- a/library/compiler/assembly-x86.factor +++ b/library/compiler/assembly-x86.factor @@ -41,6 +41,9 @@ USE: combinators : ESI 6 ; : EDI 7 ; +: MOD-R/M ( r/m reg/opcode mod -- ) + 6 shift swap 3 shift bitor bitor compile-byte ; + : PUSH ( reg -- ) HEX: 50 + compile-byte ; @@ -57,7 +60,7 @@ USE: combinators drop HEX: a1 compile-byte ] [ HEX: 8b compile-byte - 3 shift BIN: 101 bitor compile-byte + BIN: 101 swap 0 MOD-R/M ] ifte compile-cell ; : I>[R] ( imm reg -- ) @@ -71,21 +74,21 @@ USE: combinators nip HEX: a3 compile-byte ] [ HEX: 89 compile-byte - swap 3 shift BIN: 101 bitor compile-byte + swap BIN: 101 swap 0 MOD-R/M ] ifte compile-cell ; : [R]>R ( reg reg -- ) #! MOV INDIRECT TO . - HEX: 8b compile-byte swap 3 shift bitor compile-byte ; + HEX: 8b compile-byte swap 0 MOD-R/M ; : R>[R] ( reg reg -- ) #! MOV TO INDIRECT . - HEX: 89 compile-byte swap 3 shift bitor compile-byte ; + HEX: 89 compile-byte swap 0 MOD-R/M ; : I+[I] ( imm addr -- ) #! ADD TO ADDRESS HEX: 81 compile-byte - HEX: 05 compile-byte + BIN: 101 0 0 MOD-R/M compile-cell compile-cell ; @@ -93,14 +96,14 @@ USE: combinators #! SUBTRACT FROM , STORE RESULT IN over -128 127 between? [ HEX: 83 compile-byte - HEX: e8 + compile-byte + BIN: 101 BIN: 11 MOD-R/M compile-byte ] [ dup EAX = [ drop HEX: 2d compile-byte ] [ HEX: 81 compile-byte - BIN: 11101000 bitor + BIN: 101 BIN: 11 MOD-R/M ] ifte compile-cell ] ifte ; @@ -111,11 +114,11 @@ USE: combinators #! 81 38 33 33 33 00 cmpl $0x333333,(%eax) over -128 127 between? [ HEX: 83 compile-byte - HEX: 38 + compile-byte + BIN: 111 0 MOD-R/M compile-byte ] [ HEX: 81 compile-byte - HEX: 38 + compile-byte + BIN: 111 0 MOD-R/M compile-cell ] ifte ; @@ -127,8 +130,8 @@ USE: combinators 4 DATASTACK I+[I] ; : [LITERAL] ( cell -- ) - #! Push literal on data stack by following an indirect - #! pointer. + #! Push complex literal on data stack by following an + #! indirect pointer. ECX PUSH ( cell -- ) ECX [I]>R DATASTACK EAX [I]>R diff --git a/library/compiler/compiler.factor b/library/compiler/compiler.factor index 2103b06d38..c7c9a19f75 100644 --- a/library/compiler/compiler.factor +++ b/library/compiler/compiler.factor @@ -132,6 +132,7 @@ USE: words ] with-scope ; : begin-compiling ( word -- ) + cell compile-aligned compiled-offset "compiled-xt" rot set-word-property ; : end-compiling ( word -- xt ) diff --git a/library/compiler/words.factor b/library/compiler/words.factor index 973f7fb62c..2461a90151 100644 --- a/library/compiler/words.factor +++ b/library/compiler/words.factor @@ -26,24 +26,42 @@ ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. IN: compiler +USE: combinators USE: words USE: stack USE: kernel USE: math +USE: lists -: compile-ifte ( -- ) - pop-literal pop-literal commit-literals +: compile-f-test ( -- fixup ) + #! Push addr where we write the branch target address. POP-DS ! ptr to condition is now in EAX f address-of EAX CMP-I-[R] - compiled-offset JE ( -- fixup ) >r - ( t -- ) compile-quot - RET - compiled-offset r> ( fixup -- ) fixup - ( f -- ) compile-quot - RET ; + compiled-offset JE ; -[ compile-ifte ] -"compiling" -"ifte" [ "combinators" ] search -set-word-property +: branch-target ( fixup -- ) + cell compile-aligned compiled-offset swap fixup ; + +: compile-else ( fixup -- fixup ) + #! Push addr where we write the branch target address, + #! and fixup branch target address from compile-f-test. + #! Push f for the fixup if we're tail position. + tail? [ RET f ] [ 0 JUMP ] ifte swap branch-target ; + +: compile-end-if ( fixup -- ) + tail? [ drop RET ] [ branch-target ] ifte ; + +: compile-ifte ( -- ) + pop-literal pop-literal commit-literals + compile-f-test >r + ( t -- ) compile-quot + r> compile-else >r + ( f -- ) compile-quot + r> compile-end-if ; + +[ + [ ifte compile-ifte ] +] [ + unswons "compiling" swap set-word-property +] each diff --git a/library/cross-compiler.factor b/library/cross-compiler.factor index def7a9f474..16f14ff2d7 100644 --- a/library/cross-compiler.factor +++ b/library/cross-compiler.factor @@ -25,7 +25,6 @@ ! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -IN: cross-compiler USE: combinators USE: kernel USE: lists @@ -127,7 +126,7 @@ DEFER: set-word-plist IN: unparser DEFER: unparse-float -IN: cross-compiler +IN: image : primitives, ( -- ) 1 [ diff --git a/library/image.factor b/library/image.factor index 531ac6a2b7..5bfd0cc0f9 100644 --- a/library/image.factor +++ b/library/image.factor @@ -25,7 +25,12 @@ ! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -IN: cross-compiler +IN: namespaces + +( Java Factor doesn't have this ) +: namespace-buckets 23 ; + +IN: image USE: combinators USE: errors USE: hashtables @@ -254,12 +259,6 @@ DEFER: ' ( Word definitions ) -IN: namespaces - -: namespace-buckets 23 ; - -IN: cross-compiler - : (vocabulary) ( name -- vocab ) #! Vocabulary for target image. dup "vocabularies" get hash dup [ diff --git a/library/platform/jvm/cross-compiler.factor b/library/platform/jvm/cross-compiler.factor index ebfb392b80..5f9ff4aa34 100644 --- a/library/platform/jvm/cross-compiler.factor +++ b/library/platform/jvm/cross-compiler.factor @@ -25,7 +25,7 @@ ! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -IN: cross-compiler +IN: image USE: combinators USE: kernel USE: lists diff --git a/library/platform/native/boot.factor b/library/platform/native/boot.factor index bd432d62fc..0018cfcae5 100644 --- a/library/platform/native/boot.factor +++ b/library/platform/native/boot.factor @@ -26,7 +26,7 @@ ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. USE: lists -USE: cross-compiler +USE: image primitives, [ diff --git a/library/platform/native/cross-compiler.factor b/library/platform/native/cross-compiler.factor index 8d276074e1..78a758cd60 100644 --- a/library/platform/native/cross-compiler.factor +++ b/library/platform/native/cross-compiler.factor @@ -25,7 +25,7 @@ ! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -IN: cross-compiler +IN: image USE: namespaces USE: parser diff --git a/library/platform/native/parse-syntax.factor b/library/platform/native/parse-syntax.factor index 5a258dbc60..61b1287fba 100644 --- a/library/platform/native/parse-syntax.factor +++ b/library/platform/native/parse-syntax.factor @@ -28,7 +28,6 @@ IN: syntax USE: combinators -USE: cross-compiler USE: errors USE: kernel USE: lists diff --git a/library/test/image.factor b/library/test/image.factor index 269ead768f..22203b3fc4 100644 --- a/library/test/image.factor +++ b/library/test/image.factor @@ -1,5 +1,5 @@ USE: test -USE: cross-compiler +USE: image USE: namespaces USE: stdio diff --git a/library/test/x86-compiler/compiler.factor b/library/test/x86-compiler/compiler.factor index 7d1988f608..5624ed9e86 100644 --- a/library/test/x86-compiler/compiler.factor +++ b/library/test/x86-compiler/compiler.factor @@ -75,3 +75,8 @@ garbage-collection : one-rec [ f one-rec ] [ "hi" ] ifte ; compiled [ "hi" ] [ t one-rec ] unit-test + +: after-ifte-test + t [ ] [ ] ifte 5 ; compiled + +[ 5 ] [ after-ifte-test ] unit-test diff --git a/native/memory.c b/native/memory.c index 38da000eae..cd37423588 100644 --- a/native/memory.c +++ b/native/memory.c @@ -7,7 +7,8 @@ void* alloc_guarded(CELL size) int pagesize = getpagesize(); char* array = mmap((void*)0,pagesize + size + pagesize, - PROT_READ | PROT_WRITE,MAP_ANON | MAP_PRIVATE,-1,0); + PROT_READ | PROT_WRITE | PROT_EXEC, + MAP_ANON | MAP_PRIVATE,-1,0); if(mprotect(array,pagesize,PROT_NONE) == -1) fatal_error("Cannot allocate low guard page",(CELL)array);