PowerPC backend fixes, new %write-barrier VOP

cvs
Slava Pestov 2005-06-04 06:20:54 +00:00
parent a76f7107c3
commit 8453c00bbf
11 changed files with 354 additions and 127 deletions

198
doc/compiler.tex Normal file
View File

@ -0,0 +1,198 @@
\documentclass{article}
\usepackage[plainpages=false,colorlinks]{hyperref}
\usepackage[style=list,toc]{glossary}
\usepackage{alltt}
\usepackage{times}
\usepackage{tabularx}
\usepackage{epsfig}
\usepackage{epsf}
\usepackage{amssymb}
\usepackage{epstopdf}
\pagestyle{headings}
\setcounter{tocdepth}{3}
\setcounter{secnumdepth}{3}
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}
\newcommand{\bs}{\char'134}
\newcommand{\dq}{\char'42}
\newcommand{\tto}{\symbol{123}}
\newcommand{\ttc}{\symbol{125}}
\newcommand{\pound}{\char'43}
\newcommand{\vocabulary}[1]{\emph{Vocabulary:} \texttt{#1}&\\}
\newcommand{\parsingword}[2]{\index{\texttt{#1}}\emph{Parsing word:} \texttt{#2}&\\}
\newcommand{\ordinaryword}[2]{\index{\texttt{#1}}\emph{Word:} \texttt{#2}&\\}
\newcommand{\symbolword}[1]{\index{\texttt{#1}}\emph{Symbol:} \texttt{#1}&\\}
\newcommand{\classword}[1]{\index{\texttt{#1}}\emph{Class:} \texttt{#1}&\\}
\newcommand{\genericword}[2]{\index{\texttt{#1}}\emph{Generic word:} \texttt{#2}&\\}
\newcommand{\predword}[1]{\ordinaryword{#1}{#1~( object -- ?~)}}
\setlength{\tabcolsep}{1mm}
\newcommand{\wordtable}[1]{
%HEVEA\renewcommand{\index}[1]{}
%HEVEA\renewcommand{\glossary}[1]{}
\begin{tabularx}{12cm}{lX}
\hline
#1
\hline
\end{tabularx}
}
\makeatletter
\makeatother
\begin{document}
\title{The Factor compiler}
\author{Slava Pestov}
\maketitle
\tableofcontents{}
\section{The compiler}
The compiler can provide a substantial speed boost for words whose stack effect can be inferred. Words without a known stack effect cannot be compiled, and must be run in the interpreter. The compiler generates native code, and so far, x86 and PowerPC backends have been developed.
To compile a single word, call \texttt{compile}:
\begin{alltt}
\textbf{ok} \bs pref-size compile
\textbf{Compiling pref-size}
\end{alltt}
During bootstrap, all words in the library with a known stack effect are compiled. You can
circumvent this, for whatever reason, by passing the \texttt{-no-compile} switch during
bootstrap:
\begin{alltt}
\textbf{bash\$} ./f boot.image.le32 -no-compile
\end{alltt}
The compiler has two limitations you must be aware of. First, if an exception is thrown in compiled code, the return stack will be incomplete, since compiled words do not push themselves there. Second, compiled code cannot be profiled. These limitations will be resolved in a future release.
The compiler consists of multiple stages -- first, a dataflow graph is inferred, then various optimizations are done on this graph, then it is transformed into a linear representation, further optimizations are done, and finally, machine code is generated from the linear representation.
\subsection{Linear intermediate representation}
The linear IR is the second of the two intermediate
representations used by Factor. It is basically a high-level
assembly language. Linear IR operations are called VOPs. The last stage of the compiler generates machine code instructions corresponding to each \emph{virtual operation} in the linear IR.
To perform everything except for the machine code generation, use the \texttt{precompile} word. This will dump the optimized linear IR instead of generating code, which can be useful sometimes.
\begin{alltt}
\textbf{ok} \bs append precompile
\textbf{<< \%prologue << vop [ ] [ ] [ ] [ ] >> >>
<< \%peek-d << vop [ ] [ 1 ] [ << vreg ... 0 >> ] [ ] >> >>
<< \%peek-d << vop [ ] [ 0 ] [ << vreg ... 1 >> ] [ ] >> >>
<< \%replace-d << vop [ ] [ 0 << vreg ... 0 >> ] [ ] [ ] >> >>
<< \%replace-d << vop [ ] [ 1 << vreg ... 1 >> ] [ ] [ ] >> >>
<< \%inc-d << vop [ ] [ -1 ] [ ] [ ] >> >>
<< \%return << vop [ ] [ ] [ ] [ ] >> >>}
\end{alltt}
\subsubsection{Control flow}
\begin{description}
\item[\texttt{\%prologue}] On x86, this does nothing. On PowerPC, at the start of
each word that calls a subroutine, we store the link
register in r0, then push r0 on the C stack.
\item[\texttt{\%call-label}] On PowerPC, uses near calling convention, where the
caller pushes the return address.
\item[\texttt{\%call}] On PowerPC, if calling a primitive, compiles a sequence that loads a 32-bit literal and jumps to that address. For other compiled words, compiles an immediate branch with link, so all compiled word definitions must be within 64 megabytes of each other.
\item[\texttt{\%jump-label}] Like \texttt{\%call-label} except the return address is not saved. Used for tail calls.
\item[\texttt{\%jump}] Like \texttt{\%call} except the return address is not saved. Used for tail calls.
\item[\texttt{\%dispatch}] Compile a piece of code that jumps to an offset in a
jump table indexed by an integer. The jump table consists of \texttt{\%target-label} and \texttt{\%target} must immediately follow this VOP.
\item[\texttt{\%target}] Not supported on PowerPC.
\item[\texttt{\%target-label}] A jump table entry.
\end{description}
\subsubsection{Slots and objects}
\begin{description}
\item[\texttt{\%slot}] The untagged object is in \texttt{vop-out-1}, the tagged slot
number is in \texttt{vop-in-1}.
\item[\texttt{\%fast-slot}] The tagged object is in \texttt{vop-out-1}, the pointer offset is
in \texttt{vop-in-1}. the offset already takes the type tag into
account, so its just one instruction to load.
\item[\texttt{\%set-slot}] The new value is \texttt{vop-in-1}, the object is \texttt{vop-in-2}, and
the slot number is \texttt{vop-in-3}.
\item[\texttt{\%fast-set-slot}] The new value is \texttt{vop-in-1}, the object is \texttt{vop-in-2}, and
the slot offset is \texttt{vop-in-3}.
the offset already takes the type tag into account, so
it's just one instruction to load.
\item[\texttt{\%write-barrier}] Mark the card containing the object pointed by \texttt{vop-in-1}.
\item[\texttt{\%untag}] Mask off the tag bits of \texttt{vop-in-1}, store result in
\texttt{vop-in-1} (which should equal \texttt{vop-out-1}!)
\item[\texttt{\%untag-fixnum}] Shift \texttt{vop-in-1} to the right by 3 bits, store result in
\texttt{vop-in-1} (which should equal \texttt{vop-out-1}!)
\item[\texttt{\%type}] Intrinstic version of type primitive. It outputs an
unboxed value in \texttt{vop-out-1}.
\end{description}
\subsubsection{Alien interface}
\begin{description}
\item[\texttt{\%parameters}] Ignored on x86.
\item[\texttt{\%parameter}] Ignored on x86.
\item[\texttt{\%unbox}] An unboxer function takes a value from the data stack
and converts it into a C value.
\item[\texttt{\%box}] A boxer function takes a C value as a parameter and
converts into a Factor value, and pushes it on the data
stack.
On x86, C functions return integers in EAX.
\item[\texttt{\%box-float}] On x86, C functions return floats on the FP stack.
\item[\texttt{\%box-double}] On x86, C functions return doubles on the FP stack.
\item[\texttt{\%cleanup}] Ignored on PowerPC.
On x86, in the cdecl ABI, the caller must pop input
parameters off the C stack. In stdcall, the callee does
it, so this node is not used in that case.
\end{description}
\end{document}

View File

@ -1,58 +0,0 @@
VOPs:
%prologue on x86, this does nothing. On PowerPC, at the start of
each word that calls a subroutine, we store the link
register in r0, then push r0 on the C stack.
%call-label on PowerPC, uses near calling convention, where the
caller pushes the return address.
%dispatch compile a piece of code that jumps to an offset in a
jump table indexed by an integer. The jump table must immediately follow this VOP.
%slot the untagged object is in vop-out-1, the tagged slot
number is in vop-in-1.
%fast-slot the tagged object is in vop-out-1, the pointer offset is
in vop-in-1. the offset already takes the type tag into
account, so its just one instruction to load.
%set-slot the new value is vop-in-1, the object is vop-in-2, and
the slot number is vop-in-3.
%fast-set-slot the new value is vop-in-1, the object is vop-in-2, and
the slot offset is vop-in-3.
the offset already takes the type tag into account, so
it's just one instruction to load.
%parameters ignored on x86.
%parameter ignored on x86.
%unbox an unboxer function takes a value from the data stack
and converts it into a C value.
%box a boxer function takes a C value as a parameter and
converts into a Factor value, and pushes it on the data
stack.
on x86, C functions return integers in EAX.
%box-float on x86, C functions return floats on the FP stack.
%box-double on x86, C functions return doubles on the FP stack.
%cleanup ignored on PowerPC.
on x86, in the cdecl ABI, the caller must pop input
parameters off the C stack. In stdcall, the callee does
it, so this node is not used in that case.
%untag mask off the low 3 bits of vop-in-1, store result in
vop-in-1 (which should equal vop-out-1!)
%untag-fixnum shift vop-in-1 to the right by 3 bits, store result in
vop-in-1 (which should equal vop-out-1!)
%type Intrinstic version of type primitive. It outputs an
unboxed value in vop-out-1.

View File

@ -40,10 +40,11 @@ import org.gjt.sp.util.Log;
public class ExternalFactor extends VocabularyLookup
{
//{{{ ExternalFactor constructor
public ExternalFactor(int port)
public ExternalFactor(String host, int port)
{
/* Start stream server */;
streamServer = port;
this.port = port;
this.host = host;
for(int i = 1; i < 6; i++)
{
@ -74,7 +75,7 @@ public class ExternalFactor extends VocabularyLookup
{
if(closed)
throw new IOException("Socket closed");
return new Socket("localhost",streamServer);
return new Socket(host,port);
} //}}}
//{{{ openWire() method
@ -343,6 +344,7 @@ public class ExternalFactor extends VocabularyLookup
private DataInputStream in;
private DataOutputStream out;
private int streamServer;
private String host;
private int port;
//}}}
}

View File

@ -3,7 +3,7 @@
/*
* $Id$
*
* Copyright (C) 2004 Slava Pestov.
* Copyright (C) 2004, 2005 Slava Pestov.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
@ -47,26 +47,63 @@ public class FactorOptionPane extends AbstractOptionPane
//{{{ _init() method
protected void _init()
{
ButtonGroup grp = new ButtonGroup();
RadioAction h = new RadioAction();
addComponent(jEdit.getProperty("options.factor.port"),
port = new JTextField(jEdit.getProperty("factor.external.port")));
addComponent(programType = new JRadioButton(jEdit.getProperty(
"options.factor.type.program")));
programType.addActionListener(h);
grp.add(programType);
addComponent(jEdit.getProperty("options.factor.program"),
createProgramField(jEdit.getProperty("factor.external.program")));
addComponent(jEdit.getProperty("options.factor.image"),
createImageField(jEdit.getProperty("factor.external.image")));
addComponent(jEdit.getProperty("options.factor.args"),
createArgsField(jEdit.getProperty("factor.external.args")));
addComponent(remoteType = new JRadioButton(jEdit.getProperty(
"options.factor.type.remote")));
remoteType.addActionListener(h);
grp.add(remoteType);
addComponent(jEdit.getProperty("options.factor.host"),
host = new JTextField(jEdit.getProperty("factor.external.host")));
String type = jEdit.getProperty("factor.external.type");
if("program".equals(type))
programType.setSelected(true);
else
remoteType.setSelected(true);
updateEnabled();
} //}}}
//{{{ _save() method
protected void _save()
{
if(programType.isSelected())
jEdit.setProperty("factor.external.type","program");
else
jEdit.setProperty("factor.external.type","remote");
jEdit.setProperty("factor.external.program",program.getText());
jEdit.setProperty("factor.external.image",image.getText());
jEdit.setProperty("factor.external.args",args.getText());
jEdit.setProperty("factor.external.host",host.getText());
jEdit.setProperty("factor.external.port",port.getText());
} //}}}
//{{{ Private members
private JTextField port;
private JRadioButton programType;
private JTextField program;
private JTextField image;
private JTextField args;
private JRadioButton remoteType;
private JTextField host;
//{{{ createProgramField() metnod
private JComponent createProgramField(String text)
@ -106,18 +143,30 @@ public class FactorOptionPane extends AbstractOptionPane
JButton button = new RolloverButton(
GUIUtilities.loadIcon("Open.png"));
button.setToolTipText(jEdit.getProperty("options.factor.choose"));
button.addActionListener(new ActionHandler(field));
button.addActionListener(new ChooseFileAction(field));
h.add(button);
return h;
} //}}}
//{{{ ActionHandler class
class ActionHandler implements ActionListener
//{{{ updateEnabled() method
private void updateEnabled()
{
boolean b = programType.isSelected();
program.setEnabled(b);
image.setEnabled(b);
args.setEnabled(b);
host.setEnabled(!b);
} //}}}
//}}}
//{{{ ChooseFileAction class
class ChooseFileAction implements ActionListener
{
private JTextField field;
ActionHandler(JTextField field)
ChooseFileAction(JTextField field)
{
this.field = field;
}
@ -134,4 +183,13 @@ public class FactorOptionPane extends AbstractOptionPane
field.setText(paths[0]);
}
} //}}}
//{{{ RadioAction class
class RadioAction implements ActionListener
{
public void actionPerformed(ActionEvent evt)
{
updateEnabled();
}
} //}}}
}

View File

@ -95,18 +95,9 @@ public class FactorPlugin extends EditPlugin
}
} //}}}
//{{{ getExternalInstance() method
/**
* Returns the object representing a connection to an external Factor instance.
* It will start the interpreter if it's not already running.
*/
public synchronized static ExternalFactor getExternalInstance()
//{{{ startExternalProcess() method
private static void startExternalProcess(int port)
{
if(external == null)
{
InputStream in = null;
OutputStream out = null;
try
{
String exePath = jEdit.getProperty(
@ -118,7 +109,7 @@ public class FactorPlugin extends EditPlugin
args.add(imagePath);
args.add("-null-stdio");
args.add("-shell=telnet");
args.add("-telnetd-port=" + PORT);
args.add("-telnetd-port=" + port);
String[] extraArgs = jEdit.getProperty(
"factor.external.args")
.split(" ");
@ -130,8 +121,6 @@ public class FactorPlugin extends EditPlugin
new File(MiscUtilities
.getParentOfPath(imagePath)));
external = new ExternalFactor(PORT);
process.getOutputStream().close();
process.getInputStream().close();
process.getErrorStream().close();
@ -143,6 +132,32 @@ public class FactorPlugin extends EditPlugin
Log.log(Log.ERROR,FactorPlugin.class,e);
process = null;
}
} //}}}
//{{{ getExternalInstance() method
/**
* Returns the object representing a connection to an external Factor instance.
* It will start the interpreter if it's not already running.
*/
public synchronized static ExternalFactor getExternalInstance()
{
if(external == null)
{
InputStream in = null;
OutputStream out = null;
String type = jEdit.getProperty("factor.external.type");
String host;
int port = jEdit.getIntegerProperty("factor.external.port",PORT);;
if("program".equals(type))
{
host = "localhost";
startExternalProcess(port);
}
else
host = jEdit.getProperty("factor.external.host");
external = new ExternalFactor(host,port);
}
return external;
@ -604,8 +619,7 @@ public class FactorPlugin extends EditPlugin
return;
}
Asset asset = data.getAssetAtPosition(
textArea.getCaretPosition());
IAsset asset = data.getAssetAtOffset(textArea.getCaretPosition());
if(asset == null)
{
@ -618,7 +632,7 @@ public class FactorPlugin extends EditPlugin
if(newWord == null)
return;
int start = asset.start.getOffset();
int start = asset.getStart().getOffset();
/* Hack */
start = buffer.getLineStartOffset(
buffer.getLineOfOffset(start));

View File

@ -91,9 +91,16 @@ plugin.factor.jedit.FactorPlugin.option-pane=factor
options.factor.label=Factor
options.factor.code=new factor.jedit.FactorOptionPane();
options.factor.port=Port number:
options.factor.type.program=Communicate with local Factor
options.factor.program=Factor runtime executable:
options.factor.image=Factor image:
options.factor.choose=Choose file...
options.factor.args=Additional arguments:
options.factor.type.remote=Communicate with remote Factor
options.factor.host=Host name:
factor.external.args=-jedit
factor.external.host=localhost
factor.external.port=9999
factor.external.type=process

View File

@ -87,12 +87,14 @@ sequences words ;
in-2
2 %dec-d ,
slot@ >r 0 1 r> %fast-set-slot ,
0 %write-barrier ,
] [
drop
in-3
3 %dec-d ,
1 %untag ,
0 1 2 %set-slot ,
1 %write-barrier ,
] ifte
] "intrinsic" set-word-prop

View File

@ -211,12 +211,12 @@ M: word BC >r 0 BC r> relative-14 ;
: BLRL 20 BCLRL ;
: BCCTR 0 264 0 0 b-form 19 insn ;
: BCTR 20 BCCTR ;
: MFSPR 5 shift 339 xfx-form 31 insn ;
: MFLR 8 MFSPR ;
: MFCTR 9 MFSPR ;
: MFXER 1 MFSPR ; : MFLR 8 MFSPR ; : MFCTR 9 MFSPR ;
: MTSPR 5 shift 467 xfx-form 31 insn ;
: MTLR 8 MTSPR ;
: MTCTR 9 MTSPR ;
: MTXER 1 MTSPR ; : MTLR 8 MTSPR ; : MTCTR 9 MTSPR ;
: LOAD32 >r w>h/h r> tuck LIS dup rot ORI ;

View File

@ -16,26 +16,28 @@ M: %slot generate-node ( vop -- )
M: %fast-slot generate-node ( vop -- )
dup vop-out-1 v>operand dup rot vop-in-1 LWZ ;
: write-barrier ( reg -- )
#! Mark the card pointed to by vreg.
dup dup card-bits SRAWI
dup dup 16 ADD
20 over 0 LBZ
20 20 card-mark ORI
20 swap 0 STB ;
M: %set-slot generate-node ( vop -- )
dup vop-in-3 v>operand over vop-in-2 v>operand
! turn tagged fixnum slot # into an offset, multiple of 4
over dup 1 SRAWI
! compute slot address in vop-in-2
over dup pick ADD
over dup rot ADD
! store new slot value
>r >r vop-in-1 v>operand r> 0 STW r> write-barrier ;
>r vop-in-1 v>operand r> 0 STW ;
M: %fast-set-slot generate-node ( vop -- )
dup vop-in-1 v>operand over vop-in-2 v>operand
[ rot vop-in-3 STW ] keep write-barrier ;
[ vop-in-1 v>operand ] keep
[ vop-in-2 v>operand ] keep
vop-in-3 STW ;
M: %write-barrier generate-node ( vop -- )
#! Mark the card pointed to by vreg.
vop-in-1 v>operand
dup dup card-bits SRAWI
dup dup 16 ADD
20 over 0 LBZ
20 20 card-mark ORI
20 swap 0 STB ;
: userenv ( reg -- )
#! Load the userenv pointer in a virtual register.

View File

@ -178,6 +178,9 @@ VOP: %fast-set-slot
<%fast-set-slot> ;
M: %fast-set-slot basic-block? drop t ;
VOP: %write-barrier
: %write-barrier ( ptr ) <vreg> unit f f <%write-barrier> ;
! fixnum intrinsics
VOP: %fixnum+ : %fixnum+ 3-vop <%fixnum+> ;
VOP: %fixnum- : %fixnum- 3-vop <%fixnum-> ;
@ -261,7 +264,7 @@ M: %untag-fixnum basic-block? drop t ;
swap vop-out-1 = [ "bad VOP destination" throw ] unless ;
: check-src ( vop reg -- )
swap vop-out-1 = [ "bad VOP source" throw ] unless ;
swap vop-in-1 = [ "bad VOP source" throw ] unless ;
VOP: %getenv
: %getenv swap src/dest-vop <%getenv> ;

View File

@ -19,8 +19,9 @@ M: %fast-slot generate-node ( vop -- )
: card-offset 1 getenv ;
: write-barrier ( reg -- )
M: %write-barrier generate-node ( vop -- )
#! Mark the card pointed to by vreg.
vop-in-1 v>operand
dup card-bits SHR
card-offset 2list card-mark OR
0 rel-cards ;
@ -30,15 +31,13 @@ M: %set-slot generate-node ( vop -- )
! turn tagged fixnum slot # into an offset, multiple of 4
over 1 SHR
! compute slot address in vop-in-2
2dup ADD
dupd ADD
! store new slot value
>r >r vop-in-1 v>operand r> unit swap MOV r>
write-barrier ;
>r vop-in-1 v>operand r> unit swap MOV ;
M: %fast-set-slot generate-node ( vop -- )
dup vop-in-3 over vop-in-2 v>operand
[ swap 2list swap vop-in-1 v>operand MOV ] keep
write-barrier ;
swap 2list swap vop-in-1 v>operand MOV ;
: userenv@ ( n -- addr )
cell * "userenv" f dlsym + ;