Get lazy lists and parser combinators working with new factor.

cvs
Chris Double 2005-04-30 03:20:11 +00:00
parent 19bc49ece3
commit aa5b8fe510
5 changed files with 905 additions and 560 deletions

View File

@ -28,17 +28,21 @@ USE: lists
USE: combinators USE: combinators
USE: kernel USE: kernel
USE: logic USE: logic
USE: sequences
USE: namespaces
: lfrom ( n -- llist ) : lfrom ( n -- llist )
#! Return a lazy list of increasing numbers starting #! Return a lazy list of increasing numbers starting
#! from the initial value 'n'. #! from the initial value 'n'.
dup [ succ lfrom ] curry1 lcons ; dup unit delay swap
[ 1 + lfrom ] cons delay lcons ;
: lfrom-by ( n quot -- llist ) : lfrom-by ( n quot -- llist )
#! Return a lazy list of values starting from n, with #! Return a lazy list of values starting from n, with
#! each successive value being the result of applying quot to #! each successive value being the result of applying quot to
#! n. #! n.
dupd [ dup [ call ] dip lfrom-by ] curry2 lcons ; swap dup unit delay -rot
[ , dup , \ call , , \ lfrom-by , ] make-list delay lcons ;
: lnaturals 0 lfrom ; : lnaturals 0 lfrom ;
: lpositves 1 lfrom ; : lpositves 1 lfrom ;
@ -56,8 +60,9 @@ USE: logic
: sieve ( llist - llist ) : sieve ( llist - llist )
#! Given a lazy list of numbers, use the sieve of eratosthenes #! Given a lazy list of numbers, use the sieve of eratosthenes
#! algorithm to return a lazy list of primes. #! algorithm to return a lazy list of primes.
luncons over [ divisible-by? not ] curry1 lsubset [ sieve ] curry1 lcons ; luncons over [ divisible-by? not ]
cons lsubset [ sieve ] cons delay >r unit delay r> lcons ;
: lprimes 2 lfrom sieve ; : lprimes 2 lfrom sieve ;
: first-ten-primes 10 lprimes ltake ; : first-ten-primes 10 lprimes ltake llist>list ;

View File

@ -21,152 +21,277 @@
! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ! OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
IN: lazy IN: lazy
USE: lists
USE: stack
USE: math
USE: stdio
USE: prettyprint
USE: kernel USE: kernel
USE: combinators USE: sequences
USE: logic USE: namespaces
USE: lists
USE: math
: curry1 ( n quot -- quot ) TUPLE: promise quot forced? value ;
#! Return a quotation that when called will initially
#! have 'n' pushed on the stack.
cons ;
: curry2 ( n1 n2 quot -- quot ) : delay ( quot -- <promise> )
#! Return a quotation that when called will initially #! Given a quotation, create a promise which may later be forced.
#! have 'n1' and 'n2' pushed on the stack. #! When forced the quotation will execute returning the value. Future
cons cons ; #! forces of the promise will return that value and not re-execute
#! the quotation.
f f <promise> ;
: delay ( value -- promise ) : (force) ( <promise> -- value )
#! Return a promise that when 'forced' returns the original value. #! Force the given promise leaving the value of calling the
unit ; #! promises quotation on the stack. Re-forcing the promise
#! will return the same value and not recall the quotation.
dup promise-forced? [
dup promise-quot call over set-promise-value
t over set-promise-forced?
] unless
promise-value ;
: force ( promise -- value ) : force ( <promise> -- value )
#! Return the value associated with the promise. (force) dup promise? [
call ; force
] when ;
TUPLE: lcons car cdr ;
: lcons ( car promise -- lcons ) SYMBOL: lazy-nil
#! Return a lazy pair, where the cdr is a promise and must DEFER: lnil
#! be forced to return the value. [ [ ] ] delay lazy-nil set
cons ;
: lunit ( a -- llist ) : lnil ( -- lcons )
#! Construct a lazy list of one element. #! Return the nil lazy list.
[ ] delay lcons ; lazy-nil get ;
: lnil? ( lcons -- bool )
#! Is the given lazy cons the nil value
force not ;
: lcar ( lcons -- car ) : lcar ( lcons -- car )
#! Return the car of a lazy pair. #! Return the value of the head of the lazy list.
car ; dup lnil? [
force lcons-car (force)
] unless ;
: lcdr ( lcons -- cdr ) : lcdr ( lcons -- cdr )
#! Return the cdr of a lazy pair, implicitly forcing it. #! Return the value of the rest of the lazy list.
cdr force ; #! This is itself a lazy list.
dup lnil? [
force lcons-cdr (force)
] unless ;
: lcons ( lcar lcdr -- promise )
#! Given a car and cdr, both lazy values, return a lazy cons.
swap [ , , \ <lcons> , ] make-list delay ;
: lunit ( lvalue -- llist )
#! Given a lazy value (a quotation that when called produces
#! the value) produce a lazy list containing that value.
[ lnil ] delay lcons ;
: lnth ( n llist -- value ) : lnth ( n llist -- value )
#! Return the nth item in a lazy list #! Return the nth item in a lazy list
swap [ lcdr ] times lcar ; swap [ lcdr ] times lcar ;
: luncons ( lcons -- car cdr ) : luncons ( lcons -- car cdr )
#! Return the car and forced cdr of the lazy cons. #! Return the car and cdr of the lazy list
uncons force ; dup lcar swap lcdr ;
: (ltake) ( n llist accum -- list )
>r >r pred dup 0 < [
drop r> drop r> reverse
] [
r> luncons swap r> cons (ltake)
] ifte ;
: ltake ( n llist -- list )
#! Return a list containing the first n items from
#! the lazy list.
[ ] (ltake) ;
: lmap ( llist quot -- llist ) : lmap ( llist quot -- llist )
#! Return a lazy list containing the collected result of calling #! Return a lazy list containing the collected result of calling
#! quot on the original lazy list. #! quot on the original lazy list.
over [ ] = [ over lnil? [
2drop [ ] drop
] [ ] [
>r luncons r> swap 2dup
dup swapd [ , \ lcdr , , \ lmap , ] make-list delay >r
[ lmap ] curry2 [ , \ lcar , , \ call , ] make-list delay r>
>r call r>
lcons lcons
] ifte ; ] ifte ;
: ltake ( n llist -- llist )
#! Return a lazy list containing the first n items from
#! the original lazy list.
over 0 = [
2drop lnil
] [
dup lnil? [
nip
] [
swap dupd ( llist llist n -- )
[ [ 1 - ] cons , \ call , , \ lcdr , \ ltake , ] make-list delay >r
[ , \ lcar , ] make-list delay r>
lcons
] ifte
] ifte ;
DEFER: lsubset
TUPLE: lsubset-state llist pred ;
: (lsubset-cdr) ( state -- llist )
#! Given a predicate and a lazy list, do the cdr
#! portion of lsubset.
dup lsubset-state-llist lcdr swap lsubset-state-pred lsubset ;
: (lsubset-car) ( state -- value )
#! Given a predicate and a lazy list, do the car
#! portion of lsubset.
dup lsubset-state-llist lcar over
lsubset-state-pred dupd call [ ( state lcar -- )
nip
] [ ( state lcar -- )
drop dup lsubset-state-llist lcdr over set-lsubset-state-llist
(lsubset-car)
] ifte ;
: (lsubset-set-first-car) ( state -- bool )
#! Set the state to the first valid car. If none found
#! return false.
dup lsubset-state-llist lcar over
lsubset-state-pred dupd call [ ( state lcar -- )
2drop t
] [ ( state lcar -- )
drop dup lsubset-state-llist lcdr dup lnil? [
2drop f
] [
over set-lsubset-state-llist
(lsubset-set-first-car)
] ifte
] ifte ;
: lsubset ( llist pred -- llist ) : lsubset ( llist pred -- llist )
#! Return a lazy list containing only the items from the original #! Return a lazy list containing only the items from the original
#! lazy list for which the predicate returns a value other than f. #! lazy list for which the predicate returns a value other than f.
over [ ] = [ over lnil? [
2drop [ ] drop
] [ ] [
>r luncons r> <lsubset-state> dup
dup swapd (lsubset-set-first-car) [
[ lsubset ] curry2 dup
-rot dupd call [ [ (lsubset-cdr) ] cons delay >r
swap lcons [ (lsubset-car) ] cons delay r> lcons
] [ ] [
drop call drop lnil
] ifte
] ifte ;
DEFER: lappend*
DEFER: (lappend*)
TUPLE: lappend*-state current rest ;
USE: stdio
: (lappend*-cdr) ( state -- llist )
#! Given the state object, do the cdr portion of the
#! lazy append.
dup lappend*-state-current dup lnil? [ ( state current -- )
nip
] [ ( state current -- )
lcdr ( state cdr -- )
dup lnil? [ ( state cdr -- )
drop dup lappend*-state-rest dup lnil? [ ( state rest )
nip
] [
nip
luncons ( state rest-car rest-cdr -- )
<lappend*-state> (lappend*)
] ifte
] [ ( state cdr -- )
swap lappend*-state-rest <lappend*-state> (lappend*)
] ifte ] ifte
] ifte ; ] ifte ;
: lappend* ; : (lappend*-car) ( state -- value )
: (lappend*) ; #! Given the state object, do the car portion of the
: lappend-list* ; #! lazy append.
dup lappend*-state-current dup lnil? [ ( state current -- )
: lappend-item* ( llists list item -- llist ) nip
-rot [ lappend-list* ] curry2 lcons ; ] [ ( state current -- )
lcar nip
: lappend-list* ( llists list -- llist )
dup [
#! non-empty list
luncons swap lappend-item*
] [
#! empty list
drop lappend*
] ifte ;
: (lappend*) ( llists -- llist )
dup lcar [ ( llists )
#! Yes, the first item in the list is a valid llist
luncons swap lappend-list*
] [
#! The first item in the list is an empty list.
#! Resume passing the next list.
lcdr lappend*
] ifte ; ] ifte ;
: (lappend*) ( state -- llist )
#! Do the main work of the lazy list appending using a
#! state object.
dup
[ (lappend*-cdr) ] cons delay >r
[ (lappend*-car) ] cons delay r> lcons ;
: lappend* ( llists -- llist ) : lappend* ( llists -- llist )
#! Given a lazy list of lazy lists, return a lazy list that #! Given a lazy list of lazy lists, return a lazy list that
#! works through all of the sub-lists in sequence. #! works through all of the sub-lists in sequence.
dup [ [ lnil? not ] lsubset
(lappend*) dup lnil? [
] [ luncons <lappend*-state> (lappend*)
#! Leave empty list on the stack ] unless ;
] ifte ;
: list>llist ( list -- llist ) DEFER: list>llist
#! Convert a list to a lazy list.
dup [
uncons [ list>llist ] curry1 lcons
] when ;
: lappend ( llist1 llist2 -- llist ) : lappend ( llist1 llist2 -- llist )
#! Concatenate two lazy lists such that they appear to be one big lazy list. #! Concatenate two lazy lists such that they appear to be one big
#! lazy list.
2list list>llist lappend* ; 2list list>llist lappend* ;
: leach ( llist quot -- ) : leach ( llist quot -- )
#! Call the quotation on each item in the lazy list. #! Call the quotation on each item in the lazy list.
#! Warning: If the list is infinite then this will #! Warning: If the list is infinite then this will
#! never return. #! never return.
over [ over lnil? [
>r luncons r> tuck >r >r call r> r> leach
] [
2drop 2drop
] [
>r luncons r> tuck >r >r call r> r> leach
] ifte ; ] ifte ;
: (llist>list) ( result llist -- list )
#! Helper function for llist>list.
dup lnil? [
drop
] [
dup lcar ( result llist car )
swap lcdr >r swons r> (llist>list)
] ifte ;
: llist>list ( llist -- list )
#! Convert a lazy list to a normal list. This will cause
#! an infinite loop if the lazy list is an infinite list.
f swap (llist>list) reverse ;
: list>llist ( list -- llist )
#! Convert a list to a lazy list.
dup [
uncons [ list>llist ] cons delay >r unit delay r> lcons
] [
drop lnil
] ifte ;
M: lcons nth lnth ;
: test1
[ 1 ] list>llist
[ 2 ] list>llist
2list
list>llist
lappend* ;
: test2
[ 1 2 ] list>llist
[ 3 4 ] list>llist
2list
list>llist
lappend* ;
: test3
[ 1 2 3 ] list>llist
[ 4 5 6 ] list>llist
[ 7 8 9 ] list>llist
3list
list>llist
lappend* ;
: test4
[ 1 2 3 4 5 ] list>llist
[ 2 mod 1 = ] lsubset ;
: test5 lnil unit delay lunit [ lnil? not ] lsubset ;
: test6 lnil unit delay lunit lappend* ;

View File

@ -9,14 +9,22 @@
ability to describe infinite structures, and to delay execution of ability to describe infinite structures, and to delay execution of
expressions until they are actually used.</p> expressions until they are actually used.</p>
<p>Lazy lists, like normal lists, are composed of a head and tail. In <p>Lazy lists, like normal lists, are composed of a head and tail. In
a lazy list the tail is something called a 'promise'. To convert a a lazy list the head and tail are something called a 'promise'.
To convert a
'promise' into its actual value a word called 'force' is used. To 'promise' into its actual value a word called 'force' is used. To
convert a value into a 'promise' the word to use is 'delay'.</p> convert a value into a 'promise' the word to use is 'delay'.</p>
<table border="1">
<tr><td><a href="#delay">delay</a></td></tr>
<tr><td><a href="#force">force</a></td></tr>
</table>
<p>Many of the lazy list words are named similar to the standard list <p>Many of the lazy list words are named similar to the standard list
words but with an 'l' suffixed to it. Here are the commonly used words but with an 'l' suffixed to it. Here are the commonly used
words and their equivalent list operation:</p> words and their equivalent list operation:</p>
<table border="1"> <table border="1">
<tr><th>Lazy List</th><th>Normal List</th></tr> <tr><th>Lazy List</th><th>Normal List</th></tr>
<tr><td><a href="#lnil">lnil</a></td><td>[ ]</td></tr>
<tr><td><a href="#lnilp">lnil?</a></td><td>Test for nil value</td></tr>
<tr><td><a href="#lcons">lcons</a></td><td>cons</td></tr> <tr><td><a href="#lcons">lcons</a></td><td>cons</td></tr>
<tr><td><a href="#lunit">lunit</a></td><td>unit</td></tr> <tr><td><a href="#lunit">lunit</a></td><td>unit</td></tr>
<tr><td><a href="#lcar">lcar</a></td><td>car</td></tr> <tr><td><a href="#lcar">lcar</a></td><td>car</td></tr>
@ -38,47 +46,122 @@ list.</td></tr>
<tr><td><a href="#list>llist">list>llist</a></td><td>Given a normal list, return a lazy list <tr><td><a href="#list>llist">list>llist</a></td><td>Given a normal list, return a lazy list
that contains the same elements as the normal list.</td></tr> that contains the same elements as the normal list.</td></tr>
</table> </table>
<p>A couple of helper functions are also provided by the lazy
vocabulary.</p>
<table border="1">
<tr><td>curry1</td><td>Given a value and a quotation, returns a new
quotation that when called will have the value on the stack.</td></tr>
<tr><td>curry2</td><td>Given two values and a quotation, returns a new
quotation that when called will have the two values on the
stack.</td></tr>
</table>
<h2>Reference</h2> <h2>Reference</h2>
<!-- delay description -->
<a name="delay">
<h3>delay ( quot -- &lt;promise&gt; )</h3>
<p>'delay' is used to convert a value or expression into a promise.
The word 'force' is used to convert that promise back to its
value, or to force evaluation of the expression to return a value.
</p>
<p>The value on the stack that 'delay' expects must be quoted. This is
a requirement to prevent it from being evaluated.
</p>
<pre class="code">
( 1 ) [ 42 ] <a href="#delay">delay</a> dup .
=> &lt;&lt; promise [ ] [ 42 ] [ ] [ ] &gt;&gt;
( 2 ) <a href="#force">force</a> .
=> 42
</pre>
<!-- force description -->
<a name="force">
<h3>force ( &lt;promise&gt; -- value )</h3>
<p>'force' will evaluate a promises original expression
and leave the value of that expression on the stack.
</p>
<p>A promise can be forced multiple times but the expression
is only evaluated once. Future calls of 'force' on the promise
will returned the cached value of the original force. If the
expression contains side effects, such as i/o, then that i/o
will only occur on the first 'force'. See below for an example
(steps 3-5).
</p>
<p>If a promise is itself delayed, a force will evaluate all promises
until a value is returned. Due to this behaviour it is generally not
possible to delay a promise. The example below shows what happens
in this case.
</p>
<pre class="code">
( 1 ) [ 42 ] <a href="#delay">delay</a> dup .
=> &lt;&lt; promise [ ] [ 42 ] [ ] [ ] &gt;&gt;
( 2 ) <a href="#force">force</a> .
=> 42
#! Multiple forces on a promise returns cached value
( 3 ) [ "hello" print 42 ] <a href="#delay">delay</a> dup .
=> << promise [ ] [ "hello" print 42 ] [ ] [ ] >>
( 4 ) dup <a href="#force">force</a> .
=> hello
42
( 5 ) <a href="#force">force</a> .
=> 42
#! Forcing a delayed promise cascades up to return
#! original value, rather than the promise.
( 6 ) [ [ 42 ] <a href="#delay">delay</a> ] <a href="#delay">delay</a> dup .
=> << promise [ ] [ [ 42 ] delay ] [ ] [ ] >>
( 7 ) <a href="#force">force</a> .
=> 42
</pre>
<!-- lnil description -->
<a name="lnil">
<h3>lnil ( -- lcons )</h3>
<p>Returns a value representing the empty lazy list.</p>
<pre class="code">
( 1 ) <a href="#lnil">lnil</a> .
=> << promise [ ] [ [ ] ] t [ ] >>
</pre>
<!-- lnil description -->
<a name="lnilp">
<h3>lnil? ( lcons -- bool )</h3>
<p>Returns true if the given lazy cons is the value representing
the empty lazy list.</p>
<pre class="code">
( 1 ) <a href="#lnil">lnil</a> <a href="#lnilp">lnil?</a> .
=> t
( 2 ) [ 1 ] <a href="#list2llist">list&gt;llist</a> dup <a href="#lnilp">lnil?</a> .
=> [ ]
( 3 ) <a href="#lcdr">lcdr</a> <a href="#lnilp">lnil?</a> .
=> t
</pre>
<!-- lcons description --> <!-- lcons description -->
<a name="lcons"> <a name="lcons">
<h3>lcons ( value promise -- lcons )</h3> <h3>lcons ( car-promise cdr-promise -- lcons )</h3>
<p>Provides the same effect as 'cons' does for normal lists. It <p>Provides the same effect as 'cons' does for normal lists.
creates a cons cell where the first element is the value given and the Both values provided must be promises (ie. expressions that have
second element is a promise.</p> had <a href="#delay">delay</a> called on them).
<a name="promise"> </p>
<p>A promise is either a value that has had 'force' called on it, or <p>As the car and cdr passed on the stack are promises, they are not
a quotation that when 'call' is applied to it, returns the actual evaluated until <a href="#lcar">lcar</a> or <a href="#lcdr">lcdr</a>
value.</p> are called on the lazy cons.</p>
<pre class="code"> <pre class="code">
( 1 ) 5 6 delay <a href="#lcons">lcons</a> dup . ( 1 ) [ "car" ] <a href="#delay">delay</a> [ "cdr" ] <a href="#delay">delay</a> <a href="#lcons">lcons</a> dup .
=> [ 5 6 ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) dup <a href="#lcar">lcar</a> . ( 2 ) dup <a href="#lcar">lcar</a> .
=> 5 => "car"
( 3 ) dup <a href="#lcdr">lcdr</a> . ( 3 ) dup <a href="#lcdr">lcdr</a> .
=> 6 => "cdr"
</pre> </pre>
<!-- lunit description --> <!-- lunit description -->
<a name="lunit"> <a name="lunit">
<h3>lunit ( value -- llist )</h3> <h3>lunit ( value-promise -- llist )</h3>
<p>Provides the same effect as 'unit' does for normal lists. It <p>Provides the same effect as 'unit' does for normal lists. It
creates a lazy list where the first element is the value given.</p> creates a lazy list where the first element is the value given.</p>
<p>Like <a href="#lcons">lcons</a>, the value on the stack must be
a promise and is not evaluated until the <a href="#lcar">lcar</a>
of the list is requested.</a>
<pre class="code"> <pre class="code">
( 1 ) 42 <a href="#lunit">lunit</a> dup . ( 1 ) [ 42 ] <a href="#delay">delay</a> <a href="#lunit">lunit</a> dup .
=> [ 42 f ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) dup <a href="#lcar">lcar</a> . ( 2 ) dup <a href="#lcar">lcar</a> .
=> 42 => 42
( 3 ) dup <a href="#lcdr">lcdr</a> . ( 3 ) dup <a href="#lcdr">lcdr</a> <a href="#lnilp">lnil?</a> .
=> f => t
( 4 ) [ . ] <a href="#leach">leach</a> ( 4 ) [ . ] <a href="#leach">leach</a>
=> 42 => 42
</pre> </pre>
@ -87,10 +170,11 @@ creates a lazy list where the first element is the value given.</p>
<a name="lcar"> <a name="lcar">
<h3>lcar ( lcons -- value )</h3> <h3>lcar ( lcons -- value )</h3>
<p>Provides the same effect as 'car' does for normal lists. It <p>Provides the same effect as 'car' does for normal lists. It
returns the first element in a lazy cons cell.</p> returns the first element in a lazy cons cell. This will force
the evaluation of that element.</p>
<pre class="code"> <pre class="code">
( 1 ) 42 <a href="#lunit">lunit</a> dup . ( 1 ) [ 42 ] <a href="#delay">delay</a> <a href="#lunit">lunit</a> dup .
=> [ 42 f ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) <a href="#lcar">lcar</a> . ( 2 ) <a href="#lcar">lcar</a> .
=> 42 => 42
</pre> </pre>
@ -102,15 +186,15 @@ returns the first element in a lazy cons cell.</p>
returns the second element in a lazy cons cell and forces it. This returns the second element in a lazy cons cell and forces it. This
causes that element to be evaluated immediately.</p> causes that element to be evaluated immediately.</p>
<pre class="code"> <pre class="code">
( 1 ) 5 [ 5 6 + ] <a href="#lcons">lcons</a> dup . ( 1 ) [ 1 ] <a href="#delay">delay</a> [ 5 6 + ] <a href="#delay">delay</a> <a href="#lcons">lcons</a> dup .
=> [ 5 5 6 + ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) <a href="#lcdr">lcdr</a> . ( 2 ) <a href="#lcdr">lcdr</a> .
=> 11 => 11
</pre> </pre>
<pre class="code"> <pre class="code">
( 1 ) 5 lfrom dup . ( 1 ) 5 <a href="#lfrom">lfrom</a> dup .
=> [ 5 5 succ lfrom ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) <a href="#lcdr">lcdr</a> dup <a href="#lcar">lcar</a> . ( 2 ) <a href="#lcdr">lcdr</a> dup <a href="#lcar">lcar</a> .
=> 6 => 6
( 3 ) <a href="#lcdr">lcdr</a> dup <a href="#lcar">lcar</a> . ( 3 ) <a href="#lcdr">lcdr</a> dup <a href="#lcar">lcar</a> .
@ -126,8 +210,8 @@ causes that element to be evaluated immediately.</p>
returns the nth value in the lazy list. It causes all the values up to returns the nth value in the lazy list. It causes all the values up to
'n' to be evaluated.</p> 'n' to be evaluated.</p>
<pre class="code"> <pre class="code">
( 1 ) 1 lfrom ( 1 ) 1 <a href="#lfrom">lfrom</a> dup .
=> [ 1 1 succ lfrom ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) 5 swap <a href="#lnth">lnth</a> . ( 2 ) 5 swap <a href="#lnth">lnth</a> .
=> 6 => 6
</pre> </pre>
@ -136,32 +220,31 @@ returns the nth value in the lazy list. It causes all the values up to
<a name="luncons"> <a name="luncons">
<h3>luncons ( lcons -- car cdr )</h3> <h3>luncons ( lcons -- car cdr )</h3>
<p>Provides the same effect as 'uncons' does for normal lists. It <p>Provides the same effect as 'uncons' does for normal lists. It
returns the car and cdr of the lazy list. Note that cdr is forced returns the car and cdr of the lazy list.</p>
resulting in it being evaluated.</p>
<pre class="code"> <pre class="code">
( 1 ) 5 [ 6 ] <a href="#lcons">lcons</a> dup . ( 1 ) [ 5 ] <a href="#delay">delay</a> [ 6 ] <a href="#delay">delay</a> <a href="#lcons">lcons</a> dup .
=> [ 5 6 ] => &lt;&lt; promise ... &gt;&gt;
( 2 ) <a href="#luncons">luncons</a> .s ( 2 ) <a href="#luncons">luncons</a> . .
=> { 5 6 } => 6
5
</pre> </pre>
<!-- lmap description --> <!-- lmap description -->
<a name="lmap"> <a name="lmap">
<h3>lmap ( llist quot -- llist )</h3> <h3>lmap ( llist quot -- llist )</h3>
<p>Provides the same effect as 'map' does for normal lists. It <p>Lazily maps over a lazy list applying the quotation to each element.
lazily maps over a lazy list applying the quotation to each element.
A new lazy list is returned which contains the results of the A new lazy list is returned which contains the results of the
quotation.</p> quotation.</p>
<p>When initially called <a href="#lmap">lmap</a> will only call quot on the first element <p>When intially called nothing in the original lazy list is
of the list. It then constructs a lazy list that performs the evaluated. Only when <a href="#lcar">lcar</a> is called will the item
next '<a href="#lmap">lmap</a>' operation on the next element when it is evaluated. This in the list be evaluated and applied to the quotation. Ditto with <a
allows mapping over infinite lists.</p> href="#lcdr">lcdr</a>, thus allowing infinite lists to be mapped over.</p>
<pre class="code"> <pre class="code">
( 1 ) 1 lfrom ( 1 ) 1 <a href="#lfrom">lfrom</a>
=> < infinite list of incrementing numbers > => < infinite list of incrementing numbers >
( 2 ) [ 2 * ] <a href="#lmap">lmap</a> ( 2 ) [ 2 * ] <a href="#lmap">lmap</a>
=> < infinite list of numbers incrementing by 2 > => < infinite list of numbers incrementing by 2 >
( 3 ) 5 swap <a href="#ltake">ltake</a> . ( 3 ) 5 swap <a href="#ltake">ltake</a> <a href="#llist2list">llist&gt;list</a> .
=> [ 2 4 6 8 10 ] => [ 2 4 6 8 10 ]
</pre> </pre>
@ -174,17 +257,15 @@ element. If that quotation returns true, the element will be included
in the resulting lazy list. If it is false, the element will be skipped. in the resulting lazy list. If it is false, the element will be skipped.
A new lazy list is returned which contains all elements where the A new lazy list is returned which contains all elements where the
predicate returned true.</p> predicate returned true.</p>
<p>When initially called <a href="#lsubset">lsubset</a> will only call <p>Like <a href="#lmap">lmap</a>, when initially called no evaluation
the predicate quotation on the first element will occur. A lazy list is returned that when values are retrieved
of the list. It then constructs a lazy list that performs the from in then items are evaluated and checked against the predicate.</p>
next '<a href="#lsubset">lsubset</a>' operation on the next element when it is evaluated. This
allows subsetting over infinite lists.</p>
<pre class="code"> <pre class="code">
( 1 ) 1 lfrom ( 1 ) 1 <a href="#lfrom">lfrom</a>
=> < infinite list of incrementing numbers > => < infinite list of incrementing numbers >
( 2 ) [ prime? ] <a href="#lsubset">lsubset</a> ( 2 ) [ <a href="#primep">prime?</a> ] <a href="#lsubset">lsubset</a>
=> < infinite list of prime numbers > => < infinite list of prime numbers >
( 3 ) 5 swap <a href="#ltake">ltake</a> . ( 3 ) 5 swap <a href="#ltake">ltake</a> <a href="#llist2list">llist&gt;list</a> .
=> [ 2 3 5 7 11 ] => [ 2 3 5 7 11 ]
</pre> </pre>
@ -196,7 +277,7 @@ lazily iterates over a lazy list applying the quotation to each
element. If this operation is applied to an infinite list it will element. If this operation is applied to an infinite list it will
never return unless the quotation escapes out by calling a continuation.</p> never return unless the quotation escapes out by calling a continuation.</p>
<pre class="code"> <pre class="code">
( 1 ) 1 lfrom ( 1 ) 1 <a href="#lfrom">lfrom</a>
=> < infinite list of incrementing numbers > => < infinite list of incrementing numbers >
( 2 ) [ 2 mod 1 = ] <a href="#lsubset">lsubset</a> ( 2 ) [ 2 mod 1 = ] <a href="#lsubset">lsubset</a>
=> < infinite list of odd numbers > => < infinite list of odd numbers >
@ -210,13 +291,13 @@ never return unless the quotation escapes out by calling a continuation.</p>
<!-- ltake description --> <!-- ltake description -->
<a name="ltake"> <a name="ltake">
<h3>ltake ( n llist -- list )</h3> <h3>ltake ( n llist -- llist )</h3>
<p>Iterates over the lazy list 'n' times, appending each element to a <p>Iterates over the lazy list 'n' times, appending each element to a
normal list. The normal list is returned. This provides a convenient lazy list. This provides a convenient way of getting elements out of
way of getting elements out of a lazy list.</p> an infinite lazy list.</p>
<pre class="code"> <pre class="code">
( 1 ) : ones 1 [ ones ] <a href="#lcons">lcons</a> ; ( 1 ) : ones [ 1 ] delay [ ones ] delay <a href="#lcons">lcons</a> ;
( 2 ) 5 ones <a href="#ltake">ltake</a> ( 2 ) 5 ones <a href="#ltake">ltake</a> <a href="#llist2list">llist&gt;list</a> .
=> [ 1 1 1 1 1 ] => [ 1 1 1 1 1 ]
</pre> </pre>
@ -227,7 +308,7 @@ way of getting elements out of a lazy list.</p>
lazily on iteration rather than immediately so it works very fast no lazily on iteration rather than immediately so it works very fast no
matter how large the list.</p> matter how large the list.</p>
<pre class="code"> <pre class="code">
( 1 ) [ 1 2 3 ] <a href="#list>llist">list>llist</a> [ 4 5 6 ] <a href="#list>llist">list>llist</a> <a href="#lappend">lappend</a> ( 1 ) [ 1 2 3 ] <a href="#list2llist">list&gt;llist</a> [ 4 5 6 ] <a href="#list2llist">list&gt;llist</a> <a href="#lappend">lappend</a>
( 2 ) [ . ] <a href="#leach">leach</a> ( 2 ) [ . ] <a href="#leach">leach</a>
=> 1 => 1
2 2
@ -244,10 +325,10 @@ matter how large the list.</p>
lazy fashion. The actual appending is done lazily on iteration rather lazy fashion. The actual appending is done lazily on iteration rather
than immediately so it works very fast no matter how large the lists.</p> than immediately so it works very fast no matter how large the lists.</p>
<pre class="code"> <pre class="code">
( 1 ) [ 1 2 3 ] <a href="#list>llist">list>llist</a> ( 1 ) [ 1 2 3 ] <a href="#list2>llist">list&gt;llist</a>
( 2 ) [ 4 5 6 ] <a href="#list>llist">list>llist</a> ( 2 ) [ 4 5 6 ] <a href="#list2llist">list&gt;llist</a>
( 3 ) [ 7 8 9 ] <a href="#list>llist">list>llist</a> ( 3 ) [ 7 8 9 ] <a href="#list2llist">list&gt;llist</a>
( 4 ) 3list <a href="#list>llist">list>llist</a> <a href="#lappendstar">lappend*</a> ( 4 ) 3list <a href="#list2llist">list&gt;llist</a> <a href="#lappendstar">lappend*</a>
( 5 ) [ . ] <a href="#leach">leach</a> ( 5 ) [ . ] <a href="#leach">leach</a>
=> 1 => 1
2 2
@ -261,12 +342,12 @@ than immediately so it works very fast no matter how large the lists.</p>
</pre> </pre>
<!-- list>llist description --> <!-- list>llist description -->
<a name="list>llist"> <a name="list2llist">
<h3>list>llist ( list -- llist )</h3> <h3>list&gt;llist ( list -- llist )</h3>
<p>Converts a normal list into a lazy list. This is done lazily so the <p>Converts a normal list into a lazy list. This is done lazily so the
initial list is not iterated through immediately.</p> initial list is not iterated through immediately.</p>
<pre class="code"> <pre class="code">
( 1 ) [ 1 2 3 ] <a href="#list>llist">list>llist</a> ( 1 ) [ 1 2 3 ] <a href="#list2llist">list&gt;llist</a>
( 2 ) [ . ] <a href="#leach">leach</a> ( 2 ) [ . ] <a href="#leach">leach</a>
=> 1 => 1
2 2

View File

@ -22,66 +22,85 @@
! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ! ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
IN: parser-combinators IN: parser-combinators
USE: lazy USE: lazy
USE: stack
USE: lists
USE: strings
USE: math
USE: logic
USE: kernel USE: kernel
USE: combinators USE: sequences
USE: parser USE: strings
USE: lists
USE: math
: phead ( object -- head ) GENERIC: phead
M: string phead ( object -- head )
#! Polymorphic head. Return the head item of the object. #! Polymorphic head. Return the head item of the object.
#! For a string this is the first character. #! For a string this is the first character.
#! For a list this is the car. 0 swap nth ;
[
[ string? ] [ 0 swap str-nth ]
[ list? ] [ car ]
] cond ;
: ptail ( object -- tail ) M: list phead ( object -- head )
#! Polymorphic head. Return the head item of the object.
#! For a list this is the car.
car ;
M: cons phead ( object -- head )
#! Polymorphic head. Return the head item of the object.
#! For a list this is the car.
car ;
GENERIC: ptail
M: string ptail ( object -- tail )
#! Polymorphic tail. Return the tail of the object. #! Polymorphic tail. Return the tail of the object.
#! For a string this is everything but the first character. #! For a string this is everything but the first character.
1 swap string-tail ;
M: list ptail ( object -- tail )
#! Polymorphic tail. Return the tail of the object.
#! For a list this is the cdr. #! For a list this is the cdr.
[ cdr ;
[ string? ] [ 1 swap str-tail ]
[ list? ] [ cdr ] M: cons ptail ( object -- tail )
] cond ; #! Polymorphic tail. Return the tail of the object.
#! For a list this is the cdr.
cdr ;
: pfirst ( object -- first ) : pfirst ( object -- first )
#! Polymorphic first #! Polymorphic first. The first item in a collection.
phead ; phead ;
: psecond ( object -- second ) GENERIC: psecond
M: string psecond ( object -- second )
#! Polymorphic second #! Polymorphic second
[ 1 swap nth ;
[ string? ] [ 1 swap str-nth ]
[ list? ] [ cdr car ] M: list psecond ( object -- second )
] cond ; #! Polymorphic second
cdr car ;
: ph:t ( object -- head tail ) : ph:t ( object -- head tail )
#! Return the head and tail of the object. #! Return the head and tail of the object.
dup phead swap ptail ; dup phead swap ptail ;
: pempty? ( object -- bool ) GENERIC: pempty?
#! Polymorphic empty test.
[ M: string pempty? ( object -- bool )
[ string? ] [ "" = ] #! Return true if the collection is empty.
[ list? ] [ not ] string-length 0 = ;
] cond ;
M: list pempty? ( object -- bool )
#! Return true if the collection is empty.
not ;
: string-take ( n string -- string ) : string-take ( n string -- string )
#! Return a string with the first 'n' characters #! Return a string with the first 'n' characters
#! of the original string. #! of the original string.
dup str-length pick < [ dup string-length pick < [
2drop "" 2drop ""
] [ ] [
str-head string-head
] ifte ; ] ifte ;
: (list-take) ( n list accum -- list ) : (list-take) ( n list accum -- list )
>r >r pred dup 0 < [ >r >r 1 - dup 0 < [
drop r> drop r> reverse drop r> drop r> reverse
] [ ] [
r> uncons swap r> cons (list-take) r> uncons swap r> cons (list-take)
@ -92,342 +111,381 @@ USE: parser
#! of the original list. #! of the original list.
[ ] (list-take) ; [ ] (list-take) ;
: ptake ( n object -- object ) GENERIC: ptake
M: string ptake ( n object -- object )
#! Polymorphic take. #! Polymorphic take.
#! Return a collection of the first 'n' #! Return a collection of the first 'n'
#! characters from the original collection. #! characters from the original collection.
[ string-take ;
[ string? ] [ string-take ]
[ list? ] [ list-take ] M: list ptake ( n object -- object )
] cond ; #! Polymorphic take.
#! Return a collection of the first 'n'
#! characters from the original collection.
list-take ;
: string-drop ( n string -- string ) : string-drop ( n string -- string )
#! Return a string with the first 'n' characters #! Return a string with the first 'n' characters
#! of the original string removed. #! of the original string removed.
dup str-length pick < [ dup string-length pick < [
2drop "" 2drop ""
] [ ] [
str-tail string-tail
] ifte ; ] ifte ;
: list-drop ( n list -- list ) : list-drop ( n list -- list )
#! Return a list with the first 'n' items #! Return a list with the first 'n' items
#! of the original list removed. #! of the original list removed.
>r pred dup 0 < [ >r 1 - dup 0 < [
drop r> drop r>
] [ ] [
r> cdr list-drop r> cdr list-drop
] ifte ; ] ifte ;
: pdrop ( n object -- object ) GENERIC: pdrop
M: string pdrop ( n object -- object )
#! Polymorphic drop. #! Polymorphic drop.
#! Return a collection the same as 'object' #! Return a collection the same as 'object'
#! but with the first n items removed. #! but with the first n items removed.
[ string-drop ;
[ string? ] [ string-drop ]
[ list? ] [ list-drop ]
] cond ;
: ifte-head= ( string-or-list ch [ quot1 ] [ quot2 ] -- )
#! When the character 'ch' is equal to the head
#! of the string or list, run the quot1 otherwise run quot2.
r> r> swap phead = r> r> ifte ;
: symbol ( ch -- parser ) M: list pdrop ( n object -- object )
#! Return a parser that parses the given symbol. #! Polymorphic drop.
[ ( inp ch -- result ) #! Return a collection the same as 'object'
2dup [ #! but with the first n items removed.
swap ptail cons lunit list-drop ;
] [
2drop [ ] : token-parser ( inp sequence -- llist )
] ifte-head= #! A parser that parses a specific sequence of
] curry1 ; #! characters.
2dup length swap ptake over = [
swap over length swap pdrop swons unit delay lunit
] [
2drop lnil
] ifte ;
: token ( string -- parser ) : token ( string -- parser )
#! Return a parser that parses the given string. #! Return a token parser that parses the given string.
[ ( inp string -- result ) [ token-parser ] cons ;
2dup str-length swap ptake over = [
swap over str-length swap pdrop cons lunit
] [
2drop [ ]
] ifte
] curry1 ;
: satisfy-parser ( inp pred -- llist )
#! A parser that succeeds if the predicate,
#! when passed the first character in the input, returns
#! true.
over pempty? [
2drop lnil
] [
over phead swap call [
ph:t swons unit delay lunit
] [
drop lnil
] ifte
] ifte ;
: satisfy ( p -- parser ) : satisfy ( p -- parser )
#! Return a parser that succeeds if the predicate 'p', #! Return a parser that succeeds if the predicate 'p',
#! when passed the first character in the input, returns #! when passed the first character in the input, returns
#! true. #! true.
[ ( inp p -- result ) [ satisfy-parser ] cons ;
over pempty? [
2drop [ ]
] [
over phead swap call [
ph:t cons lunit
] [
drop [ ]
] ifte
] ifte
] curry1 ;
: satisfy2 ( p r -- parser ) : satisfy2-parser ( inp pred quot -- llist )
#! Return a parser that succeeds if the predicate 'p', #! A parser that succeeds if the predicate,
#! when passed the first character in the input, returns #! when passed the first character in the input, returns
#! true. On success the word 'r' is called with the #! true. On success the quotation is called with the
#! successfully parser character on the stack. The result #! successfully parsed character on the stack. The result
#! of this is returned as the result of the parser. #! of that call is returned as the result portion of the
[ ( inp p r -- result ) #! successfull parse lazy list.
>r over phead swap call [ -rot over phead swap call [ ( quot inp -- )
ph:t swap r> call swons lunit ph:t >r swap call r> swons unit delay lunit
] [ ] [
r> 2drop [ ] 2drop lnil
] ifte ] ifte ;
] curry2 ;
: satisfy2 ( pred quot -- parser )
#! Return a satisfy2-parser.
[ satisfy2-parser ] cons cons ;
: epsilon-parser ( input -- llist )
#! A parser that parses the empty string. It
#! does not consume any input and always returns
#! an empty list as the parse tree with the
#! unmodified input.
"" cons unit delay lunit ;
: epsilon ( -- parser ) : epsilon ( -- parser )
#! A parser that parses the empty string. #! Return an epsilon parser
[ ( inp -- result ) [ epsilon-parser ] ;
"" swap cons lunit
] ;
: succeed ( r -- parser ) : succeed-parser ( input result -- llist )
#! A parser that always returns 'r' and consumes no input. #! A parser that always returns 'result' as a
[ ( inp r -- result ) #! successful parse with no input consumed.
swap cons lunit cons unit delay lunit ;
] curry1 ;
: succeed ( result -- parser )
#! Return a succeed parser.
[ succeed-parser ] cons ;
: fail-parser ( input -- llist )
#! A parser that always fails and returns
#! an empty list of successes.
drop lnil ;
: fail ( -- parser ) : fail ( -- parser )
#! A parser that always fails #! Return a fail-parser.
[ [ fail-parser ] ;
drop [ ]
] ;
USE: prettyprint : <&>-do-parser3 ( [[ x1 xs2 ]] x -- result )
USE: unparser #! Called by <&>-do-parser2 on each result of the
#! parse from parser2.
>r uncons r> ( x1 xs2 x )
swap cons cons ;
: ensure-list ( a -- [ a ] ) : <&>-do-parser2 ( [[ x xs ]] parser2 -- result )
#! If 'a' is not a list, make it one. #! Called by the <&>-parser on each result of the
dup list? [ unit ] unless ; #! successfull parse of parser1. It's input is the
#! cons containing the data parsed and the remaining
: ++ ( a b -- [ a b ] ) #! input. This word will parser2 on the remaining input
#! Join two items into a list. #! returning a new cons cell containing the combined
>r ensure-list r> ensure-list append ; #! parse result.
>r unswons r> ( x xs parser2 )
: <&> ( p1 p2 -- parser ) call swap ( llist x )
[ <&>-do-parser3 ] cons lmap ;
: <&>-parser ( input parser1 parser2 -- llist )
#! Parse 'input' by sequentially combining the
#! two parsers. First parser1 is applied to the
#! input then parser2 is applied to the rest of
#! the input strings from the first parser.
>r call r> ( [[ x xs ]] p2 -- result )
[ <&>-do-parser2 ] cons lmap lappend* ;
: <&> ( parser1 parser2 -- parser )
#! Sequentially combine two parsers, returning a parser #! Sequentially combine two parsers, returning a parser
#! that first calls p1, then p2 all remaining results from #! that first calls p1, then p2 all remaining results from
#! p1. #! p1.
[ ( inp p1 p2 -- result ) [ <&>-parser ] cons cons ;
>r call r> [ ( [ x | xs ] p2 -- result )
>r uncons r> call swap [ ( [ x2 | xs2 ] x -- result ) : <|>-parser ( input parser1 parser2 -- result )
>r uncons swap r> swap ++ swons #! Return the combined list resulting from the parses
] curry1 lmap #! of parser1 and parser2 being applied to the same
] curry1 lmap lappend* #! input. This implements the choice parsing operator.
] curry2 ; >r dupd call swap r> call lappend ;
: <|> ( p1 p2 -- parser ) : <|> ( p1 p2 -- parser )
#! Choice operator for parsers. Return a parser that does #! Choice operator for parsers. Return a parser that does
#! p1 or p2 depending on which will succeed. #! p1 or p2 depending on which will succeed.
[ ( inp p1 p2 -- result ) [ <|>-parser ] cons cons ;
rot tuck swap call >r swap call r> lappend
] curry2 ;
: p-abc ( -- parser ) : string-ltrim ( string -- string )
#! Test Parser. Parses the string "abc" #! Return a new string without any leading whitespace
"a" token "b" token "c" token <&> <&> ; #! from the original string.
dup phead blank? [ ptail string-ltrim ] when ;
: parse-skipwhite ( string -- string ) : sp-parser ( input parser -- result )
dup phead blank? [ #! Skip all leading whitespace from the input then call
ptail parse-skipwhite #! the parser on the remaining input.
] [ >r string-ltrim r> call ;
] ifte ;
: sp ( parser -- parser ) : sp ( parser -- parser )
#! Return a parser that first skips all whitespace before #! Return a parser that first skips all whitespace before
#! parsing. #! calling the original parser.
[ ( inp parser -- result ) [ sp-parser ] cons ;
>r parse-skipwhite r> call
] curry1 ; : just-parser ( input parser -- result )
#! Calls the given parser on the input removes
#! from the results anything where the remaining
#! input to be parsed is not empty. So ensures a
#! fully parsed input string.
call [ car pempty? ] lsubset ;
: just ( parser -- parser ) : just ( parser -- parser )
#! Return a parser that works exactly like the input parser #! Return an instance of the just-parser.
#! but guarantees that the rest string is empty. [ just-parser ] cons ;
[ ( inp parser -- result )
call [ ( [ x | xs ] -- )
cdr str-length 0 =
] lsubset
] curry1 ;
: <@ ( p f -- parser ) : (<@-parser-replace) ( [[ inp result ]] quot -- [[ inp new-result ]] )
#! Given a parser p and a quotation f return a parser #! Perform the result replacement step of <@-parser.
#! that does the same as p but in addition applies f #! Given a successfull parse result, calls the quotation
#! to the resulting parse tree. #! with the result portion on the stack. The result of
[ ( inp p f -- result ) #! that call is then used as the new result.
>r call r> [ ( [ x | xs ] f -- [ fx | xs ] ) swap uncons rot call cons ;
swap uncons r> swap over [ call ] [ drop ] ifte r> cons
] curry1 lmap
] curry2 ;
: p-1 ( -- parser ) : <@-parser ( input parser quot -- result )
"1" token "123" swap call lcar ; #! Calls the parser on the input. For each successfull
#! parse the quot is call with the parse result on the stack.
#! The result of that quotation then becomes the new parse result.
#! This allows modification of parse tree results (like
#! converting strings to integers, etc).
-rot call dup lnil? [ ( quot lnil -- )
nip
] [ ( quot result -- )
[ (<@-parser-replace) ] rot swons lmap
] ifte ;
: p-2 ( -- parser ) : <@ ( parser quot -- parser )
"1" token [ str>number ] <@ "123" swap call lcar ; #! Return an <@-parser.
[ <@-parser ] cons cons ;
: some ( parser -- det-parser ) : some-parser ( input parser -- result )
#! Given a parser, return a parser that only produces the #! Calls the parser on the input, guarantees
#! resulting parse tree of the first successful complete parse. #! the parse is complete (the remaining input is empty),
[ ( inp parser -- result ) #! picks the first solution and only returns the parse
just call lcar car #! tree since the remaining input is empty.
] curry1 ; just call lcar cdr ;
: delayed-parser ( [ parser ] -- parser )
[ ( inp [ parser ] -- result )
call call
] curry1 ;
: parens ; : some ( parser -- deterministic-parser )
: parens ( -- parser ) #! Creates a 'some-parser'.
#! Parse nested parentheses [ some-parser ] cons ;
"(" token [ parens ] delayed-parser <&>
")" token <&> [ parens ] delayed-parser <&>
epsilon <|> ;
: nesting ( -- parser ) : <&-parser ( input parser1 parser2 -- result )
#! Count the maximum depth of nested parentheses. #! Same as <&> except discard the results of the second parser.
"(" token [ nesting ] delayed-parser <&> ")" token <&> <&> [ phead ] <@ call ;
[ nesting ] delayed-parser <&> [ .s drop "a" ] <@ epsilon <|> ;
: <& ( parser1 parser2 -- parser ) : <& ( parser1 parser2 -- parser )
#! Same as <&> except only return the first item in the parse tree. #! Same as <&> except discard the results of the second parser.
<&> [ pfirst ] <@ ; [ <&-parser ] cons cons ;
: &>-parser ( input parser1 parser2 -- result )
#! Same as <&> except discard the results of the first parser.
<&> [ ptail ] <@ call ;
: &> ( parser1 parser2 -- parser ) : &> ( parser1 parser2 -- parser )
#! Same as <&> except only return the second item in the parse tree. #! Same as <&> except discard the results of the first parser.
<&> [ psecond ] <@ ; [ &>-parser ] cons cons ;
: lst ( [ x [ xs ] ] -- [x:xs] ) : (a,(b,c))>((a,b,c)) ( list -- list )
#! I need a good name for this word... #! Convert a list where the car is a single value
dup cdr [ uncons car cons ] when unit ; #! and the cdr is a list to a list containing a flattened
#! list.
uncons car cons unit ;
: <:&>-parser ( input parser1 parser2 -- result )
#! Same as <&> except postprocess the result with
#! (a,(b,c))>((a,b,c)).
<&> [ (a,(b,c))>((a,b,c)) ] <@ call ;
: <:&> ( parser1 parser2 -- parser )
#! Same as <&> except postprocess the result with
#! (a,(b,c))>((a,b,c)).
[ <:&>-parser ] cons cons ;
DEFER: <*>
: (<*>) ( parser -- parser )
#! Non-delayed implementation of <*>
dup <*> <:&> [ ] succeed <|> ;
: <*> ( parser -- parser ) : <*> ( parser -- parser )
#! Return a parser that accepts zero or more occurences of the original #! Return a parser that accepts zero or more occurences of the original
#! parser. #! parser.
dup [ <*> ] curry1 delayed-parser <&> [ lst ] <@ [ ] succeed <|> ; [ (<*>) call ] cons ;
: (<+>) ( parser -- parser )
#! Non-delayed implementation of <+>
dup <*> <:&> ;
: <+> ( parser -- parser ) : <+> ( parser -- parser )
#! Return a parser that accepts one or more occurences of the original #! Return a parser that accepts one or more occurences of the original
#! parser. #! parser.
dup [ <*> ] curry1 delayed-parser <&> [ lst ] <@ ; [ (<+>) call ] cons ;
: <?> ( parser -- parser ) : (<?>) ( parser -- parser )
#! Return a parser where its construct is optional. It may or may not occur. #! Non-delayed implementation of <?>
[ ] succeed <|> ; [ unit ] <@ [ ] succeed <|> ;
: <first> ( parser -- parser )
#! Transform a parser into a parser that only returns the first success.
[
call dup [ lcar lunit ] when
] curry1 ;
: <!*> ( parser -- parser ) : <?> ( parser -- parser )
#! Version of <*> that only returns the first success. #! Return a parser that optionally uses the parser
<*> <first> ; #! if that parser would be successfull.
[ (<?>) call ] cons ;
: <!+> ( parser -- parser ) USE: prettyprint
#! Version of <+> that only returns the first success. USE: parser
<+> <first> ; USE: unparser
USE: stdio
: ab-test ! Testing <&>
"a" token <*> "b" token <&> "aaaaab" swap call [ . ] leach ; : test1 "abcd" "a" token "b" token <&> call [ . ] leach ;
: test1a "abcd" "a" token "b" token <&> "c" token <&> call [ . ] leach ;
: test1b "abcd" "a" token "b" token "c" token <&> <&> call [ . ] leach ;
: test2 "decd" "a" token "b" token <&> call [ . ] leach ;
: test3 "dbcd" "a" token "b" token <&> call [ . ] leach ;
: test4 "adcd" "a" token "b" token <&> call [ . ] leach ;
: ab-test2 ! Testing <|>
"a" token <*> "b" token <&> [ "a" "a" "a" "b" ] swap call [ . ] leach ; : test5 "abcd" "a" token "b" token <|> call [ . ] leach ;
: test6 "bbcd" "a" token "b" token <|> call [ . ] leach ;
: test7 "cbcd" "a" token "b" token <|> call [ . ] leach ;
: a "a" token "a" token <&> epsilon <|> ; ! Testing sp
: b "b" token epsilon <|> ; : test8 " abcd" "a" token call [ . ] leach ;
: c "c" token "c" token <&> ; : test9 " abcd" "a" token sp call [ . ] leach ;
: d "d" token "d" token <&> ;
: count-a "a" token [ count-a ] delayed-parser &> "b" token <& [ 1 + ] <@ 0 succeed <|> ;
: tca "aaabbb" count-a call [ . ] leach ;
: parse-digit ( -- parser ) ! Testing just
#! Return a parser for digits : test10 "abcd" "abcd" token "abc" token <|> call [ . ] leach ;
[ digit? ] satisfy [ CHAR: 0 - ] <@ ; : test11 "abcd" "abcd" token "abc" token <|> just call [ . ] leach ;
: (reduce) ( start quot list -- value ) ! Testing <@
#! Call quot with start and the first value in the list. : test12 "01234" [ digit? ] satisfy call [ . ] leach ;
#! quot is then called with the result of quot and the : test13 "01234" [ digit? ] satisfy [ digit> ] <@ call [ . ] leach ;
#! next item in the list until the list is exhausted.
uncons >r swap dup swap r> call r> r> dup [
(reduce)
] [
2drop
] ifte ;
: reduce ( list start quot -- value ) ! Testing some
#! Call quot with start and the first value in the list. : test14 "begin1" "begin" token call [ . ] leach ;
#! quot is then called with the result of quot and the : test15 "This should fail with an error" print
#! next item in the list until the list is exhausted. "begin1" "begin" token some call . ;
rot (reduce) ; : test16 "begin" "begin" token some call . ;
: natural ( -- parser ) ! parens test function
#! a parser for natural numbers. : parens ( -- parser )
parse-digit <*> [ car 0 [ swap 10 * + ] reduce unit ] <@ ; #! Return a parser that parses nested parentheses.
[ "(" token parens <&> ")" token <&> parens <&> epsilon <|> call ] ;
: natural2 ( -- parser ) : test17 "" parens call [ . ] leach ;
#! a parser for natural numbers. : test18 "()" parens call [ . ] leach ;
parse-digit <!+> [ car 0 [ swap 10 * + ] reduce unit ] <@ ; : test19 "((()))" parens call [ . ] leach ;
: integer ( -- parser ) ! <& parser and &> parser
#! A parser that can parser possible negative numbers. : test20 "abcd" "a" token "b" token <&> call [ . ] leach ;
"-" token <?> [ drop -1 ] <@ natural2 <&> [ 1 [ * ] reduce ] <@ ; : test21 "abcd" "a" token "b" token <& call [ . ] leach ;
: test22 "abcd" "a" token "b" token &> call [ . ] leach ;
: identifier ( -- parser ) ! nesting example
#! Parse identifiers : parens-open "(" token ;
[ letter? ] satisfy <+> [ car cat ] <@ ; : parens-close ")" token ;
: nesting
[ parens-open
nesting &>
parens-close <&
nesting <&>
[ unswons 1 + max ] <@
0 succeed <|>
call ] ;
: identifier2 ( -- parser ) : test23 "" nesting just call [ . ] leach ;
#! Parse identifiers : test24 "()" nesting just call [ . ] leach ;
[ letter? ] satisfy <!+> [ car cat ] <@ ; : test25 "(())" nesting just call [ . ] leach ;
: test26 "()(()(()()))()" nesting just call [ . ] leach ;
: ints ( -- parser ) ! Testing <*> and <:&>
integer "+" token [ drop [ [ + ] ] ] <@ <&> : test27 "1234" "1" token <*> call [ . ] leach ;
integer <&> [ call swap call ] <@ ; : test28 "1111234" "1" token <*> call [ . ] leach ;
: test28a "1111234" "1" token <*> [ car cat unit ] <@ call [ . ] leach ;
: test29 "234" "1" token <*> call [ . ] leach ;
: pdigit [ digit? ] satisfy [ digit> ] <@ ;
: pnatural pdigit <*> ;
: pnatural2 pnatural [ car [ >digit ] map cat dup pempty? [ drop 0 ] [ str>number ] ifte unit ] <@ ;
: test30 "12345" pnatural2 call [ . ] leach ;
! Testing <+>
: test31 "1234" "1" token <+> call [ . ] leach ;
: test32 "1111234" "1" token <+> call [ . ] leach ;
: test33 "234" "1" token <+> call [ . ] leach ;
: url-quotable ( -- parser ) ! Testing <?>
! [a-zA-Z0-9/_?] re-matches : test34 "ab" "a" token pdigit <?> <&> "b" token <&> call [ . ] leach ;
[ letter? ] satisfy : test35 "ac" "a" token pdigit <?> <&> "b" token <&> call [ . ] leach ;
[ LETTER? ] satisfy <|> : test36 "a5b" "a" token pdigit <?> <&> "b" token <&> call [ . ] leach ;
[ digit? ] satisfy <|> : pinteger "-" token <?> pnatural2 <&> [ uncons swap [ car -1 * ] when ] <@ ;
CHAR: / symbol <|> : test37 "123" pinteger call [ . ] leach ;
CHAR: _ symbol <|> : test38 "-123" pinteger call [ . ] leach ;
CHAR: ? symbol <|> just ;
: http-header ( -- parser )
[ CHAR: : = not ] satisfy <!+> [ car cat ] <@
": " token [ drop f ] <@ <&>
[ drop t ] satisfy <!+> [ car cat ] <@ <&> just ;
: parse-http-header ( string -- [ name value ] )
http-header call lcar car ;
: get-request ( -- parser )
"GET" token
[ drop t ] satisfy <!+> sp [ car cat ] <@ <&> ;
: post-request ( -- parser )
"POST" token
[ drop t ] satisfy <!+> sp [ car cat ] <@ <&> ;
: all-request ( -- parser )
"POST" token
[ 32 = not ] satisfy <!+> sp [ car cat ] <@ <&>
"HTTP/1.0" token sp <&> ;
: split-url ( -- parser )
"http://" token
[ CHAR: / = not ] satisfy <!*> [ car cat ] <@ <&>
"/" token <&>
[ drop t ] satisfy <!*> [ car cat ] <@ <&> ;

View File

@ -16,17 +16,17 @@
operation.</p> operation.</p>
<p>The result returned by a parser is known as a 'list of <p>The result returned by a parser is known as a 'list of
successes'. It is a lazy list of standard Factor cons cells. Each cons successes'. It is a lazy list of standard Factor cons cells. Each cons
cell is a result of a parse. The car of the cell is the result of the cell is a result of a parse. The car of the cell is the remaining
parse operation and the cdr of the cell is the remaining input left to input left to be parsed and the cdr of the cell is the result of the
be parsed.</p> parsing operation.</p>
<p>A list is used for the result as a parse operation can potentially <p>A lazy list is used for the result as a parse operation can potentially
return many successful results. For example, a parser that parses one return many successful results. For example, a parser that parses one
or more digits will return more than one result for the input "123". A or more digits will return more than one result for the input "123". A
successful parse could be "1", "12" or "123".</p> successful parse could be "1", "12" or "123".</p>
<p>The list is lazy so if only one parse result is required the <p>The list is lazy so if only one parse result is required the
remaining results won't actually be processed if they are not remaining results won't actually be processed if they are not
requested. This improves efficiency.</p> requested. This improves efficiency.</p>
<p>The car of the result pair can be any value that the parser wishes <p>The cdr of the result pair can be any value that the parser wishes
to return. It could be the successful portion of the input string to return. It could be the successful portion of the input string
parsed, an abstract syntax tree representing the parsed input, or even parsed, an abstract syntax tree representing the parsed input, or even
a quotation that should get called for later processing.</p> a quotation that should get called for later processing.</p>
@ -39,32 +39,32 @@ general parsers work and the 'list of sucesses' that are returned as a
result.</p> result.</p>
<pre class="code"> <pre class="code">
(1) : char-a ( inp -- result ) (1) : char-a ( inp -- result )
0 over str-nth CHAR: a = [ 0 over string-nth CHAR: a = [
1 str-tail CHAR: a swons lunit 1 swap string-tail CHAR: a cons unit delay lunit
] [ ] [
drop f drop lnil
] ifte ; ] ifte ;
(2) "atest" char-a [ [ . ] leach ] when* (2) "atest" char-a [ [ . ] leach ] when*
=> [ 97 | "test" ] =&gt; [[ "test" 97 ]]
(3) "test" char-a [ [ . ] leach ] when* (3) "test" char-a [ [ . ] leach ] when*
=> =&gt;
</pre> </pre>
<p>'char-a' is a parser that only accepts the character 'a' in the <p>'char-a' is a parser that only accepts the character 'a' in the
input string. When passed an input string with a string with a leading input string. When passed an input string with a string with a leading
'a' then the 'list of successes' has 1 result value. The car of that 'a' then the 'list of successes' has 1 result value. The cdr of that
result value is the character 'a' successfully parsed, and the cdr is result value is the character 'a' successfully parsed, and the car is
the remaining input string. On failure of the parse an empty list is the remaining input string. On failure of the parse an empty list is
returned.</p> returned.</p>
<p>The parser combinator library provides a combinator, <&>, that takes <p>The parser combinator library provides a combinator, &lt;&amp;&gt;, that takes
two parsers off the stack and returns a parser that calls the original two parsers off the stack and returns a parser that calls the original
two in sequence. An example of use would be calling 'char-a' twice, two in sequence. An example of use would be calling 'char-a' twice,
which would then result in an input string expected with two 'a' which would then result in an input string expected with two 'a'
characters leading:</p> characters leading:</p>
<pre class="code"> <pre class="code">
(1) "aatest" [ char-a ] [ char-a ] <&> call (1) "aatest" [ char-a ] [ char-a ] &lt;&amp;&gt; call
=> < list of successes > =&gt; < list of successes >
(2) [ . ] leach (2) [ . ] leach
=> [ [ 97 97 ] | "test" ] =&gt; [[ "test" [[ 97 97 ]] ]]
</pre> </pre>
<h2>Tokens</h2> <h2>Tokens</h2>
<p>Creating parsers for specfic characters and tokens can be a chore <p>Creating parsers for specfic characters and tokens can be a chore
@ -72,13 +72,13 @@ so there is a word that, given a string token on the stack, returns
a parser that parses that particular token:</p> a parser that parses that particular token:</p>
<pre class="code"> <pre class="code">
(1) "begin" token (1) "begin" token
=> < a parser that parses the token "begin" > =&gt; < a parser that parses the token "begin" >
(2) dup "this should fail" swap call . (2) dup "this should fail" swap call lnil? .
=> f =&gt; t
(3) "begin a successfull parse" swap call (3) "begin a successfull parse" swap call
=> < lazy list > =&gt; < lazy list >
(4) [ . ] leach (4) [ . ] leach
=> [ "begin" | " a successfull parse" ] =&gt; [[ " a successfull parse" "begin" ]]
</pre> </pre>
<h2>Predicate matching</h2> <h2>Predicate matching</h2>
<p>The word 'satisfy' takes a quotation from the top of the stack and <p>The word 'satisfy' takes a quotation from the top of the stack and
@ -89,9 +89,9 @@ true then the parse is successful, otherwise it fails:</p>
(1) : digit-parser ( -- parser ) (1) : digit-parser ( -- parser )
[ digit? ] satisfy ; [ digit? ] satisfy ;
(2) "5" digit-parser call [ . ] leach (2) "5" digit-parser call [ . ] leach
=> [ 53 | "" ] =&gt; [[ "" 53 ]]
(3) "a" digit-parser call (3) "a" digit-parser call lnil? .
=> f =&gt; t
</pre> </pre>
<p>Note that 'digit-parser' returns a parser, it is not the parser <p>Note that 'digit-parser' returns a parser, it is not the parser
itself. It is really a parser generating word like 'token'. Whereas itself. It is really a parser generating word like 'token'. Whereas
@ -105,14 +105,14 @@ parser parsed. The result of the '<*>' generated parser will be a list
of the successful results returned by the original parser.</p> of the successful results returned by the original parser.</p>
<pre class="code"> <pre class="code">
(1) digit-parser <*> (1) digit-parser <*>
=> < parser > =&gt; < parser >
(2) "123" swap call (2) "123" swap call
=> < lazy list > =&gt; < lazy list >
(3) [ . ] leach (3) [ . ] leach
=> [ [ [ 49 50 51 ] ] | "" ] =&gt; [ "" [ 49 50 51 ] ]
[ [ [ 49 50 ] ] | "3" ] [ "3" [ 49 50 ] ]
[ [ [ 49 ] ] | "23" ] [ "23" [ 49 ] ]
[ f | "123" ] [ "123" ]
</pre> </pre>
<p>In this case there are multiple successful parses. This is because <p>In this case there are multiple successful parses. This is because
the occurrence of zero or more digits happens more than once. There is the occurrence of zero or more digits happens more than once. There is
@ -124,7 +124,7 @@ remaining parse results are never produced.</p>
parsed. Sometimes you want this to be something else, like an abstract parsed. Sometimes you want this to be something else, like an abstract
syntax tree, or some calculation. For the digit case we may want the syntax tree, or some calculation. For the digit case we may want the
actual integer number.</p> actual integer number.</p>
<p>For this we can use the '<@' parser <p>For this we can use the '&lt;@' parser
combinator. This combinator takes a parser and a quotation on the combinator. This combinator takes a parser and a quotation on the
stack and returns a new parser. When the new parser is called it will stack and returns a new parser. When the new parser is called it will
call the original parser to produce the results, then it will call the call the original parser to produce the results, then it will call the
@ -132,27 +132,48 @@ quotation on each successfull result, and the result of that quotation
will be the result of the parse:</p> will be the result of the parse:</p>
<pre class="code"> <pre class="code">
(1) : digit-parser2 ( -- parser ) (1) : digit-parser2 ( -- parser )
[ digit? ] satisfy [ CHAR: 0 - ] <@ ; [ digit? ] satisfy [ digit> ] &lt;@ ;
(2) "5" digit-parser2 call [ . ] leach (2) "5" digit-parser2 call [ . ] leach
=> [ 5 | "" ] =&gt; [[ "" 5 ]]
</pre> </pre>
<p>Notice that now the result is the actual integer '5' rather than <p>Notice that now the result is the actual integer '5' rather than
character code '53'.</p> character code '53'.</p>
<pre class="code"> <pre class="code">
(1) : natural-parser ( -- parser ) (1) : digit-list>number ( list -- number )
digit-parser2 <*> [ car 0 [ swap 10 * + ] reduce unit ] <@ ; #! Converts a list of digits to a number
(2) "123" natural-parser call [ >digit ] map cat dup string-length 0 = [
=> < lazy list > drop 0
(3) [ . ] leach ] [
=> [ [ 123 ] | "" ] str>number
[ [ 12 ] | "3" ] ] ifte ;
[ [ 1 ] | "23" ] (2) : natural-parser ( -- parser )
[ f | "123" ] digit-parser2 <*> [ car digit-list>number unit ] &lt;@ ;
(3) "123" natural-parser call
=&gt; < lazy list >
(4) [ . ] leach
=&gt; [ "" 123 ]
[ "3" 12 ]
[ "23" 1 ]
[ "123" 0 ]
[ [ 123 ] | "" ]
</pre> </pre>
<p>The number parsed is the actual integer number due to the operation <p>The number parsed is the actual integer number due to the operation
of the '<@' word. This allows parsers to not only parse the input of the '&lt;@' word. This allows parsers to not only parse the input
string but perform operations and transformations on the syntax tree string but perform operations and transformations on the syntax tree
returned.</p> returned.</p>
<p>A useful debugging method to work out what to use in the quotation
passed to &lt;@ is to write an initial version of the parser that just
displays the topmost item on the stack:</p>
<pre class="code">
(1) : natural-parser-debug ( -- parser )
digit-parser2 <*> [ "debug: " write dup . ] &lt;@ ;
(3) "123" natural-parser-debug call lcar .
=&gt; debug: [ [ 1 2 3 ] ]
[ "" [ 1 2 3 ] ]
</pre>
<p>From the debug output we can see how to manipulate the result to
get what we want. In this case it's the quotation in the previous example.</p>
<h2>Sequential combinator</h2> <h2>Sequential combinator</h2>
<p>To create a full grammar we need a parser combinator that does <p>To create a full grammar we need a parser combinator that does
sequential compositions. That is, given two parsers, the sequential sequential compositions. That is, given two parsers, the sequential
@ -160,41 +181,101 @@ combinator will first run the first parser, and then run the second on
the remaining text to be parsed. As the first parser returns a lazy the remaining text to be parsed. As the first parser returns a lazy
list, the second parser will be run on each item of the lazy list. Of list, the second parser will be run on each item of the lazy list. Of
course this is done lazily so it only ends up being done when those course this is done lazily so it only ends up being done when those
list items are requested. The sequential combinator word is <&>.</p> list items are requested. The sequential combinator word is &lt;&amp;&gt;.</p>
<pre class="code"> <pre class="code">
( 1 ) "number:" token ( 1 ) "number:" token
=> < parser that parses the text 'number:' > =&gt; < parser that parses the text 'number:' >
( 2 ) natural ( 2 ) natural-parser
=> < parser that parses natural numbers > =&gt; < parser that parses natural numbers >
( 3 ) <&> ( 3 ) &lt;&amp;&gt;
=> < parser that parses 'number:' followed by a natural > =&gt; < parser that parses 'number:' followed by a natural >
( 4 ) "number:1000" swap call ( 4 ) "number:100" swap call
=> < list of successes > =&gt; < list of successes >
( 5 ) [ . ] leach ( 5 ) [ . ] leach
=> [ [ "number:" 1000 ] | "" ] =&gt; [ "" "number:" 100 ]
[ [ "number:" 100 ] | "0" ] [ "0" "number:" 10 ]
[ [ "number:" 10 ] | "00" ] [ "00" "number:" 1 ]
[ [ "number:" 1 ] | "000" ] [ "100" "number:" 0 ]
[ [ "number:" ] | "1000" ]
</pre> </pre>
<p>In this example we might prefer not to have the parse result
contain the token, we want just the number. Two alternatives to &lt;&amp;&gt;
provide the ability to select which result to use from the two
parsers. These operators are &lt;&amp; and &amp;&gt;. The &lt; or &gt; points
in the direction of which parser to retain the results from. So our
example above could be:</p>
<pre class="code">
( 1 ) "number:" token
=&gt; < parser that parses the text 'number:' >
( 2 ) natural-parser
=&gt; < parser that parses natural numbers >
( 3 ) &amp;&gt;
=&gt; < parser that parses 'number:' followed by a natural >
( 4 ) "number:100" swap call
=&gt; < list of successes >
( 5 ) [ . ] leach
=&gt; [ "" 100 ]
[ "0" 10 ]
[ "00" 1 ]
[ "100" 0 ]
</pre>
<p>Notice how the parse result only contains the number due to &&gt;
being used to retain the result of the second parser.</p>
<h2>Choice combinator</h2> <h2>Choice combinator</h2>
<p>As well as a sequential combinator we need an alternative <p>As well as a sequential combinator we need an alternative
combinator. The word for this is <|>. It takes two parsers from the combinator. The word for this is &lt;|&gt;. It takes two parsers from the
stack and returns a parser that will first try the first parser. If it stack and returns a parser that will first try the first parser. If it
succeeds then the result for that is returned. If it fails then the succeeds then the result for that is returned. If it fails then the
second parser is tried and its result returned.</p> second parser is tried and its result returned.</p>
<pre class="code"> <pre class="code">
( 1 ) "one" token ( 1 ) "one" token
=> < parser that parses the text 'one' > =&gt; < parser that parses the text 'one' >
( 2 ) "two" token ( 2 ) "two" token
=> < parser that parses the text 'two' > =&gt; < parser that parses the text 'two' >
( 3 ) <|> ( 3 ) &lt;|&gt;
=> < parser that parses 'one' or 'two' > =&gt; < parser that parses 'one' or 'two' >
( 4 ) "one" over call [ . ] leach ( 4 ) "one" over call [ . ] leach
=> [ "one" | "" ] =&gt; [[ "" "one" ]]
( 5 ) "two" swap call [ . ] leach ( 5 ) "two" swap call [ . ] leach
=> [ "two" | "" ] =&gt; [[ "" "two" ]]
</pre> </pre>
<h2>Option combinator</h2>
<p>The option combinator, &lt;?&gt; allows adding optional elements to
a parser. It takes one parser off the stack and if the parse succeeds
add it to the result tree, otherwise it will ignore it and
continue. The example below extends our natural-parser to parse
integers with an optional leading minus sign.</p>
<pre class="code">
( 1 ) : integer-parser
"-" token <?> natural-parser &lt;&amp;&gt; ;
( 2 ) "200" integer-parser call [ . ] leach
=&gt; [ "" [ ] 200 ]
[ "0" [ ] 20 ]
[ "00" [ ] 2 ]
[ "200" [ ] 0 ]
( 3 ) "-200" integer-parser call [ . ] leach
=&gt; [ "" [ "-" ] 200 ]
[ "0" [ "-" ] 20 ]
[ "00" [ "-" ] 2 ]
[ "200" [ "-" ] 0 ]
[ "-200" [ ] 0 ]
( 4 ) : integer-parser2
integer-parser [ uncons swap [ car -1 * ] when ] &lt;@ ;
( 5 ) "200" integer-parser2 call [ . ] leach
=&gt; [ "" 200 ]
[ "0" 20 ]
[ "00" 2 ]
[ "200" 0 ]
( 6 ) "-200" integer-parser2 call [ . ] leach
=&gt; [ "" -200 ]
[ "0" -20 ]
[ "00" -2 ]
[ "200" 0 ]
[ "-200" 0 ]
</pre>
<h2>Skipping Whitespace</h2> <h2>Skipping Whitespace</h2>
<p>A parser transformer exists, the word 'sp', that takes an existing <p>A parser transformer exists, the word 'sp', that takes an existing
parser and returns a new one that will first skip any whitespace parser and returns a new one that will first skip any whitespace
@ -202,59 +283,54 @@ before calling the original parser. This makes it easy to write
grammers that avoid whitespace without having to explicitly code it grammers that avoid whitespace without having to explicitly code it
into the grammar.</p> into the grammar.</p>
<pre class="code"> <pre class="code">
( 1 ) natural ( 1 ) " 123" natural-parser call [ . ] leach
=> < a parser for natural numbers > =&gt; [ " 123" 0 ]
( 2 ) "+" token sp ( 2 ) " 123" natural-parser sp call [ . ] leach
=> < parser for '+' which ignores leading whitespace > =&gt; [ "" 123 ]
( 3 ) over sp [ "3" 12 ]
=> < a parser for natural numbers skipping leading whitespace > [ "23" 1 ]
( 4 ) <&> <&> [ "123" 0 ]
=> < a parser for natural + natural >
( 5 ) "1 + 2" over call lcar .
=> [ [ 1 "+" 2 ] | "" ]
( 6 ) "3+4" over call lcar .
=> [ [ 3 "+" 4 ] | "" ]
</pre> </pre>
<h2>Eval grammar example</h2> <h2>Eval grammar example</h2>
<p>This example presents a simple grammar that will parse a number <p>This example presents a simple grammar that will parse a number
followed by an operator and another number. A factor expression that followed by an operator and another number. A factor expression that
computes the entered value will be executed.</p> computes the entered value will be executed.</p>
<pre class="code"> <pre class="code">
( 1 ) natural ( 1 ) natural-parser
=> < a parser for natural numbers > =&gt; < a parser for natural numbers >
( 2 ) "/" token "*" token "+" token "-" token <|> <|> <|> ( 2 ) "/" token "*" token "+" token "-" token &lt;|&gt; &lt;|&gt; &lt;|&gt;
=> < a parser for the operator > =&gt; < a parser for the operator >
( 3 ) sp [ unit [ eval ] append unit ] <@ ( 3 ) sp [ "\\ " swap cat2 eval unit ] &lt;@
=> < operator parser that skips whitespace and converts to a =&gt; < operator parser that skips whitespace and converts to a
factor expression > factor expression >
( 4 ) natural sp ( 4 ) natural-parser sp
=> < a whitespace skipping natural parser > =&gt; < a whitespace skipping natural parser >
( 5 ) <&> <&> [ call swap call ] <@ ( 5 ) &lt;&amp;&gt; &lt;&amp;&gt; [ uncons uncons swap append append call ] &lt;@
=> < a parser that parsers the expression, converts it to =&gt; < a parser that parsers the expression, converts it to
factor, calls it and puts the result in the parse tree > factor, calls it and puts the result in the parse tree >
( 6 ) "123 + 456" over call lcar . ( 6 ) "123 + 456" over call lcar .
=> [ 579 | "" ] =&gt; [[ "" 579 ]]
( 7 ) "300-100" over call lcar . ( 7 ) "300-100" over call lcar .
=> [ 200 | "" ] =&gt; [[ "" 200 ]]
( 8 ) "200/2" over call lcar . ( 8 ) "200/2" over call lcar .
=> [ 100 | "" ] =&gt; [[ "" 100 ]]
</pre> </pre>
<p>It looks complicated when expanded as above but the entire parser, <p>It looks complicated when expanded as above but the entire parser,
factored a little, looks quite readable:</p> factored a little, looks quite readable:</p>
<pre class="code"> <pre class="code">
( 1 ) : operator ( -- parser ) ( 1 ) : operator ( -- parser )
"/" token "/" token
"*" token <|> "*" token &lt;|&gt;
"+" token <|> "+" token &lt;|&gt;
"-" token <|> "-" token &lt;|&gt;
[ unit [ eval ] append unit ] <@ ; [ "\\ " swap cat2 eval unit ] &lt;@ ;
( 2 ) : expression ( -- parser ) ( 2 ) : expression ( -- parser )
natural natural-parser
operator sp <&> operator sp &lt;&amp;&gt;
natural sp <&> natural-parser sp &lt;&amp;&gt;
[ call swap call ] <@ ; [ uncons swap uncons -rot append append reverse call ] &lt;@ ;
( 3 ) "40+2" expression call lcar . ( 3 ) "40+2" expression call lcar .
=> [ 42 | "" ] =&gt; [[ "" 42 ]]
</pre> </pre>
<p class="footer"> <p class="footer">
News and updates to this software can be obtained from the authors News and updates to this software can be obtained from the authors