342 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			HTML
		
	
	
		
		
			
		
	
	
			342 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			HTML
		
	
	
| 
								 | 
							
								<html>
							 | 
						||
| 
								 | 
							
								  <head>
							 | 
						||
| 
								 | 
							
								    <title>Parser Combinators</title>
							 | 
						||
| 
								 | 
							
								    <link rel="stylesheet" type="text/css" href="style.css">
							 | 
						||
| 
								 | 
							
								      </head>
							 | 
						||
| 
								 | 
							
								  <body>
							 | 
						||
| 
								 | 
							
								    <h1>Parsers</h1>
							 | 
						||
| 
								 | 
							
								<p class="note">The parser combinator library described here is based
							 | 
						||
| 
								 | 
							
								  on a library written for the Clean pure functional programming language and
							 | 
						||
| 
								 | 
							
								  described in chapter 5 of the 'Clean Book' (<a
							 | 
						||
| 
								 | 
							
								  href="ftp://ftp.cs.kun.nl/pub/Clean/papers/cleanbook/II.05.ParserCombinators.pdf">PDF
							 | 
						||
| 
								 | 
							
								  available here</a>). Based on the description
							 | 
						||
| 
								 | 
							
								  in that chapter I developed a version for Factor, a concatenative
							 | 
						||
| 
								 | 
							
								  language.</p>  
							 | 
						||
| 
								 | 
							
								<p>A parser is a word or quotation that, when called, processes
							 | 
						||
| 
								 | 
							
								   an input string on the stack, performs some parsing operation on
							 | 
						||
| 
								 | 
							
								   it, and returns a result indicating the success of the parsing
							 | 
						||
| 
								 | 
							
								   operation.</p> 
							 | 
						||
| 
								 | 
							
								<p>The result returned by a parser is known as a 'list of
							 | 
						||
| 
								 | 
							
								successes'. It is a lazy list of standard Factor cons cells. Each cons
							 | 
						||
| 
								 | 
							
								cell is a result of a parse. The car of the cell is the remaining
							 | 
						||
| 
								 | 
							
								input left to be parsed and the cdr of the cell is the result of the
							 | 
						||
| 
								 | 
							
								parsing operation.</p>
							 | 
						||
| 
								 | 
							
								<p>A lazy list is used for the result as a parse operation can potentially
							 | 
						||
| 
								 | 
							
								return many successful results. For example, a parser that parses one
							 | 
						||
| 
								 | 
							
								or more digits will return more than one result for the input "123". A
							 | 
						||
| 
								 | 
							
								successful parse could be "1", "12" or "123".</p>
							 | 
						||
| 
								 | 
							
								<p>The list is lazy so if only one parse result is required the
							 | 
						||
| 
								 | 
							
								remaining results won't actually be processed if they are not
							 | 
						||
| 
								 | 
							
								requested. This improves efficiency.</p>
							 | 
						||
| 
								 | 
							
								<p>The cdr of the result pair can be any value that the parser wishes
							 | 
						||
| 
								 | 
							
								to return. It could be the successful portion of the input string
							 | 
						||
| 
								 | 
							
								parsed, an abstract syntax tree representing the parsed input, or even
							 | 
						||
| 
								 | 
							
								a quotation that should get called for later processing.</p>
							 | 
						||
| 
								 | 
							
								<p>A Parser Combinator is a word that takes one or more parsers and
							 | 
						||
| 
								 | 
							
								returns a parser that when called uses the original parsers in some
							 | 
						||
| 
								 | 
							
								manner.</p>
							 | 
						||
| 
								 | 
							
								<h1>Example Parsers</h1>
							 | 
						||
| 
								 | 
							
								<p>The following are some very simple parsers that demonstrate how
							 | 
						||
| 
								 | 
							
								general parsers work and the 'list of sucesses' that are returned as a
							 | 
						||
| 
								 | 
							
								result.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) : char-a ( inp -- result )
							 | 
						||
| 
								 | 
							
								        0 over string-nth CHAR: a = [
							 | 
						||
| 
								 | 
							
								          1 swap string-tail CHAR: a cons unit delay lunit
							 | 
						||
| 
								 | 
							
								        ] [
							 | 
						||
| 
								 | 
							
								          drop lnil
							 | 
						||
| 
								 | 
							
								        ] ifte ;
							 | 
						||
| 
								 | 
							
								  (2) "atest" char-a [ [ . ] leach ] when*
							 | 
						||
| 
								 | 
							
								      => [[ "test" 97 ]]
							 | 
						||
| 
								 | 
							
								  (3) "test"  char-a [ [ . ] leach ] when*
							 | 
						||
| 
								 | 
							
								      =>
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>'char-a' is a parser that only accepts the character 'a' in the
							 | 
						||
| 
								 | 
							
								input string. When passed an input string with a string with a leading
							 | 
						||
| 
								 | 
							
								'a' then the 'list of successes' has 1 result value. The cdr of that
							 | 
						||
| 
								 | 
							
								result value is the character 'a' successfully parsed, and the car is
							 | 
						||
| 
								 | 
							
								the remaining input string. On failure of the parse an empty list is
							 | 
						||
| 
								 | 
							
								returned.</p> 
							 | 
						||
| 
								 | 
							
								<p>The parser combinator library provides a combinator, <&>, that takes
							 | 
						||
| 
								 | 
							
								two parsers off the stack and returns a parser that calls the original
							 | 
						||
| 
								 | 
							
								two in sequence. An example of use would be calling 'char-a' twice,
							 | 
						||
| 
								 | 
							
								which would then result in an input string expected with two 'a'
							 | 
						||
| 
								 | 
							
								characters leading:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) "aatest" [ char-a ] [ char-a ] <&> call
							 | 
						||
| 
								 | 
							
								      => < list of successes >
							 | 
						||
| 
								 | 
							
								  (2) [ . ] leach
							 | 
						||
| 
								 | 
							
								      => [[ "test" [[ 97 97 ]] ]]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<h2>Tokens</h2>
							 | 
						||
| 
								 | 
							
								<p>Creating parsers for specfic characters and tokens can be a chore
							 | 
						||
| 
								 | 
							
								so there is a word that, given a string token on the stack, returns
							 | 
						||
| 
								 | 
							
								a parser that parses that particular token:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) "begin" token 
							 | 
						||
| 
								 | 
							
								      => < a parser that parses the token "begin" >
							 | 
						||
| 
								 | 
							
								  (2) dup "this should fail" swap call lnil? .
							 | 
						||
| 
								 | 
							
								      => t
							 | 
						||
| 
								 | 
							
								  (3) "begin a successfull parse" swap call 
							 | 
						||
| 
								 | 
							
								      => < lazy list >
							 | 
						||
| 
								 | 
							
								  (4) [ . ] leach
							 | 
						||
| 
								 | 
							
								      => [[ " a successfull parse" "begin" ]]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<h2>Predicate matching</h2>
							 | 
						||
| 
								 | 
							
								<p>The word 'satisfy' takes a quotation from the top of the stack and
							 | 
						||
| 
								 | 
							
								returns a parser than when called will call the quotation with the
							 | 
						||
| 
								 | 
							
								first item in the input string on the stack. If the quotation returns
							 | 
						||
| 
								 | 
							
								true then the parse is successful, otherwise it fails:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) : digit-parser ( -- parser )
							 | 
						||
| 
								 | 
							
								        [ digit? ] satisfy ;
							 | 
						||
| 
								 | 
							
								  (2) "5" digit-parser call [ . ] leach
							 | 
						||
| 
								 | 
							
								      => [[ "" 53 ]]
							 | 
						||
| 
								 | 
							
								  (3) "a" digit-parser call lnil? .
							 | 
						||
| 
								 | 
							
								      => t
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>Note that 'digit-parser' returns a parser, it is not the parser
							 | 
						||
| 
								 | 
							
								itself. It is really a parser generating word like 'token'. Whereas
							 | 
						||
| 
								 | 
							
								our 'char-a' word defined originally was a parser itself.</p>
							 | 
						||
| 
								 | 
							
								<h2>Zero or more matches</h2>
							 | 
						||
| 
								 | 
							
								<p>Now that we can parse single digits it would be nice to easily
							 | 
						||
| 
								 | 
							
								parse a string of them. The '<*>' parser combinator word will do
							 | 
						||
| 
								 | 
							
								this. It accepts a parser on the top of the stack and produces a
							 | 
						||
| 
								 | 
							
								parser that parses zero or more of the constructs that the original
							 | 
						||
| 
								 | 
							
								parser parsed. The result of the '<*>' generated parser will be a list
							 | 
						||
| 
								 | 
							
								of the successful results returned by the original parser.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) digit-parser <*>
							 | 
						||
| 
								 | 
							
								      => < parser >
							 | 
						||
| 
								 | 
							
								  (2) "123" swap call
							 | 
						||
| 
								 | 
							
								      => < lazy list >
							 | 
						||
| 
								 | 
							
								  (3) [ . ] leach
							 | 
						||
| 
								 | 
							
								      => [ "" [ 49 50 51 ] ]
							 | 
						||
| 
								 | 
							
								           [ "3" [ 49 50 ] ]
							 | 
						||
| 
								 | 
							
								           [ "23" [ 49 ] ]
							 | 
						||
| 
								 | 
							
								           [ "123" ]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>In this case there are multiple successful parses. This is because
							 | 
						||
| 
								 | 
							
								the occurrence of zero or more digits happens more than once. There is
							 | 
						||
| 
								 | 
							
								also the 'f' case where zero digits is parsed. If only the 'longest
							 | 
						||
| 
								 | 
							
								match' is required then the lcar of the lazy list can be used and the
							 | 
						||
| 
								 | 
							
								remaining parse results are never produced.</p>
							 | 
						||
| 
								 | 
							
								<h2>Manipulating parse trees</h2>
							 | 
						||
| 
								 | 
							
								<p>The result of the previous parse was the list of characters
							 | 
						||
| 
								 | 
							
								parsed. Sometimes you want this to be something else, like an abstract
							 | 
						||
| 
								 | 
							
								syntax tree, or some calculation. For the digit case we may want the
							 | 
						||
| 
								 | 
							
								actual integer number.</p>
							 | 
						||
| 
								 | 
							
								<p>For this we can use the '<@' parser
							 | 
						||
| 
								 | 
							
								combinator. This combinator takes a parser and a quotation on the
							 | 
						||
| 
								 | 
							
								stack and returns a new parser. When the new parser is called it will
							 | 
						||
| 
								 | 
							
								call the original parser to produce the results, then it will call the
							 | 
						||
| 
								 | 
							
								quotation on each successfull result, and the result of that quotation
							 | 
						||
| 
								 | 
							
								will be the result of the parse:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) : digit-parser2 ( -- parser )
							 | 
						||
| 
								 | 
							
								        [ digit? ] satisfy [ digit> ] <@ ;
							 | 
						||
| 
								 | 
							
								  (2) "5" digit-parser2 call [ . ] leach
							 | 
						||
| 
								 | 
							
								      => [[ "" 5 ]]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>Notice that now the result is the actual integer '5' rather than
							 | 
						||
| 
								 | 
							
								character code '53'.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) : digit-list>number ( list -- number )
							 | 
						||
| 
								 | 
							
								         #! Converts a list of digits to a number
							 | 
						||
| 
								 | 
							
								         [ >digit ] map >string dup empty? [ 
							 | 
						||
| 
								 | 
							
								           drop 0 
							 | 
						||
| 
								 | 
							
								         ] [
							 | 
						||
| 
								 | 
							
									   str>number 
							 | 
						||
| 
								 | 
							
								         ]  ifte ;
							 | 
						||
| 
								 | 
							
								  (2) : natural-parser ( -- parser )
							 | 
						||
| 
								 | 
							
								        digit-parser2 <*> [ car digit-list>number unit  ] <@  ;
							 | 
						||
| 
								 | 
							
								  (3) "123" natural-parser call
							 | 
						||
| 
								 | 
							
								      => < lazy list >
							 | 
						||
| 
								 | 
							
								  (4) [ . ] leach
							 | 
						||
| 
								 | 
							
								      => [ "" 123 ]
							 | 
						||
| 
								 | 
							
								           [ "3" 12 ]
							 | 
						||
| 
								 | 
							
								           [ "23" 1 ]
							 | 
						||
| 
								 | 
							
								           [ "123" 0 ]
							 | 
						||
| 
								 | 
							
								           [ [ 123 ] | "" ]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>The number parsed is the actual integer number due to the operation
							 | 
						||
| 
								 | 
							
								of the '<@' word. This allows parsers to not only parse the input
							 | 
						||
| 
								 | 
							
								string but perform operations and transformations on the syntax tree
							 | 
						||
| 
								 | 
							
								returned.</p>
							 | 
						||
| 
								 | 
							
								<p>A useful debugging method to work out what to use in the quotation
							 | 
						||
| 
								 | 
							
								passed to <@ is to write an initial version of the parser that just
							 | 
						||
| 
								 | 
							
								displays the topmost item on the stack:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  (1) : natural-parser-debug ( -- parser )
							 | 
						||
| 
								 | 
							
								        digit-parser2 <*> [ "debug: " write dup . ] <@  ;
							 | 
						||
| 
								 | 
							
								  (3) "123" natural-parser-debug call lcar .
							 | 
						||
| 
								 | 
							
								      => debug: [ [ 1 2 3 ] ]
							 | 
						||
| 
								 | 
							
								           [ "" [ 1 2 3 ] ]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>From the debug output we can see how to manipulate the result to
							 | 
						||
| 
								 | 
							
								get what we want. In this case it's the quotation in the previous example.</p>
							 | 
						||
| 
								 | 
							
								 
							 | 
						||
| 
								 | 
							
								<h2>Sequential combinator</h2>
							 | 
						||
| 
								 | 
							
								<p>To create a full grammar we need a parser combinator that does
							 | 
						||
| 
								 | 
							
								sequential compositions. That is, given two parsers, the sequential
							 | 
						||
| 
								 | 
							
								combinator will first run the first parser, and then run the second on
							 | 
						||
| 
								 | 
							
								the remaining text to be parsed. As the first parser returns a lazy
							 | 
						||
| 
								 | 
							
								list, the second parser will be run on each item of the lazy list. Of
							 | 
						||
| 
								 | 
							
								course this is done lazily so it only ends up being done when those
							 | 
						||
| 
								 | 
							
								list items are requested. The sequential combinator word is <&>.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) "number:" token 
							 | 
						||
| 
								 | 
							
								       => < parser that parses the text 'number:' >
							 | 
						||
| 
								 | 
							
								  ( 2 ) natural-parser
							 | 
						||
| 
								 | 
							
								       => < parser that parses natural numbers >
							 | 
						||
| 
								 | 
							
								  ( 3 ) <&>
							 | 
						||
| 
								 | 
							
								       => < parser that parses 'number:' followed by a natural >
							 | 
						||
| 
								 | 
							
								  ( 4 ) "number:100" swap call
							 | 
						||
| 
								 | 
							
								       => < list of successes >
							 | 
						||
| 
								 | 
							
								  ( 5 ) [ . ] leach
							 | 
						||
| 
								 | 
							
								       => [ "" "number:" 100 ]
							 | 
						||
| 
								 | 
							
								            [ "0" "number:" 10 ]
							 | 
						||
| 
								 | 
							
								            [ "00" "number:" 1 ]
							 | 
						||
| 
								 | 
							
								            [ "100" "number:" 0 ]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>In this  example we might prefer not to have the parse result
							 | 
						||
| 
								 | 
							
								contain the token, we want just the number. Two alternatives to <&>
							 | 
						||
| 
								 | 
							
								provide the ability to select which result to use from the two
							 | 
						||
| 
								 | 
							
								parsers. These operators are <& and &>. The < or > points 
							 | 
						||
| 
								 | 
							
								in the direction of which parser to retain the results from. So our
							 | 
						||
| 
								 | 
							
								example above could be:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) "number:" token 
							 | 
						||
| 
								 | 
							
								       => < parser that parses the text 'number:' >
							 | 
						||
| 
								 | 
							
								  ( 2 ) natural-parser
							 | 
						||
| 
								 | 
							
								       => < parser that parses natural numbers >
							 | 
						||
| 
								 | 
							
								  ( 3 ) &>
							 | 
						||
| 
								 | 
							
								       => < parser that parses 'number:' followed by a natural >
							 | 
						||
| 
								 | 
							
								  ( 4 ) "number:100" swap call
							 | 
						||
| 
								 | 
							
								       => < list of successes >
							 | 
						||
| 
								 | 
							
								  ( 5 ) [ . ] leach
							 | 
						||
| 
								 | 
							
								       => [ "" 100 ]
							 | 
						||
| 
								 | 
							
								            [ "0" 10 ]
							 | 
						||
| 
								 | 
							
								            [ "00" 1 ]
							 | 
						||
| 
								 | 
							
								            [ "100" 0 ]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>Notice how the parse result only contains the number due to &>
							 | 
						||
| 
								 | 
							
								being used to retain the result of the second parser.</p>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<h2>Choice combinator</h2>
							 | 
						||
| 
								 | 
							
								<p>As well as a sequential combinator we need an alternative
							 | 
						||
| 
								 | 
							
								combinator. The word for this is <|>. It takes two parsers from the
							 | 
						||
| 
								 | 
							
								stack and returns a parser that will first try the first parser. If it
							 | 
						||
| 
								 | 
							
								succeeds then the result for that is returned. If it fails then the
							 | 
						||
| 
								 | 
							
								second parser is tried and its result returned.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) "one" token
							 | 
						||
| 
								 | 
							
								        => < parser that parses the text 'one' >
							 | 
						||
| 
								 | 
							
								  ( 2 ) "two" token 
							 | 
						||
| 
								 | 
							
								        => < parser that parses the text 'two' >
							 | 
						||
| 
								 | 
							
								  ( 3 ) <|>
							 | 
						||
| 
								 | 
							
								        => < parser that parses 'one' or 'two' >
							 | 
						||
| 
								 | 
							
								  ( 4 ) "one" over call [ . ] leach
							 | 
						||
| 
								 | 
							
								        => [[ "" "one" ]]
							 | 
						||
| 
								 | 
							
								  ( 5 ) "two" swap call [ . ] leach
							 | 
						||
| 
								 | 
							
								        => [[ "" "two" ]]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<h2>Option combinator</h2>
							 | 
						||
| 
								 | 
							
								<p>The option combinator, <?> allows adding optional elements to
							 | 
						||
| 
								 | 
							
								a parser. It takes one parser off the stack and if the parse succeeds
							 | 
						||
| 
								 | 
							
								add it to the result tree, otherwise it will ignore it and
							 | 
						||
| 
								 | 
							
								continue. The example below extends our natural-parser to parse
							 | 
						||
| 
								 | 
							
								integers with an optional leading minus sign.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) : integer-parser
							 | 
						||
| 
								 | 
							
								          "-" token <?> natural-parser <&> ;
							 | 
						||
| 
								 | 
							
								  ( 2 ) "200" integer-parser call [ . ] leach 
							 | 
						||
| 
								 | 
							
								       => [ "" [ ] 200 ]
							 | 
						||
| 
								 | 
							
								            [ "0" [ ] 20 ]
							 | 
						||
| 
								 | 
							
								            [ "00" [ ] 2 ]
							 | 
						||
| 
								 | 
							
								            [ "200" [ ] 0 ]
							 | 
						||
| 
								 | 
							
								  ( 3 ) "-200" integer-parser call [ . ] leach
							 | 
						||
| 
								 | 
							
								       => [ "" [ "-" ] 200 ]
							 | 
						||
| 
								 | 
							
								            [ "0" [ "-" ] 20 ]
							 | 
						||
| 
								 | 
							
								            [ "00" [ "-" ] 2 ]
							 | 
						||
| 
								 | 
							
								            [ "200" [ "-" ] 0 ]
							 | 
						||
| 
								 | 
							
								            [ "-200" [ ] 0 ]
							 | 
						||
| 
								 | 
							
								  ( 4 ) : integer-parser2
							 | 
						||
| 
								 | 
							
								          integer-parser [ uncons swap [ car -1 * ] when ] <@ ;
							 | 
						||
| 
								 | 
							
								  ( 5 ) "200" integer-parser2 call [ . ] leach 
							 | 
						||
| 
								 | 
							
								       => [ "" 200 ]
							 | 
						||
| 
								 | 
							
								            [ "0" 20 ]
							 | 
						||
| 
								 | 
							
								            [ "00" 2 ]
							 | 
						||
| 
								 | 
							
								            [ "200" 0 ]
							 | 
						||
| 
								 | 
							
								  ( 6 ) "-200" integer-parser2 call [ . ] leach
							 | 
						||
| 
								 | 
							
								       => [ "" -200 ]
							 | 
						||
| 
								 | 
							
								            [ "0" -20 ]
							 | 
						||
| 
								 | 
							
								            [ "00" -2 ]
							 | 
						||
| 
								 | 
							
								            [ "200" 0 ]
							 | 
						||
| 
								 | 
							
								            [ "-200" 0 ]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<h2>Skipping Whitespace</h2>
							 | 
						||
| 
								 | 
							
								<p>A parser transformer exists, the word 'sp', that takes an existing
							 | 
						||
| 
								 | 
							
								parser and returns a new one that will first skip any whitespace
							 | 
						||
| 
								 | 
							
								before calling the original parser. This makes it easy to write
							 | 
						||
| 
								 | 
							
								grammers that avoid whitespace without having to explicitly code it
							 | 
						||
| 
								 | 
							
								into the grammar.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) "  123" natural-parser call [ . ] leach
							 | 
						||
| 
								 | 
							
								        => [ "  123" 0 ]
							 | 
						||
| 
								 | 
							
								  ( 2 ) "  123" natural-parser sp call [ . ] leach
							 | 
						||
| 
								 | 
							
								        => [ "" 123 ]
							 | 
						||
| 
								 | 
							
								             [ "3" 12 ]
							 | 
						||
| 
								 | 
							
								             [ "23" 1 ]
							 | 
						||
| 
								 | 
							
								             [ "123" 0 ]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<h2>Eval grammar example</h2>
							 | 
						||
| 
								 | 
							
								<p>This example presents a simple grammar that will parse a number
							 | 
						||
| 
								 | 
							
								followed by an operator and another number. A factor expression that
							 | 
						||
| 
								 | 
							
								computes the entered value will be executed.</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) natural-parser
							 | 
						||
| 
								 | 
							
								        => < a parser for natural numbers >
							 | 
						||
| 
								 | 
							
								  ( 2 ) "/" token "*" token "+" token "-" token <|> <|> <|>
							 | 
						||
| 
								 | 
							
								        => < a parser for the operator >
							 | 
						||
| 
								 | 
							
								  ( 3 ) sp [ "\\ " swap cat2 eval unit ] <@
							 | 
						||
| 
								 | 
							
								        => < operator parser that skips whitespace and converts to a 
							 | 
						||
| 
								 | 
							
								             factor expression >
							 | 
						||
| 
								 | 
							
								  ( 4 ) natural-parser sp
							 | 
						||
| 
								 | 
							
								        => < a whitespace skipping natural parser >
							 | 
						||
| 
								 | 
							
								  ( 5 ) <&> <&> [ uncons uncons swap append append call ] <@
							 | 
						||
| 
								 | 
							
								        => < a parser that parsers the expression, converts it to
							 | 
						||
| 
								 | 
							
								             factor, calls it and puts the result in the parse tree >
							 | 
						||
| 
								 | 
							
								  ( 6 ) "123 + 456" over call lcar .
							 | 
						||
| 
								 | 
							
								        => [[ "" 579 ]]
							 | 
						||
| 
								 | 
							
								  ( 7 ) "300-100" over call lcar .
							 | 
						||
| 
								 | 
							
								        => [[ "" 200 ]]
							 | 
						||
| 
								 | 
							
								  ( 8 ) "200/2" over call lcar .
							 | 
						||
| 
								 | 
							
								        => [[ "" 100 ]]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p>It looks complicated when expanded as above but the entire parser,
							 | 
						||
| 
								 | 
							
								factored a little, looks quite readable:</p>
							 | 
						||
| 
								 | 
							
								<pre class="code">
							 | 
						||
| 
								 | 
							
								  ( 1 ) : operator ( -- parser )
							 | 
						||
| 
								 | 
							
								          "/" token 
							 | 
						||
| 
								 | 
							
								          "*" token <|>
							 | 
						||
| 
								 | 
							
								          "+" token <|>
							 | 
						||
| 
								 | 
							
								          "-" token <|>
							 | 
						||
| 
								 | 
							
								          [ "\\ " swap cat2 eval unit ] <@ ;
							 | 
						||
| 
								 | 
							
								  ( 2 ) : expression ( -- parser )
							 | 
						||
| 
								 | 
							
								          natural-parser 
							 | 
						||
| 
								 | 
							
								          operator sp <&>  
							 | 
						||
| 
								 | 
							
								          natural-parser sp <&> 
							 | 
						||
| 
								 | 
							
								          [ uncons swap uncons -rot append append reverse call ] <@ ;
							 | 
						||
| 
								 | 
							
								  ( 3 ) "40+2" expression call lcar .
							 | 
						||
| 
								 | 
							
								        => [[ "" 42 ]]
							 | 
						||
| 
								 | 
							
								</pre>
							 | 
						||
| 
								 | 
							
								<p class="footer">
							 | 
						||
| 
								 | 
							
								News and updates to this software can be obtained from the authors
							 | 
						||
| 
								 | 
							
								weblog: <a href="http://radio.weblogs.com/0102385">Chris Double</a>.</p>
							 | 
						||
| 
								 | 
							
								<p id="copyright">Copyright (c) 2004, Chris Double. All Rights Reserved.</p>
							 | 
						||
| 
								 | 
							
								</body> </html>
							 |