231 lines
12 KiB
Plaintext
231 lines
12 KiB
Plaintext
USING: help kernel math parser words io ;
|
|
|
|
ARTICLE: "syntax" "Syntax"
|
|
"In Factor, an " { $emphasis "object" } " is a piece of data that can be identified. Code is data, so Factor syntax is actually a syntax for describing objects, of which code is a special case. Factor syntax is read by the parser. The parser performs two kinds of tasks -- it creates objects from their " { $emphasis "printed representations" } ", and it adds " { $emphasis "word definitions" } " to the dictionary (see " { $link "words" } "). The parser can be extended (see " { $link "parser" } ")."
|
|
{ $subsection "parser-algorithm" }
|
|
{ $subsection "vocabulary-search" }
|
|
{ $subsection "syntax-comments" }
|
|
{ $subsection "syntax-literals" } ;
|
|
|
|
ARTICLE: "parser-algorithm" "Parser algorithm"
|
|
"At the most abstract level, Factor syntax consists of whitespace-separated tokens. The parser tokenizes the input on whitespace boundaries. The parser is case-sensitive and whitespace between tokens is significant, so the following three expressions tokenize differently:"
|
|
{ $code "2X+\n2 X +\n2 x +" }
|
|
"As the parser reads tokens it makes a distinction between numbers, ordinary words, and parsing words. Tokens are appended to the parse tree, the top level of which is a quotation returned by the original parser invocation. Nested levels of the parse tree are created by parsing words."
|
|
$nl
|
|
"The parser iterates through the input text, checking each character in turn. Here is the parser algorithm in more detail -- some of the concepts therein will be defined shortly:"
|
|
{ $list
|
|
{ "If the current character is a double-quote (\"), the " { $link POSTPONE: " } " parsing word is executed, causing a string to be read." }
|
|
{
|
|
"Otherwise, the next token is taken from the input. The parser searches for a word named by the token in the currently used set of vocabularies. If the word is found, one of the following two actions is taken:"
|
|
{ $list
|
|
"If the word is an ordinary word, it is appended to the parse tree."
|
|
"If the word is a parsing word, it is executed."
|
|
}
|
|
}
|
|
"Otherwise if the token does not represent a known word, the parser attempts to parse it as a number. If the token is a number, the number object is added to the parse tree. Otherwise, an error is raised and parsing halts."
|
|
}
|
|
"Parsing words play a key role in parsing; while ordinary words and numbers are simply added to the parse tree, parsing words execute in the context of the parser, and can do their own parsing and create nested data structures in the parse tree. Parsing words are also able to define new words."
|
|
$nl
|
|
"While parsing words supporting arbitrary syntax can be defined, the default set is found in the " { $vocab-link "syntax" } " vocabulary and provides the basis for all further syntactic interaction with Factor." ;
|
|
|
|
ARTICLE: "vocabulary-search" "Vocabulary search"
|
|
"A " { $emphasis "word" } " is a code definition identified by a name. Words are sorted into " { $emphasis "vocabularies" } ". Words are discussed in depth in " { $link "words" } "."
|
|
$nl
|
|
"When the parser reads a token, it attempts to look up a word named by that token. The lookup is performed by searching each vocabulary in the search path, in order."
|
|
$nl
|
|
"Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. Use the " { $link POSTPONE: DEFER: } " parsing word to get around this limitation, for example when defining mutually-recursive words."
|
|
$nl
|
|
"For a source file the vocabulary search path starts off with two vocabularies:"
|
|
{ $code "syntax\nscratchpad" }
|
|
"The " { $vocab-link "syntax" } " vocabulary consists of a set of parsing words for reading Factor data and defining new words. The " { $vocab-link "scratchpad" } " vocabulary is the default vocabulary for new word definitions."
|
|
$nl
|
|
"At the interactive listener, the default search path contains many more vocabularies. Details on the default search path and parser invocation are found in " { $link "parser" } "."
|
|
$nl
|
|
"Three parsing words deal with the vocabulary search path:"
|
|
{ $subsection POSTPONE: USE: }
|
|
{ $subsection POSTPONE: USING: }
|
|
{ $subsection POSTPONE: IN: }
|
|
{ $subsection "vocabulary-search-shadow" } ;
|
|
|
|
ARTICLE: "vocabulary-search-shadow" "Shadowing word names"
|
|
"If adding a vocabulary to the search path results in a word in another vocabulary becoming inaccessible due to the new vocabulary defining a word with the same name, a message is printed to the " { $link stdio } " stream. Except when debugging suspected name clashes, these messages can be ignored."
|
|
$nl
|
|
"Here is an example where shadowing occurs:"
|
|
{ $code
|
|
"IN: foe"
|
|
"USING: sequences io ;"
|
|
""
|
|
": append"
|
|
" #! Prints a message, then calls sequences::append."
|
|
" \"foe::append calls sequences::append\" print append ;"
|
|
""
|
|
"IN: fee"
|
|
""
|
|
": append"
|
|
" #! Infinite recursion! Calls fee::append."
|
|
" \"fee::append calls fee::append\" print append ;"
|
|
""
|
|
"USE: foe"
|
|
""
|
|
": append"
|
|
" #! Redefining fee::append to call foe::append."
|
|
" \"fee::append calls foe::append\" print append ;"
|
|
""
|
|
"\"1234\" \"5678\" append print"
|
|
}
|
|
"When placed in a source file and run, the above code produces the following output:"
|
|
{ $code
|
|
"fee::append calls foe::append"
|
|
"foe::append calls sequences::append"
|
|
"12345678"
|
|
} ;
|
|
|
|
ARTICLE: "syntax-comments" "Comments"
|
|
{ $subsection POSTPONE: ! }
|
|
{ $subsection POSTPONE: #! } ;
|
|
|
|
ARTICLE: "syntax-literals" "Literals"
|
|
"Many different types of objects can be constructed at parse time via literal syntax. Numbers are a special case since support for reading them is built-in to the parser. All other literals are constructed via parsing words."
|
|
$nl
|
|
"If a quotation contains a literal object, the same literal object instance is used each time the quotation executes; that is, literals are ``live''."
|
|
$nl
|
|
"Using mutable object literals in word definitions requires care, since if those objects are mutated, the actual word definition will be changed, which is in most cases not what you would expect."
|
|
{ $subsection "syntax-numbers" }
|
|
{ $subsection "syntax-words" }
|
|
{ $subsection "syntax-booleans" }
|
|
{ $subsection "syntax-quots" }
|
|
{ $subsection "syntax-arrays" }
|
|
{ $subsection "syntax-vectors" }
|
|
{ $subsection "syntax-strings" }
|
|
{ $subsection "syntax-sbufs" }
|
|
{ $subsection "syntax-byte-arrays" }
|
|
{ $subsection "syntax-bit-arrays" }
|
|
{ $subsection "syntax-hashtables" }
|
|
{ $subsection "syntax-tuples" }
|
|
{ $subsection "syntax-aliens" } ;
|
|
|
|
ARTICLE: "syntax-numbers" "Number syntax"
|
|
"If a vocabulary lookup of a token fails, the parser attempts to parse it as a number."
|
|
{ $subsection "syntax-integers" }
|
|
{ $subsection "syntax-ratios" }
|
|
{ $subsection "syntax-floats" }
|
|
{ $subsection "syntax-complex-numbers" } ;
|
|
|
|
ARTICLE: "syntax-integers" "Integer syntax"
|
|
"The printed representation of an integer consists of a sequence of digits, optionally prefixed by a sign."
|
|
{ $code
|
|
"123456"
|
|
"-10"
|
|
"2432902008176640000"
|
|
}
|
|
"Integers are entered in base 10 unless prefixed with a base change parsing word."
|
|
{ $subsection POSTPONE: BIN: }
|
|
{ $subsection POSTPONE: OCT: }
|
|
{ $subsection POSTPONE: HEX: }
|
|
"More information on integers can be found in " { $link "integers" } "." ;
|
|
|
|
ARTICLE: "syntax-ratios" "Ratio syntax"
|
|
"The printed representation of a ratio is a pair of integers separated by a slash (/). No intermediate whitespace is permitted. Either integer may be signed, however the ratio will be normalized into a form where the denominator is positive and the greatest common divisor of the two terms is 1."
|
|
{ $code
|
|
"75/33"
|
|
"1/10"
|
|
"-5/-6"
|
|
}
|
|
"More information on ratios can be found in " { $link "rationals" } ;
|
|
|
|
ARTICLE: "syntax-floats" "Float syntax"
|
|
"Floating point literals must contain a decimal point, and may contain an exponent:"
|
|
{ $code
|
|
"10.5"
|
|
"-3.1456"
|
|
"7.e13"
|
|
"1.0e-5"
|
|
}
|
|
"More information on floats can be found in " { $link "floats" } "." ;
|
|
|
|
ARTICLE: "syntax-complex-numbers" "Complex number syntax"
|
|
"A complex number is given by two components, a ``real'' part and ''imaginary'' part. The components must either be integers, ratios or floats."
|
|
{ $code
|
|
"C{ 1/2 1/3 } ! the complex number 1/2+1/3i"
|
|
"C{ 0 1 } ! the imaginary unit"
|
|
}
|
|
{ $subsection POSTPONE: C{ }
|
|
"More information on complex numbers can be found in " { $link "complex-numbers" } "." ;
|
|
|
|
ARTICLE: "syntax-words" "Word syntax"
|
|
"A word occurring inside a quotation is executed when the quotation is called. Sometimes a word needs to be pushed on the data stack instead. The canonical use-case for this is passing the word to the " { $link execute } " combinator, or alternatively, reflectively accessing word properties (" { $link "word-props" } ")."
|
|
{ $subsection POSTPONE: \ }
|
|
{ $subsection POSTPONE: POSTPONE: }
|
|
"The implementation of the " { $link POSTPONE: \ } " word is discussed in detail in " { $link "reading-ahead" } ". Words are documented in " { $link "words" } "." ;
|
|
|
|
ARTICLE: "syntax-booleans" "Boolean syntax"
|
|
"Any Factor object may be used as a truth value in a conditional expression. The " { $link f } " object is false and anything else is true. The " { $link f } " object is also used to represent the empty list, as well as the concept of a missing value. The canonical truth value is the " { $link t } " object."
|
|
{ $subsection POSTPONE: f }
|
|
{ $subsection t } ;
|
|
|
|
ARTICLE: "syntax-strings" "Character and string syntax"
|
|
"Factor has no distinct character type, however Unicode character value integers can be read by specifying a literal character, or an escaped representation thereof."
|
|
{ $subsection POSTPONE: CHAR: }
|
|
{ $subsection POSTPONE: " }
|
|
{ $subsection "escape" }
|
|
"Strings are documented in " { $link "strings" } "." ;
|
|
|
|
ARTICLE: "escape" "Character escape codes"
|
|
{ $table
|
|
{ "Escape code" "Meaning" }
|
|
{ { $snippet "\\\\" } { $snippet "\\" } }
|
|
{ { $snippet "\\s" } "a space" }
|
|
{ { $snippet "\\t" } "a tab" }
|
|
{ { $snippet "\\n" } "a newline" }
|
|
{ { $snippet "\\r" } "a carriage return" }
|
|
{ { $snippet "\\0" } "a null byte (ASCII 0)" }
|
|
{ { $snippet "\\e" } "escape (ASCII 27)" }
|
|
{ { $snippet "\\\"" } { $snippet "\"" } }
|
|
}
|
|
"A Unicode character can be specified by its code number by writing " { $snippet "\\u" } " followed by a four-digit hexadecimal number. That is, the following two expressions are equivalent:"
|
|
{ $code
|
|
"CHAR: \\u0078"
|
|
"78"
|
|
}
|
|
"While not useful for single characters, this syntax is also permitted inside strings." ;
|
|
|
|
ARTICLE: "syntax-sbufs" "String buffer syntax"
|
|
{ $subsection POSTPONE: SBUF" }
|
|
"String buffers are documented in " { $link "sbufs" } "." ;
|
|
|
|
ARTICLE: "syntax-arrays" "Array syntax"
|
|
{ $subsection POSTPONE: { }
|
|
{ $subsection POSTPONE: } }
|
|
"Arrays are documented in " { $link "arrays" } "." ;
|
|
|
|
ARTICLE: "syntax-vectors" "Vector syntax"
|
|
{ $subsection POSTPONE: V{ }
|
|
"Vectors are documented in " { $link "vectors" } "." ;
|
|
|
|
ARTICLE: "syntax-hashtables" "Hashtable syntax"
|
|
{ $subsection POSTPONE: H{ }
|
|
"Hashtables are documented in " { $link "hashtables" } "." ;
|
|
|
|
ARTICLE: "syntax-tuples" "Tuple syntax"
|
|
{ $subsection POSTPONE: T{ }
|
|
"Tuples are documented in " { $link "tuples" } "." ;
|
|
|
|
ARTICLE: "syntax-quots" "Quotation syntax"
|
|
{ $subsection POSTPONE: [ }
|
|
{ $subsection POSTPONE: ] }
|
|
"Quotations are documented in " { $link "quotations" } "." ;
|
|
|
|
ARTICLE: "syntax-bit-arrays" "Bit array syntax"
|
|
{ $subsection POSTPONE: ?{ }
|
|
"Bit arrays are documented in " { $link "bit-arrays" } "." ;
|
|
|
|
ARTICLE: "syntax-byte-arrays" "Byte array syntax"
|
|
{ $subsection POSTPONE: B{ }
|
|
"Bit arrays are documented in " { $link "bit-arrays" } "." ;
|
|
|
|
ARTICLE: "syntax-aliens" "Alien object syntax"
|
|
"These literal forms mainly exist for print-outs, and should not be input unless you know what you are doing."
|
|
{ $subsection POSTPONE: DLL" }
|
|
{ $subsection POSTPONE: ALIEN: }
|
|
"The alien interface is documented in " { $link "alien" } "." ;
|