"At the most abstract level, Factor syntax consists of whitespace-separated tokens. The parser tokenizes the input on whitespace boundaries. The parser is case-sensitive and whitespace between tokens is significant, so the following three expressions tokenize differently:"
"As the parser reads tokens it makes a distinction between numbers, ordinary words, and parsing words. Tokens are appended to the parse tree, the top level of which is a quotation returned by the original parser invocation. Nested levels of the parse tree are created by parsing words."
$nl
"The parser iterates through the input text, checking each character in turn. Here is the parser algorithm in more detail -- some of the concepts therein will be defined shortly:"
{ $list
{ "If the current character is a double-quote (\"), the " { $link POSTPONE:" } " parsing word is executed, causing a string to be read." }
{
"Otherwise, the next token is taken from the input. The parser searches for a word named by the token in the currently used set of vocabularies. If the word is found, one of the following two actions is taken:"
{ $list
"If the word is an ordinary word, it is appended to the parse tree."
"If the word is a parsing word, it is executed."
}
}
"Otherwise if the token does not represent a known word, the parser attempts to parse it as a number. If the token is a number, the number object is added to the parse tree. Otherwise, an error is raised and parsing halts."
}
"Parsing words play a key role in parsing; while ordinary words and numbers are simply added to the parse tree, parsing words execute in the context of the parser, and can do their own parsing and create nested data structures in the parse tree. Parsing words are also able to define new words."
$nl
"While parsing words supporting arbitrary syntax can be defined, the default set is found in the " { $vocab-link "syntax" } " vocabulary and provides the basis for all further syntactic interaction with Factor.";
ARTICLE: "syntax-immediate""Parse time evaluation"
"Code can be evaluated at parse time. This is a rarely-used feature; one use-case is " { $link "loading-libs" } ", where you want to execute some code before the words in a source file are compiled."
"Integers are entered in base 10 unless prefixed with a base-changing prefix. " { $snippet "0x" } " begins a hexadecimal literal, " { $snippet "0o" } " an octal literal, and " { $snippet "0b" } " a binary literal. A sign, if any, goes before the base prefix."
"The printed representation of a ratio is a pair of integers separated by a slash (" { $snippet "/" } "). A ratio can also be written as a proper fraction by following an integer part with " { $snippet "+" } " or " { $snippet "-" } " (matching the sign of the integer) and a ratio. No intermediate whitespace is permitted within a ratio literal. Here are some examples:"
"Floating point literals are specified when a literal number contains a decimal point or exponent. Exponents are marked by an " { $snippet "e" } " or " { $snippet "E" } ":"
"Hexadecimal float literals are also supported. These consist of a hexadecimal literal with a decimal point and an optional base-two exponent expressed as a decimal number after " { $snippet "p" } " or " { $snippet "P" } ":"
"The normalized hex form " { $snippet "±0x1.MMMMMMMMMMMMMp±EEEE" } " allows any floating-point number to be specified precisely. The values of MMMMMMMMMMMMM and EEEE map directly to the mantissa and exponent fields of the binary IEEE 754 representation."
"A word occurring inside a quotation is executed when the quotation is called. Sometimes a word needs to be pushed on the data stack instead. The canonical use case for this is passing the word to the " { $link execute } " combinator, or alternatively, reflectively accessing word properties (" { $link "word-props" } ")."
"The implementation of the " { $link POSTPONE:\ } " word is discussed in detail in " { $link "reading-ahead" } ". Words are documented in " { $link "words" } ".";
"Factor has no distinct character type. Integers representing Unicode code points can be read by specifying a literal character, or an escaped representation thereof."
"Note that this is " { $emphasis "not" } " syntax to declare stack effects of words. This pushes an " { $link effect } " instance on the stack for reflection, for use with words such as " { $link define-declared } ", " { $link call-effect } " and " { $link execute-effect } "."
"Many different types of objects can be constructed at parse time via literal syntax. Numbers are a special case since support for reading them is built-in to the parser. All other literals are constructed via parsing words."
"Using mutable object literals in word definitions requires care, since if those objects are mutated, the actual word definition will be changed, which is in most cases not what you would expect. Literals should be " { $link clone } "d before being passed to a word which may potentially mutate them."
"Factor has two main forms of syntax: " { $emphasis "definition" } " syntax and " { $emphasis "literal" } " syntax. Code is data, so the syntax for code is a special case of object literal syntax. This section documents literal syntax. Definition syntax is covered in " { $link "words" } ". Extending the parser is the main topic of " { $link "parser" } "."
{ $description "Declares the most recently defined word as a delimiter. Delimiters are words which are only ever valid as the end of a nested block to be read by " { $link parse-until } ". An unpaired occurrence of a delimiter is a parse error." } ;
{ $description "Declares the most recently defined word as deprecated. If the " { $vocab-link "tools.deprecation" } " vocabulary is loaded, usages of deprecated words will be noted by the " { $link "tools.errors" } " system." }
{ $notes "Code that uses deprecated words continues to function normally; the errors are purely informational. However, code that uses deprecated words should be updated, for the deprecated words are intended to be removed soon." } ;
{ $examples "In the below example, the " { $snippet "world" } " word is never called, however its body references a parsing word which executes immediately:" { $example "USE: io""IN: scratchpad""<< SYNTAX: HELLO \"Hello parser!\" print ; >>\n: world ( -- ) HELLO ;""Hello parser!" } } ;
"Combinators must be inlined in order to compile with the optimizing compiler - see " { $link "inference-combinators" } ". For any other word, inlining is merely an optimization. Note that inlined words that can be compiled stand-alone are also, themselves, compiled by the optimizing compiler."
{ $description "Declares the most recently defined word as a recursive word." }
{ $notes "This declaration is only required for " { $link POSTPONE:inline } " words which call themselves. See " { $link "inference-recursive-combinators" } "." } ;
"Declares that the most recently defined word may be evaluated at compile-time if all inputs are literal. Foldable words must satisfy a very strong contract:"
{ $list
"foldable words must not have any observable side effects,"
"foldable words must halt - for example, a word computing a series until it coverges should not be foldable, since compilation will not halt in the event the series does not converge."
"both inputs and outputs of foldable words must be immutable."
}
"The last restriction ensures that words such as " { $link clone } " do not satisfy the foldable word contract. Indeed, " { $link clone } " will output a mutable object if its input is mutable, and so it is undesirable to evaluate it at compile-time, since doing so would give incorrect semantics for code that clones mutable objects and proceeds to mutate them."
"Folding optimizations are not applied if the call site of a word is in the same source file as the word. This is a side-effect of the compilation unit system; see " { $link "compilation-units" } "."
{ $examples "Most operations on numbers are foldable. For example, " { $snippet "2 2 +" } " compiles to a literal 4, since " { $link + } " is declared foldable." } ;
HELP:flushable
{ $syntax ": foo ... ; flushable" }
{ $description
"Declares that the most recently defined word has no side effects, and thus calls to this word may be pruned by the compiler if the outputs are not used."
$nl
"Note that many words are flushable but not foldable, for example " { $link clone } " and " { $link <array> } "."
{ $description "The " { $link f } " parsing word adds the " { $link f } " object to the parse tree, and is also the class whose sole instance is the " { $link f } " object. The " { $link f } " object is the singleton false value, the only object that is not true. The " { $link f } " object is not equal to the " { $link f } " class word, which can be pushed on the stack using word wrapper syntax:"
{ $code "f ! the singleton f object denoting falsity\n\\ f ! the f class word" } } ;
HELP:[
{ $syntax "[ elements... ]" }
{ $description "Marks the beginning of a literal quotation." }
{ $examples { $code "[ 1 2 3 ]" } } ;
{ POSTPONE:[POSTPONE:] } related-words
HELP:]
{ $syntax "]" }
{ $description "Marks the end of a literal quotation."
$nl
"Parsing words can use this word as a generic end delimiter." } ;
HELP:}
{ $syntax "}" }
{ $description "Marks the end of an array, vector, hashtable, complex number, tuple, or wrapper."
$nl
"Parsing words can use this word as a generic end delimiter." } ;
{ $description "Marks the beginning of a literal hashtable, given as a list of two-element arrays holding key/value pairs. Literal hashtables are terminated by " { $link POSTPONE:} } "." }
{ $description "Marks the beginning of a literal hash set, given as a list of its members. Literal hashtables are terminated by " { $link POSTPONE:} } "." }
{ $description "Parses a complex number given in rectangular form as a pair of real numbers. Literal complex numbers are terminated by " { $link POSTPONE:} } "." } ;
{ "empty tuple form: if no slot values are specified, then the literal tuple will have all slots set to their initial values (see " { $link "slot-initial-values" } ")." }
{ "BOA-form: if the first element of " { $snippet "slots" } " is " { $snippet "f" } ", then the remaining elements are slot values corresponding to slots in the order in which they are defined in the " { $link POSTPONE:TUPLE: } " form." }
{ "assoc-form: otherwise, " { $snippet "slots" } " is interpreted as a sequence of " { $snippet "{ slot-name value }" } " pairs. The " { $snippet "slot-name" } " should not be quoted." }
}
"BOA form is more concise, whereas assoc form is more readable for larger tuples with many slots, or if only a few slots are to be specified."
$nl
"With BOA form, specifying an insufficient number of values is given after the class word, the remaining slots of the tuple are set to their initial values (see " { $link "slot-initial-values" } "). If too many values are given, an error will be raised." }
{ $examples
"An empty tuple; since vectors have their own literal syntax, the above is equivalent to " { $snippet "V{ }" } ""
{ $code "T{ vector }" }
"A BOA-form tuple:"
{ $code
"USE: colors"
"T{ rgba f 1.0 0.0 0.5 }"
}
"An assoc-form tuple equal to the above:"
{ $code
"USE: colors"
"T{ rgba { red 1.0 } { green 0.0 } { blue 0.5 } }"
{ $description "Marks the beginning of a literal wrapper. Literal wrappers are terminated by " { $link POSTPONE:} } "." } ;
HELP:POSTPONE:
{ $syntax "POSTPONE: word" }
{ $values { "word""a word" } }
{ $description "Reads the next word from the input string and appends the word to the parse tree, even if it is a parsing word." }
{ $examples "For an ordinary word " { $snippet "foo" } ", " { $snippet "foo" } " and " { $snippet "POSTPONE: foo" } " are equivalent; however, if " { $snippet "foo" } " is a parsing word, the former will execute it at parse time, while the latter will execute it at runtime." }
{ $notes "This word is used inside parsing words to delegate further action to another parsing word, and to refer to parsing words literally from literal arrays and such." } ;
"Parsing words can use this word as a generic end delimiter."
} ;
HELP:SYMBOL:
{ $syntax "SYMBOL: word" }
{ $values { "word""a new word to define" } }
{ $description "Defines a new symbol word in the current vocabulary. Symbols push themselves on the stack when executed, and are used to identify variables (see " { $link "namespaces" } ") as well as for storing crufties in word properties (see " { $link "word-props" } ")." }
{ $description "Reads the next word from the input and appends a wrapper holding the word to the parse tree. When the evaluator encounters a wrapper, it pushes the wrapped word literally on the data stack." }
{ $examples "The following two lines are equivalent:" { $code "0 \\ <vector> execute\n0 <vector>" } "If " { $snippet "foo" } " is a symbol, the following two lines are equivalent:" { $code "foo""\\ foo" } } ;
{ $description "Create a word in the current vocabulary that simply raises an error when executed. Usually, the word will be replaced with a real definition later." }
{ $notes "Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. Mutually-recursive pairs of words can be implemented by " { $emphasis "deferring" } " one of the words in the pair allowing the second word in the pair to parse, then by defining the first word." }
{ $examples { $code "DEFER: foe\n: fie ... foe ... ;\n: foe ... fie ... ;" } } ;
HELP:FORGET:
{ $syntax "FORGET: word" }
{ $values { "word""a word" } }
{ $description "Removes the word from its vocabulary, or does nothing if no such word exists. Existing definitions that reference forgotten words will continue to work, but new occurrences of the word will not parse." } ;
{ $notes "If adding the vocabulary introduces ambiguity, the vocabulary will take precedence when resolving any ambiguous names. This is a rare case; for example, suppose a vocabulary " { $snippet "fish" } " defines a word named " { $snippet "go:fishing" } ", and a vocabulary named " { $snippet "go" } " defines a word named " { $snippet "fishing" } ". Then, the following will call the latter word:"
{ $description "Adds " { $snippet "words" } " from " { $snippet "vocab" } " to the search path." }
{ $notes "If adding the words introduces ambiguity, the words will take precedence when resolving any ambiguous names." }
{ $examples
"Both the " { $vocab-link "vocabs.parser" } " and " { $vocab-link "binary-search" } " vocabularies define a word named " { $snippet "search" } ". The following will throw an " { $link ambiguous-use-error } ":"
{ $values { "vocabulary""a new vocabulary name" } }
{ $description "Sets the current vocabulary where new words will be defined, creating the vocabulary first if it does not exist. After the vocabulary has been created, it can be listed in " { $link POSTPONE:USE: } " and " { $link POSTPONE:USING: } " declarations." } ;
{ $description "Reads from the input string until the next occurrence of " { $snippet "\"" } " or " { $snippet "\"\"\"" } ", and appends the resulting string to the parse tree. String literals can span multiple lines. Various special characters can be read by inserting " { $link "escape" } ". For triple quoted strings, the double-quote character does not require escaping." }
{ $values { "string""literal and escaped characters" } }
{ $description "Reads from the input string until the next occurrence of " { $link POSTPONE:" } ", converts the string to a string buffer, and appends it to the parse tree." }
{ $description "Reads from the input string until the next occurrence of " { $link POSTPONE:" } ", creates a new " { $link pathname } ", and appends it to the parse tree. Pathnames presented in the UI are clickable, which opens them in a text editor configured with " { $link "editor" } "." }
{ $description "Literal stack effect syntax. Also used by syntax words (such as " { $link POSTPONE:: } "), typically declaring the stack effect of the word definition which follows." }
{ $description "Discards all input until the end of the line." }
{ $notes "To allow Unix-style \"shebang\" scripts to work as expected, " { $snippet "#!" } " is parsed as a separate token regardless of following whitespace if it appears at the beginning of a line."
{ $description "Defines a new generic word in the current vocabulary. Initially, it contains no methods, and thus will throw a " { $link no-method } " error when called." } ;
{ $description "Defines a new generic word which dispatches on the " { $snippet "n" } "th most element from the top of the stack in the current vocabulary. Initially, it contains no methods, and thus will throw a " { $link no-method } " error when called." }
{ $values { "word""a new word to define" } { "variable" word } }
{ $description "Defines a new hook word in the current vocabulary. Hook words are generic words which dispatch on the value of a variable, so methods are defined with " { $link POSTPONE:M: } ". Hook words differ from other generic words in that the dispatch value is removed from the stack before the chosen method is called." }
{ $values { "class""a new class word to define" } }
{ $description "Defines a mixin class. A mixin is similar to a union class, except it has no members initially, and new members can be added with the " { $link POSTPONE:INSTANCE: } " word." }
{ $examples "The " { $link sequence } " and " { $link assoc } " mixin classes." } ;
{ $syntax "PREDICATE: class < superclass predicate... ;" }
{ $values { "class""a new class word to define" } { "superclass""an existing class word" } { "predicate""membership test with stack effect " { $snippet "( superclass -- ? )" } } }
"Defines a predicate class deriving from " { $snippet "superclass" } "."
$nl
"An object is an instance of a predicate class if two conditions hold:"
{ $list
"it is an instance of the predicate's superclass,"
"it satisfies the predicate"
}
"Each predicate must be defined as a subclass of some other class. This ensures that predicates inheriting from disjoint classes do not need to be exhaustively tested during method dispatch."
"Slot attributes are lists of slot attribute specifiers followed by values; a slot attribute specifier is one of " { $link initial: } " or " { $link read-only } ". See " { $link "tuple-declarations" } " for details." }
{ $description "Declares the most recently defined word as a final tuple class which cannot be subclassed. Attempting to subclass a final class raises a " { $link bad-superclass } " error." } ;
{ $description "Defines a tuple slot to be read-only. If a tuple has read-only slots, instances of the tuple should only be created by calling " { $link boa } ", instead of " { $link new } ". Using " { $link boa } " is the only way to set the value of a read-only slot." } ;
{ $description "Defines a protocol slot; that is, defines the accessor words for a slot named " { $snippet "slot" } " without associating it with any specific tuple." } ;
"In both cases, a word " { $snippet "<color>" } " is defined, which reads three values from the stack and creates a " { $snippet "color" } " instance having these values in the " { $snippet "red" } ", " { $snippet "green" } " and " { $snippet "blue" } " slots, respectively."
{ $description "Defines the main entry point for the current vocabulary and source file. This word will be executed when this vocabulary is passed to " { $link run } " or the source file is passed to " { $link run-script } "." } ;
{ $description "Begins a block of private word definitions. Private word definitions are placed in the current vocabulary name, suffixed with " { $snippet ".private" } "." }
{ $description "Evaluates some code at parse time." }
{ $notes "Calling words defined in the same source file at parse time is prohibited; see compilation unit as where it was defined; see " { $link "compilation-units" } "." } ;
HELP:>>
{ $syntax ">>" }
{ $description "Marks the end of a parse time code block." } ;
{ $description "Calls the next applicable method. Only valid inside a method definition. The values at the top of the stack are passed on to the next method, and they must be compatible with that method's class specializer." }
{ $notes "This word looks like an ordinary word but it is a parsing word. It cannot be factored out of a method definition, since the code expansion references the current method object directly." }
"Throws a " { $link no-next-method } " error if this is the least specific method, and throws an " { $link inconsistent-next-method } " error if the values at the top of the stack are not compatible with the current method's specializer."
{ $description "Calls the quotation on the top of the stack, asserting that it has the given stack effect. The quotation does not need to be known at compile time." }
{ $description "Calls the word on the top of the stack, asserting that it has the given stack effect. The word does not need to be known at compile time." }