factor/basis/xml/traversal/traversal-docs.factor

! Copyright (C) 2005, 2009 Daniel Ehrenberg
! See http://factorcode.org/license.txt for BSD license.
USING: help.markup help.syntax xml.data sequences strings multiline ;
IN: xml.traversal

ABOUT: "xml.traversal"

ARTICLE: "xml.traversal" "Utilities for traversing XML"
    "The " { $vocab-link "xml.traversal" } " vocabulary provides utilities for traversing an XML DOM tree and viewing the contents of a single tag. The following words are defined:"
    $nl
    { $subsection { "xml.traversal" "intro" } }
    { $subsection tag-named }
    { $subsection tags-named }
    { $subsection deep-tag-named }
    { $subsection deep-tags-named }
    { $subsection get-id }
    "To get at the contents of a single tag, use"
    { $subsection children>string }
    { $subsection children-tags }
    { $subsection first-child-tag }
    { $subsection assert-tag } ;

ARTICLE: { "xml.traversal" "intro" } "An example of XML processing"
"To illustrate how to use the XML library, we develop a simple Atom parser in Factor. Atom is an XML-based syndication format, like RSS. To see the full version of what we develop here, look at " { $snippet "basis/syndication" } " at the " { $snippet "atom1.0" } " word. First, we want to load a file and get a DOM tree for it."
{ $code <" "file.xml" file>xml "> }
"No encoding descriptor is needed, because XML files contain sufficient information to auto-detect the encoding. Next, we want to extract information from the tree. To get the title, we can use the following:"
{ $code <" "title" tag-named children>string "> }
"The " { $link tag-named } " word finds the first tag named " { $snippet "title" } " in the top level (just under the main tag). Then, with a tag on the stack, its children are asserted to be a string, and the string is returned." $nl
"For a slightly more complicated example, we can look at how entries are parsed. To get a sequence of tags with the name " { $snippet "entry" } ":"
{ $code <" "entry" tags-named "> }
"Imagine that, for each of these, we want to get the URL of the entry. In Atom, the URLs are in a " { $snippet "link" } " tag which is contained in the " { $snippet "entry" } " tag. There are multiple " { $snippet "link" } " tags, but one of them contains the attribute " { $snippet "rel=alternate" } ", and the " { $snippet "href" } " attribute has the URL. So, given an element of the sequence produced in the above quotation, we run the code:"
{ $code <" "link" tags-named [ "rel" attr "alternate" = ] find nip "> }
"to get the link tag on the stack, and"
{ $code <" "href" attr >url "> }
"to extract the URL from it." ;

HELP: deep-tag-named
{ $values { "tag" "an XML tag or document" } { "name/string" "an XML name or string representing a name" } { "matching-tag" tag } }
{ $description "Finds an XML tag with a matching name, recursively searching children and children of children." }
{ $see-also tags-named tag-named deep-tags-named } ;

HELP: deep-tags-named
{ $values { "tag" "an XML tag or document" } { "name/string" "an XML name or string representing a name" } { "tags-seq" "a sequence of tags" } }
{ $description "Returns a sequence of all tags of a matching name, recursively searching children and children of children." }
{ $see-also tag-named deep-tag-named tags-named } ;

HELP: children>string
{ $values { "tag" "an XML tag or document" } { "string" "a string" } }
{ $description "Concatenates the children of the tag, throwing an exception when there is a non-string child." } ;

HELP: children-tags
{ $values { "tag" "an XML tag or document" } { "sequence" sequence } }
{ $description "Gets the children of the tag that are themselves tags." }
{ $see-also first-child-tag } ;

HELP: first-child-tag
{ $values { "tag" "an XML tag or document" } { "tag" tag } }
{ $description "Returns the first child of the given tag that is a tag." }
{ $see-also children-tags } ;

HELP: tag-named
{ $values { "tag" "an XML tag or document" }
    { "name/string" "an XML name or string representing the name" }
    { "matching-tag" tag } }
{ $description "Finds the first tag with matching name which is the direct child of the given tag." }
{ $see-also deep-tags-named deep-tag-named tags-named } ;

HELP: tags-named
{ $values { "tag" "an XML tag or document" }
    { "name/string" "an XML name or string representing the name" }
    { "tags-seq" "a sequence of tags" } }
{ $description "Finds all tags with matching name that are the direct children of the given tag." }
{ $see-also deep-tag-named deep-tags-named tag-named } ;

HELP: get-id
{ $values { "tag" "an XML tag or document" } { "id" "a string" } { "elem" "an XML element or f" } }
{ $description "Finds the XML tag with the specified id, ignoring the namespace." } ;
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`! Copyright (C) 2005, 2009 Daniel Ehrenberg`
			`! See http://factorcode.org/license.txt for BSD license.`
More expository XML docs 2009-03-16 23:29:38 -04:00			`USING: help.markup help.syntax xml.data sequences strings multiline ;`
Moving XML vocabularies around 2009-02-05 22:17:03 -05:00			`IN: xml.traversal`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00
Moving XML vocabularies around 2009-02-05 22:17:03 -05:00			`ABOUT: "xml.traversal"`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00
Moving XML vocabularies around 2009-02-05 22:17:03 -05:00			`ARTICLE: "xml.traversal" "Utilities for traversing XML"`
			`"The " { $vocab-link "xml.traversal" } " vocabulary provides utilities for traversing an XML DOM tree and viewing the contents of a single tag. The following words are defined:"`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`$nl`
More expository XML docs 2009-03-16 23:29:38 -04:00			`{ $subsection { "xml.traversal" "intro" } }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $subsection tag-named }`
			`{ $subsection tags-named }`
			`{ $subsection deep-tag-named }`
			`{ $subsection deep-tags-named }`
			`{ $subsection get-id }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`"To get at the contents of a single tag, use"`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $subsection children>string }`
			`{ $subsection children-tags }`
			`{ $subsection first-child-tag }`
			`{ $subsection assert-tag } ;`

More expository XML docs 2009-03-16 23:29:38 -04:00			`ARTICLE: { "xml.traversal" "intro" } "An example of XML processing"`
			`"To illustrate how to use the XML library, we develop a simple Atom parser in Factor. Atom is an XML-based syndication format, like RSS. To see the full version of what we develop here, look at " { $snippet "basis/syndication" } " at the " { $snippet "atom1.0" } " word. First, we want to load a file and get a DOM tree for it."`
			`{ $code <" "file.xml" file>xml "> }`
			`"No encoding descriptor is needed, because XML files contain sufficient information to auto-detect the encoding. Next, we want to extract information from the tree. To get the title, we can use the following:"`
			`{ $code <" "title" tag-named children>string "> }`
			`"The " { $link tag-named } " word finds the first tag named " { $snippet "title" } " in the top level (just under the main tag). Then, with a tag on the stack, its children are asserted to be a string, and the string is returned." $nl`
			`"For a slightly more complicated example, we can look at how entries are parsed. To get a sequence of tags with the name " { $snippet "entry" } ":"`
			`{ $code <" "entry" tags-named "> }`
			`"Imagine that, for each of these, we want to get the URL of the entry. In Atom, the URLs are in a " { $snippet "link" } " tag which is contained in the " { $snippet "entry" } " tag. There are multiple " { $snippet "link" } " tags, but one of them contains the attribute " { $snippet "rel=alternate" } ", and the " { $snippet "href" } " attribute has the URL. So, given an element of the sequence produced in the above quotation, we run the code:"`
			`{ $code <" "link" tags-named [ "rel" attr "alternate" = ] find nip "> }`
			`"to get the link tag on the stack, and"`
			`{ $code <" "href" attr >url "> }`
			`"to extract the URL from it." ;`

XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`HELP: deep-tag-named`
			`{ $values { "tag" "an XML tag or document" } { "name/string" "an XML name or string representing a name" } { "matching-tag" tag } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Finds an XML tag with a matching name, recursively searching children and children of children." }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $see-also tags-named tag-named deep-tags-named } ;`

			`HELP: deep-tags-named`
			`{ $values { "tag" "an XML tag or document" } { "name/string" "an XML name or string representing a name" } { "tags-seq" "a sequence of tags" } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Returns a sequence of all tags of a matching name, recursively searching children and children of children." }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $see-also tag-named deep-tag-named tags-named } ;`

			`HELP: children>string`
			`{ $values { "tag" "an XML tag or document" } { "string" "a string" } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Concatenates the children of the tag, throwing an exception when there is a non-string child." } ;`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00
			`HELP: children-tags`
			`{ $values { "tag" "an XML tag or document" } { "sequence" sequence } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Gets the children of the tag that are themselves tags." }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $see-also first-child-tag } ;`

			`HELP: first-child-tag`
			`{ $values { "tag" "an XML tag or document" } { "tag" tag } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Returns the first child of the given tag that is a tag." }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $see-also children-tags } ;`

			`HELP: tag-named`
			`{ $values { "tag" "an XML tag or document" }`
			`{ "name/string" "an XML name or string representing the name" }`
			`{ "matching-tag" tag } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Finds the first tag with matching name which is the direct child of the given tag." }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $see-also deep-tags-named deep-tag-named tags-named } ;`

			`HELP: tags-named`
			`{ $values { "tag" "an XML tag or document" }`
			`{ "name/string" "an XML name or string representing the name" }`
			`{ "tags-seq" "a sequence of tags" } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Finds all tags with matching name that are the direct children of the given tag." }`
XML refactoring, splitting up docs 2009-01-21 00:54:33 -05:00			`{ $see-also deep-tag-named deep-tags-named tag-named } ;`

			`HELP: get-id`
			`{ $values { "tag" "an XML tag or document" } { "id" "a string" } { "elem" "an XML element or f" } }`
Splitting off PROCESS:/TAG: into a separate vocab; new word XML-NS: 2009-01-27 14:34:14 -05:00			`{ $description "Finds the XML tag with the specified id, ignoring the namespace." } ;`