More expository XML docs

2009-03-16 22:29:38 -05:00 · 2009-03-16 22:29:38 -05:00 · fec49cb616
parent 7a0ce748df
commit fec49cb616
2 changed files with 23 additions and 3 deletions
--- a/basis/xml/traversal/traversal-docs.factor
+++ b/basis/xml/traversal/traversal-docs.factor
@ -1,6 +1,6 @@
 ! Copyright (C) 2005, 2009 Daniel Ehrenberg
 ! See http://factorcode.org/license.txt for BSD license.
-USING: help.markup help.syntax xml.data sequences strings ;
+USING: help.markup help.syntax xml.data sequences strings multiline ;
 IN: xml.traversal

 ABOUT: "xml.traversal"
@ -8,7 +8,7 @@ ABOUT: "xml.traversal"
 ARTICLE: "xml.traversal" "Utilities for traversing XML"
    "The " { $vocab-link "xml.traversal" } " vocabulary provides utilities for traversing an XML DOM tree and viewing the contents of a single tag. The following words are defined:"
    $nl
-    "Note: the difference between deep-tag-named and tag-named is that the former searches recursively among all children and children of children of the tag, while the latter only looks at the direct children, and is therefore more efficient."
+    { $subsection { "xml.traversal" "intro" } }
    { $subsection tag-named }
    { $subsection tags-named }
    { $subsection deep-tag-named }
@ -20,6 +20,20 @@ ARTICLE: "xml.traversal" "Utilities for traversing XML"
    { $subsection first-child-tag }
    { $subsection assert-tag } ;

+ARTICLE: { "xml.traversal" "intro" } "An example of XML processing"
+"To illustrate how to use the XML library, we develop a simple Atom parser in Factor. Atom is an XML-based syndication format, like RSS. To see the full version of what we develop here, look at " { $snippet "basis/syndication" } " at the " { $snippet "atom1.0" } " word. First, we want to load a file and get a DOM tree for it."
+{ $code <" "file.xml" file>xml "> }
+"No encoding descriptor is needed, because XML files contain sufficient information to auto-detect the encoding. Next, we want to extract information from the tree. To get the title, we can use the following:"
+{ $code <" "title" tag-named children>string "> }
+"The " { $link tag-named } " word finds the first tag named " { $snippet "title" } " in the top level (just under the main tag). Then, with a tag on the stack, its children are asserted to be a string, and the string is returned." $nl
+"For a slightly more complicated example, we can look at how entries are parsed. To get a sequence of tags with the name " { $snippet "entry" } ":"
+{ $code <" "entry" tags-named "> }
+"Imagine that, for each of these, we want to get the URL of the entry. In Atom, the URLs are in a " { $snippet "link" } " tag which is contained in the " { $snippet "entry" } " tag. There are multiple " { $snippet "link" } " tags, but one of them contains the attribute " { $snippet "rel=alternate" } ", and the " { $snippet "href" } " attribute has the URL. So, given an element of the sequence produced in the above quotation, we run the code:"
+{ $code <" "link" tags-named [ "rel" attr "alternate" = ] find nip "> }
+"to get the link tag on the stack, and"
+{ $code <" "href" attr >url "> }
+"to extract the URL from it." ;
+
 HELP: deep-tag-named
 { $values { "tag" "an XML tag or document" } { "name/string" "an XML name or string representing a name" } { "matching-tag" tag } }
 { $description "Finds an XML tag with a matching name, recursively searching children and children of children." }
--- a/basis/xml/xml-docs.factor
+++ b/basis/xml/xml-docs.factor
@ -67,9 +67,9 @@ HELP: string>dtd

 ARTICLE: { "xml" "reading" } "Reading XML"
    "The following words are used to read something into an XML document"
-    { $subsection string>xml }
    { $subsection read-xml }
    { $subsection read-xml-chunk }
+    { $subsection string>xml }
    { $subsection string>xml-chunk }
    { $subsection file>xml }
    { $subsection bytes>xml }
@ -90,10 +90,16 @@ ARTICLE: { "xml" "events" } "Event-based XML parsing"
    { $subsection pull-event }
    { $subsection pull-elem } ;

+ARTICLE: { "xml" "namespaces" } "Working with XML namespaces"
+"The Factor XML parser implements XML namespaces, and provides convenient utilities for working with them. Anywhere in the public API that a name is accepted as an argument, either a string or an XML name is accepted. If a string is used, it is coerced into a name by giving it a null namespace. Names are stored as " { $link name } " tuples, which have slots for the namespace prefix and namespace URL as well as the main part of the tag name." $nl
+"To make it easier to create XML names, the parsing word " { $snippet "XML-NS:" } " is provided in the " { $vocab-link "xml.syntax" } " vocabulary." $nl
+"When parsing XML, names are automatically augmented with the appropriate namespace URL when the information is available. This does not take into account any XML schema which might allow for such prefixes to be omitted. When generating XML to be written, keep in mind that the XML writer knows only about the literal prefixes and ignores the URLs. It is your job to make sure that they match up correctly, and that there is the appropriate " { $snippet "xmlns" } " declaration." ;
+
 ARTICLE: "xml" "XML parser"
 "The " { $vocab-link "xml" } " vocabulary implements the XML 1.0 and 1.1 standards, converting strings of text into XML and vice versa. The parser checks for well-formedness but is not validating. There is only partial support for processing DTDs."
    { $subsection { "xml" "reading" } }
    { $subsection { "xml" "events" } }
+    { $subsection { "xml" "namespaces" } }
    { $vocab-subsection "Writing XML" "xml.writer" }
    { $vocab-subsection "XML parsing errors" "xml.errors" }
    { $vocab-subsection "XML entities" "xml.entities" }