diff --git a/basis/unicode/breaks/breaks-docs.factor b/basis/unicode/breaks/breaks-docs.factor index e604c10c06..eb8c2eb00c 100644 --- a/basis/unicode/breaks/breaks-docs.factor +++ b/basis/unicode/breaks/breaks-docs.factor @@ -4,7 +4,7 @@ IN: unicode.breaks ABOUT: "unicode.breaks" ARTICLE: "unicode.breaks" "Word and grapheme breaks" -"The " { $vocab-link "unicode.breaks" "unicode.breaks" } " vocabulary partially implements Unicode Standard Annex #29. This provides for segmentation of a string along grapheme and word boundaries. In Unicode, a grapheme, or a basic unit of display in text, may be more than one code point. For example, in the string \"e\\u000301\" (where U+0301 is a combining acute accent), there is only one grapheme, as the acute accent goes above the e, forming a single grapheme. Word breaks, in general, are more complicated than simply splitting by whitespace, and the Unicode algorithm provides for that." +"The " { $vocab-link "unicode.breaks" } " vocabulary partially implements Unicode Standard Annex #29. This provides for segmentation of a string along grapheme and word boundaries. In Unicode, a grapheme, or a basic unit of display in text, may be more than one code point. For example, in the string \"e\\u000301\" (where U+0301 is a combining acute accent), there is only one grapheme, as the acute accent goes above the e, forming a single grapheme. Word breaks, in general, are more complicated than simply splitting by whitespace, and the Unicode algorithm provides for that." $nl "Operations for graphemes:" { $subsections first-grapheme diff --git a/basis/unicode/data/data-docs.factor b/basis/unicode/data/data-docs.factor index 9df0481217..563987ae75 100644 --- a/basis/unicode/data/data-docs.factor +++ b/basis/unicode/data/data-docs.factor @@ -6,7 +6,7 @@ IN: unicode.data ABOUT: "unicode.data" ARTICLE: "unicode.data" "Unicode data tables" -"The " { $vocab-link "unicode.data" "unicode.data" } " vocabulary contains core Unicode data tables and code for parsing this from files. The following words access these data tables." +"The " { $vocab-link "unicode.data" } " vocabulary contains core Unicode data tables and code for parsing this from files. The following words access these data tables." { $subsections canonical-entry combine-chars diff --git a/basis/unicode/normalize/normalize-docs.factor b/basis/unicode/normalize/normalize-docs.factor index e4433f13e0..58f381446e 100644 --- a/basis/unicode/normalize/normalize-docs.factor +++ b/basis/unicode/normalize/normalize-docs.factor @@ -4,7 +4,7 @@ IN: unicode.normalize ABOUT: "unicode.normalize" ARTICLE: "unicode.normalize" "Unicode normalization" -"The " { $vocab-link "unicode.normalize" "unicode.normalize" } " vocabulary defines words for normalizing Unicode strings." +"The " { $vocab-link "unicode.normalize" } " vocabulary defines words for normalizing Unicode strings." $nl "In Unicode, it is often possible to have multiple sequences of characters which really represent exactly the same thing. For example, to represent e with an acute accent above, there are two possible strings: " { $snippet "\"e\\u000301\"" } " (the e character, followed by the combining acute accent character) and " { $snippet "\"\\u0000e9\"" } " (a single character, e with an acute accent)." $nl