Once you have your topics in a strict XHTML form, you are able to work with them in more detail. You can use the ‘strict-ness’ to ensure that the content is more consistent.
Although this primarily enables you to use XSL transforms to check and move the content into any required format, it is often overlooked that the clean content is primed for a plethora of searches to ensure your standards are being applied.
A tool such as textpad provides enables you to performing the following S&Rs:
- Terminology: Simple search for terms that are not used by your house style.
- Formatting: Search that particular terms have the appropriate xhtml formatting.
- Linking: Search to ensure all instances of particular types of elements are linked. For example, all instances of ‘dialogs’ and ‘tabs’ link to the appropriate topic.
Care should be taken when doing a global S&R as there is as much potential for creating global problems as there is for fixing them.
Textpad enables you to use regular expressions to refine these searches. Regular expressions enable you to identify phrases that are not (“a^b”) followed by particular characters. They enable you identify phrases that can contain optional (“a|b”) sections.
Regular expressions are used to perform complex searches in software such as Flare and Textpad. They enable you to identify phrases that are not (“a^b”) followed by particular characters. They enable you identify phrases that can contain optional (“a|b”) sections.
Some examples of these phrases, used in practice by technical authors are as follows:
- “[^=|(]”[^ |>|?|/|)]” – Find quotes not in HTML attributes. We only use single quotes. This expression locates any quotes not related to an HTML attribute, i.e. in the body of the topic text.
- Avoids ‘” ‘ (end of attribute mid tag)
- Avoids ‘”/>’ (end of attribute end tag)
- Avoids ‘=”‘ (start of arrtibute)
- Avoids ‘”?’ (header doctype)
- Avoids ‘(“‘ (filename in quotes)
- “[^,]” – Find the phrase without a preceeding comma. This is useful if you always use phrases of the form “To achieve XXX, perform the following”. It identifes any places you have missed the “,”.
- “[^(his)|(he)|(current)] dialog[^<|s| |)]” – This assumes you provide links from procedural topics to screen topics. For example from “Student Dialog”. It works by identifying all occurences of “dialog” that are not proceeded with the end of the link “</a>”.
- Avoids “This dialog”
- Avoids “The dialog”
- Avoids “Current dialog”
- Avoids “dialogs”
- Avoids “dialog “
- Avoids “dialog)”
- Finds “dialogs” not followed by “</a>”
- “tab[^<|s| |e|a|l|]” – This assumes you provide links from procedural topics to screen topics. For example from “Student tab”. It works by identifying all occurences of “tab” that are not proceeded with the end of the link “</a>”.
- Avoids “tabs”
- Avoids “tab “
- Avoids “tabe”
- Avoids “taba”
- Avoids “tabl” (table)
- “[^(and)|(This)|(</a>)] Tab[^h|<|\.|”|l|e|s|a]” – We always have the control type (dialog, tab etc) in lowercase, this expression identifies any cases where “Tab” is uppercase and not the start of a sentance. It also rules out occurences inside “Table” etc.
- “<img.*/> “– Images with spaces afterwards. We insert spaces after images via a css style, so need to ensure that the images don’t have “hard spaces” after them.
- “[^ |>]<a” – Find links without whitespace before them.
- “[^ |>|(]<b>” – Find bold text without whitespace after.
- “<a.[^(href)]” – links with no href tag.
- “href=””” – links with empty hrefs.