Indexing options |
![]() ![]() ![]() |
What to index You can specify the parts of a page that should be included or excluded from indexing here. This includes the page title, content, and filename. Meta information can also be indexed such as meta descriptions, keywords, and author information. By excluding certain sections of pages, you can make index data files smaller and the indexing procedure faster, and less memory intensive. It may also help make your searches more accurate by including or excluding only the relevant sections of a page. URL domain will index the domain name, such that when indexing the page "http://www.mysite.com/section1/index.html", we will index the words "www mysite com" (if dots are not enabled for word rules). The full domain "www.mysite.com" would be indexed as one word if dots are enabled for joining words. URL path will index the path name as well such that when indexing the page "http://www.mysite.com/section1/index.html", we will index "section1". Dublin Core meta data can also be indexed. By enabling this option, Zoom will index DC.Title, DC.Subject, and DC.Identifier meta tags as described by the Dublin Core Metadata Initiative (DCMI). Note that "Link text" and "ALT text for images" only affect the indexing of these elements for the target or destination file. That is, if a text link appears on "pageA.html" to "imageB.jpg", with the link text (or ALT text) "picture of my pets", then these words will be indexed for the file "imageB.jpg", and NOT for "pageA.html". Indexing word rules This allows you to specify which characters should be allowed to act as a join character between two words. Otherwise, these characters will act as separators of words (for example, if the ‘dash/hyphen’ character is a join character, words such as “web-based” will be indexed as one word. Otherwise, it would be split into two words, “web” and “based”). Note that the character must be immediately preceded and followed by another valid character to be indexed. A list of the characters available for this option:
Rewrite links This option allows you to rewrite the indexed URLs of the pages indexed. This can be useful if you are spidering a development version of your site on a test server (eg. http://test.mycompany.com/) and creating index files to go on the live server (eg. http://www.mycompany.com/). You would do this by specifying rewrite options to replace all instances of "http://test.mycompany.com/" in the indexed URLs with "http://www.mycompany.com/". You could also use this option to change all the search result links to be relative rather than absolute by replacing the domain (eg. "http://www.mysite.com/") with a relative path (eg. "./" or "../"). We only recommend this for users who are very familiar with relative linking and understand that the linking would only work if the generated search files are placed in an appropriate folder on the server.
|