By default, a Search is run on indexing data for documents. Content Search provides a deeper level of business intelligence by looking within the text-based documents in your database for keywords or phrases. Each document found is further weighed for relevance by the number of times the keywords (“hits’) are located. The hits are highlighting in the Document Viewer, so that users can navigate from hit to hit in a document image.
When the Content Search is run, there will be a text box to enter keywords. Content Searches are run in the same way as other Searches, with the addition of the Keyword text box, where users enter words or phrases to search for within the document’s text. Users can create simple word searches or more complex queries using tools for logical operator concatenated (look for this and that, look for this or for that), phrase-based searching, and proximity searching. Content Search enables keyword, fuzzy, stemming, phrase, and wildcard searching within the contents of a text-searchable document and highlights any matching instances. Please refer to Understanding Content-Based Searches for more details.
There are three steps to successfully configuring search document contents: first the Archive selected must be a Content-Searchable Archive; next the documents searched must have text-based content; and finally, the Search must be enabled for Content Search.
To do full-text searching of document content, based on an OCR scan of the document, do the following:
- Create or edit an Archive to enable Content Searches.
- Create a Search for the Archive, click Advanced, and in the Options group, enable Content Search Enabled. (Note that a Search must have at least one Index Field assigned, including Content Searches.)
- In the Content Search Options group that appears, optionally select one or more of the following:
- Phonics – Enable to use phonic (“sounds like”) searching.
- Fuzziness – Enable to allow for “degree of wrongness” allowed for search criteria. Enter a number (1 - 10) in the text box or use the arrows to increase or decrease the amount. This number represents the number or allowable incorrect characters. For example, a fuzziness of 1 would allow a search for "appple" to find the word "apple".
- Stemming – Stemming extends a search to cover grammatical variations on a word. For example, a search for fish would also find fishing.
- Click Save.
Content Searching Special Characters or Content Search Modifiers.
* Wildcard, “Zero to many chars,” 1 or more, anywhere in search term
? Wildcard, “One and only one char,” 1 or more, anywhere in search term
= Wildcard, “One and only one digit,” 1 or more, anywhere in search term
% Fuzzy, “Degree of wrongness,” 1 or more, anywhere in search term
# Phonic, “Sounds like,” 1 at beginning of search term
~ Stemming, “Forms of the word,” 1 at end of search term
AND/OR/NOT, Boolean, use between terms
(educat* or train*) AND (manag* or direct*)
w/#, acts as AND but within # words of each other
|( ) |
( ) Parenthetical Control, use around terms to control order of operation
“ “ Phrase, use quotes around a string of words to search for a phrase
#~~# Numeric Range, searches for string of digits, not values
12~~14 It would find this too: 14,000